Commit 8ca8d89b authored by Linus Torvalds's avatar Linus Torvalds

Merge tag 'cgroup-for-6.3' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/cgroup

Pull cgroup updates from Tejun Heo:
 "All the changes are trivial: documentation updates and a trivial code
  cleanup"

* tag 'cgroup-for-6.3' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/cgroup:
  cgroup/cpuset: fix a few kernel-doc warnings & coding style
  docs: cgroup-v1: use numbered lists for user interface setup
  docs: cgroup-v1: add internal cross-references
  docs: cgroup-v1: make swap extension subsections subsections
  docs: cgroup-v1: use bullet lists for list of stat file tables
  docs: cgroup-v1: move hierarchy of accounting caption
  docs: cgroup-v1: fix footnotes
  docs: cgroup-v1: use code block for locking order schema
  docs: cgroup-v1: wrap remaining admonitions in admonition blocks
  docs: cgroup-v1: replace custom note constructs with appropriate admonition blocks
  cgroup/cpuset: no need to explicitly init a global static variable
parents 3e82b41e 32a47817
...@@ -80,6 +80,8 @@ access. For example, cpusets (see Documentation/admin-guide/cgroup-v1/cpusets.rs ...@@ -80,6 +80,8 @@ access. For example, cpusets (see Documentation/admin-guide/cgroup-v1/cpusets.rs
you to associate a set of CPUs and a set of memory nodes with the you to associate a set of CPUs and a set of memory nodes with the
tasks in each cgroup. tasks in each cgroup.
.. _cgroups-why-needed:
1.2 Why are cgroups needed ? 1.2 Why are cgroups needed ?
---------------------------- ----------------------------
......
...@@ -2,18 +2,18 @@ ...@@ -2,18 +2,18 @@
Memory Resource Controller Memory Resource Controller
========================== ==========================
NOTE: .. caution::
This document is hopelessly outdated and it asks for a complete This document is hopelessly outdated and it asks for a complete
rewrite. It still contains a useful information so we are keeping it rewrite. It still contains a useful information so we are keeping it
here but make sure to check the current code if you need a deeper here but make sure to check the current code if you need a deeper
understanding. understanding.
NOTE: .. note::
The Memory Resource Controller has generically been referred to as the The Memory Resource Controller has generically been referred to as the
memory controller in this document. Do not confuse memory controller memory controller in this document. Do not confuse memory controller
used here with the memory controller that is used in hardware. used here with the memory controller that is used in hardware.
(For editors) In this document: .. hint::
When we mention a cgroup (cgroupfs's directory) with memory controller, When we mention a cgroup (cgroupfs's directory) with memory controller,
we call it "memory cgroup". When you see git-log and source code, you'll we call it "memory cgroup". When you see git-log and source code, you'll
see patch's title and function names tend to use "memcg". see patch's title and function names tend to use "memcg".
...@@ -23,7 +23,7 @@ Benefits and Purpose of the memory controller ...@@ -23,7 +23,7 @@ Benefits and Purpose of the memory controller
============================================= =============================================
The memory controller isolates the memory behaviour of a group of tasks The memory controller isolates the memory behaviour of a group of tasks
from the rest of the system. The article on LWN [12] mentions some probable from the rest of the system. The article on LWN [12]_ mentions some probable
uses of the memory controller. The memory controller can be used to uses of the memory controller. The memory controller can be used to
a. Isolate an application or a group of applications a. Isolate an application or a group of applications
...@@ -55,7 +55,8 @@ Features: ...@@ -55,7 +55,8 @@ Features:
- Root cgroup has no limit controls. - Root cgroup has no limit controls.
Kernel memory support is a work in progress, and the current version provides Kernel memory support is a work in progress, and the current version provides
basically functionality. (See Section 2.7) basically functionality. (See :ref:`section 2.7
<cgroup-v1-memory-kernel-extension>`)
Brief summary of control files. Brief summary of control files.
...@@ -107,16 +108,16 @@ Brief summary of control files. ...@@ -107,16 +108,16 @@ Brief summary of control files.
========== ==========
The memory controller has a long history. A request for comments for the memory The memory controller has a long history. A request for comments for the memory
controller was posted by Balbir Singh [1]. At the time the RFC was posted controller was posted by Balbir Singh [1]_. At the time the RFC was posted
there were several implementations for memory control. The goal of the there were several implementations for memory control. The goal of the
RFC was to build consensus and agreement for the minimal features required RFC was to build consensus and agreement for the minimal features required
for memory control. The first RSS controller was posted by Balbir Singh[2] for memory control. The first RSS controller was posted by Balbir Singh [2]_
in Feb 2007. Pavel Emelianov [3][4][5] has since posted three versions of the in Feb 2007. Pavel Emelianov [3]_ [4]_ [5]_ has since posted three versions
RSS controller. At OLS, at the resource management BoF, everyone suggested of the RSS controller. At OLS, at the resource management BoF, everyone
that we handle both page cache and RSS together. Another request was raised suggested that we handle both page cache and RSS together. Another request was
to allow user space handling of OOM. The current memory controller is raised to allow user space handling of OOM. The current memory controller is
at version 6; it combines both mapped (RSS) and unmapped Page at version 6; it combines both mapped (RSS) and unmapped Page
Cache Control [11]. Cache Control [11]_.
2. Memory Control 2. Memory Control
================= =================
...@@ -147,7 +148,8 @@ specific data structure (mem_cgroup) associated with it. ...@@ -147,7 +148,8 @@ specific data structure (mem_cgroup) associated with it.
2.2. Accounting 2.2. Accounting
--------------- ---------------
:: .. code-block::
:caption: Figure 1: Hierarchy of Accounting
+--------------------+ +--------------------+
| mem_cgroup | | mem_cgroup |
...@@ -167,7 +169,6 @@ specific data structure (mem_cgroup) associated with it. ...@@ -167,7 +169,6 @@ specific data structure (mem_cgroup) associated with it.
| | | | | | | |
+---------------+ +---------------+ +---------------+ +---------------+
(Figure 1: Hierarchy of Accounting)
Figure 1 shows the important aspects of the controller Figure 1 shows the important aspects of the controller
...@@ -221,8 +222,9 @@ behind this approach is that a cgroup that aggressively uses a shared ...@@ -221,8 +222,9 @@ behind this approach is that a cgroup that aggressively uses a shared
page will eventually get charged for it (once it is uncharged from page will eventually get charged for it (once it is uncharged from
the cgroup that brought it in -- this will happen on memory pressure). the cgroup that brought it in -- this will happen on memory pressure).
But see section 8.2: when moving a task to another cgroup, its pages may But see :ref:`section 8.2 <cgroup-v1-memory-movable-charges>` when moving a
be recharged to the new cgroup, if move_charge_at_immigrate has been chosen. task to another cgroup, its pages may be recharged to the new cgroup, if
move_charge_at_immigrate has been chosen.
2.4 Swap Extension 2.4 Swap Extension
-------------------------------------- --------------------------------------
...@@ -244,7 +246,8 @@ In this case, setting memsw.limit_in_bytes=3G will prevent bad use of swap. ...@@ -244,7 +246,8 @@ In this case, setting memsw.limit_in_bytes=3G will prevent bad use of swap.
By using the memsw limit, you can avoid system OOM which can be caused by swap By using the memsw limit, you can avoid system OOM which can be caused by swap
shortage. shortage.
**why 'memory+swap' rather than swap** 2.4.1 why 'memory+swap' rather than swap
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
The global LRU(kswapd) can swap out arbitrary pages. Swap-out means The global LRU(kswapd) can swap out arbitrary pages. Swap-out means
to move account from memory to swap...there is no change in usage of to move account from memory to swap...there is no change in usage of
...@@ -252,7 +255,8 @@ memory+swap. In other words, when we want to limit the usage of swap without ...@@ -252,7 +255,8 @@ memory+swap. In other words, when we want to limit the usage of swap without
affecting global LRU, memory+swap limit is better than just limiting swap from affecting global LRU, memory+swap limit is better than just limiting swap from
an OS point of view. an OS point of view.
**What happens when a cgroup hits memory.memsw.limit_in_bytes** 2.4.2. What happens when a cgroup hits memory.memsw.limit_in_bytes
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
When a cgroup hits memory.memsw.limit_in_bytes, it's useless to do swap-out When a cgroup hits memory.memsw.limit_in_bytes, it's useless to do swap-out
in this cgroup. Then, swap-out will not be done by cgroup routine and file in this cgroup. Then, swap-out will not be done by cgroup routine and file
...@@ -268,26 +272,26 @@ global VM. When a cgroup goes over its limit, we first try ...@@ -268,26 +272,26 @@ global VM. When a cgroup goes over its limit, we first try
to reclaim memory from the cgroup so as to make space for the new to reclaim memory from the cgroup so as to make space for the new
pages that the cgroup has touched. If the reclaim is unsuccessful, pages that the cgroup has touched. If the reclaim is unsuccessful,
an OOM routine is invoked to select and kill the bulkiest task in the an OOM routine is invoked to select and kill the bulkiest task in the
cgroup. (See 10. OOM Control below.) cgroup. (See :ref:`10. OOM Control <cgroup-v1-memory-oom-control>` below.)
The reclaim algorithm has not been modified for cgroups, except that The reclaim algorithm has not been modified for cgroups, except that
pages that are selected for reclaiming come from the per-cgroup LRU pages that are selected for reclaiming come from the per-cgroup LRU
list. list.
NOTE: .. note::
Reclaim does not work for the root cgroup, since we cannot set any Reclaim does not work for the root cgroup, since we cannot set any
limits on the root cgroup. limits on the root cgroup.
Note2: .. note::
When panic_on_oom is set to "2", the whole system will panic. When panic_on_oom is set to "2", the whole system will panic.
When oom event notifier is registered, event will be delivered. When oom event notifier is registered, event will be delivered.
(See oom_control section) (See :ref:`oom_control <cgroup-v1-memory-oom-control>` section)
2.6 Locking 2.6 Locking
----------- -----------
Lock order is as follows: Lock order is as follows::
Page lock (PG_locked bit of page->flags) Page lock (PG_locked bit of page->flags)
mm->page_table_lock or split pte_lock mm->page_table_lock or split pte_lock
...@@ -299,6 +303,8 @@ Per-node-per-memcgroup LRU (cgroup's private LRU) is guarded by ...@@ -299,6 +303,8 @@ Per-node-per-memcgroup LRU (cgroup's private LRU) is guarded by
lruvec->lru_lock; PG_lru bit of page->flags is cleared before lruvec->lru_lock; PG_lru bit of page->flags is cleared before
isolating a page from its LRU under lruvec->lru_lock. isolating a page from its LRU under lruvec->lru_lock.
.. _cgroup-v1-memory-kernel-extension:
2.7 Kernel Memory Extension 2.7 Kernel Memory Extension
----------------------------------------------- -----------------------------------------------
...@@ -367,10 +373,10 @@ U != 0, K < U: ...@@ -367,10 +373,10 @@ U != 0, K < U:
never greater than the total memory, and freely set U at the cost of his never greater than the total memory, and freely set U at the cost of his
QoS. QoS.
WARNING: .. warning::
In the current implementation, memory reclaim will NOT be In the current implementation, memory reclaim will NOT be triggered for
triggered for a cgroup when it hits K while staying below U, which makes a cgroup when it hits K while staying below U, which makes this setup
this setup impractical. impractical.
U != 0, K >= U: U != 0, K >= U:
Since kmem charges will also be fed to the user counter and reclaim will be Since kmem charges will also be fed to the user counter and reclaim will be
...@@ -381,45 +387,41 @@ U != 0, K >= U: ...@@ -381,45 +387,41 @@ U != 0, K >= U:
3. User Interface 3. User Interface
================= =================
3.0. Configuration To use the user interface:
------------------
a. Enable CONFIG_CGROUPS
b. Enable CONFIG_MEMCG
3.1. Prepare the cgroups (see cgroups.txt, Why are cgroups needed?)
-------------------------------------------------------------------
:: 1. Enable CONFIG_CGROUPS and CONFIG_MEMCG options
2. Prepare the cgroups (see :ref:`Why are cgroups needed?
<cgroups-why-needed>` for the background information)::
# mount -t tmpfs none /sys/fs/cgroup # mount -t tmpfs none /sys/fs/cgroup
# mkdir /sys/fs/cgroup/memory # mkdir /sys/fs/cgroup/memory
# mount -t cgroup none /sys/fs/cgroup/memory -o memory # mount -t cgroup none /sys/fs/cgroup/memory -o memory
3.2. Make the new group and move bash into it:: 3. Make the new group and move bash into it::
# mkdir /sys/fs/cgroup/memory/0 # mkdir /sys/fs/cgroup/memory/0
# echo $$ > /sys/fs/cgroup/memory/0/tasks # echo $$ > /sys/fs/cgroup/memory/0/tasks
Since now we're in the 0 cgroup, we can alter the memory limit:: 4. Since now we're in the 0 cgroup, we can alter the memory limit::
# echo 4M > /sys/fs/cgroup/memory/0/memory.limit_in_bytes # echo 4M > /sys/fs/cgroup/memory/0/memory.limit_in_bytes
NOTE: The limit can now be queried::
We can use a suffix (k, K, m, M, g or G) to indicate values in kilo,
mega or gigabytes. (Here, Kilo, Mega, Giga are Kibibytes, Mebibytes,
Gibibytes.)
NOTE: # cat /sys/fs/cgroup/memory/0/memory.limit_in_bytes
We can write "-1" to reset the ``*.limit_in_bytes(unlimited)``. 4194304
NOTE: .. note::
We cannot set limits on the root cgroup any more. We can use a suffix (k, K, m, M, g or G) to indicate values in kilo,
mega or gigabytes. (Here, Kilo, Mega, Giga are Kibibytes, Mebibytes,
Gibibytes.)
:: .. note::
We can write "-1" to reset the ``*.limit_in_bytes(unlimited)``.
.. note::
We cannot set limits on the root cgroup any more.
# cat /sys/fs/cgroup/memory/0/memory.limit_in_bytes
4194304
We can check the usage:: We can check the usage::
...@@ -458,6 +460,8 @@ test because it has noise of shared objects/status. ...@@ -458,6 +460,8 @@ test because it has noise of shared objects/status.
But the above two are testing extreme situations. But the above two are testing extreme situations.
Trying usual test under memory controller is always helpful. Trying usual test under memory controller is always helpful.
.. _cgroup-v1-memory-test-troubleshoot:
4.1 Troubleshooting 4.1 Troubleshooting
------------------- -------------------
...@@ -470,8 +474,11 @@ terminated by the OOM killer. There are several causes for this: ...@@ -470,8 +474,11 @@ terminated by the OOM killer. There are several causes for this:
A sync followed by echo 1 > /proc/sys/vm/drop_caches will help get rid of A sync followed by echo 1 > /proc/sys/vm/drop_caches will help get rid of
some of the pages cached in the cgroup (page cache pages). some of the pages cached in the cgroup (page cache pages).
To know what happens, disabling OOM_Kill as per "10. OOM Control" (below) and To know what happens, disabling OOM_Kill as per :ref:`"10. OOM Control"
seeing what happens will be helpful. <cgroup-v1-memory-oom-control>` (below) and seeing what happens will be
helpful.
.. _cgroup-v1-memory-test-task-migration:
4.2 Task migration 4.2 Task migration
------------------ ------------------
...@@ -482,15 +489,16 @@ remain charged to it, the charge is dropped when the page is freed or ...@@ -482,15 +489,16 @@ remain charged to it, the charge is dropped when the page is freed or
reclaimed. reclaimed.
You can move charges of a task along with task migration. You can move charges of a task along with task migration.
See 8. "Move charges at task migration" See :ref:`8. "Move charges at task migration" <cgroup-v1-memory-move-charges>`
4.3 Removing a cgroup 4.3 Removing a cgroup
--------------------- ---------------------
A cgroup can be removed by rmdir, but as discussed in sections 4.1 and 4.2, a A cgroup can be removed by rmdir, but as discussed in :ref:`sections 4.1
cgroup might have some charge associated with it, even though all <cgroup-v1-memory-test-troubleshoot>` and :ref:`4.2
tasks have migrated away from it. (because we charge against pages, not <cgroup-v1-memory-test-task-migration>`, a cgroup might have some charge
against tasks.) associated with it, even though all tasks have migrated away from it. (because
we charge against pages, not against tasks.)
We move the stats to parent, and no change on the charge except uncharging We move the stats to parent, and no change on the charge except uncharging
from the child. from the child.
...@@ -519,67 +527,66 @@ will be charged as a new owner of it. ...@@ -519,67 +527,66 @@ will be charged as a new owner of it.
5.2 stat file 5.2 stat file
------------- -------------
memory.stat file includes following statistics memory.stat file includes following statistics:
per-memory cgroup local status * per-memory cgroup local status
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
=============== ===============================================================
=============== =============================================================== cache # of bytes of page cache memory.
cache # of bytes of page cache memory. rss # of bytes of anonymous and swap cache memory (includes
rss # of bytes of anonymous and swap cache memory (includes transparent hugepages).
transparent hugepages). rss_huge # of bytes of anonymous transparent hugepages.
rss_huge # of bytes of anonymous transparent hugepages. mapped_file # of bytes of mapped file (includes tmpfs/shmem)
mapped_file # of bytes of mapped file (includes tmpfs/shmem) pgpgin # of charging events to the memory cgroup. The charging
pgpgin # of charging events to the memory cgroup. The charging event happens each time a page is accounted as either mapped
event happens each time a page is accounted as either mapped anon page(RSS) or cache page(Page Cache) to the cgroup.
anon page(RSS) or cache page(Page Cache) to the cgroup. pgpgout # of uncharging events to the memory cgroup. The uncharging
pgpgout # of uncharging events to the memory cgroup. The uncharging event happens each time a page is unaccounted from the
event happens each time a page is unaccounted from the cgroup. cgroup.
swap # of bytes of swap usage swap # of bytes of swap usage
dirty # of bytes that are waiting to get written back to the disk. dirty # of bytes that are waiting to get written back to the disk.
writeback # of bytes of file/anon cache that are queued for syncing to writeback # of bytes of file/anon cache that are queued for syncing to
disk. disk.
inactive_anon # of bytes of anonymous and swap cache memory on inactive inactive_anon # of bytes of anonymous and swap cache memory on inactive
LRU list. LRU list.
active_anon # of bytes of anonymous and swap cache memory on active active_anon # of bytes of anonymous and swap cache memory on active
LRU list. LRU list.
inactive_file # of bytes of file-backed memory and MADV_FREE anonymous memory( inactive_file # of bytes of file-backed memory and MADV_FREE anonymous
LazyFree pages) on inactive LRU list. memory (LazyFree pages) on inactive LRU list.
active_file # of bytes of file-backed memory on active LRU list. active_file # of bytes of file-backed memory on active LRU list.
unevictable # of bytes of memory that cannot be reclaimed (mlocked etc). unevictable # of bytes of memory that cannot be reclaimed (mlocked etc).
=============== =============================================================== =============== ===============================================================
status considering hierarchy (see memory.use_hierarchy settings) * status considering hierarchy (see memory.use_hierarchy settings):
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
========================= ===================================================
========================= =================================================== hierarchical_memory_limit # of bytes of memory limit with regard to
hierarchical_memory_limit # of bytes of memory limit with regard to hierarchy hierarchy
under which the memory cgroup is under which the memory cgroup is
hierarchical_memsw_limit # of bytes of memory+swap limit with regard to hierarchical_memsw_limit # of bytes of memory+swap limit with regard to
hierarchy under which memory cgroup is. hierarchy under which memory cgroup is.
total_<counter> # hierarchical version of <counter>, which in total_<counter> # hierarchical version of <counter>, which in
addition to the cgroup's own value includes the addition to the cgroup's own value includes the
sum of all hierarchical children's values of sum of all hierarchical children's values of
<counter>, i.e. total_cache <counter>, i.e. total_cache
========================= =================================================== ========================= ===================================================
The following additional stats are dependent on CONFIG_DEBUG_VM * additional vm parameters (depends on CONFIG_DEBUG_VM):
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
========================= ========================================
========================= ======================================== recent_rotated_anon VM internal parameter. (see mm/vmscan.c)
recent_rotated_anon VM internal parameter. (see mm/vmscan.c) recent_rotated_file VM internal parameter. (see mm/vmscan.c)
recent_rotated_file VM internal parameter. (see mm/vmscan.c) recent_scanned_anon VM internal parameter. (see mm/vmscan.c)
recent_scanned_anon VM internal parameter. (see mm/vmscan.c) recent_scanned_file VM internal parameter. (see mm/vmscan.c)
recent_scanned_file VM internal parameter. (see mm/vmscan.c) ========================= ========================================
========================= ========================================
.. hint::
Memo:
recent_rotated means recent frequency of LRU rotation. recent_rotated means recent frequency of LRU rotation.
recent_scanned means recent # of scans to LRU. recent_scanned means recent # of scans to LRU.
showing for better debug please see the code for meanings. showing for better debug please see the code for meanings.
Note: .. note::
Only anonymous and swap cache memory is listed as part of 'rss' stat. Only anonymous and swap cache memory is listed as part of 'rss' stat.
This should not be confused with the true 'resident set size' or the This should not be confused with the true 'resident set size' or the
amount of physical memory used by the cgroup. amount of physical memory used by the cgroup.
...@@ -710,13 +717,16 @@ If we want to change this to 1G, we can at any time use:: ...@@ -710,13 +717,16 @@ If we want to change this to 1G, we can at any time use::
# echo 1G > memory.soft_limit_in_bytes # echo 1G > memory.soft_limit_in_bytes
NOTE1: .. note::
Soft limits take effect over a long period of time, since they involve Soft limits take effect over a long period of time, since they involve
reclaiming memory for balancing between memory cgroups reclaiming memory for balancing between memory cgroups
NOTE2:
.. note::
It is recommended to set the soft limit always below the hard limit, It is recommended to set the soft limit always below the hard limit,
otherwise the hard limit will take precedence. otherwise the hard limit will take precedence.
.. _cgroup-v1-memory-move-charges:
8. Move charges at task migration 8. Move charges at task migration
================================= =================================
...@@ -735,23 +745,29 @@ If you want to enable it:: ...@@ -735,23 +745,29 @@ If you want to enable it::
# echo (some positive value) > memory.move_charge_at_immigrate # echo (some positive value) > memory.move_charge_at_immigrate
Note: .. note::
Each bits of move_charge_at_immigrate has its own meaning about what type Each bits of move_charge_at_immigrate has its own meaning about what type
of charges should be moved. See 8.2 for details. of charges should be moved. See :ref:`section 8.2
Note: <cgroup-v1-memory-movable-charges>` for details.
.. note::
Charges are moved only when you move mm->owner, in other words, Charges are moved only when you move mm->owner, in other words,
a leader of a thread group. a leader of a thread group.
Note:
.. note::
If we cannot find enough space for the task in the destination cgroup, we If we cannot find enough space for the task in the destination cgroup, we
try to make space by reclaiming memory. Task migration may fail if we try to make space by reclaiming memory. Task migration may fail if we
cannot make enough space. cannot make enough space.
Note:
.. note::
It can take several seconds if you move charges much. It can take several seconds if you move charges much.
And if you want disable it again:: And if you want disable it again::
# echo 0 > memory.move_charge_at_immigrate # echo 0 > memory.move_charge_at_immigrate
.. _cgroup-v1-memory-movable-charges:
8.2 Type of charges which can be moved 8.2 Type of charges which can be moved
-------------------------------------- --------------------------------------
...@@ -801,6 +817,8 @@ threshold in any direction. ...@@ -801,6 +817,8 @@ threshold in any direction.
It's applicable for root and non-root cgroup. It's applicable for root and non-root cgroup.
.. _cgroup-v1-memory-oom-control:
10. OOM Control 10. OOM Control
=============== ===============
...@@ -956,15 +974,16 @@ commented and discussed quite extensively in the community. ...@@ -956,15 +974,16 @@ commented and discussed quite extensively in the community.
References References
========== ==========
1. Singh, Balbir. RFC: Memory Controller, http://lwn.net/Articles/206697/ .. [1] Singh, Balbir. RFC: Memory Controller, http://lwn.net/Articles/206697/
2. Singh, Balbir. Memory Controller (RSS Control), .. [2] Singh, Balbir. Memory Controller (RSS Control),
http://lwn.net/Articles/222762/ http://lwn.net/Articles/222762/
3. Emelianov, Pavel. Resource controllers based on process cgroups .. [3] Emelianov, Pavel. Resource controllers based on process cgroups
https://lore.kernel.org/r/45ED7DEC.7010403@sw.ru https://lore.kernel.org/r/45ED7DEC.7010403@sw.ru
4. Emelianov, Pavel. RSS controller based on process cgroups (v2) .. [4] Emelianov, Pavel. RSS controller based on process cgroups (v2)
https://lore.kernel.org/r/461A3010.90403@sw.ru https://lore.kernel.org/r/461A3010.90403@sw.ru
5. Emelianov, Pavel. RSS controller based on process cgroups (v3) .. [5] Emelianov, Pavel. RSS controller based on process cgroups (v3)
https://lore.kernel.org/r/465D9739.8070209@openvz.org https://lore.kernel.org/r/465D9739.8070209@openvz.org
6. Menage, Paul. Control Groups v10, http://lwn.net/Articles/236032/ 6. Menage, Paul. Control Groups v10, http://lwn.net/Articles/236032/
7. Vaidyanathan, Srinivasan, Control Groups: Pagecache accounting and control 7. Vaidyanathan, Srinivasan, Control Groups: Pagecache accounting and control
subsystem (v3), http://lwn.net/Articles/235534/ subsystem (v3), http://lwn.net/Articles/235534/
...@@ -974,7 +993,8 @@ References ...@@ -974,7 +993,8 @@ References
https://lore.kernel.org/r/464D267A.50107@linux.vnet.ibm.com https://lore.kernel.org/r/464D267A.50107@linux.vnet.ibm.com
10. Singh, Balbir. Memory controller v6 test results, 10. Singh, Balbir. Memory controller v6 test results,
https://lore.kernel.org/r/20070819094658.654.84837.sendpatchset@balbir-laptop https://lore.kernel.org/r/20070819094658.654.84837.sendpatchset@balbir-laptop
11. Singh, Balbir. Memory controller introduction (v6),
https://lore.kernel.org/r/20070817084228.26003.12568.sendpatchset@balbir-laptop .. [11] Singh, Balbir. Memory controller introduction (v6),
12. Corbet, Jonathan, Controlling memory use in cgroups, https://lore.kernel.org/r/20070817084228.26003.12568.sendpatchset@balbir-laptop
http://lwn.net/Articles/243795/ .. [12] Corbet, Jonathan, Controlling memory use in cgroups,
http://lwn.net/Articles/243795/
...@@ -1271,7 +1271,7 @@ static int update_flag(cpuset_flagbits_t bit, struct cpuset *cs, ...@@ -1271,7 +1271,7 @@ static int update_flag(cpuset_flagbits_t bit, struct cpuset *cs,
int turning_on); int turning_on);
/** /**
* update_parent_subparts_cpumask - update subparts_cpus mask of parent cpuset * update_parent_subparts_cpumask - update subparts_cpus mask of parent cpuset
* @cpuset: The cpuset that requests change in partition root state * @cs: The cpuset that requests change in partition root state
* @cmd: Partition root state change command * @cmd: Partition root state change command
* @newmask: Optional new cpumask for partcmd_update * @newmask: Optional new cpumask for partcmd_update
* @tmp: Temporary addmask and delmask * @tmp: Temporary addmask and delmask
...@@ -3286,8 +3286,6 @@ struct cgroup_subsys cpuset_cgrp_subsys = { ...@@ -3286,8 +3286,6 @@ struct cgroup_subsys cpuset_cgrp_subsys = {
int __init cpuset_init(void) int __init cpuset_init(void)
{ {
BUG_ON(percpu_init_rwsem(&cpuset_rwsem));
BUG_ON(!alloc_cpumask_var(&top_cpuset.cpus_allowed, GFP_KERNEL)); BUG_ON(!alloc_cpumask_var(&top_cpuset.cpus_allowed, GFP_KERNEL));
BUG_ON(!alloc_cpumask_var(&top_cpuset.effective_cpus, GFP_KERNEL)); BUG_ON(!alloc_cpumask_var(&top_cpuset.effective_cpus, GFP_KERNEL));
BUG_ON(!zalloc_cpumask_var(&top_cpuset.subparts_cpus, GFP_KERNEL)); BUG_ON(!zalloc_cpumask_var(&top_cpuset.subparts_cpus, GFP_KERNEL));
...@@ -3907,8 +3905,7 @@ bool __cpuset_node_allowed(int node, gfp_t gfp_mask) ...@@ -3907,8 +3905,7 @@ bool __cpuset_node_allowed(int node, gfp_t gfp_mask)
} }
/** /**
* cpuset_mem_spread_node() - On which node to begin search for a file page * cpuset_spread_node() - On which node to begin search for a page
* cpuset_slab_spread_node() - On which node to begin search for a slab page
* *
* If a task is marked PF_SPREAD_PAGE or PF_SPREAD_SLAB (as for * If a task is marked PF_SPREAD_PAGE or PF_SPREAD_SLAB (as for
* tasks in a cpuset with is_spread_page or is_spread_slab set), * tasks in a cpuset with is_spread_page or is_spread_slab set),
...@@ -3932,12 +3929,14 @@ bool __cpuset_node_allowed(int node, gfp_t gfp_mask) ...@@ -3932,12 +3929,14 @@ bool __cpuset_node_allowed(int node, gfp_t gfp_mask)
* is passed an offline node, it will fall back to the local node. * is passed an offline node, it will fall back to the local node.
* See kmem_cache_alloc_node(). * See kmem_cache_alloc_node().
*/ */
static int cpuset_spread_node(int *rotor) static int cpuset_spread_node(int *rotor)
{ {
return *rotor = next_node_in(*rotor, current->mems_allowed); return *rotor = next_node_in(*rotor, current->mems_allowed);
} }
/**
* cpuset_mem_spread_node() - On which node to begin search for a file page
*/
int cpuset_mem_spread_node(void) int cpuset_mem_spread_node(void)
{ {
if (current->cpuset_mem_spread_rotor == NUMA_NO_NODE) if (current->cpuset_mem_spread_rotor == NUMA_NO_NODE)
...@@ -3947,6 +3946,9 @@ int cpuset_mem_spread_node(void) ...@@ -3947,6 +3946,9 @@ int cpuset_mem_spread_node(void)
return cpuset_spread_node(&current->cpuset_mem_spread_rotor); return cpuset_spread_node(&current->cpuset_mem_spread_rotor);
} }
/**
* cpuset_slab_spread_node() - On which node to begin search for a slab page
*/
int cpuset_slab_spread_node(void) int cpuset_slab_spread_node(void)
{ {
if (current->cpuset_slab_spread_rotor == NUMA_NO_NODE) if (current->cpuset_slab_spread_rotor == NUMA_NO_NODE)
...@@ -3955,7 +3957,6 @@ int cpuset_slab_spread_node(void) ...@@ -3955,7 +3957,6 @@ int cpuset_slab_spread_node(void)
return cpuset_spread_node(&current->cpuset_slab_spread_rotor); return cpuset_spread_node(&current->cpuset_slab_spread_rotor);
} }
EXPORT_SYMBOL_GPL(cpuset_mem_spread_node); EXPORT_SYMBOL_GPL(cpuset_mem_spread_node);
/** /**
......
Markdown is supported
0%
or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment