Commit 44195d1e authored by Jiaqi Yan's avatar Jiaqi Yan Committed by Andrew Morton

docs: mm: add enable_soft_offline sysctl

Add the documentation for soft offline behaviors / costs, and what the new
enable_soft_offline sysctl is for.

[jiaqiyan@google.com: fix kerneldoc warnings]
  Link: https://lkml.kernel.org/r/CACw3F52=GxTCDw-PqFh3-GDM-fo3GbhGdu0hedxYXOTT4TQSTg@mail.gmail.com
[jiaqiyan@google.com: there are more blank lines needed]
  Link: https://lkml.kernel.org/r/CACw3F52_obAB742XeDRNun4BHBYtrxtbvp5NkUincXdaob0j1g@mail.gmail.com
Link: https://lkml.kernel.org/r/20240626050818.2277273-5-jiaqiyan@google.comSigned-off-by: default avatarJiaqi Yan <jiaqiyan@google.com>
Acked-by: default avatarOscar Salvador <osalvador@suse.de>
Acked-by: default avatarMiaohe Lin <linmiaohe@huawei.com>
Acked-by: default avatarDavid Rientjes <rientjes@google.com>
Cc: Frank van der Linden <fvdl@google.com>
Cc: Jane Chu <jane.chu@oracle.com>
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: Lance Yang <ioworker0@gmail.com>
Cc: Muchun Song <muchun.song@linux.dev>
Cc: Naoya Horiguchi <nao.horiguchi@gmail.com>
Cc: Randy Dunlap <rdunlap@infradead.org>
Cc: Shuah Khan <shuah@kernel.org>
Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
parent 72ead83d
......@@ -36,6 +36,7 @@ Currently, these files are in /proc/sys/vm:
- dirtytime_expire_seconds
- dirty_writeback_centisecs
- drop_caches
- enable_soft_offline
- extfrag_threshold
- highmem_is_dirtyable
- hugetlb_shm_group
......@@ -267,6 +268,43 @@ used::
These are informational only. They do not mean that anything is wrong
with your system. To disable them, echo 4 (bit 2) into drop_caches.
enable_soft_offline
===================
Correctable memory errors are very common on servers. Soft-offline is kernel's
solution for memory pages having (excessive) corrected memory errors.
For different types of page, soft-offline has different behaviors / costs.
- For a raw error page, soft-offline migrates the in-use page's content to
a new raw page.
- For a page that is part of a transparent hugepage, soft-offline splits the
transparent hugepage into raw pages, then migrates only the raw error page.
As a result, user is transparently backed by 1 less hugepage, impacting
memory access performance.
- For a page that is part of a HugeTLB hugepage, soft-offline first migrates
the entire HugeTLB hugepage, during which a free hugepage will be consumed
as migration target. Then the original hugepage is dissolved into raw
pages without compensation, reducing the capacity of the HugeTLB pool by 1.
It is user's call to choose between reliability (staying away from fragile
physical memory) vs performance / capacity implications in transparent and
HugeTLB cases.
For all architectures, enable_soft_offline controls whether to soft offline
memory pages. When set to 1, kernel attempts to soft offline the pages
whenever it thinks needed. When set to 0, kernel returns EOPNOTSUPP to
the request to soft offline the pages. Its default value is 1.
It is worth mentioning that after setting enable_soft_offline to 0, the
following requests to soft offline pages will not be performed:
- Request to soft offline pages from RAS Correctable Errors Collector.
- On ARM, the request to soft offline pages from GHES driver.
- On PARISC, the request to soft offline pages from Page Deallocation Table.
extfrag_threshold
=================
......
Markdown is supported
0%
or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment