- 09 Jun, 2023 40 commits
-
-
Liam R. Howlett authored
Since the maple tree is inclusive in range, ensure that a range of 1 (min = max) works for searching for a gap in either direction, and make sure the size is at least 1 but not larger than the delta between min and max. This commit also updates the testing. Unfortunately there isn't a way to safely update the tests and code without a test failure. Link: https://lkml.kernel.org/r/20230518145544.1722059-26-Liam.Howlett@oracle.comSigned-off-by: Liam R. Howlett <Liam.Howlett@oracle.com> Suggested-by: Peng Zhang <zhangpeng.00@bytedance.com> Cc: David Binderman <dcb314@hotmail.com> Cc: Sergey Senozhatsky <senozhatsky@chromium.org> Cc: Vernon Yang <vernon2gm@gmail.com> Cc: Wei Yang <richard.weiyang@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
-
Liam R. Howlett authored
Keep a reference to the node when possible with mas_prev(). This will avoid re-walking the tree. In keeping a reference to the node, keep the last/index accurate to the range being referenced. This means the limit may be within the range, but the range may extend outside of the limit. Also fix the single entry tree to respect the range (of 0), or set the node to MAS_NONE in the case of shifting beyond 0. Link: https://lkml.kernel.org/r/20230518145544.1722059-25-Liam.Howlett@oracle.comSigned-off-by: Liam R. Howlett <Liam.Howlett@oracle.com> Cc: David Binderman <dcb314@hotmail.com> Cc: Peng Zhang <zhangpeng.00@bytedance.com> Cc: Sergey Senozhatsky <senozhatsky@chromium.org> Cc: Vernon Yang <vernon2gm@gmail.com> Cc: Wei Yang <richard.weiyang@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
-
Liam R. Howlett authored
Clean up the mas_next() call to try and keep a node reference when possible. This will avoid re-walking the tree in most cases. Also clean up the single entry tree handling to ensure index/last are consistent with what one would expect. (returning NULL with limit of 1-oo). Link: https://lkml.kernel.org/r/20230518145544.1722059-24-Liam.Howlett@oracle.comSigned-off-by: Liam R. Howlett <Liam.Howlett@oracle.com> Cc: David Binderman <dcb314@hotmail.com> Cc: Peng Zhang <zhangpeng.00@bytedance.com> Cc: Sergey Senozhatsky <senozhatsky@chromium.org> Cc: Vernon Yang <vernon2gm@gmail.com> Cc: Wei Yang <richard.weiyang@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
-
Liam R. Howlett authored
The maple tree iterator clean up is incompatible with the way do_vmi_align_munmap() expects it to behave. Update the expected behaviour to map now since the change will work currently. Link: https://lkml.kernel.org/r/20230518145544.1722059-23-Liam.Howlett@oracle.comSigned-off-by: Liam R. Howlett <Liam.Howlett@oracle.com> Cc: David Binderman <dcb314@hotmail.com> Cc: Peng Zhang <zhangpeng.00@bytedance.com> Cc: Sergey Senozhatsky <senozhatsky@chromium.org> Cc: Vernon Yang <vernon2gm@gmail.com> Cc: Wei Yang <richard.weiyang@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
-
Liam R. Howlett authored
When a dead node is detected, the depth has already been set to 1 so reset it to 0. Link: https://lkml.kernel.org/r/20230518145544.1722059-22-Liam.Howlett@oracle.comSigned-off-by: Liam R. Howlett <Liam.Howlett@oracle.com> Reviewed-by: Peng Zhang <zhangpeng.00@bytedance.com> Cc: David Binderman <dcb314@hotmail.com> Cc: Sergey Senozhatsky <senozhatsky@chromium.org> Cc: Vernon Yang <vernon2gm@gmail.com> Cc: Wei Yang <richard.weiyang@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
-
Liam R. Howlett authored
mas_destroy currently checks if mas->node is MAS_START prior to calling mas_start(), but this is unnecessary as mas_start() will do nothing if the node is anything but MAS_START. Link: https://lkml.kernel.org/r/20230518145544.1722059-21-Liam.Howlett@oracle.comSigned-off-by: Liam R. Howlett <Liam.Howlett@oracle.com> Reviewed-by: Peng Zhang <zhangpeng.00@bytedance.com> Cc: David Binderman <dcb314@hotmail.com> Cc: Sergey Senozhatsky <senozhatsky@chromium.org> Cc: Vernon Yang <vernon2gm@gmail.com> Cc: Wei Yang <richard.weiyang@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
-
Liam R. Howlett authored
The test functions are not needed after the module is removed, so mark them as such. Add __exit to the module removal function. Some other variables have been marked as const static as well. Link: https://lkml.kernel.org/r/20230518145544.1722059-20-Liam.Howlett@oracle.comSigned-off-by: Liam R. Howlett <Liam.Howlett@oracle.com> Suggested-by: Andrew Morton <akpm@linux-foundation.org> Cc: David Binderman <dcb314@hotmail.com> Cc: Peng Zhang <zhangpeng.00@bytedance.com> Cc: Sergey Senozhatsky <senozhatsky@chromium.org> Cc: Vernon Yang <vernon2gm@gmail.com> Cc: Wei Yang <richard.weiyang@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
-
Liam R. Howlett authored
MAS_WARN_ON() will provide more information on the maple state and can be more useful for debugging. Use this version of WARN_ON() in the debugging code when storing to the tree. Update the printk to a pr_warn(), but this will only be printed when maple tree debug is enabled anyways. Making all print statements into one will keep them together on a busy terminal. Link: https://lkml.kernel.org/r/20230518145544.1722059-19-Liam.Howlett@oracle.comSigned-off-by: Liam R. Howlett <Liam.Howlett@oracle.com> Reviewed-by: Sergey Senozhatsky <senozhatsky@chromium.org> Cc: David Binderman <dcb314@hotmail.com> Cc: Peng Zhang <zhangpeng.00@bytedance.com> Cc: Vernon Yang <vernon2gm@gmail.com> Cc: Wei Yang <richard.weiyang@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
-
Liam R. Howlett authored
Use the vma iterator in the validation code and combine the code to check the maple tree into the main validate_mm() function. Introduce a new function vma_iter_dump_tree() to dump the maple tree in hex layout. Replace all calls to validate_mm_mt() with validate_mm(). [Liam.Howlett@oracle.com: update validate_mm() to use vma iterator CONFIG flag] Link: https://lkml.kernel.org/r/20230606183538.588190-1-Liam.Howlett@oracle.com Link: https://lkml.kernel.org/r/20230518145544.1722059-18-Liam.Howlett@oracle.comSigned-off-by: Liam R. Howlett <Liam.Howlett@oracle.com> Cc: David Binderman <dcb314@hotmail.com> Cc: Peng Zhang <zhangpeng.00@bytedance.com> Cc: Sergey Senozhatsky <senozhatsky@chromium.org> Cc: Vernon Yang <vernon2gm@gmail.com> Cc: Wei Yang <richard.weiyang@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
-
Liam R. Howlett authored
The test code is less useful without debug, but can still do general validations. Define mt_dump(), mas_dump() and mas_wr_dump() as a noop if debug is not enabled and document it in the test module information that more information can be obtained with another kernel config option. MT_BUG_ON() will report a failures without tree dumps, and the output will be less useful. Link: https://lkml.kernel.org/r/20230518145544.1722059-17-Liam.Howlett@oracle.comSigned-off-by: Liam R. Howlett <Liam.Howlett@oracle.com> Cc: David Binderman <dcb314@hotmail.com> Cc: Peng Zhang <zhangpeng.00@bytedance.com> Cc: Sergey Senozhatsky <senozhatsky@chromium.org> Cc: Vernon Yang <vernon2gm@gmail.com> Cc: Wei Yang <richard.weiyang@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
-
Liam R. Howlett authored
Rename mte_pivots() to mas_pivots() and pass through the ma_state to set the error code to -EIO when the offset is out of range for the node type. Change the WARN_ON() to MAS_WARN_ON() to log the maple state. Link: https://lkml.kernel.org/r/20230518145544.1722059-16-Liam.Howlett@oracle.comSigned-off-by: Liam R. Howlett <Liam.Howlett@oracle.com> Reviewed-by: Peng Zhang <zhangpeng.00@bytedance.com> Cc: David Binderman <dcb314@hotmail.com> Cc: Sergey Senozhatsky <senozhatsky@chromium.org> Cc: Vernon Yang <vernon2gm@gmail.com> Cc: Wei Yang <richard.weiyang@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
-
Liam R. Howlett authored
Replace the call to BUG_ON() in mas_meta_gap() with calls before the function call MAS_BUG_ON() to get more information on error condition. Link: https://lkml.kernel.org/r/20230518145544.1722059-15-Liam.Howlett@oracle.comSigned-off-by: Liam R. Howlett <Liam.Howlett@oracle.com> Cc: David Binderman <dcb314@hotmail.com> Cc: Peng Zhang <zhangpeng.00@bytedance.com> Cc: Sergey Senozhatsky <senozhatsky@chromium.org> Cc: Vernon Yang <vernon2gm@gmail.com> Cc: Wei Yang <richard.weiyang@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
-
Liam R. Howlett authored
mas_store_prealloc() should never fail, but if it does due to internal tree issues then get as much debug information as possible prior to crashing the kernel. Link: https://lkml.kernel.org/r/20230518145544.1722059-14-Liam.Howlett@oracle.comSigned-off-by: Liam R. Howlett <Liam.Howlett@oracle.com> Cc: David Binderman <dcb314@hotmail.com> Cc: Peng Zhang <zhangpeng.00@bytedance.com> Cc: Sergey Senozhatsky <senozhatsky@chromium.org> Cc: Vernon Yang <vernon2gm@gmail.com> Cc: Wei Yang <richard.weiyang@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
-
Liam R. Howlett authored
In the even of trying to remove data from a leaf node by use of mas_topiary_range(), log the maple state. Link: https://lkml.kernel.org/r/20230518145544.1722059-13-Liam.Howlett@oracle.comSigned-off-by: Liam R. Howlett <Liam.Howlett@oracle.com> Cc: David Binderman <dcb314@hotmail.com> Cc: Peng Zhang <zhangpeng.00@bytedance.com> Cc: Sergey Senozhatsky <senozhatsky@chromium.org> Cc: Vernon Yang <vernon2gm@gmail.com> Cc: Wei Yang <richard.weiyang@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
-
Liam R. Howlett authored
Use MAS_BUG_ON() instead of MT_BUG_ON() to get the maple state information. In the unlikely event of a tree height of > 31, try to increase the probability of useful information being logged. Link: https://lkml.kernel.org/r/20230518145544.1722059-12-Liam.Howlett@oracle.comSigned-off-by: Liam R. Howlett <Liam.Howlett@oracle.com> Cc: David Binderman <dcb314@hotmail.com> Cc: Peng Zhang <zhangpeng.00@bytedance.com> Cc: Sergey Senozhatsky <senozhatsky@chromium.org> Cc: Vernon Yang <vernon2gm@gmail.com> Cc: Wei Yang <richard.weiyang@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
-
Liam R. Howlett authored
Use MAS_BUG_ON() to dump the maple state and tree in the unlikely event of an issue. Link: https://lkml.kernel.org/r/20230518145544.1722059-11-Liam.Howlett@oracle.comSigned-off-by: Liam R. Howlett <Liam.Howlett@oracle.com> Cc: David Binderman <dcb314@hotmail.com> Cc: Peng Zhang <zhangpeng.00@bytedance.com> Cc: Sergey Senozhatsky <senozhatsky@chromium.org> Cc: Vernon Yang <vernon2gm@gmail.com> Cc: Wei Yang <richard.weiyang@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
-
Liam R. Howlett authored
Using MT_WARN_ON() allows for the removal of if statements before logging. Using MAS_WARN_ON() will provide more information when issues are encountered. Link: https://lkml.kernel.org/r/20230518145544.1722059-10-Liam.Howlett@oracle.comSigned-off-by: Liam R. Howlett <Liam.Howlett@oracle.com> Cc: David Binderman <dcb314@hotmail.com> Cc: Peng Zhang <zhangpeng.00@bytedance.com> Cc: Sergey Senozhatsky <senozhatsky@chromium.org> Cc: Vernon Yang <vernon2gm@gmail.com> Cc: Wei Yang <richard.weiyang@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
-
Liam R. Howlett authored
If RCU is enabled and the tree isn't locked, just warn the user and avoid crashing the kernel. Link: https://lkml.kernel.org/r/20230518145544.1722059-9-Liam.Howlett@oracle.comSigned-off-by: Liam R. Howlett <Liam.Howlett@oracle.com> Cc: David Binderman <dcb314@hotmail.com> Cc: Peng Zhang <zhangpeng.00@bytedance.com> Cc: Sergey Senozhatsky <senozhatsky@chromium.org> Cc: Vernon Yang <vernon2gm@gmail.com> Cc: Wei Yang <richard.weiyang@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
-
Liam R. Howlett authored
Use MT_BUG_ON() to get more information when running with MAPLE_TREE_DEBUG enabled. Link: https://lkml.kernel.org/r/20230518145544.1722059-8-Liam.Howlett@oracle.comSigned-off-by: Liam R. Howlett <Liam.Howlett@oracle.com> Cc: David Binderman <dcb314@hotmail.com> Cc: Peng Zhang <zhangpeng.00@bytedance.com> Cc: Sergey Senozhatsky <senozhatsky@chromium.org> Cc: Vernon Yang <vernon2gm@gmail.com> Cc: Wei Yang <richard.weiyang@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
-
Liam R. Howlett authored
Add debug macros to dump the maple state and/or the tree for both warning and bug_on calls. Link: https://lkml.kernel.org/r/20230518145544.1722059-7-Liam.Howlett@oracle.comSigned-off-by: Liam R. Howlett <Liam.Howlett@oracle.com> Cc: David Binderman <dcb314@hotmail.com> Cc: Peng Zhang <zhangpeng.00@bytedance.com> Cc: Sergey Senozhatsky <senozhatsky@chromium.org> Cc: Vernon Yang <vernon2gm@gmail.com> Cc: Wei Yang <richard.weiyang@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
-
Liam R. Howlett authored
Allow different formatting strings to be used when dumping the tree. Currently supports hex and decimal. Link: https://lkml.kernel.org/r/20230518145544.1722059-6-Liam.Howlett@oracle.comSigned-off-by: Liam R. Howlett <Liam.Howlett@oracle.com> Cc: David Binderman <dcb314@hotmail.com> Cc: Peng Zhang <zhangpeng.00@bytedance.com> Cc: Sergey Senozhatsky <senozhatsky@chromium.org> Cc: Vernon Yang <vernon2gm@gmail.com> Cc: Wei Yang <richard.weiyang@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
-
Liam R. Howlett authored
Convert loop type to ensure all variables are set to make the compiler happy, and use the mas_is_none() function instead of explicitly checking the node in the maple state. Link: https://lkml.kernel.org/r/20230518145544.1722059-5-Liam.Howlett@oracle.comSigned-off-by: Liam R. Howlett <Liam.Howlett@oracle.com> Cc: David Binderman <dcb314@hotmail.com> Cc: Peng Zhang <zhangpeng.00@bytedance.com> Cc: Sergey Senozhatsky <senozhatsky@chromium.org> Cc: Vernon Yang <vernon2gm@gmail.com> Cc: Wei Yang <richard.weiyang@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
-
Liam R. Howlett authored
The maple tree node limits are implied by the parent. When walking up the tree, the limit may not be known until a slot that does not have implied limits are encountered. However, if the node is the left-most or right-most node, the walking up to find that limit can be skipped. This commit also fixes the debug/testing code that was not setting the limit on walking down the tree as that optimization is not compatible with this change. Link: https://lkml.kernel.org/r/20230518145544.1722059-4-Liam.Howlett@oracle.comSigned-off-by: Liam R. Howlett <Liam.Howlett@oracle.com> Reviewed-by: Peng Zhang <zhangpeng.00@bytedance.com> Cc: David Binderman <dcb314@hotmail.com> Cc: Sergey Senozhatsky <senozhatsky@chromium.org> Cc: Vernon Yang <vernon2gm@gmail.com> Cc: Wei Yang <richard.weiyang@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
-
Liam R. Howlett authored
mas_parent_enum() is a simple wrapper for mte_parent_enum() which is only called from that wrapper. Remove the wrapper and inline mte_parent_enum() into mas_parent_enum(). At the same time, clean up the bit masking of the root pointer since it cannot be set by the time the bit masking occurs. Change the check on the root bit to a WARN_ON(), and fix the verification code to not trigger the WARN_ON() before checking if the node is root. Align the name to mas_parent_type() since mas_node_type() exists already. Link: https://lkml.kernel.org/r/20230518145544.1722059-3-Liam.Howlett@oracle.comSigned-off-by: Liam R. Howlett <Liam.Howlett@oracle.com> Reported-by: Wei Yang <richard.weiyang@gmail.com> Reviewed-by: Wei Yang <richard.weiyang@gmail.com> Cc: David Binderman <dcb314@hotmail.com> Cc: Peng Zhang <zhangpeng.00@bytedance.com> Cc: Sergey Senozhatsky <senozhatsky@chromium.org> Cc: Vernon Yang <vernon2gm@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
-
Liam R. Howlett authored
Patch series "Maple tree mas_{next,prev}_range() and cleanup", v4. This patchset contains a number of clean ups to the code to make it more usable (next/prev range), the addition of debug output formatting, the addition of printing the maple state information in the WARN_ON/BUG_ON code. There is also work done here to keep nodes active during iterations to reduce the necessity of re-walking the tree. Finally, there is a new interface added to move to the next or previous range in the tree, even if it is empty. The organisation of the patches is as follows: 0001-0004 - Small clean ups 0005-0018 - Additional debug options and WARN_ON/BUG_ON changes 0019 - Test module __init and __exit addition 0020-0021 - More functional clean ups 0022-0026 - Changes to keep nodes active 0027-0034 - Add new mas_{prev,next}_range() 0035 - Use new mas_{prev,next}_range() in mmap_region() This patch (of 35): Static analyser of the maple tree code noticed that the split variable is being used to dereference into an array prior to checking the variable itself. Fix this issue by changing the order of the statement to check the variable first. Link: https://lkml.kernel.org/r/20230518145544.1722059-1-Liam.Howlett@oracle.com Link: https://lkml.kernel.org/r/20230518145544.1722059-2-Liam.Howlett@oracle.comSigned-off-by: Liam R. Howlett <Liam.Howlett@oracle.com> Reported-by: David Binderman <dcb314@hotmail.com> Reviewed-by: Peng Zhang<zhangpeng.00@bytedance.com> Cc: Sergey Senozhatsky <senozhatsky@chromium.org> Cc: Vernon Yang <vernon2gm@gmail.com> Cc: Wei Yang <richard.weiyang@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
-
Matthew Wilcox (Oracle) authored
Almost all of the callers & implementors of migrate_pages() were already converted to use folios. compaction_alloc() & compaction_free() are trivial to convert a part of this patch and not worth splitting out. Link: https://lkml.kernel.org/r/20230513001101.276972-1-willy@infradead.orgSigned-off-by: Matthew Wilcox (Oracle) <willy@infradead.org> Reviewed-by: "Huang, Ying" <ying.huang@intel.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
-
Lorenzo Stoakes authored
Now we have eliminated all callers to GUP APIs which use the vmas parameter, eliminate it altogether. This eliminates a class of bugs where vmas might have been kept around longer than the mmap_lock and thus we need not be concerned about locks being dropped during this operation leaving behind dangling pointers. This simplifies the GUP API and makes it considerably clearer as to its purpose - follow flags are applied and if pinning, an array of pages is returned. Link: https://lkml.kernel.org/r/6811b4b2b4b3baf3dd07f422bb18853bb2cd09fb.1684350871.git.lstoakes@gmail.comSigned-off-by: Lorenzo Stoakes <lstoakes@gmail.com> Acked-by: David Hildenbrand <david@redhat.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Christian König <christian.koenig@amd.com> Cc: Dennis Dalessandro <dennis.dalessandro@cornelisnetworks.com> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: Janosch Frank <frankja@linux.ibm.com> Cc: Jarkko Sakkinen <jarkko@kernel.org> Cc: Jason Gunthorpe <jgg@nvidia.com> Cc: Jens Axboe <axboe@kernel.dk> Cc: Matthew Wilcox (Oracle) <willy@infradead.org> Cc: Sakari Ailus <sakari.ailus@linux.intel.com> Cc: Sean Christopherson <seanjc@google.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
-
Lorenzo Stoakes authored
We are now in a position where no caller of pin_user_pages() requires the vmas parameter at all, so eliminate this parameter from the function and all callers. This clears the way to removing the vmas parameter from GUP altogether. Link: https://lkml.kernel.org/r/195a99ae949c9f5cb589d2222b736ced96ec199a.1684350871.git.lstoakes@gmail.comSigned-off-by: Lorenzo Stoakes <lstoakes@gmail.com> Acked-by: David Hildenbrand <david@redhat.com> Acked-by: Dennis Dalessandro <dennis.dalessandro@cornelisnetworks.com> [qib] Reviewed-by: Christoph Hellwig <hch@lst.de> Acked-by: Sakari Ailus <sakari.ailus@linux.intel.com> [drivers/media] Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Christian König <christian.koenig@amd.com> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: Janosch Frank <frankja@linux.ibm.com> Cc: Jarkko Sakkinen <jarkko@kernel.org> Cc: Jason Gunthorpe <jgg@nvidia.com> Cc: Jens Axboe <axboe@kernel.dk> Cc: Matthew Wilcox (Oracle) <willy@infradead.org> Cc: Sean Christopherson <seanjc@google.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
-
Lorenzo Stoakes authored
Now that the GUP explicitly checks FOLL_LONGTERM pin_user_pages() for broken file-backed mappings in "mm/gup: disallow FOLL_LONGTERM GUP-nonfast writing to file-backed mappings", there is no need to explicitly check VMAs for this condition, so simply remove this logic from io_uring altogether. Link: https://lkml.kernel.org/r/e4a4efbda9cd12df71e0ed81796dc630231a1ef2.1684350871.git.lstoakes@gmail.comSigned-off-by: Lorenzo Stoakes <lstoakes@gmail.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Jens Axboe <axboe@kernel.dk> Reviewed-by: David Hildenbrand <david@redhat.com> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Christian König <christian.koenig@amd.com> Cc: Dennis Dalessandro <dennis.dalessandro@cornelisnetworks.com> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: Janosch Frank <frankja@linux.ibm.com> Cc: Jarkko Sakkinen <jarkko@kernel.org> Cc: Jason Gunthorpe <jgg@nvidia.com> Cc: Matthew Wilcox (Oracle) <willy@infradead.org> Cc: Sakari Ailus <sakari.ailus@linux.intel.com> Cc: Sean Christopherson <seanjc@google.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
-
Lorenzo Stoakes authored
The only instances of get_user_pages_remote() invocations which used the vmas parameter were for a single page which can instead simply look up the VMA directly. In particular:- - __update_ref_ctr() looked up the VMA but did nothing with it so we simply remove it. - __access_remote_vm() was already using vma_lookup() when the original lookup failed so by doing the lookup directly this also de-duplicates the code. We are able to perform these VMA operations as we already hold the mmap_lock in order to be able to call get_user_pages_remote(). As part of this work we add get_user_page_vma_remote() which abstracts the VMA lookup, error handling and decrementing the page reference count should the VMA lookup fail. This forms part of a broader set of patches intended to eliminate the vmas parameter altogether. [akpm@linux-foundation.org: avoid passing NULL to PTR_ERR] Link: https://lkml.kernel.org/r/d20128c849ecdbf4dd01cc828fcec32127ed939a.1684350871.git.lstoakes@gmail.comSigned-off-by: Lorenzo Stoakes <lstoakes@gmail.com> Reviewed-by: Catalin Marinas <catalin.marinas@arm.com> (for arm64) Acked-by: David Hildenbrand <david@redhat.com> Reviewed-by: Janosch Frank <frankja@linux.ibm.com> (for s390) Reviewed-by: Christoph Hellwig <hch@lst.de> Cc: Christian König <christian.koenig@amd.com> Cc: Dennis Dalessandro <dennis.dalessandro@cornelisnetworks.com> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: Jarkko Sakkinen <jarkko@kernel.org> Cc: Jason Gunthorpe <jgg@nvidia.com> Cc: Jens Axboe <axboe@kernel.dk> Cc: Matthew Wilcox (Oracle) <willy@infradead.org> Cc: Sakari Ailus <sakari.ailus@linux.intel.com> Cc: Sean Christopherson <seanjc@google.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
-
Lorenzo Stoakes authored
No invocation of pin_user_pages_remote() uses the vmas parameter, so remove it. This forms part of a larger patch set eliminating the use of the vmas parameters altogether. Link: https://lkml.kernel.org/r/28f000beb81e45bf538a2aaa77c90f5482b67a32.1684350871.git.lstoakes@gmail.comSigned-off-by: Lorenzo Stoakes <lstoakes@gmail.com> Acked-by: David Hildenbrand <david@redhat.com> Reviewed-by: Jason Gunthorpe <jgg@nvidia.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Christian König <christian.koenig@amd.com> Cc: Dennis Dalessandro <dennis.dalessandro@cornelisnetworks.com> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: Janosch Frank <frankja@linux.ibm.com> Cc: Jarkko Sakkinen <jarkko@kernel.org> Cc: Jens Axboe <axboe@kernel.dk> Cc: Matthew Wilcox (Oracle) <willy@infradead.org> Cc: Sakari Ailus <sakari.ailus@linux.intel.com> Cc: Sean Christopherson <seanjc@google.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
-
Lorenzo Stoakes authored
Patch series "remove the vmas parameter from GUP APIs", v6. (pin_/get)_user_pages[_remote]() each provide an optional output parameter for an array of VMA objects associated with each page in the input range. These provide the means for VMAs to be returned, as long as mm->mmap_lock is never released during the GUP operation (i.e. the internal flag FOLL_UNLOCKABLE is not specified). In addition, these VMAs can only be accessed with the mmap_lock held and become invalidated the moment it is released. The vast majority of invocations do not use this functionality and of those that do, all but one case retrieve a single VMA to perform checks upon. It is not egregious in the single VMA cases to simply replace the operation with a vma_lookup(). In these cases we duplicate the (fast) lookup on a slow path already under the mmap_lock, abstracted to a new get_user_page_vma_remote() inline helper function which also performs error checking and reference count maintenance. The special case is io_uring, where io_pin_pages() specifically needs to assert that the VMAs underlying the range do not result in broken long-term GUP file-backed mappings. As GUP now internally asserts that FOLL_LONGTERM mappings are not file-backed in a broken fashion (i.e. requiring dirty tracking) - as implemented in "mm/gup: disallow FOLL_LONGTERM GUP-nonfast writing to file-backed mappings" - this logic is no longer required and so we can simply remove it altogether from io_uring. Eliminating the vmas parameter eliminates an entire class of danging pointer errors that might have occured should the lock have been incorrectly released. In addition, the API is simplified and now clearly expresses what it is intended for - applying the specified GUP flags and (if pinning) returning pinned pages. This change additionally opens the door to further potential improvements in GUP and the possible marrying of disparate code paths. I have run this series against gup_test with no issues. Thanks to Matthew Wilcox for suggesting this refactoring! This patch (of 6): No invocation of get_user_pages() use the vmas parameter, so remove it. The GUP API is confusing and caveated. Recent changes have done much to improve that, however there is more we can do. Exporting vmas is a prime target as the caller has to be extremely careful to preclude their use after the mmap_lock has expired or otherwise be left with dangling pointers. Removing the vmas parameter focuses the GUP functions upon their primary purpose - pinning (and outputting) pages as well as performing the actions implied by the input flags. This is part of a patch series aiming to remove the vmas parameter altogether. Link: https://lkml.kernel.org/r/cover.1684350871.git.lstoakes@gmail.com Link: https://lkml.kernel.org/r/589e0c64794668ffc799651e8d85e703262b1e9d.1684350871.git.lstoakes@gmail.comSigned-off-by: Lorenzo Stoakes <lstoakes@gmail.com> Suggested-by: Matthew Wilcox (Oracle) <willy@infradead.org> Acked-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Acked-by: David Hildenbrand <david@redhat.com> Reviewed-by: Jason Gunthorpe <jgg@nvidia.com> Acked-by: Christian König <christian.koenig@amd.com> (for radeon parts) Acked-by: Jarkko Sakkinen <jarkko@kernel.org> Reviewed-by: Christoph Hellwig <hch@lst.de> Acked-by: Sean Christopherson <seanjc@google.com> (KVM) Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Dennis Dalessandro <dennis.dalessandro@cornelisnetworks.com> Cc: Janosch Frank <frankja@linux.ibm.com> Cc: Jens Axboe <axboe@kernel.dk> Cc: Sakari Ailus <sakari.ailus@linux.intel.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
-
Sidhartha Kumar authored
All users of hugetlb_page_subpool() have been converted to use the folio equivalent. This function can be safely removed. Link: https://lkml.kernel.org/r/20230516225205.1429196-1-sidhartha.kumar@oracle.comSigned-off-by: Sidhartha Kumar <sidhartha.kumar@oracle.com> Reviewed-by: David Hildenbrand <david@redhat.com> Reviewed-by: Mike Kravetz <mike.kravetz@oracle.com> Cc: Matthew Wilcox (Oracle) <willy@infradead.org> Cc: Muchun Song <songmuchun@bytedance.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
-
Kefeng Wang authored
The is_check_pages_enabled() only used in page_alloc.c, move it into page_alloc.c, also use it in free_tail_page_prepare(). Link: https://lkml.kernel.org/r/20230516063821.121844-14-wangkefeng.wang@huawei.comSigned-off-by: Kefeng Wang <wangkefeng.wang@huawei.com> Cc: David Hildenbrand <david@redhat.com> Cc: "Huang, Ying" <ying.huang@intel.com> Cc: Iurii Zaikin <yzaikin@google.com> Cc: Kees Cook <keescook@chromium.org> Cc: Len Brown <len.brown@intel.com> Cc: Luis Chamberlain <mcgrof@kernel.org> Cc: Mike Rapoport (IBM) <rppt@kernel.org> Cc: Oscar Salvador <osalvador@suse.de> Cc: Pavel Machek <pavel@ucw.cz> Cc: Rafael J. Wysocki <rafael@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
-
Kefeng Wang authored
This moves all page alloc related sysctls to its own file, as part of the kernel/sysctl.c spring cleaning, also move some functions declarations from mm.h into internal.h. Link: https://lkml.kernel.org/r/20230516063821.121844-13-wangkefeng.wang@huawei.comSigned-off-by: Kefeng Wang <wangkefeng.wang@huawei.com> Cc: David Hildenbrand <david@redhat.com> Cc: "Huang, Ying" <ying.huang@intel.com> Cc: Iurii Zaikin <yzaikin@google.com> Cc: Kees Cook <keescook@chromium.org> Cc: Len Brown <len.brown@intel.com> Cc: Luis Chamberlain <mcgrof@kernel.org> Cc: Mike Rapoport (IBM) <rppt@kernel.org> Cc: Oscar Salvador <osalvador@suse.de> Cc: Pavel Machek <pavel@ucw.cz> Cc: Rafael J. Wysocki <rafael@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
-
Kefeng Wang authored
Use gfp_has_io_fs() instead of open-code. Link: https://lkml.kernel.org/r/20230516063821.121844-12-wangkefeng.wang@huawei.comSigned-off-by: Kefeng Wang <wangkefeng.wang@huawei.com> Cc: David Hildenbrand <david@redhat.com> Cc: "Huang, Ying" <ying.huang@intel.com> Cc: Iurii Zaikin <yzaikin@google.com> Cc: Kees Cook <keescook@chromium.org> Cc: Len Brown <len.brown@intel.com> Cc: Luis Chamberlain <mcgrof@kernel.org> Cc: Mike Rapoport (IBM) <rppt@kernel.org> Cc: Oscar Salvador <osalvador@suse.de> Cc: Pavel Machek <pavel@ucw.cz> Cc: Rafael J. Wysocki <rafael@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
-
Kefeng Wang authored
pm_restrict_gfp_mask()/pm_restore_gfp_mask() only used in power, let's move them out of page_alloc.c. Adding a general gfp_has_io_fs() function which return true if gfp with both __GFP_IO and __GFP_FS flags, then use it inside of pm_suspended_storage(), also the pm_suspended_storage() is moved into suspend.h. Link: https://lkml.kernel.org/r/20230516063821.121844-11-wangkefeng.wang@huawei.comSigned-off-by: Kefeng Wang <wangkefeng.wang@huawei.com> Cc: David Hildenbrand <david@redhat.com> Cc: "Huang, Ying" <ying.huang@intel.com> Cc: Iurii Zaikin <yzaikin@google.com> Cc: Kees Cook <keescook@chromium.org> Cc: Len Brown <len.brown@intel.com> Cc: Luis Chamberlain <mcgrof@kernel.org> Cc: Mike Rapoport (IBM) <rppt@kernel.org> Cc: Oscar Salvador <osalvador@suse.de> Cc: Pavel Machek <pavel@ucw.cz> Cc: Rafael J. Wysocki <rafael@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
-
Kefeng Wang authored
The mark_free_page() is only used in kernel/power/snapshot.c, move it out to reduce a bit of page_alloc.c Link: https://lkml.kernel.org/r/20230516063821.121844-10-wangkefeng.wang@huawei.comSigned-off-by: Kefeng Wang <wangkefeng.wang@huawei.com> Cc: David Hildenbrand <david@redhat.com> Cc: "Huang, Ying" <ying.huang@intel.com> Cc: Iurii Zaikin <yzaikin@google.com> Cc: Kees Cook <keescook@chromium.org> Cc: Len Brown <len.brown@intel.com> Cc: Luis Chamberlain <mcgrof@kernel.org> Cc: Mike Rapoport (IBM) <rppt@kernel.org> Cc: Oscar Salvador <osalvador@suse.de> Cc: Pavel Machek <pavel@ucw.cz> Cc: Rafael J. Wysocki <rafael@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
-
Kefeng Wang authored
Move DEBUG_PAGEALLOC related functions into a single file to reduce a bit of page_alloc.c. Link: https://lkml.kernel.org/r/20230516063821.121844-9-wangkefeng.wang@huawei.comSigned-off-by: Kefeng Wang <wangkefeng.wang@huawei.com> Cc: David Hildenbrand <david@redhat.com> Cc: "Huang, Ying" <ying.huang@intel.com> Cc: Iurii Zaikin <yzaikin@google.com> Cc: Kees Cook <keescook@chromium.org> Cc: Len Brown <len.brown@intel.com> Cc: Luis Chamberlain <mcgrof@kernel.org> Cc: Mike Rapoport (IBM) <rppt@kernel.org> Cc: Oscar Salvador <osalvador@suse.de> Cc: Pavel Machek <pavel@ucw.cz> Cc: Rafael J. Wysocki <rafael@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
-
Kefeng Wang authored
... to a single file to reduce a bit of page_alloc.c. Link: https://lkml.kernel.org/r/20230516063821.121844-8-wangkefeng.wang@huawei.comSigned-off-by: Kefeng Wang <wangkefeng.wang@huawei.com> Cc: David Hildenbrand <david@redhat.com> Cc: "Huang, Ying" <ying.huang@intel.com> Cc: Iurii Zaikin <yzaikin@google.com> Cc: Kees Cook <keescook@chromium.org> Cc: Len Brown <len.brown@intel.com> Cc: Luis Chamberlain <mcgrof@kernel.org> Cc: Mike Rapoport (IBM) <rppt@kernel.org> Cc: Oscar Salvador <osalvador@suse.de> Cc: Pavel Machek <pavel@ucw.cz> Cc: Rafael J. Wysocki <rafael@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
-