- 24 Feb, 2021 40 commits
-
-
Yanfei Xu authored
Gigantic page is a compound page and its order is more than 1. Thus it must be available for hpage_pincount. Let's remove the redundant check for gigantic page. Link: https://lkml.kernel.org/r/20210202112002.73170-1-yanfei.xu@windriver.comSigned-off-by: Yanfei Xu <yanfei.xu@windriver.com> Reviewed-by: Mike Kravetz <mike.kravetz@oracle.com> Reviewed-by: David Hildenbrand <david@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
-
Miaohe Lin authored
Fix typos sasitfy to satisfy, reservtion to reservation, hugegpage to hugepage and uniprocesor to uniprocessor in comments. Link: https://lkml.kernel.org/r/20210128112028.64831-1-linmiaohe@huawei.comSigned-off-by: Miaohe Lin <linmiaohe@huawei.com> Reviewed-by: Souptick Joarder <jrdr.linux@gmail.com> Cc: Mike Kravetz <mike.kravetz@oracle.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
-
Joao Martins authored
For a given hugepage backing a VA, there's a rather ineficient loop which is solely responsible for storing subpages in GUP @pages/@vmas array. For each subpage we check whether it's within range or size of @pages and keep increment @pfn_offset and a couple other variables per subpage iteration. Simplify this logic and minimize the cost of each iteration to just store the output page/vma. Instead of incrementing number of @refs iteratively, we do it through pre-calculation of @refs and only with a tight loop for storing pinned subpages/vmas. Additionally, retain existing behaviour with using mem_map_offset() when recording the subpages for configurations that don't have a contiguous mem_map. pinning consequently improves bringing us close to {pin,get}_user_pages_fast: - 16G with 1G huge page size gup_test -f /mnt/huge/file -m 16384 -r 30 -L -S -n 512 -w PIN_LONGTERM_BENCHMARK: ~12.8k us -> ~5.8k us PIN_FAST_BENCHMARK: ~3.7k us Link: https://lkml.kernel.org/r/20210128182632.24562-3-joao.m.martins@oracle.comSigned-off-by: Joao Martins <joao.m.martins@oracle.com> Reviewed-by: Mike Kravetz <mike.kravetz@oracle.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
-
Joao Martins authored
Patch series "mm/hugetlb: follow_hugetlb_page() improvements", v2. While looking at ZONE_DEVICE struct page reuse particularly the last patch[0], I found two possible improvements for follow_hugetlb_page() which is solely used for get_user_pages()/pin_user_pages(). The first patch batches page refcount updates while the second tidies up storing the subpages/vmas. Both together bring the cost of slow variant of gup() cost from ~87.6k usecs to ~5.8k usecs. libhugetlbfs tests seem to pass as well gup_test benchmarks with hugetlbfs vmas. This patch (of 2): follow_hugetlb_page() once it locks the pmd/pud, checks all its N subpages in a huge page and grabs a reference for each one. Similar to gup-fast, have follow_hugetlb_page() grab the head page refcount only after counting all its subpages that are part of the just faulted huge page. Consequently we reduce the number of atomics necessary to pin said huge page, which improves non-fast gup() considerably: - 16G with 1G huge page size gup_test -f /mnt/huge/file -m 16384 -r 10 -L -S -n 512 -w PIN_LONGTERM_BENCHMARK: ~87.6k us -> ~12.8k us Link: https://lkml.kernel.org/r/20210128182632.24562-1-joao.m.martins@oracle.com Link: https://lkml.kernel.org/r/20210128182632.24562-2-joao.m.martins@oracle.comSigned-off-by: Joao Martins <joao.m.martins@oracle.com> Reviewed-by: Mike Kravetz <mike.kravetz@oracle.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
-
Jiapeng Zhong authored
Fix the following coccicheck warnings: mm/hugetlb.c:3372:20-22: WARNING !A || A && B is equivalent to !A || B. Link: https://lkml.kernel.org/r/1611643468-52233-1-git-send-email-abaci-bugfix@linux.alibaba.comSigned-off-by: Jiapeng Zhong <abaci-bugfix@linux.alibaba.com> Reported-by: Abaci Robot <abaci@linux.alibaba.com> Reviewed-by: Miaohe Lin <linmiaohe@huawei.com> Reviewed-by: Mike Kravetz <mike.kravetz@oracle.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
-
Miaohe Lin authored
If a hugetlbfs filesystem is created with the min_size option and without the size option, used_hpages is always 0 and might lead to release subpool prematurely because it indicates no pages are used now while there might be. In order to fix this issue, we should check used_hpages == 0 iff max_hpages accounting is enabled. As max_hpages accounting should be enabled in most common case, this is not worth a Cc stable. [mike.kravetz@oracle.com: new changelog] Link: https://lkml.kernel.org/r/20210126115510.53374-1-linmiaohe@huawei.comSigned-off-by: Hongxiang Lou <louhongxiang@huawei.com> Signed-off-by: Miaohe Lin <linmiaohe@huawei.com> Reviewed-by: Mike Kravetz <mike.kravetz@oracle.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
-
Miaohe Lin authored
Since commit a5516438 ("hugetlb: modular state for hugetlb page size"), we can use huge_page_order to access hstate->order and pages_per_huge_page to fetch the pages per huge page. But gather_bootmem_prealloc() forgot to use it. Link: https://lkml.kernel.org/r/20210114114435.40075-1-linmiaohe@huawei.comSigned-off-by: Miaohe Lin <linmiaohe@huawei.com> Reviewed-by: David Hildenbrand <david@redhat.com> Reviewed-by: Mike Kravetz <mike.kravetz@oracle.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
-
Miaohe Lin authored
When reservation accounting remains unchanged, hugetlb_acct_memory() will do nothing except holding and releasing hugetlb_lock. We should avoid this unnecessary hugetlb_lock lock/unlock cycle which is happening on 'most' hugetlb munmap operations by check delta against 0 at the beginning of hugetlb_acct_memory. Link: https://lkml.kernel.org/r/20210115092013.61012-1-linmiaohe@huawei.comSigned-off-by: Miaohe Lin <linmiaohe@huawei.com> Reviewed-by: David Hildenbrand <david@redhat.com> Reviewed-by: Oscar Salvador <osalvador@suse.de> Cc: Mike Kravetz <mike.kravetz@oracle.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
-
Li Xinhai authored
The current code would unnecessarily expand the address range. Consider one example, (start, end) = (1G-2M, 3G+2M), and (vm_start, vm_end) = (1G-4M, 3G+4M), the expected adjustment should be keep (1G-2M, 3G+2M) without expand. But the current result will be (1G-4M, 3G+4M). Actually, the range (1G-4M, 1G) and (3G, 3G+4M) would never been involved in pmd sharing. After this patch, we will check that the vma span at least one PUD aligned size and the start,end range overlap the aligned range of vma. With above example, the aligned vma range is (1G, 3G), so if (start, end) range is within (1G-4M, 1G), or within (3G, 3G+4M), then no adjustment to both start and end. Otherwise, we will have chance to adjust start downwards or end upwards without exceeding (vm_start, vm_end). Mike: : The 'adjusted range' is used for calls to mmu notifiers and cache(tlb) : flushing. Since the current code unnecessarily expands the range in some : cases, more entries than necessary would be flushed. This would/could : result in performance degradation. However, this is highly dependent on : the user runtime. Is there a combination of vma layout and calls to : actually hit this issue? If the issue is hit, will those entries : unnecessarily flushed be used again and need to be unnecessarily reloaded? Link: https://lkml.kernel.org/r/20210104081631.2921415-1-lixinhai.lxh@gmail.com Fixes: 75802ca6 ("mm/hugetlb: fix calculation of adjust_range_if_pmd_sharing_possible") Signed-off-by: Li Xinhai <lixinhai.lxh@gmail.com> Suggested-by: Mike Kravetz <mike.kravetz@oracle.com> Reviewed-by: Mike Kravetz <mike.kravetz@oracle.com> Cc: Peter Xu <peterx@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
-
Miaohe Lin authored
In hugetlb_sysfs_add_hstate(), we would do kobject_put() on hstate_kobjs when failed to create sysfs group but forget to set hstate_kobjs to NULL. Then in hugetlb_register_node() error path, we may free it again via hugetlb_unregister_node(). Link: https://lkml.kernel.org/r/20210107123249.36964-1-linmiaohe@huawei.com Fixes: a3437870 ("hugetlb: new sysfs interface") Signed-off-by: Miaohe Lin <linmiaohe@huawei.com> Reviewed-by: Mike Kravetz <mike.kravetz@oracle.com> Reviewed-by: Muchun Song <smuchun@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
-
Bibo Mao authored
Function set_pmd_at is to set pmd entry, if tlb entry need to be flushed, there exists pmdp_huge_clear_flush alike function before set_pmd_at is called. So it is not necessary to call flush_tlb_all in this function. In these scenarios, tlb for the pmd range needs to be flushed: - privilege degrade such as wrprotect is set on the pmd entry - pmd entry is cleared - there is exception if set_pmd_at is issued by dup_mmap, since flush_tlb_mm is called for parent process, it is not necessary to flush tlb in function copy_huge_pmd. Link: http://lkml.kernel.org/r/1592990792-1923-3-git-send-email-maobibo@loongson.cnSigned-off-by: Bibo Mao <maobibo@loongson.cn> Cc: Anshuman Khandual <anshuman.khandual@arm.com> Cc: Daniel Silsby <dansilsby@gmail.com> Cc: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com> Cc: Mike Kravetz <mike.kravetz@oracle.com> Cc: Mike Rapoport <rppt@linux.ibm.com> Cc: Paul Burton <paulburton@kernel.org> Cc: Ralf Baechle <ralf@linux-mips.org> Cc: Thomas Bogendoerfer <tsbogend@alpha.franken.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
-
Bibo Mao authored
When set_pmd_at is called in function do_huge_pmd_anonymous_page, new tlb entry can be added by software on MIPS platform. Here add update_mmu_cache_pmd when pmd entry is set, and update_mmu_cache_pmd is defined as empty excepts arc/mips platform. This patch has no negative effect on other platforms except arc/mips system. Link: http://lkml.kernel.org/r/1592990792-1923-2-git-send-email-maobibo@loongson.cnSigned-off-by: Bibo Mao <maobibo@loongson.cn> Cc: Anshuman Khandual <anshuman.khandual@arm.com> Cc: Daniel Silsby <dansilsby@gmail.com> Cc: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com> Cc: Mike Kravetz <mike.kravetz@oracle.com> Cc: Mike Rapoport <rppt@linux.ibm.com> Cc: Paul Burton <paulburton@kernel.org> Cc: Ralf Baechle <ralf@linux-mips.org> Cc: Thomas Bogendoerfer <tsbogend@alpha.franken.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
-
Aili Yao authored
When a memory uncorrected error is triggered by process who accessed the address with error, It's Action Required Case for only current process which triggered this; This Action Required case means Action optional to other process who share the same page. Usually killing current process will be sufficient, other processes sharing the same page will get be signaled when they really touch the poisoned page. But there is another scenario that other processes sharing the same page want to be signaled early with PF_MCE_EARLY set. In this case, we should get them into kill list and signal BUS_MCEERR_AO to them. So in this patch, task_early_kill will check current process if force_early is set, and if not current,the code will fallback to find_early_kill_thread() to check if there is PF_MCE_EARLY process who cares the error. In kill_proc(), BUS_MCEERR_AR is only send to current, other processes in kill list will be signaled with BUS_MCEERR_AO. Link: https://lkml.kernel.org/r/20210122132424.313c8f5f.yaoaili@kingsoft.comSigned-off-by: Aili Yao <yaoaili@kingsoft.com> Reviewed-by: Oscar Salvador <osalvador@suse.de> Acked-by: Naoya Horiguchi <naoya.horiguchi@nec.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
-
Matthew Wilcox (Oracle) authored
The generated html will link to the definition of the gfp_t automatically once we define it. Move the one-paragraph overview of GFP flags from the documentation directory into gfp.h and pull gfp.h into the documentation. This generates warnings with clang (https://lkml.kernel.org/r/20210219195509.GA59987@24bbad8f3778), so use a #if 0 to hide it from the compiler for now. Link: https://lkml.kernel.org/r/20210215204909.3824509-1-willy@infradead.org Link: https://lkml.kernel.org/r/20210220003049.GZ2858050@casper.infradead.orgSigned-off-by: Matthew Wilcox (Oracle) <willy@infradead.org> Acked-by: Mike Rapoport <rppt@linux.ibm.com> Cc: Nathan Chancellor <nathan@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
-
David Hildenbrand authored
adjust_managed_page_count() as called by free_reserved_page() properly handles pages in a highmem zone, so we can reuse it for free_highmem_page(). We can now get rid of totalhigh_pages_inc() and simplify free_reserved_page(). Link: https://lkml.kernel.org/r/20210126182113.19892-3-david@redhat.comSigned-off-by: David Hildenbrand <david@redhat.com> Reviewed-by: Oscar Salvador <osalvador@suse.de> Reviewed-by: Anshuman Khandual <anshuman.khandual@arm.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: "Peter Zijlstra (Intel)" <peterz@infradead.org> Cc: Mike Rapoport <rppt@kernel.org> Cc: Michal Hocko <mhocko@kernel.org> Cc: Wei Yang <richard.weiyang@linux.alibaba.com> Cc: "Gustavo A. R. Silva" <gustavoars@kernel.org> Cc: Sam Ravnborg <sam@ravnborg.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
-
David Hildenbrand authored
Patch series "mm: simplify free_highmem_page() and free_reserved_page()". Let's simplify and unify free_highmem_page() and free_reserved_page(). This patch (of 2): This function is never used and it is one of the last remaining user of __free_reserved_page(). Let's just drop it. Link: https://lkml.kernel.org/r/20210126182113.19892-1-david@redhat.com Link: https://lkml.kernel.org/r/20210126182113.19892-2-david@redhat.com Fixes: ffd29195 ("drivers/video/acornfb.c: remove dead code") Signed-off-by: David Hildenbrand <david@redhat.com> Reviewed-by: Oscar Salvador <osalvador@suse.de> Reviewed-by: Anshuman Khandual <anshuman.khandual@arm.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: "Peter Zijlstra (Intel)" <peterz@infradead.org> Cc: Mike Rapoport <rppt@kernel.org> Cc: Michal Hocko <mhocko@kernel.org> Cc: Wei Yang <richard.weiyang@linux.alibaba.com> Cc: "Gustavo A. R. Silva" <gustavoars@kernel.org> Cc: Sam Ravnborg <sam@ravnborg.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
-
Baoquan He authored
Local variable 'zone_start_pfn' is not needed since there's only one call site in free_area_init_core(). Let's remove it and pass zone->zone_start_pfn directly to init_currently_empty_zone(). Link: https://lkml.kernel.org/r/20210122135956.5946-6-bhe@redhat.comSigned-off-by: Baoquan He <bhe@redhat.com> Reviewed-by: Mike Rapoport <rppt@linux.ibm.com> Reviewed-by: David Hildenbrand <david@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
-
Baoquan He authored
Parameter 'zone' has got needed information, let's remove other unnecessary parameters. Link: https://lkml.kernel.org/r/20210122135956.5946-5-bhe@redhat.comSigned-off-by: Baoquan He <bhe@redhat.com> Reviewed-by: Mike Rapoport <rppt@linux.ibm.com> Reviewed-by: David Hildenbrand <david@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
-
Baoquan He authored
As David suggested, simply passing 'struct zone *zone' is enough. We can get all needed information from 'struct zone*' easily. Link: https://lkml.kernel.org/r/20210122135956.5946-4-bhe@redhat.comSigned-off-by: Baoquan He <bhe@redhat.com> Suggested-by: David Hildenbrand <david@redhat.com> Reviewed-by: Mike Rapoport <rppt@linux.ibm.com> Reviewed-by: David Hildenbrand <david@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
-
Baoquan He authored
The current memmap_init_zone() only handles memory region inside one zone, actually memmap_init() does the memmap init of one zone. So rename both of them accordingly. Link: https://lkml.kernel.org/r/20210122135956.5946-3-bhe@redhat.comSigned-off-by: Baoquan He <bhe@redhat.com> Reviewed-by: Mike Rapoport <rppt@linux.ibm.com> Reviewed-by: David Hildenbrand <david@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
-
Baoquan He authored
Patch series "mm: clean up names and parameters of memmap_init_xxxx functions", v5. This patchset corrects inappropriate function names of memmap_init_xxx, and simplify parameters of functions in the code flow. And also fix a prototype warning reported by lkp. This patch (of 5); Kernel test robot calling make with 'W=1' is triggering warning like below for memmap_init_zone() function. mm/page_alloc.c:6259:23: warning: no previous prototype for 'memmap_init_zone' [-Wmissing-prototypes] 6259 | void __meminit __weak memmap_init_zone(unsigned long size, int nid, | ^~~~~~~~~~~~~~~~ Fix it by adding the function declaration in include/linux/mm.h. Since memmap_init_zone() has a generic version with '__weak', the declaratoin in ia64 header file can be simply removed. Link: https://lkml.kernel.org/r/20210122135956.5946-1-bhe@redhat.com Link: https://lkml.kernel.org/r/20210122135956.5946-2-bhe@redhat.comSigned-off-by: Baoquan He <bhe@redhat.com> Reported-by: kernel test robot <lkp@intel.com> Reviewed-by: Mike Rapoport <rppt@linux.ibm.com> Reviewed-by: David Hildenbrand <david@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
-
Andrey Konovalov authored
Don't run KASAN tests when it's disabled with kasan.mode=off to avoid corrupting kernel memory. Link: https://linux-review.googlesource.com/id/I6447af436a69a94bfc35477f6bf4e2122948355e Link: https://lkml.kernel.org/r/25bd4fb5cae7b421d806a1f33fb633edd313f0c7.1610733117.git.andreyknvl@google.comSigned-off-by: Andrey Konovalov <andreyknvl@google.com> Reviewed-by: Marco Elver <elver@google.com> Reviewed-by: Alexander Potapenko <glider@google.com> Cc: Andrey Ryabinin <aryabinin@virtuozzo.com> Cc: Branislav Rankov <Branislav.Rankov@arm.com> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Dmitry Vyukov <dvyukov@google.com> Cc: Evgenii Stepanov <eugenis@google.com> Cc: Kevin Brodsky <kevin.brodsky@arm.com> Cc: Peter Collingbourne <pcc@google.com> Cc: Vincenzo Frascino <vincenzo.frascino@arm.com> Cc: Will Deacon <will.deacon@arm.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
-
Andrey Konovalov authored
Add a test for kmem_cache_alloc/free_bulk to make sure there are no false-positives when these functions are used. Link: https://linux-review.googlesource.com/id/I2a8bf797aecf81baeac61380c567308f319e263d Link: https://lkml.kernel.org/r/418122ebe4600771ac81e9ca6eab6740cf8dcfa1.1610733117.git.andreyknvl@google.comSigned-off-by: Andrey Konovalov <andreyknvl@google.com> Reviewed-by: Marco Elver <elver@google.com> Reviewed-by: Alexander Potapenko <glider@google.com> Cc: Andrey Ryabinin <aryabinin@virtuozzo.com> Cc: Branislav Rankov <Branislav.Rankov@arm.com> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Dmitry Vyukov <dvyukov@google.com> Cc: Evgenii Stepanov <eugenis@google.com> Cc: Kevin Brodsky <kevin.brodsky@arm.com> Cc: Peter Collingbourne <pcc@google.com> Cc: Vincenzo Frascino <vincenzo.frascino@arm.com> Cc: Will Deacon <will.deacon@arm.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
-
Andrey Konovalov authored
The currently existing page allocator tests rely on kmalloc fallback with large sizes that is only present for SLUB. Add proper tests that use alloc/free_pages(). Link: https://linux-review.googlesource.com/id/Ia173d5a1b215fe6b2548d814ef0f4433cf983570 Link: https://lkml.kernel.org/r/a2648930e55ff75b8e700f2e0d905c2b55a67483.1610733117.git.andreyknvl@google.comSigned-off-by: Andrey Konovalov <andreyknvl@google.com> Reviewed-by: Marco Elver <elver@google.com> Reviewed-by: Alexander Potapenko <glider@google.com> Cc: Andrey Ryabinin <aryabinin@virtuozzo.com> Cc: Branislav Rankov <Branislav.Rankov@arm.com> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Dmitry Vyukov <dvyukov@google.com> Cc: Evgenii Stepanov <eugenis@google.com> Cc: Kevin Brodsky <kevin.brodsky@arm.com> Cc: Peter Collingbourne <pcc@google.com> Cc: Vincenzo Frascino <vincenzo.frascino@arm.com> Cc: Will Deacon <will.deacon@arm.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
-
Andrey Konovalov authored
The currently existing kasan_check_read/write() annotations are intended to be used for kernel modules that have KASAN compiler instrumentation disabled. Thus, they are only relevant for the software KASAN modes that rely on compiler instrumentation. However there's another use case for these annotations: ksize() checks that the object passed to it is indeed accessible before unpoisoning the whole object. This is currently done via __kasan_check_read(), which is compiled away for the hardware tag-based mode that doesn't rely on compiler instrumentation. This leads to KASAN missing detecting some memory corruptions. Provide another annotation called kasan_check_byte() that is available for all KASAN modes. As the implementation rename and reuse kasan_check_invalid_free(). Use this new annotation in ksize(). To avoid having ksize() as the top frame in the reported stack trace pass _RET_IP_ to __kasan_check_byte(). Also add a new ksize_uaf() test that checks that a use-after-free is detected via ksize() itself, and via plain accesses that happen later. Link: https://linux-review.googlesource.com/id/Iaabf771881d0f9ce1b969f2a62938e99d3308ec5 Link: https://lkml.kernel.org/r/f32ad74a60b28d8402482a38476f02bb7600f620.1610733117.git.andreyknvl@google.comSigned-off-by: Andrey Konovalov <andreyknvl@google.com> Reviewed-by: Marco Elver <elver@google.com> Reviewed-by: Alexander Potapenko <glider@google.com> Cc: Andrey Ryabinin <aryabinin@virtuozzo.com> Cc: Branislav Rankov <Branislav.Rankov@arm.com> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Dmitry Vyukov <dvyukov@google.com> Cc: Evgenii Stepanov <eugenis@google.com> Cc: Kevin Brodsky <kevin.brodsky@arm.com> Cc: Peter Collingbourne <pcc@google.com> Cc: Vincenzo Frascino <vincenzo.frascino@arm.com> Cc: Will Deacon <will.deacon@arm.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
-
Andrey Konovalov authored
Generic mm functions that call KASAN annotations that might report a bug pass _RET_IP_ to them as an argument. This allows KASAN to include the name of the function that called the mm function in its report's header. Now that KASAN has inline wrappers for all of its annotations, move _RET_IP_ to those wrappers to simplify annotation call sites. Link: https://linux-review.googlesource.com/id/I8fb3c06d49671305ee184175a39591bc26647a67 Link: https://lkml.kernel.org/r/5c1490eddf20b436b8c4eeea83fce47687d5e4a4.1610733117.git.andreyknvl@google.comSigned-off-by: Andrey Konovalov <andreyknvl@google.com> Reviewed-by: Marco Elver <elver@google.com> Reviewed-by: Alexander Potapenko <glider@google.com> Cc: Andrey Ryabinin <aryabinin@virtuozzo.com> Cc: Branislav Rankov <Branislav.Rankov@arm.com> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Dmitry Vyukov <dvyukov@google.com> Cc: Evgenii Stepanov <eugenis@google.com> Cc: Kevin Brodsky <kevin.brodsky@arm.com> Cc: Peter Collingbourne <pcc@google.com> Cc: Vincenzo Frascino <vincenzo.frascino@arm.com> Cc: Will Deacon <will.deacon@arm.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
-
Andrey Konovalov authored
Since the hardware tag-based KASAN mode might not have a redzone that comes after an allocated object (when kasan.mode=prod is enabled), the kasan_bitops_tags() test ends up corrupting the next object in memory. Change the test so it always accesses the redzone that lies within the allocated object's boundaries. Link: https://linux-review.googlesource.com/id/I67f51d1ee48f0a8d0fe2658c2a39e4879fe0832a Link: https://lkml.kernel.org/r/7d452ce4ae35bb1988d2c9244dfea56cf2cc9315.1610733117.git.andreyknvl@google.comSigned-off-by: Andrey Konovalov <andreyknvl@google.com> Reviewed-by: Marco Elver <elver@google.com> Reviewed-by: Alexander Potapenko <glider@google.com> Cc: Andrey Ryabinin <aryabinin@virtuozzo.com> Cc: Branislav Rankov <Branislav.Rankov@arm.com> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Dmitry Vyukov <dvyukov@google.com> Cc: Evgenii Stepanov <eugenis@google.com> Cc: Kevin Brodsky <kevin.brodsky@arm.com> Cc: Peter Collingbourne <pcc@google.com> Cc: Vincenzo Frascino <vincenzo.frascino@arm.com> Cc: Will Deacon <will.deacon@arm.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
-
Andrey Konovalov authored
In the kmalloc_uaf2() test, the pointers to the two allocated memory blocks might happen to be the same, and the test will fail. With the software tag-based mode, the probability of the that is 1/254, so it's hard to observe the failure. For the hardware tag-based mode though, the probablity is 1/14, which is quite noticable. Allow up to 16 attempts at generating different tags for the tag-based modes. Link: https://linux-review.googlesource.com/id/Ibfa458ef2804ff465d8eb07434a300bf36388d55 Link: https://lkml.kernel.org/r/9cd5cf2f633dcbf55cab801cd26845d2b075cec7.1610733117.git.andreyknvl@google.comSigned-off-by: Andrey Konovalov <andreyknvl@google.com> Reviewed-by: Marco Elver <elver@google.com> Reviewed-by: Alexander Potapenko <glider@google.com> Cc: Andrey Ryabinin <aryabinin@virtuozzo.com> Cc: Branislav Rankov <Branislav.Rankov@arm.com> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Dmitry Vyukov <dvyukov@google.com> Cc: Evgenii Stepanov <eugenis@google.com> Cc: Kevin Brodsky <kevin.brodsky@arm.com> Cc: Peter Collingbourne <pcc@google.com> Cc: Vincenzo Frascino <vincenzo.frascino@arm.com> Cc: Will Deacon <will.deacon@arm.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
-
Andrey Konovalov authored
It might not be obvious to the compiler that the expression must be executed between writing and reading to fail_data. In this case, the compiler might reorder or optimize away some of the accesses, and the tests will fail. Add compiler barriers around the expression in KUNIT_EXPECT_KASAN_FAIL and use READ/WRITE_ONCE() for accessing fail_data fields. Link: https://linux-review.googlesource.com/id/I046079f48641a1d36fe627fc8827a9249102fd50 Link: https://lkml.kernel.org/r/6f11596f367d8ae8f71d800351e9a5d91eda19f6.1610733117.git.andreyknvl@google.comSigned-off-by: Andrey Konovalov <andreyknvl@google.com> Reviewed-by: Marco Elver <elver@google.com> Reviewed-by: Alexander Potapenko <glider@google.com> Cc: Andrey Ryabinin <aryabinin@virtuozzo.com> Cc: Branislav Rankov <Branislav.Rankov@arm.com> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Dmitry Vyukov <dvyukov@google.com> Cc: Evgenii Stepanov <eugenis@google.com> Cc: Kevin Brodsky <kevin.brodsky@arm.com> Cc: Peter Collingbourne <pcc@google.com> Cc: Vincenzo Frascino <vincenzo.frascino@arm.com> Cc: Will Deacon <will.deacon@arm.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
-
Andrey Konovalov authored
Rename CONFIG_TEST_KASAN_MODULE to CONFIG_KASAN_MODULE_TEST. This naming is more consistent with the existing CONFIG_KASAN_KUNIT_TEST. Link: https://linux-review.googlesource.com/id/Id347dfa5fe8788b7a1a189863e039f409da0ae5f Link: https://lkml.kernel.org/r/f08250246683981bcf8a094fbba7c361995624d2.1610733117.git.andreyknvl@google.comSigned-off-by: Andrey Konovalov <andreyknvl@google.com> Reviewed-by: Marco Elver <elver@google.com> Reviewed-by: Alexander Potapenko <glider@google.com> Cc: Andrey Ryabinin <aryabinin@virtuozzo.com> Cc: Branislav Rankov <Branislav.Rankov@arm.com> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Dmitry Vyukov <dvyukov@google.com> Cc: Evgenii Stepanov <eugenis@google.com> Cc: Kevin Brodsky <kevin.brodsky@arm.com> Cc: Peter Collingbourne <pcc@google.com> Cc: Vincenzo Frascino <vincenzo.frascino@arm.com> Cc: Will Deacon <will.deacon@arm.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
-
Andrey Konovalov authored
On a high level, this patch allows running KUnit KASAN tests with the hardware tag-based KASAN mode. Internally, this change reenables tag checking at the end of each KASAN test that triggers a tag fault and leads to tag checking being disabled. Also simplify is_write calculation in report_tag_fault. With this patch KASAN tests are still failing for the hardware tag-based mode; fixes come in the next few patches. [andreyknvl@google.com: export HW_TAGS symbols for KUnit tests] Link: https://lkml.kernel.org/r/e7eeb252da408b08f0c81b950a55fb852f92000b.1613155970.git.andreyknvl@google.com Link: https://linux-review.googlesource.com/id/Id94dc9eccd33b23cda4950be408c27f879e474c8 Link: https://lkml.kernel.org/r/51b23112cf3fd62b8f8e9df81026fa2b15870501.1610733117.git.andreyknvl@google.comSigned-off-by: Andrey Konovalov <andreyknvl@google.com> Reviewed-by: Catalin Marinas <catalin.marinas@arm.com> Reviewed-by: Vincenzo Frascino <vincenzo.frascino@arm.com> Cc: Alexander Potapenko <glider@google.com> Cc: Andrey Ryabinin <aryabinin@virtuozzo.com> Cc: Branislav Rankov <Branislav.Rankov@arm.com> Cc: Dmitry Vyukov <dvyukov@google.com> Cc: Evgenii Stepanov <eugenis@google.com> Cc: Kevin Brodsky <kevin.brodsky@arm.com> Cc: Marco Elver <elver@google.com> Cc: Peter Collingbourne <pcc@google.com> Cc: Will Deacon <will.deacon@arm.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
-
Andrey Konovalov authored
Add 3 new tests for tag-based KASAN modes: 1. Check that match-all pointer tag is not assigned randomly. 2. Check that 0xff works as a match-all pointer tag. 3. Check that there are no match-all memory tags. Note, that test #3 causes a significant number (255) of KASAN reports to be printed during execution for the SW_TAGS mode. [arnd@arndb.de: export kasan_poison] Link: https://lkml.kernel.org/r/20210125112831.2156212-1-arnd@kernel.org [akpm@linux-foundation.org: s/EXPORT_SYMBOL_GPL/EXPORT_SYMBOL/, per Andrey] Link: https://linux-review.googlesource.com/id/I78f1375efafa162b37f3abcb2c5bc2f3955dfd8e Link: https://lkml.kernel.org/r/da841a5408e2204bf25f3b23f70540a65844e8a4.1610733117.git.andreyknvl@google.comSigned-off-by: Andrey Konovalov <andreyknvl@google.com> Signed-off-by: Arnd Bergmann <arnd@arndb.de> Reviewed-by: Marco Elver <elver@google.com> Reviewed-by: Alexander Potapenko <glider@google.com> Cc: Andrey Ryabinin <aryabinin@virtuozzo.com> Cc: Branislav Rankov <Branislav.Rankov@arm.com> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Dmitry Vyukov <dvyukov@google.com> Cc: Evgenii Stepanov <eugenis@google.com> Cc: Kevin Brodsky <kevin.brodsky@arm.com> Cc: Peter Collingbourne <pcc@google.com> Cc: Vincenzo Frascino <vincenzo.frascino@arm.com> Cc: Will Deacon <will.deacon@arm.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
-
Andrey Konovalov authored
Some KASAN tests require specific kernel configs to be enabled. Instead of copy-pasting the checks for these configs add a few helper macros and use them. Link: https://linux-review.googlesource.com/id/I237484a7fddfedf4a4aae9cc61ecbcdbe85a0a63 Link: https://lkml.kernel.org/r/6a0fcdb9676b7e869cfc415893ede12d916c246c.1610733117.git.andreyknvl@google.comSigned-off-by: Andrey Konovalov <andreyknvl@google.com> Suggested-by: Alexander Potapenko <glider@google.com> Reviewed-by: Marco Elver <elver@google.com> Reviewed-by: Alexander Potapenko <glider@google.com> Cc: Andrey Ryabinin <aryabinin@virtuozzo.com> Cc: Branislav Rankov <Branislav.Rankov@arm.com> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Dmitry Vyukov <dvyukov@google.com> Cc: Evgenii Stepanov <eugenis@google.com> Cc: Kevin Brodsky <kevin.brodsky@arm.com> Cc: Peter Collingbourne <pcc@google.com> Cc: Vincenzo Frascino <vincenzo.frascino@arm.com> Cc: Will Deacon <will.deacon@arm.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
-
Andrey Konovalov authored
Clarify and update comments in KASAN tests. Link: https://linux-review.googlesource.com/id/I6c816c51fa1e0eb7aa3dead6bda1f339d2af46c8 Link: https://lkml.kernel.org/r/ba6db104d53ae0e3796f80ef395f6873c1c1282f.1610733117.git.andreyknvl@google.comSigned-off-by: Andrey Konovalov <andreyknvl@google.com> Reviewed-by: Marco Elver <elver@google.com> Reviewed-by: Alexander Potapenko <glider@google.com> Cc: Andrey Ryabinin <aryabinin@virtuozzo.com> Cc: Branislav Rankov <Branislav.Rankov@arm.com> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Dmitry Vyukov <dvyukov@google.com> Cc: Evgenii Stepanov <eugenis@google.com> Cc: Kevin Brodsky <kevin.brodsky@arm.com> Cc: Peter Collingbourne <pcc@google.com> Cc: Vincenzo Frascino <vincenzo.frascino@arm.com> Cc: Will Deacon <will.deacon@arm.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
-
Andrey Konovalov authored
Mention in the documentation that enabling CONFIG_KASAN_HW_TAGS always results in in-kernel TBI (Top Byte Ignore) being enabled. Also do a few minor documentation cleanups. Link: https://linux-review.googlesource.com/id/Iba2a6697e3c6304cb53f89ec61dedc77fa29e3ae Link: https://lkml.kernel.org/r/3b4ea6875bb14d312092ad14ac55cb456c83c08e.1610733117.git.andreyknvl@google.comSigned-off-by: Andrey Konovalov <andreyknvl@google.com> Reviewed-by: Marco Elver <elver@google.com> Reviewed-by: Alexander Potapenko <glider@google.com> Cc: Andrey Ryabinin <aryabinin@virtuozzo.com> Cc: Branislav Rankov <Branislav.Rankov@arm.com> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Dmitry Vyukov <dvyukov@google.com> Cc: Evgenii Stepanov <eugenis@google.com> Cc: Kevin Brodsky <kevin.brodsky@arm.com> Cc: Peter Collingbourne <pcc@google.com> Cc: Vincenzo Frascino <vincenzo.frascino@arm.com> Cc: Will Deacon <will.deacon@arm.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
-
Andrey Konovalov authored
Patch series "kasan: HW_TAGS tests support and fixes", v4. This patchset adds support for running KASAN-KUnit tests with the hardware tag-based mode and also contains a few fixes. This patch (of 15): There's a number of internal KASAN functions that are used across multiple source code files and therefore aren't marked as static inline. To avoid littering the kernel function names list with generic function names, prefix all such KASAN functions with kasan_. As a part of this change: - Rename internal (un)poison_range() to kasan_(un)poison() (no _range) to avoid name collision with a public kasan_unpoison_range(). - Rename check_memory_region() to kasan_check_range(), as it's a more fitting name. Link: https://lkml.kernel.org/r/cover.1610733117.git.andreyknvl@google.com Link: https://linux-review.googlesource.com/id/I719cc93483d4ba288a634dba80ee6b7f2809cd26 Link: https://lkml.kernel.org/r/13777aedf8d3ebbf35891136e1f2287e2f34aaba.1610733117.git.andreyknvl@google.comSigned-off-by: Andrey Konovalov <andreyknvl@google.com> Suggested-by: Marco Elver <elver@google.com> Reviewed-by: Marco Elver <elver@google.com> Reviewed-by: Alexander Potapenko <glider@google.com> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Vincenzo Frascino <vincenzo.frascino@arm.com> Cc: Dmitry Vyukov <dvyukov@google.com> Cc: Will Deacon <will.deacon@arm.com> Cc: Andrey Ryabinin <aryabinin@virtuozzo.com> Cc: Peter Collingbourne <pcc@google.com> Cc: Evgenii Stepanov <eugenis@google.com> Cc: Branislav Rankov <Branislav.Rankov@arm.com> Cc: Kevin Brodsky <kevin.brodsky@arm.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
-
Yang Li authored
Fix below warnings reported by coccicheck: fs/proc/vmcore.c:1503:2-7: WARNING: NULL check before some freeing functions is not needed. Link: https://lkml.kernel.org/r/1611216753-44598-1-git-send-email-abaci-bugfix@linux.alibaba.comSigned-off-by: Yang Li <abaci-bugfix@linux.alibaba.com> Reported-by: Abaci Robot <abaci@linux.alibaba.com> Acked-by: Baoquan He <bhe@redhat.com> Cc: Dave Young <dyoung@redhat.com> Cc: Vivek Goyal <vgoyal@redhat.com> Cc: Alexey Dobriyan <adobriyan@gmail.com> Cc: "Uladzislau Rezki (Sony)" <urezki@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
-
sh_def@163.com authored
Replace '&next->lru != list' with list_entry_is_head(). No functional change. Link: https://lkml.kernel.org/r/20201222182735.GA1257912@ubuntu-A520I-ACSigned-off-by: sh <sh_def@163.com> Reviewed-by: Andrew Morton <akpm@linux-foundation.org> Cc: Alexander Duyck <alexander.h.duyck@linux.intel.com> Reviewed-by: David Hildenbrand <david@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
-
Li Xinhai authored
mremap with MREMAP_DONTUNMAP can move all page table entries to new vma, which means all pages allocated for the old vma are not relevant to it anymore, and the relevant anon_vma links needs to be unlinked, in nature the old vma is much like been freshly created and have no pages been fault in. But we should not do unlink, if the new vma has effectively merged with the old one. [lixinhai.lxh@gmail.com: v2] Link: https://lkml.kernel.org/r/20210127083917.309264-2-lixinhai.lxh@gmail.com Link: https://lkml.kernel.org/r/20210119075126.3513154-2-lixinhai.lxh@gmail.comSigned-off-by: Li Xinhai <lixinhai.lxh@gmail.com> Cc: Brian Geffon <bgeffon@google.com> Cc: Lokesh Gidra <lokeshgidra@google.com> Cc: Minchan Kim <minchan@kernel.org> Cc: Kirill A. Shutemov <kirill.shutemov@linux.intel.com> Cc: Vlastimil Babka <vbabka@suse.cz> Cc: Andrea Arcangeli <aarcange@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
-
Li Xinhai authored
In case the vma will continue to be used after unlink its relevant anon_vma, we need to reset the vma->anon_vma pointer to NULL. So, later when fault happen within this vma again, a new anon_vma will be prepared. By this way, the vma will only be checked for reverse mapping of pages which been fault in after the unlink_anon_vmas call. Currently, the mremap with MREMAP_DONTUNMAP scenario will continue use the vma after moved its page table entries to a new vma. For other scenarios, the vma itself will be freed after call unlink_anon_vmas. Link: https://lkml.kernel.org/r/20210119075126.3513154-1-lixinhai.lxh@gmail.comSigned-off-by: Li Xinhai <lixinhai.lxh@gmail.com> Cc: Andrea Arcangeli <aarcange@redhat.com> Cc: Brian Geffon <bgeffon@google.com> Cc: Kirill A. Shutemov <kirill.shutemov@linux.intel.com> Cc: Lokesh Gidra <lokeshgidra@google.com> Cc: Minchan Kim <minchan@kernel.org> Cc: Vlastimil Babka <vbabka@suse.cz> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
-