- 22 Aug, 2018 19 commits
-
-
Vlastimil Babka authored
To prepare for handling /proc/pid/smaps_rollup differently from /proc/pid/smaps factor out from show_smap() printing the parts of output that are common for both variants, which is the bulk of the gathered memory stats. [vbabka@suse.cz: add const, per Alexey] Link: http://lkml.kernel.org/r/b45f319f-cd04-337b-37f8-77f99786aa8a@suse.cz Link: http://lkml.kernel.org/r/20180723111933.15443-4-vbabka@suse.czSigned-off-by: Vlastimil Babka <vbabka@suse.cz> Reviewed-by: Alexey Dobriyan <adobriyan@gmail.com> Cc: Daniel Colascione <dancol@google.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
-
Vlastimil Babka authored
To prepare for handling /proc/pid/smaps_rollup differently from /proc/pid/smaps factor out vma mem stats gathering from show_smap() - it will be used by both. Link: http://lkml.kernel.org/r/20180723111933.15443-3-vbabka@suse.czSigned-off-by: Vlastimil Babka <vbabka@suse.cz> Reviewed-by: Alexey Dobriyan <adobriyan@gmail.com> Cc: Daniel Colascione <dancol@google.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
-
Vlastimil Babka authored
Patch series "cleanups and refactor of /proc/pid/smaps*". The recent regression in /proc/pid/smaps made me look more into the code. Especially the issues with smaps_rollup reported in [1] as explained in Patch 4, which fixes them by refactoring the code. Patches 2 and 3 are preparations for that. Patch 1 is me realizing that there's a lot of boilerplate left from times where we tried (unsuccessfuly) to mark thread stacks in the output. Originally I had also plans to rework the translation from /proc/pid/*maps* file offsets to the internal structures. Now the offset means "vma number", which is not really stable (vma's can come and go between read() calls) and there's an extra caching of last vma's address. My idea was that offsets would be interpreted directly as addresses, which would also allow meaningful seeks (see the ugly seek_to_smaps_entry() in tools/testing/selftests/vm/mlock2.h). However loff_t is (signed) long long so that might be insufficient somewhere for the unsigned long addresses. So the result is fixed issues with skewed /proc/pid/smaps_rollup results, simpler smaps code, and a lot of unused code removed. [1] https://marc.info/?l=linux-mm&m=151927723128134&w=2 This patch (of 4): Commit b7643757 ("procfs: mark thread stack correctly in proc/<pid>/maps") introduced differences between /proc/PID/maps and /proc/PID/task/TID/maps to mark thread stacks properly, and this was also done for smaps and numa_maps. However it didn't work properly and was ultimately removed by commit b18cb64e ("fs/proc: Stop trying to report thread stacks"). Now the is_pid parameter for the related show_*() functions is unused and we can remove it together with wrapper functions and ops structures that differ for PID and TID cases only in this parameter. Link: http://lkml.kernel.org/r/20180723111933.15443-2-vbabka@suse.czSigned-off-by: Vlastimil Babka <vbabka@suse.cz> Reviewed-by: Alexey Dobriyan <adobriyan@gmail.com> Cc: Daniel Colascione <dancol@google.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
-
Michal Hocko authored
Andrew has noticed some inconsistencies in oom_reap_task_mm. Notably - Undocumented return value. - comment "failed to reap part..." is misleading - sounds like it's referring to something which happened in the past, is in fact referring to something which might happen in the future. - fails to call trace_finish_task_reaping() in one case - code duplication. - Increases mmap_sem hold time a little by moving trace_finish_task_reaping() inside the locked region. So sue me ;) - Sharing the finish: path means that the trace event won't distinguish between the two sources of finishing. Add a short explanation for the return value and fix the rest by reorganizing the function a bit to have unified function exit paths. Link: http://lkml.kernel.org/r/20180724141747.GP28386@dhcp22.suse.czSuggested-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Michal Hocko <mhocko@suse.com> Reviewed-by: Andrew Morton <akpm@linux-foundation.org> Cc: Tetsuo Handa <penguin-kernel@i-love.sakura.ne.jp> Cc: David Rientjes <rientjes@google.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
-
Rodrigo Freire authored
The default page memory unit of OOM task dump events might not be intuitive and potentially misleading for the non-initiated when debugging OOM events: These are pages and not kBs. Add a small printk prior to the task dump informing that the memory units are actually memory _pages_. Also extends PID field to align on up to 7 characters. Reference https://lkml.org/lkml/2018/7/3/1201 Link: http://lkml.kernel.org/r/c795eb5129149ed8a6345c273aba167ff1bbd388.1530715938.git.rfreire@redhat.comSigned-off-by: Rodrigo Freire <rfreire@redhat.com> Acked-by: David Rientjes <rientjes@google.com> Acked-by: Rafael Aquini <aquini@redhat.com> Cc: Michal Hocko <mhocko@kernel.org> Cc: Tetsuo Handa <penguin-kernel@i-love.sakura.ne.jp> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
-
Michal Hocko authored
oom_reaper used to rely on the oom_lock since e2fe1456 ("oom_reaper: close race with exiting task"). We do not really need the lock anymore though. 21292580 ("mm: oom: let oom_reap_task and exit_mmap run concurrently") has removed serialization with the exit path based on the mm reference count and so we do not really rely on the oom_lock anymore. Tetsuo was arguing that at least MMF_OOM_SKIP should be set under the lock to prevent from races when the page allocator didn't manage to get the freed (reaped) memory in __alloc_pages_may_oom but it sees the flag later on and move on to another victim. Although this is possible in principle let's wait for it to actually happen in real life before we make the locking more complex again. Therefore remove the oom_lock for oom_reaper paths (both exit_mmap and oom_reap_task_mm). The reaper serializes with exit_mmap by mmap_sem + MMF_OOM_SKIP flag. There is no synchronization with out_of_memory path now. [mhocko@kernel.org: oom_reap_task_mm should return false when __oom_reap_task_mm did] Link: http://lkml.kernel.org/r/20180724141747.GP28386@dhcp22.suse.cz Link: http://lkml.kernel.org/r/20180719075922.13784-1-mhocko@kernel.orgSigned-off-by: Michal Hocko <mhocko@suse.com> Suggested-by: David Rientjes <rientjes@google.com> Acked-by: David Rientjes <rientjes@google.com> Cc: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
-
Michal Hocko authored
There are several blockable mmu notifiers which might sleep in mmu_notifier_invalidate_range_start and that is a problem for the oom_reaper because it needs to guarantee a forward progress so it cannot depend on any sleepable locks. Currently we simply back off and mark an oom victim with blockable mmu notifiers as done after a short sleep. That can result in selecting a new oom victim prematurely because the previous one still hasn't torn its memory down yet. We can do much better though. Even if mmu notifiers use sleepable locks there is no reason to automatically assume those locks are held. Moreover majority of notifiers only care about a portion of the address space and there is absolutely zero reason to fail when we are unmapping an unrelated range. Many notifiers do really block and wait for HW which is harder to handle and we have to bail out though. This patch handles the low hanging fruit. __mmu_notifier_invalidate_range_start gets a blockable flag and callbacks are not allowed to sleep if the flag is set to false. This is achieved by using trylock instead of the sleepable lock for most callbacks and continue as long as we do not block down the call chain. I think we can improve that even further because there is a common pattern to do a range lookup first and then do something about that. The first part can be done without a sleeping lock in most cases AFAICS. The oom_reaper end then simply retries if there is at least one notifier which couldn't make any progress in !blockable mode. A retry loop is already implemented to wait for the mmap_sem and this is basically the same thing. The simplest way for driver developers to test this code path is to wrap userspace code which uses these notifiers into a memcg and set the hard limit to hit the oom. This can be done e.g. after the test faults in all the mmu notifier managed memory and set the hard limit to something really small. Then we are looking for a proper process tear down. [akpm@linux-foundation.org: coding style fixes] [akpm@linux-foundation.org: minor code simplification] Link: http://lkml.kernel.org/r/20180716115058.5559-1-mhocko@kernel.orgSigned-off-by: Michal Hocko <mhocko@suse.com> Acked-by: Christian König <christian.koenig@amd.com> # AMD notifiers Acked-by: Leon Romanovsky <leonro@mellanox.com> # mlx and umem_odp Reported-by: David Rientjes <rientjes@google.com> Cc: "David (ChunMing) Zhou" <David1.Zhou@amd.com> Cc: Paolo Bonzini <pbonzini@redhat.com> Cc: Alex Deucher <alexander.deucher@amd.com> Cc: David Airlie <airlied@linux.ie> Cc: Jani Nikula <jani.nikula@linux.intel.com> Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Cc: Rodrigo Vivi <rodrigo.vivi@intel.com> Cc: Doug Ledford <dledford@redhat.com> Cc: Jason Gunthorpe <jgg@ziepe.ca> Cc: Mike Marciniszyn <mike.marciniszyn@intel.com> Cc: Dennis Dalessandro <dennis.dalessandro@intel.com> Cc: Sudeep Dutt <sudeep.dutt@intel.com> Cc: Ashutosh Dixit <ashutosh.dixit@intel.com> Cc: Dimitri Sivanich <sivanich@sgi.com> Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com> Cc: Juergen Gross <jgross@suse.com> Cc: "Jérôme Glisse" <jglisse@redhat.com> Cc: Andrea Arcangeli <aarcange@redhat.com> Cc: Felix Kuehling <felix.kuehling@amd.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
-
Huang Ying authored
In this patch, locking related code is shared between huge/normal code path in put_swap_page() to reduce code duplication. The `free_entries == 0` case is merged into the more general `free_entries != SWAPFILE_CLUSTER` case, because the new locking method makes it easy. The added lines is same as the removed lines. But the code size is increased when CONFIG_TRANSPARENT_HUGEPAGE=n. text data bss dec hex filename base: 24123 2004 340 26467 6763 mm/swapfile.o unified: 24485 2004 340 26829 68cd mm/swapfile.o Dig on step deeper with `size -A mm/swapfile.o` for base and unified kernel and compare the result, yields, -.text 17723 0 +.text 17835 0 -.orc_unwind_ip 1380 0 +.orc_unwind_ip 1480 0 -.orc_unwind 2070 0 +.orc_unwind 2220 0 -Total 26686 +Total 27048 The total difference is the same. The text segment difference is much smaller: 112. More difference comes from the ORC unwinder segments: (1480 + 2220) - (1380 + 2070) = 250. If the frame pointer unwinder is used, this costs nothing. Link: http://lkml.kernel.org/r/20180720071845.17920-9-ying.huang@intel.comSigned-off-by: "Huang, Ying" <ying.huang@intel.com> Reviewed-by: Daniel Jordan <daniel.m.jordan@oracle.com> Acked-by: Dave Hansen <dave.hansen@linux.intel.com> Cc: Michal Hocko <mhocko@suse.com> Cc: Johannes Weiner <hannes@cmpxchg.org> Cc: Shaohua Li <shli@kernel.org> Cc: Hugh Dickins <hughd@google.com> Cc: Minchan Kim <minchan@kernel.org> Cc: Rik van Riel <riel@redhat.com> Cc: Dan Williams <dan.j.williams@intel.com> Cc: Matthew Wilcox <willy@infradead.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
-
Huang Ying authored
The part of __swap_entry_free() with lock held is separated into a new function __swap_entry_free_locked(). Because we want to reuse that piece of code in some other places. Just mechanical code refactoring, there is no any functional change in this function. Link: http://lkml.kernel.org/r/20180720071845.17920-8-ying.huang@intel.comSigned-off-by: "Huang, Ying" <ying.huang@intel.com> Reviewed-by: Daniel Jordan <daniel.m.jordan@oracle.com> Acked-by: Dave Hansen <dave.hansen@linux.intel.com> Cc: Michal Hocko <mhocko@suse.com> Cc: Johannes Weiner <hannes@cmpxchg.org> Cc: Shaohua Li <shli@kernel.org> Cc: Hugh Dickins <hughd@google.com> Cc: Minchan Kim <minchan@kernel.org> Cc: Rik van Riel <riel@redhat.com> Cc: Dan Williams <dan.j.williams@intel.com> Cc: Matthew Wilcox <willy@infradead.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
-
Huang Ying authored
As suggested by Matthew Wilcox, it is better to use "int entry_size" instead of "bool cluster" as parameter to specify whether to operate for huge or normal swap entries. Because this improve the flexibility to support other swap entry size. And Dave Hansen thinks that this improves code readability too. So in this patch, the "bool cluster" parameter of get_swap_pages() is replaced by "int entry_size". And nr_swap_entries() trick is used to reduce the binary size when !CONFIG_TRANSPARENT_HUGE_PAGE. text data bss dec hex filename base 24215 2028 340 26583 67d7 mm/swapfile.o head 24123 2004 340 26467 6763 mm/swapfile.o Link: http://lkml.kernel.org/r/20180720071845.17920-7-ying.huang@intel.comSigned-off-by: "Huang, Ying" <ying.huang@intel.com> Suggested-by: Matthew Wilcox <willy@infradead.org> Acked-by: Dave Hansen <dave.hansen@linux.intel.com> Cc: Daniel Jordan <daniel.m.jordan@oracle.com> Cc: Michal Hocko <mhocko@suse.com> Cc: Johannes Weiner <hannes@cmpxchg.org> Cc: Shaohua Li <shli@kernel.org> Cc: Hugh Dickins <hughd@google.com> Cc: Minchan Kim <minchan@kernel.org> Cc: Rik van Riel <riel@redhat.com> Cc: Dan Williams <dan.j.williams@intel.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
-
Huang Ying authored
In this patch, the normal/huge code path in put_swap_page() and several helper functions are unified to avoid duplicated code, bugs, etc. and make it easier to review the code. The removed lines are more than added lines. And the binary size is kept exactly same when CONFIG_TRANSPARENT_HUGEPAGE=n. Link: http://lkml.kernel.org/r/20180720071845.17920-6-ying.huang@intel.comSigned-off-by: "Huang, Ying" <ying.huang@intel.com> Suggested-by: Dave Hansen <dave.hansen@linux.intel.com> Acked-by: Dave Hansen <dave.hansen@linux.intel.com> Reviewed-by: Daniel Jordan <daniel.m.jordan@oracle.com> Cc: Michal Hocko <mhocko@suse.com> Cc: Johannes Weiner <hannes@cmpxchg.org> Cc: Shaohua Li <shli@kernel.org> Cc: Hugh Dickins <hughd@google.com> Cc: Minchan Kim <minchan@kernel.org> Cc: Rik van Riel <riel@redhat.com> Cc: Dan Williams <dan.j.williams@intel.com> Cc: Matthew Wilcox <willy@infradead.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
-
Huang Ying authored
As suggested by Dave, we should unify the code path for normal and huge swap support if possible to avoid duplicated code, bugs, etc. and make it easier to review code. In this patch, the normal/huge code path in swap_page_trans_huge_swapped() is unified, the added and removed lines are same. And the binary size is kept almost same when CONFIG_TRANSPARENT_HUGEPAGE=n. text data bss dec hex filename base: 24179 2028 340 26547 67b3 mm/swapfile.o unified: 24215 2028 340 26583 67d7 mm/swapfile.o Link: http://lkml.kernel.org/r/20180720071845.17920-5-ying.huang@intel.comSigned-off-by: "Huang, Ying" <ying.huang@intel.com> Suggested-and-acked-by: Dave Hansen <dave.hansen@linux.intel.com> Reviewed-by: Daniel Jordan <daniel.m.jordan@oracle.com> Cc: Michal Hocko <mhocko@suse.com> Cc: Johannes Weiner <hannes@cmpxchg.org> Cc: Shaohua Li <shli@kernel.org> Cc: Hugh Dickins <hughd@google.com> Cc: Minchan Kim <minchan@kernel.org> Cc: Rik van Riel <riel@redhat.com> Cc: Dan Williams <dan.j.williams@intel.com> Cc: Matthew Wilcox <willy@infradead.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
-
Huang Ying authored
In swap_page_trans_huge_swapped(), to identify whether there's any page table mapping for a 4k sized swap entry, "si->swap_map[i] != SWAP_HAS_CACHE" is used. This works correctly now, because all users of the function will only call it after checking SWAP_HAS_CACHE. But as pointed out by Daniel, it is better to use "swap_count(map[i])" here, because it works for "map[i] == 0" case too. And this makes the implementation more consistent between normal and huge swap entry. Link: http://lkml.kernel.org/r/20180720071845.17920-4-ying.huang@intel.comSigned-off-by: "Huang, Ying" <ying.huang@intel.com> Suggested-and-reviewed-by: Daniel Jordan <daniel.m.jordan@oracle.com> Acked-by: Dave Hansen <dave.hansen@linux.intel.com> Cc: Michal Hocko <mhocko@suse.com> Cc: Johannes Weiner <hannes@cmpxchg.org> Cc: Shaohua Li <shli@kernel.org> Cc: Hugh Dickins <hughd@google.com> Cc: Minchan Kim <minchan@kernel.org> Cc: Rik van Riel <riel@redhat.com> Cc: Dan Williams <dan.j.williams@intel.com> Cc: Matthew Wilcox <willy@infradead.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
-
Huang Ying authored
In mm/swapfile.c, THP (Transparent Huge Page) swap specific code is enclosed by #ifdef CONFIG_THP_SWAP/#endif to avoid code dilating when THP isn't enabled. But #ifdef/#endif in .c file hurt the code readability, so Dave suggested to use IS_ENABLED(CONFIG_THP_SWAP) instead and let compiler to do the dirty job for us. This has potential to remove some duplicated code too. From output of `size`, text data bss dec hex filename THP=y: 26269 2076 340 28685 700d mm/swapfile.o ifdef/endif: 24115 2028 340 26483 6773 mm/swapfile.o IS_ENABLED: 24179 2028 340 26547 67b3 mm/swapfile.o IS_ENABLED() based solution works quite well, almost as good as that of #ifdef/#endif. And from the diffstat, the removed lines are more than added lines. One #ifdef for split_swap_cluster() is kept. Because it is a public function with a stub implementation for CONFIG_THP_SWAP=n in swap.h. Link: http://lkml.kernel.org/r/20180720071845.17920-3-ying.huang@intel.comSigned-off-by: "Huang, Ying" <ying.huang@intel.com> Suggested-and-acked-by: Dave Hansen <dave.hansen@linux.intel.com> Reviewed-by: Daniel Jordan <daniel.m.jordan@oracle.com> Reviewed-by: Andrew Morton <akpm@linux-foundation.org> Cc: Michal Hocko <mhocko@suse.com> Cc: Johannes Weiner <hannes@cmpxchg.org> Cc: Shaohua Li <shli@kernel.org> Cc: Hugh Dickins <hughd@google.com> Cc: Minchan Kim <minchan@kernel.org> Cc: Rik van Riel <riel@redhat.com> Cc: Dan Williams <dan.j.williams@intel.com> Cc: Matthew Wilcox <willy@infradead.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
-
Huang Ying authored
Patch series "swap: THP optimizing refactoring", v4. Now the THP (Transparent Huge Page) swap optimizing is implemented in the way like below, #ifdef CONFIG_THP_SWAP huge_function(...) { } #else normal_function(...) { } #endif general_function(...) { if (huge) return thp_function(...); else return normal_function(...); } As pointed out by Dave Hansen, this will, 1. Create a new, wholly untested code path for huge page 2. Create two places to patch bugs 3. Are not reusing code when possible This patchset is to address these problems via merging huge/normal code path/functions if possible. One concern is that this may cause code size to dilate when !CONFIG_TRANSPARENT_HUGEPAGE. The data shows that most refactoring will only cause quite slight code size increase. This patch (of 8): To improve code readability. Link: http://lkml.kernel.org/r/20180720071845.17920-2-ying.huang@intel.comSigned-off-by: "Huang, Ying" <ying.huang@intel.com> Suggested-and-acked-by: Dave Hansen <dave.hansen@linux.intel.com> Reviewed-by: Daniel Jordan <daniel.m.jordan@oracle.com> Reviewed-by: Andrew Morton <akpm@linux-foundation.org> Cc: Michal Hocko <mhocko@suse.com> Cc: Johannes Weiner <hannes@cmpxchg.org> Cc: Shaohua Li <shli@kernel.org> Cc: Hugh Dickins <hughd@google.com> Cc: Minchan Kim <minchan@kernel.org> Cc: Rik van Riel <riel@redhat.com> Cc: Dan Williams <dan.j.williams@intel.com> Cc: Matthew Wilcox <willy@infradead.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
-
Kirill Tkhai authored
Currently, there are two flags only, so unsigned is more then enough. Also, move int seeks to keep these fields together. Link: http://lkml.kernel.org/r/153199748720.21131.6476256940113102483.stgit@localhost.localdomainSigned-off-by: Kirill Tkhai <ktkhai@virtuozzo.com> Acked-by: Michal Hocko <mhocko@suse.com> Cc: Vladimir Davydov <vdavydov.dev@gmail.com> Cc: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp> Cc: Chris Wilson <chris@chris-wilson.co.uk> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
-
Kirill Tkhai authored
Patch series "Reorderings in struct shrinker and struct shrink_control". These structures are intensively used during reclaim and, displace other data in cache, so there is no a reason they have int fields not grouped together. This patch (of 2): gfp_t is of unsigned type, so let's move nid to keep them together. Link: http://lkml.kernel.org/r/153199747930.21131.861043607301997810.stgit@localhost.localdomainSigned-off-by: Kirill Tkhai <ktkhai@virtuozzo.com> Acked-by: Michal Hocko <mhocko@suse.com> Cc: Vladimir Davydov <vdavydov.dev@gmail.com> Cc: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp> Cc: Chris Wilson <chris@chris-wilson.co.uk> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
-
Kirill Tkhai authored
There is a sad BUG introduced in patch adding SHRINKER_REGISTERING. shrinker_idr business is only for memcg-aware shrinkers. Only such type of shrinkers have id and they must be finaly installed via idr_replace() in this function. For !memcg-aware shrinkers we never initialize shrinker->id field. But there are all types of shrinkers passed to idr_replace(), and every !memcg-aware shrinker with random ID (most probably, its id is 0) replaces memcg-aware shrinker pointed by the ID in IDR. This patch fixes the problem. Link: http://lkml.kernel.org/r/8ff8a793-8211-713a-4ed9-d6e52390c2fc@virtuozzo.com Fixes: 7e010df5 "mm: use special value SHRINKER_REGISTERING instead of list_empty() check" Signed-off-by: Kirill Tkhai <ktkhai@virtuozzo.com> Reported-by: <syzbot+d5f648a1bfe15678786b@syzkaller.appspotmail.com> Cc: Andrey Ryabinin <aryabinin@virtuozzo.com> Cc: Johannes Weiner <hannes@cmpxchg.org> Cc: Josef Bacik <jbacik@fb.com> Cc: Mel Gorman <mgorman@techsingularity.net> Cc: Michal Hocko <mhocko@suse.com> Cc: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp> Cc: Shakeel Butt <shakeelb@google.com> Cc: <syzkaller-bugs@googlegroups.com> Cc: Huang Ying <ying.huang@intel.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
-
Ian Kent authored
autofs_sbi() does not check the superblock magic number to verify it has been given an autofs super block. Link: http://lkml.kernel.org/r/153475422934.17131.7563724552005298277.stgit@pluto.themaw.net Reported-by: <syzbot+87c3c541582e56943277@syzkaller.appspotmail.com> Signed-off-by: Ian Kent <raven@themaw.net> Reviewed-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
-
- 21 Aug, 2018 4 commits
-
-
git://git.kernel.org/pub/scm/linux/kernel/git/aegl/linuxLinus Torvalds authored
Pull ia64 NO_BOOTMEM conversion from Tony Luck: "Mike Rapoport kindly fixed up ia64 to work with NO_BOOTMEM" * tag 'please-pull-noboot' of git://git.kernel.org/pub/scm/linux/kernel/git/aegl/linux: ia64: switch to NO_BOOTMEM ia64: use mem_data to detect nodes' minimal and maximal PFNs ia64: remove unused num_dma_physpages member from 'struct early_node_data' ia64: contig/paging_init: reduce code duplication
-
Linus Torvalds authored
Merge tag 'linux-kselftest-4.19-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/shuah/linux-kselftest Pull Kselftest update from Shuah Khan: - add cgroup core selftests - fix compile warnings in android ion test - fix to bugs in exclude and skip paths in vDSO test - remove obsolete config options - add missing .gitignore file * tag 'linux-kselftest-4.19-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/shuah/linux-kselftest: selftests/ftrace: Fix kprobe string testcase to not probe notrace function selftests: mount: remove no longer needed config option selftests: cgroup: add gitignore file Add cgroup core selftests selftests: vDSO - fix to return KSFT_SKIP when test couldn't be run selftests: vDSO - fix to exclude x86 test on non-x86 platforms selftests/android: initialize heap_type to avoid compiling warning
-
git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-traceLinus Torvalds authored
Pull tracing updates from Steven Rostedt: - Restructure of lockdep and latency tracers This is the biggest change. Joel Fernandes restructured the hooks from irqs and preemption disabling and enabling. He got rid of a lot of the preprocessor #ifdef mess that they caused. He turned both lockdep and the latency tracers to use trace events inserted in the preempt/irqs disabling paths. But unfortunately, these started to cause issues in corner cases. Thus, parts of the code was reverted back to where lockdep and the latency tracers just get called directly (without using the trace events). But because the original change cleaned up the code very nicely we kept that, as well as the trace events for preempt and irqs disabling, but they are limited to not being called in NMIs. - Have trace events use SRCU for "rcu idle" calls. This was required for the preempt/irqs off trace events. But it also had to not allow them to be called in NMI context. Waiting till Paul makes an NMI safe SRCU API. - New notrace SRCU API to allow trace events to use SRCU. - Addition of mcount-nop option support - SPDX headers replacing GPL templates. - Various other fixes and clean ups. - Some fixes are marked for stable, but were not fully tested before the merge window opened. * tag 'trace-v4.19' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace: (44 commits) tracing: Fix SPDX format headers to use C++ style comments tracing: Add SPDX License format tags to tracing files tracing: Add SPDX License format to bpf_trace.c blktrace: Add SPDX License format header s390/ftrace: Add -mfentry and -mnop-mcount support tracing: Add -mcount-nop option support tracing: Avoid calling cc-option -mrecord-mcount for every Makefile tracing: Handle CC_FLAGS_FTRACE more accurately Uprobe: Additional argument arch_uprobe to uprobe_write_opcode() Uprobes: Simplify uprobe_register() body tracepoints: Free early tracepoints after RCU is initialized uprobes: Use synchronize_rcu() not synchronize_sched() tracing: Fix synchronizing to event changes with tracepoint_synchronize_unregister() ftrace: Remove unused pointer ftrace_swapper_pid tracing: More reverting of "tracing: Centralize preemptirq tracepoints and unify their usage" tracing/irqsoff: Handle preempt_count for different configs tracing: Partial revert of "tracing: Centralize preemptirq tracepoints and unify their usage" tracing: irqsoff: Account for additional preempt_disable trace: Use rcu_dereference_raw for hooks from trace-event subsystem tracing/kprobes: Fix within_notrace_func() to check only notrace functions ...
-
git://github.com/ceph/ceph-clientLinus Torvalds authored
Pull ceph updates from Ilya Dryomov: "The main things are support for cephx v2 authentication protocol and basic support for rbd images within namespaces (myself). Also included are y2038 conversion patches from Arnd, a pile of miscellaneous fixes from Chengguang and Zheng's feature bit infrastructure for the filesystem" * tag 'ceph-for-4.19-rc1' of git://github.com/ceph/ceph-client: (40 commits) ceph: don't drop message if it contains more data than expected ceph: support cephfs' own feature bits crush: fix using plain integer as NULL warning libceph: remove unnecessary non NULL check for request_key ceph: refactor error handling code in ceph_reserve_caps() ceph: refactor ceph_unreserve_caps() ceph: change to void return type for __do_request() ceph: compare fsc->max_file_size and inode->i_size for max file size limit ceph: add additional size check in ceph_setattr() ceph: add additional offset check in ceph_write_iter() ceph: add additional range check in ceph_fallocate() ceph: add new field max_file_size in ceph_fs_client libceph: weaken sizeof check in ceph_x_verify_authorizer_reply() libceph: check authorizer reply/challenge length before reading libceph: implement CEPHX_V2 calculation mode libceph: add authorizer challenge libceph: factor out encrypt_authorizer() libceph: factor out __ceph_x_decrypt() libceph: factor out __prepare_write_connect() libceph: store ceph_auth_handshake pointer in ceph_connection ...
-
- 20 Aug, 2018 17 commits
-
-
git://git.kernel.org/pub/scm/linux/kernel/git/abelloni/linuxLinus Torvalds authored
Pull RTC updates from Alexandre Belloni: "It is now possible to add custom sysfs attributes while avoiding a possible race condition. Unused code has been removed resulting in a nice reduction of the code base. And more drivers have been switched to SPDX by their maintainers. Summary: Subsystem: - new helpers to add custom sysfs attributes - struct rtc_task removal along with rtc_irq_[un]register() - rtc_irq_set_state and rtc_irq_set_freq are not exported anymore Drivers: - armada38x: reset after rtc power loss - ds1307: now supports m41t11 - isl1208: now supports isl1219 and tamper detection - pcf2127: internal SRAM support" * tag 'rtc-4.19' of git://git.kernel.org/pub/scm/linux/kernel/git/abelloni/linux: (34 commits) rtc: ds1307: simplify hwmon config rtc: s5m: Add SPDX license identifier rtc: maxim: Add SPDX license identifiers rtc: isl1219: add device tree documentation rtc: isl1208: set ev-evienb bit from device tree rtc: isl1208: Add "evdet" interrupt source for isl1219 rtc: isl1208: add support for isl1219 with tamper detection rtc: sysfs: facilitate attribute add to rtc device rtc: remove struct rtc_task char: rtc: remove task handling rtc: pcf85063: preserve control register value between stop and start rtc: sh: remove unused variable rtc_dev rtc: unexport rtc_irq_set_* rtc: simplify rtc_irq_set_state/rtc_irq_set_freq rtc: remove irq_task and irq_task_lock rtc: remove rtc_irq_register/rtc_irq_unregister rtc: sh: remove dead code rtc: sa1100: don't set PIE frequency rtc: ds1307: support m41t11 variant rtc: ds1307: fix data pointer to m41t0 ...
-
git://git.kernel.org/pub/scm/linux/kernel/git/jikos/livepatchingLinus Torvalds authored
Pull livepatching updates from Jiri Kosina: "Code cleanups from Kamalesh Babulal" * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/livepatching: livepatch: Validate module/old func name length livepatch: Remove reliable stacktrace check in klp_try_switch_task()
-
git://git.kernel.org/pub/scm/linux/kernel/git/jikos/hidLinus Torvalds authored
Pull HID updates from Jiri Kosina: - touch_max detection improvements and quirk handling fixes in wacom driver from Jason Gerecke and Ping Cheng - Palm rejection from Dmitry Torokhov and _dial support from Benjamin Tissoires for hid-multitouch driver - Low voltage support for i2c-hid driver from Stephen Boyd - Guitar-Hero support from Nicolas Adenis-Lamarre - other assorted small fixes and device ID additions * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/hid: (40 commits) HID: intel_ish-hid: tx_buf memory leak on probe/remove HID: intel-ish-hid: Prevent loading of driver on Mehlow HID: cougar: Add support for the Cougar 500k Gaming Keyboard HID: cougar: make compare_device_paths reusable HID: intel-ish-hid: remove redundant variable num_frags HID: multitouch: handle palm for touchscreens HID: multitouch: touchscreens also use confidence reports HID: multitouch: report MT_TOOL_PALM for non-confident touches HID: microsoft: support the Surface Dial HID: core: do not upper bound the collection stack HID: input: enable Totem on the Dell Canvas 27 HID: multitouch: remove one copy of values HID: multitouch: ditch mt_report_id HID: multitouch: store a per application quirks value HID: multitouch: Store per collection multitouch data HID: multitouch: make sure the static list of class is not changed input: add MT_TOOL_DIAL HID: elan: Add support for touchpad on the Toshiba Click Mini L9W HID: elan: Add USB-id for HP x2 10-n000nd touchpad HID: elan: Add a flag for selecting if the touchpad has a LED ...
-
git://git.kernel.org/pub/scm/linux/kernel/git/lee/backlightLinus Torvalds authored
Pull backlight updates from Lee Jones: "Core Framework: - Remove unused/obsolete code/comments New Functionality: - Allow less granular brightness specification for high-res PWMs; pwm_bl - Align brightness {inc,dec}rements with that perceived by the human-eye; pwm_bl Fix-ups: - Prepare for the introduction of -Wimplicit-fall-through; adp8860_bl Bug Fixes: - Fix uninitialised variable; pwm_bl" * tag 'backlight-next-4.19' of git://git.kernel.org/pub/scm/linux/kernel/git/lee/backlight: backlight: pwm_bl: Fix uninitialized variable backlight: adp8860: Mark expected switch fall-through backlight: Remove obsolete comment for ->state dt-bindings: pwm-backlight: Move brightness-levels to optional backlight: pwm_bl: Compute brightness of LED linearly to human eye dt-bindings: pwm-backlight: Add a num-interpolation-steps property backlight: pwm_bl: Linear interpolation between brightness-levels
-
git://git.kernel.org/pub/scm/linux/kernel/git/lee/mfdLinus Torvalds authored
Pull MFD updates from Lee Jones: "New Drivers: - Add Cirrus Logic Madera Codec (CS47L35, CS47L85 and CS47L90/91) driver - Add ChromeOS EC CEC driver - Add ROHM BD71837 PMIC driver New Device Support: - Add support for Dialog Semi DA9063L PMIC variant to DA9063 - Add support for Intel Ice Lake to Intel-PLSS-PCI - Add support for X-Powers AXP806 to AXP20x New Functionality: - Add support for USB Charging to the ChromeOS Embedded Controller - Add support for HDMI CEC to the ChromeOS Embedded Controller - Add support for HDMI CEC to Intel HDMI - Add support for accessory detection to Madera devices - Allow individual pins to be configured via DT' wlf,csnaddr-pd - Provide legacy platform specific EEPROM/Watchdog commands; rave-sp Fix-upsL - Trivial renaming/spelling fixes; cros_ec, da9063-* - Convert to Managed Resources (devm_*); da9063-*, ti_am335x_tscadc - Transition to helper macros/functions; da9063-* - Constify; kempld-core - Improve error path/messages; wm8994-core - Disable IRQs locally instead of relying on USB subsystem; dln2 - Remove unused code; rave-sp - New exports; sec-core Bug Fixes: - Fix possible false I2C transaction error; arizona-core - Fix declared memory area size; hi655x-pmic - Fix checksum type; rave-sp - Fix incorrect default serial port configuration: rave-sp - Fix incorrect coherent DMA mask for sub-devices; sm501" * tag 'mfd-next-4.19' of git://git.kernel.org/pub/scm/linux/kernel/git/lee/mfd: (60 commits) mfd: madera: Add register definitions for accessory detect mfd: sm501: Set coherent_dma_mask when creating subdevices mfd: bd71837: Devicetree bindings for ROHM BD71837 PMIC mfd: bd71837: Core driver for ROHM BD71837 PMIC media: platform: cros-ec-cec: Fix dependency on MFD_CROS_EC mfd: sec-core: Export OF module alias table mfd: as3722: Disable auto-power-on when AC OK mfd: axp20x: Support AXP806 in I2C mode mfd: axp20x: Add self-working mode support for AXP806 dt-bindings: mfd: axp20x: Add "self-working" mode for AXP806 mfd: wm8994: Allow to configure CS/ADDR Pulldown from dts mfd: wm8994: Allow to configure Speaker Mode Pullup from dts mfd: rave-sp: Emulate CMD_GET_STATUS on device that don't support it mfd: rave-sp: Add legacy watchdog ping command translation mfd: rave-sp: Add legacy EEPROM access command translation mfd: rave-sp: Initialize flow control and parity of the port mfd: rave-sp: Fix incorrectly specified checksum type mfd: rave-sp: Remove unused defines mfd: hi655x: Fix regmap area declared size for hi655x mfd: ti_am335x_tscadc: Fix struct clk memory leak ...
-
git://git.kernel.org/pub/scm/linux/kernel/git/bp/bpLinus Torvalds authored
Pull EDAC fix from Borislav Petkov: "An urgent fix for a NULL ptr deref on machines with LRDDR4 DIMMs, from Takashi Iwai" * tag 'edac_fixes_for_4.19' of git://git.kernel.org/pub/scm/linux/kernel/git/bp/bp: EDAC: Add missing MEM_LRDDR4 entry in edac_mem_types[]
-
Joe Perches authored
Various architectures fail to build properly with older versions of the gcc compiler. An example from Guenter Roeck in thread [1]: > > In file included from ./include/linux/mm.h:17:0, > from ./include/linux/pid_namespace.h:7, > from ./include/linux/ptrace.h:10, > from arch/openrisc/kernel/asm-offsets.c:32: > ./include/linux/mm_types.h:497:16: error: flexible array member in otherwise empty struct > > This is just an example with gcc 4.5.1 for or32. I have seen the problem > with gcc 4.4 (for unicore32) as well. So update the minimum required version of gcc to 4.6. [1] https://lore.kernel.org/lkml/20180814170904.GA12768@roeck-us.net/ Miscellanea: - Update Documentation/process/changes.rst - Remove and consolidate version test blocks in compiler-gcc.h for versions lower than 4.6 Signed-off-by: Joe Perches <joe@perches.com> Reviewed-by: Kees Cook <keescook@chromium.org> Reviewed-by: Nick Desaulniers <ndesaulniers@google.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
-
Tony Luck authored
Commit 0bbf47ea ("ia64: use asm-generic/io.h") results in a BUG while booting ia64. This is because asm-generic/io.h defines PCI_IOBASE, which results in the function acpi_pci_root_remap_iospace() doing a lot of unnecessary (and wrong) things. I'd suggested an #if !CONFIG_IA64 in the functon, but Arnd suggested keeping the fix inside the arch/ia64 tree. Fixes: 0bbf47ea ("ia64: use asm-generic/io.h") Suggested-by: Arnd Bergman <arnd@arndb.de> Signed-off-by: Tony Luck <tony.luck@intel.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
-
Jiri Kosina authored
-
Jiri Kosina authored
Guitar-Hero devices support for hid-wiimote
-
Jiri Kosina authored
Wacom driver updates: - touch_max detection improvements - quirk handling cleanup - get rid of wacom custom usages
-
Jiri Kosina authored
Assorted small driver/core fixes.
-
Jiri Kosina authored
devm_* API conversion for hid-sony
-
Jiri Kosina authored
Multitouch updates: - Dial support - Palm rejection for touchscreens - a few small assorted fixes
-
Jiri Kosina authored
Device-specific fixes for hid-intel-ish
-
Jiri Kosina authored
Low voltage support for i2c-hid
-
Jiri Kosina authored
Resolution/pressure fixes and new device support for hid-elan
-