1. 06 Mar, 2020 14 commits
    • Linus Torvalds's avatar
      Merge branch 'akpm' (patches from Andrew) · aeb542a1
      Linus Torvalds authored
      Merge misc fixes from Andrew Morton:
       "7 fixes"
      
      * emailed patches from Andrew Morton <akpm@linux-foundation.org>:
        arch/Kconfig: update HAVE_RELIABLE_STACKTRACE description
        mm, hotplug: fix page online with DEBUG_PAGEALLOC compiled but not enabled
        mm/z3fold.c: do not include rwlock.h directly
        fat: fix uninit-memory access for partial initialized inode
        mm: avoid data corruption on CoW fault into PFN-mapped VMA
        mm: fix possible PMD dirty bit lost in set_pmd_migration_entry()
        mm, numa: fix bad pmd by atomically check for pmd_trans_huge when marking page tables prot_numa
      aeb542a1
    • Miroslav Benes's avatar
      arch/Kconfig: update HAVE_RELIABLE_STACKTRACE description · 140d7e88
      Miroslav Benes authored
      save_stack_trace_tsk_reliable() is not the only function providing the
      reliable stack traces anymore.  Architecture might define ARCH_STACKWALK
      which provides a newer stack walking interface and has
      arch_stack_walk_reliable() function.  Update the description accordingly.
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarMiroslav Benes <mbenes@suse.cz>
      Acked-by: default avatarJosh Poimboeuf <jpoimboe@redhat.com>
      Link: http://lkml.kernel.org/r/20200120154042.9934-1-mbenes@suse.czSigned-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      140d7e88
    • Vlastimil Babka's avatar
      mm, hotplug: fix page online with DEBUG_PAGEALLOC compiled but not enabled · c87cbc1f
      Vlastimil Babka authored
      Commit cd02cf1a ("mm/hotplug: fix an imbalance with DEBUG_PAGEALLOC")
      fixed memory hotplug with debug_pagealloc enabled, where onlining a page
      goes through page freeing, which removes the direct mapping.  Some arches
      don't like when the page is not mapped in the first place, so
      generic_online_page() maps it first.  This is somewhat wasteful, but
      better than special casing page freeing fast paths.
      
      The commit however missed that DEBUG_PAGEALLOC configured doesn't mean
      it's actually enabled.  One has to test debug_pagealloc_enabled() since
      031bc574 ("mm/debug-pagealloc: make debug-pagealloc boottime
      configurable"), or alternatively debug_pagealloc_enabled_static() since
      8e57f8ac ("mm, debug_pagealloc: don't rely on static keys too early"),
      but this is not done.
      
      As a result, a s390 kernel with DEBUG_PAGEALLOC configured but not enabled
      will crash:
      
      Unable to handle kernel pointer dereference in virtual kernel address space
      Failing address: 0000000000000000 TEID: 0000000000000483
      Fault in home space mode while using kernel ASCE.
      AS:0000001ece13400b R2:000003fff7fd000b R3:000003fff7fcc007 S:000003fff7fd7000 P:000000000000013d
      Oops: 0004 ilc:2 [#1] SMP
      CPU: 1 PID: 26015 Comm: chmem Kdump: loaded Tainted: GX 5.3.18-5-default #1 SLE15-SP2 (unreleased)
      Krnl PSW : 0704e00180000000 0000001ecd281b9e (__kernel_map_pages+0x166/0x188)
      R:0 T:1 IO:1 EX:1 Key:0 M:1 W:0 P:0 AS:3 CC:2 PM:0 RI:0 EA:3
      Krnl GPRS: 0000000000000000 0000000000000800 0000400b00000000 0000000000000100
      0000000000000001 0000000000000000 0000000000000002 0000000000000100
      0000001ece139230 0000001ecdd98d40 0000400b00000100 0000000000000000
      000003ffa17e4000 001fffe0114f7d08 0000001ecd4d93ea 001fffe0114f7b20
      Krnl Code: 0000001ecd281b8e: ec17ffff00d8 ahik %r1,%r7,-1
      0000001ecd281b94: ec111dbc0355 risbg %r1,%r1,29,188,3
      >0000001ecd281b9e: 94fb5006 ni 6(%r5),251
      0000001ecd281ba2: 41505008 la %r5,8(%r5)
      0000001ecd281ba6: ec51fffc6064 cgrj %r5,%r1,6,1ecd281b9e
      0000001ecd281bac: 1a07 ar %r0,%r7
      0000001ecd281bae: ec03ff584076 crj %r0,%r3,4,1ecd281a5e
      Call Trace:
      [<0000001ecd281b9e>] __kernel_map_pages+0x166/0x188
      [<0000001ecd4d9516>] online_pages_range+0xf6/0x128
      [<0000001ecd2a8186>] walk_system_ram_range+0x7e/0xd8
      [<0000001ecda28aae>] online_pages+0x2fe/0x3f0
      [<0000001ecd7d02a6>] memory_subsys_online+0x8e/0xc0
      [<0000001ecd7add42>] device_online+0x5a/0xc8
      [<0000001ecd7d0430>] state_store+0x88/0x118
      [<0000001ecd5b9f62>] kernfs_fop_write+0xc2/0x200
      [<0000001ecd5064b6>] vfs_write+0x176/0x1e0
      [<0000001ecd50676a>] ksys_write+0xa2/0x100
      [<0000001ecda315d4>] system_call+0xd8/0x2c8
      
      Fix this by checking debug_pagealloc_enabled_static() before calling
      kernel_map_pages(). Backports for kernel before 5.5 should use
      debug_pagealloc_enabled() instead. Also add comments.
      
      Fixes: cd02cf1a ("mm/hotplug: fix an imbalance with DEBUG_PAGEALLOC")
      Reported-by: default avatarGerald Schaefer <gerald.schaefer@de.ibm.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarVlastimil Babka <vbabka@suse.cz>
      Reviewed-by: default avatarDavid Hildenbrand <david@redhat.com>
      Cc: <stable@vger.kernel.org>
      Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
      Cc: Qian Cai <cai@lca.pw>
      Link: http://lkml.kernel.org/r/20200224094651.18257-1-vbabka@suse.czSigned-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      c87cbc1f
    • Sebastian Andrzej Siewior's avatar
      mm/z3fold.c: do not include rwlock.h directly · a8198fed
      Sebastian Andrzej Siewior authored
      rwlock.h should not be included directly. Instead linux/splinlock.h
      should be included. One thing it does is to break the RT build.
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarSebastian Andrzej Siewior <bigeasy@linutronix.de>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Vitaly Wool <vitaly.wool@konsulko.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Link: http://lkml.kernel.org/r/20200224133631.1510569-1-bigeasy@linutronix.deSigned-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      a8198fed
    • OGAWA Hirofumi's avatar
      fat: fix uninit-memory access for partial initialized inode · bc87302a
      OGAWA Hirofumi authored
      When get an error in the middle of reading an inode, some fields in the
      inode might be still not initialized.  And then the evict_inode path may
      access those fields via iput().
      
      To fix, this makes sure that inode fields are initialized.
      
      Reported-by: syzbot+9d82b8de2992579da5d0@syzkaller.appspotmail.com
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarOGAWA Hirofumi <hirofumi@mail.parknet.co.jp>
      Cc: <stable@vger.kernel.org>
      Link: http://lkml.kernel.org/r/871rqnreqx.fsf@mail.parknet.co.jpSigned-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      bc87302a
    • Kirill A. Shutemov's avatar
      mm: avoid data corruption on CoW fault into PFN-mapped VMA · c3e5ea6e
      Kirill A. Shutemov authored
      Jeff Moyer has reported that one of xfstests triggers a warning when run
      on DAX-enabled filesystem:
      
      	WARNING: CPU: 76 PID: 51024 at mm/memory.c:2317 wp_page_copy+0xc40/0xd50
      	...
      	wp_page_copy+0x98c/0xd50 (unreliable)
      	do_wp_page+0xd8/0xad0
      	__handle_mm_fault+0x748/0x1b90
      	handle_mm_fault+0x120/0x1f0
      	__do_page_fault+0x240/0xd70
      	do_page_fault+0x38/0xd0
      	handle_page_fault+0x10/0x30
      
      The warning happens on failed __copy_from_user_inatomic() which tries to
      copy data into a CoW page.
      
      This happens because of race between MADV_DONTNEED and CoW page fault:
      
      	CPU0					CPU1
       handle_mm_fault()
         do_wp_page()
           wp_page_copy()
             do_wp_page()
      					madvise(MADV_DONTNEED)
      					  zap_page_range()
      					    zap_pte_range()
      					      ptep_get_and_clear_full()
      					      <TLB flush>
      	 __copy_from_user_inatomic()
      	 sees empty PTE and fails
      	 WARN_ON_ONCE(1)
      	 clear_page()
      
      The solution is to re-try __copy_from_user_inatomic() under PTL after
      checking that PTE is matches the orig_pte.
      
      The second copy attempt can still fail, like due to non-readable PTE, but
      there's nothing reasonable we can do about, except clearing the CoW page.
      Reported-by: default avatarJeff Moyer <jmoyer@redhat.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarKirill A. Shutemov <kirill.shutemov@linux.intel.com>
      Tested-by: default avatarJeff Moyer <jmoyer@redhat.com>
      Cc: <stable@vger.kernel.org>
      Cc: Justin He <Justin.He@arm.com>
      Cc: Dan Williams <dan.j.williams@intel.com>
      Link: http://lkml.kernel.org/r/20200218154151.13349-1-kirill.shutemov@linux.intel.comSigned-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      c3e5ea6e
    • Huang Ying's avatar
      mm: fix possible PMD dirty bit lost in set_pmd_migration_entry() · 8a8683ad
      Huang Ying authored
      In set_pmd_migration_entry(), pmdp_invalidate() is used to change PMD
      atomically.  But the PMD is read before that with an ordinary memory
      reading.  If the THP (transparent huge page) is written between the PMD
      reading and pmdp_invalidate(), the PMD dirty bit may be lost, and cause
      data corruption.  The race window is quite small, but still possible in
      theory, so need to be fixed.
      
      The race is fixed via using the return value of pmdp_invalidate() to get
      the original content of PMD, which is a read/modify/write atomic
      operation.  So no THP writing can occur in between.
      
      The race has been introduced when the THP migration support is added in
      the commit 616b8371 ("mm: thp: enable thp migration in generic path").
      But this fix depends on the commit d52605d7 ("mm: do not lose dirty
      and accessed bits in pmdp_invalidate()").  So it's easy to be backported
      after v4.16.  But the race window is really small, so it may be fine not
      to backport the fix at all.
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatar"Huang, Ying" <ying.huang@intel.com>
      Reviewed-by: default avatarZi Yan <ziy@nvidia.com>
      Reviewed-by: default avatarWilliam Kucharski <william.kucharski@oracle.com>
      Acked-by: default avatarKirill A. Shutemov <kirill.shutemov@linux.intel.com>
      Cc: <stable@vger.kernel.org>
      Cc: Vlastimil Babka <vbabka@suse.cz>
      Cc: Michal Hocko <mhocko@kernel.org>
      Cc: Andrea Arcangeli <aarcange@redhat.com>
      Link: http://lkml.kernel.org/r/20200220075220.2327056-1-ying.huang@intel.comSigned-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      8a8683ad
    • Mel Gorman's avatar
      mm, numa: fix bad pmd by atomically check for pmd_trans_huge when marking page tables prot_numa · 8b272b3c
      Mel Gorman authored
      : A user reported a bug against a distribution kernel while running a
      : proprietary workload described as "memory intensive that is not swapping"
      : that is expected to apply to mainline kernels.  The workload is
      : read/write/modifying ranges of memory and checking the contents.  They
      : reported that within a few hours that a bad PMD would be reported followed
      : by a memory corruption where expected data was all zeros.  A partial
      : report of the bad PMD looked like
      :
      :   [ 5195.338482] ../mm/pgtable-generic.c:33: bad pmd ffff8888157ba008(000002e0396009e2)
      :   [ 5195.341184] ------------[ cut here ]------------
      :   [ 5195.356880] kernel BUG at ../mm/pgtable-generic.c:35!
      :   ....
      :   [ 5195.410033] Call Trace:
      :   [ 5195.410471]  [<ffffffff811bc75d>] change_protection_range+0x7dd/0x930
      :   [ 5195.410716]  [<ffffffff811d4be8>] change_prot_numa+0x18/0x30
      :   [ 5195.410918]  [<ffffffff810adefe>] task_numa_work+0x1fe/0x310
      :   [ 5195.411200]  [<ffffffff81098322>] task_work_run+0x72/0x90
      :   [ 5195.411246]  [<ffffffff81077139>] exit_to_usermode_loop+0x91/0xc2
      :   [ 5195.411494]  [<ffffffff81003a51>] prepare_exit_to_usermode+0x31/0x40
      :   [ 5195.411739]  [<ffffffff815e56af>] retint_user+0x8/0x10
      :
      : Decoding revealed that the PMD was a valid prot_numa PMD and the bad PMD
      : was a false detection.  The bug does not trigger if automatic NUMA
      : balancing or transparent huge pages is disabled.
      :
      : The bug is due a race in change_pmd_range between a pmd_trans_huge and
      : pmd_nond_or_clear_bad check without any locks held.  During the
      : pmd_trans_huge check, a parallel protection update under lock can have
      : cleared the PMD and filled it with a prot_numa entry between the transhuge
      : check and the pmd_none_or_clear_bad check.
      :
      : While this could be fixed with heavy locking, it's only necessary to make
      : a copy of the PMD on the stack during change_pmd_range and avoid races.  A
      : new helper is created for this as the check if quite subtle and the
      : existing similar helpful is not suitable.  This passed 154 hours of
      : testing (usually triggers between 20 minutes and 24 hours) without
      : detecting bad PMDs or corruption.  A basic test of an autonuma-intensive
      : workload showed no significant change in behaviour.
      
      Although Mel withdrew the patch on the face of LKML comment
      https://lkml.org/lkml/2017/4/10/922 the race window aforementioned is
      still open, and we have reports of Linpack test reporting bad residuals
      after the bad PMD warning is observed.  In addition to that, bad
      rss-counter and non-zero pgtables assertions are triggered on mm teardown
      for the task hitting the bad PMD.
      
       host kernel: mm/pgtable-generic.c:40: bad pmd 00000000b3152f68(8000000d2d2008e7)
       ....
       host kernel: BUG: Bad rss-counter state mm:00000000b583043d idx:1 val:512
       host kernel: BUG: non-zero pgtables_bytes on freeing mm: 4096
      
      The issue is observed on a v4.18-based distribution kernel, but the race
      window is expected to be applicable to mainline kernels, as well.
      
      [akpm@linux-foundation.org: fix comment typo, per Rafael]
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarRafael Aquini <aquini@redhat.com>
      Signed-off-by: default avatarMel Gorman <mgorman@techsingularity.net>
      Cc: <stable@vger.kernel.org>
      Cc: Zi Yan <zi.yan@cs.rutgers.edu>
      Cc: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>
      Cc: Vlastimil Babka <vbabka@suse.cz>
      Cc: Michal Hocko <mhocko@suse.com>
      Link: http://lkml.kernel.org/r/20200216191800.22423-1-aquini@redhat.comSigned-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      8b272b3c
    • Linus Torvalds's avatar
      Merge tag 'devprop-5.6-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm · b0b8a945
      Linus Torvalds authored
      Pull device properties framework fix from Rafael Wysocki:
       "Revert a problematic commit from the 5.3 development cycle (Brendan
        Higgins)"
      
      * tag 'devprop-5.6-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm:
        Revert "software node: Simplify software_node_release() function"
      b0b8a945
    • Linus Torvalds's avatar
      Merge tag 'acpi-5.6-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm · fe67d182
      Linus Torvalds authored
      Pull ACPI documentation fix from Rafael Wysocki:
       "Fix Sphinx format warinings in an ACPI fan document added recently
        (Randy Dunlap)"
      
      * tag 'acpi-5.6-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm:
        Documentation/admin-guide/acpi: fix fan_performance_states.rst warnings
      fe67d182
    • Linus Torvalds's avatar
      Merge tag 'drm-fixes-2020-03-06' of git://anongit.freedesktop.org/drm/drm · ba0ae9ac
      Linus Torvalds authored
      Pull drm fixes from Dave Airlie:
       "Weekly fixes round, looks like a few people woke up, got a bunch of
        fixes across the drivers. Bit bigger than I'd like but they all seem
        fine and hopefully it quiets down now.
      
        sun4i, kirin, mediatek and exynos on the ARM side. virtio-gpu and core
        have some mmap fixes, and there is a dma-buf leak. one ttm fence leak
        is also fixed.
      
        Otherwise it's mostly amdgpu and i915.
      
        One of the i915 fixes is for a very long latency I was seeing (using
        latencytop) running gnome-shell locally when using firefox and eating
        nearly all my RAM, it really helps with desktop responsiveness esp
        when firefox is chewing a lot.
      
        dma-buf:
         - fix memory leak
      
        core:
         - shmem object mmap fix.
      
        ttm:
         - Fix fence leak in ttm_buffer_object_transfer().
      
        amdgpu:
         - Gfx reset fix for gfx9, 10
         - Fix for gfx10
         - DP MST fix
         - DCC fix
         - Renoir power fixes
         - Navi power fix
      
        i915:
         - Break up long lists of object reclaim with cond_resched()
         - PSR probe fix
         - TGL workarounds
         - Selftest return value fix
         - Drop timeline mutex while waiting for retirement
         - Wait for OA configuration completion before writes to OA buffer
      
        virtio:
         - Fix resource id creation race in virtio.
         - mmap fixes
      
        sun4i:
         - Fixes for sun4i VI layer format support.
      
        kirin:
         - kirin: Revert "Fix for hikey620 display offset problem"
      
        exynos:
         - fix a kernel oops problem in case that driver is loaded as module.
         - fix a regulator warning issue when I2C DDC adapter cannot be gathered.
         - print out an error message only in error case excepting -EPROBE_DEFER.
      
        mediatek:
         - overlay, cursor and gce fixes"
      `
      
      * tag 'drm-fixes-2020-03-06' of git://anongit.freedesktop.org/drm/drm: (38 commits)
        drm/amdgpu/display: navi1x copy dcn watermark clock settings to smu resume from s3 (v2)
        drm/amd/powerplay: map mclk to fclk for COMBINATIONAL_BYPASS case
        drm/amd/powerplay: fix pre-check condition for setting clock range
        drm/amd/display: fix dcc swath size calculations on dcn1
        drm/amd/display: Clear link settings on MST disable connector
        drm/amdgpu: disable 3D pipe 1 on Navi1x
        drm/amdgpu: clean wptr on wb when gpu recovery
        drm: kirin: Revert "Fix for hikey620 display offset problem"
        drm/i915/gt: Drop the timeline->mutex as we wait for retirement
        drm/i915/perf: Reintroduce wait on OA configuration completion
        drm/sun4i: Fix DE2 VI layer format support
        drm/sun4i: Add separate DE3 VI layer formats
        drm/sun4i: de2/de3: Remove unsupported VI layer formats
        drm/i915/selftests: Fix return in assert_mmap_offset()
        drm/i915: Protect i915_request_await_start from early waits
        drm/i915/tgl: Add Wa_1608008084
        drm/i915/tgl: Add Wa_22010178259:tgl
        drm/i915: Program MBUS with rmw during initialization
        drm/i915/psr: Force PSR probe only after full initialization
        drm/i915/gem: Break up long lists of object reclaim
        ...
      ba0ae9ac
    • Rafael J. Wysocki's avatar
      Merge branch 'acpi-doc' · 86dfa5be
      Rafael J. Wysocki authored
      * acpi-doc:
        Documentation/admin-guide/acpi: fix fan_performance_states.rst warnings
      86dfa5be
    • Dave Airlie's avatar
      Merge tag 'amd-drm-fixes-5.6-2020-03-05' of... · 2ac4853e
      Dave Airlie authored
      Merge tag 'amd-drm-fixes-5.6-2020-03-05' of git://people.freedesktop.org/~agd5f/linux into drm-fixes
      
      amd-drm-fixes-5.6-2020-03-05:
      
      amdgpu:
      - Gfx reset fix for gfx9, 10
      - Fix for gfx10
      - DP MST fix
      - DCC fix
      - Renoir power fixes
      - Navi power fix
      Signed-off-by: default avatarDave Airlie <airlied@redhat.com>
      From: Alex Deucher <alexdeucher@gmail.com>
      Link: https://patchwork.freedesktop.org/patch/msgid/20200305185957.4268-1-alexander.deucher@amd.com
      2ac4853e
    • Dave Airlie's avatar
      Merge tag 'drm-intel-fixes-2020-03-05' of... · 64c3fd53
      Dave Airlie authored
      Merge tag 'drm-intel-fixes-2020-03-05' of git://anongit.freedesktop.org/drm/drm-intel into drm-fixes
      
      drm/i915 fixes for v5.6-rc5:
      - Break up long lists of object reclaim with cond_resched()
      - PSR probe fix
      - TGL workarounds
      - Selftest return value fix
      - Drop timeline mutex while waiting for retirement
      - Wait for OA configuration completion before writes to OA buffer
      Signed-off-by: default avatarDave Airlie <airlied@redhat.com>
      
      From: Jani Nikula <jani.nikula@intel.com>
      Link: https://patchwork.freedesktop.org/patch/msgid/87eeu7nl6z.fsf@intel.com
      64c3fd53
  2. 05 Mar, 2020 15 commits
  3. 04 Mar, 2020 5 commits
  4. 03 Mar, 2020 6 commits