1. 06 Mar, 2020 7 commits
    • Miroslav Benes's avatar
      arch/Kconfig: update HAVE_RELIABLE_STACKTRACE description · 140d7e88
      Miroslav Benes authored
      save_stack_trace_tsk_reliable() is not the only function providing the
      reliable stack traces anymore.  Architecture might define ARCH_STACKWALK
      which provides a newer stack walking interface and has
      arch_stack_walk_reliable() function.  Update the description accordingly.
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarMiroslav Benes <mbenes@suse.cz>
      Acked-by: default avatarJosh Poimboeuf <jpoimboe@redhat.com>
      Link: http://lkml.kernel.org/r/20200120154042.9934-1-mbenes@suse.czSigned-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      140d7e88
    • Vlastimil Babka's avatar
      mm, hotplug: fix page online with DEBUG_PAGEALLOC compiled but not enabled · c87cbc1f
      Vlastimil Babka authored
      Commit cd02cf1a ("mm/hotplug: fix an imbalance with DEBUG_PAGEALLOC")
      fixed memory hotplug with debug_pagealloc enabled, where onlining a page
      goes through page freeing, which removes the direct mapping.  Some arches
      don't like when the page is not mapped in the first place, so
      generic_online_page() maps it first.  This is somewhat wasteful, but
      better than special casing page freeing fast paths.
      
      The commit however missed that DEBUG_PAGEALLOC configured doesn't mean
      it's actually enabled.  One has to test debug_pagealloc_enabled() since
      031bc574 ("mm/debug-pagealloc: make debug-pagealloc boottime
      configurable"), or alternatively debug_pagealloc_enabled_static() since
      8e57f8ac ("mm, debug_pagealloc: don't rely on static keys too early"),
      but this is not done.
      
      As a result, a s390 kernel with DEBUG_PAGEALLOC configured but not enabled
      will crash:
      
      Unable to handle kernel pointer dereference in virtual kernel address space
      Failing address: 0000000000000000 TEID: 0000000000000483
      Fault in home space mode while using kernel ASCE.
      AS:0000001ece13400b R2:000003fff7fd000b R3:000003fff7fcc007 S:000003fff7fd7000 P:000000000000013d
      Oops: 0004 ilc:2 [#1] SMP
      CPU: 1 PID: 26015 Comm: chmem Kdump: loaded Tainted: GX 5.3.18-5-default #1 SLE15-SP2 (unreleased)
      Krnl PSW : 0704e00180000000 0000001ecd281b9e (__kernel_map_pages+0x166/0x188)
      R:0 T:1 IO:1 EX:1 Key:0 M:1 W:0 P:0 AS:3 CC:2 PM:0 RI:0 EA:3
      Krnl GPRS: 0000000000000000 0000000000000800 0000400b00000000 0000000000000100
      0000000000000001 0000000000000000 0000000000000002 0000000000000100
      0000001ece139230 0000001ecdd98d40 0000400b00000100 0000000000000000
      000003ffa17e4000 001fffe0114f7d08 0000001ecd4d93ea 001fffe0114f7b20
      Krnl Code: 0000001ecd281b8e: ec17ffff00d8 ahik %r1,%r7,-1
      0000001ecd281b94: ec111dbc0355 risbg %r1,%r1,29,188,3
      >0000001ecd281b9e: 94fb5006 ni 6(%r5),251
      0000001ecd281ba2: 41505008 la %r5,8(%r5)
      0000001ecd281ba6: ec51fffc6064 cgrj %r5,%r1,6,1ecd281b9e
      0000001ecd281bac: 1a07 ar %r0,%r7
      0000001ecd281bae: ec03ff584076 crj %r0,%r3,4,1ecd281a5e
      Call Trace:
      [<0000001ecd281b9e>] __kernel_map_pages+0x166/0x188
      [<0000001ecd4d9516>] online_pages_range+0xf6/0x128
      [<0000001ecd2a8186>] walk_system_ram_range+0x7e/0xd8
      [<0000001ecda28aae>] online_pages+0x2fe/0x3f0
      [<0000001ecd7d02a6>] memory_subsys_online+0x8e/0xc0
      [<0000001ecd7add42>] device_online+0x5a/0xc8
      [<0000001ecd7d0430>] state_store+0x88/0x118
      [<0000001ecd5b9f62>] kernfs_fop_write+0xc2/0x200
      [<0000001ecd5064b6>] vfs_write+0x176/0x1e0
      [<0000001ecd50676a>] ksys_write+0xa2/0x100
      [<0000001ecda315d4>] system_call+0xd8/0x2c8
      
      Fix this by checking debug_pagealloc_enabled_static() before calling
      kernel_map_pages(). Backports for kernel before 5.5 should use
      debug_pagealloc_enabled() instead. Also add comments.
      
      Fixes: cd02cf1a ("mm/hotplug: fix an imbalance with DEBUG_PAGEALLOC")
      Reported-by: default avatarGerald Schaefer <gerald.schaefer@de.ibm.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarVlastimil Babka <vbabka@suse.cz>
      Reviewed-by: default avatarDavid Hildenbrand <david@redhat.com>
      Cc: <stable@vger.kernel.org>
      Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
      Cc: Qian Cai <cai@lca.pw>
      Link: http://lkml.kernel.org/r/20200224094651.18257-1-vbabka@suse.czSigned-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      c87cbc1f
    • Sebastian Andrzej Siewior's avatar
      mm/z3fold.c: do not include rwlock.h directly · a8198fed
      Sebastian Andrzej Siewior authored
      rwlock.h should not be included directly. Instead linux/splinlock.h
      should be included. One thing it does is to break the RT build.
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarSebastian Andrzej Siewior <bigeasy@linutronix.de>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Vitaly Wool <vitaly.wool@konsulko.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Link: http://lkml.kernel.org/r/20200224133631.1510569-1-bigeasy@linutronix.deSigned-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      a8198fed
    • OGAWA Hirofumi's avatar
      fat: fix uninit-memory access for partial initialized inode · bc87302a
      OGAWA Hirofumi authored
      When get an error in the middle of reading an inode, some fields in the
      inode might be still not initialized.  And then the evict_inode path may
      access those fields via iput().
      
      To fix, this makes sure that inode fields are initialized.
      
      Reported-by: syzbot+9d82b8de2992579da5d0@syzkaller.appspotmail.com
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarOGAWA Hirofumi <hirofumi@mail.parknet.co.jp>
      Cc: <stable@vger.kernel.org>
      Link: http://lkml.kernel.org/r/871rqnreqx.fsf@mail.parknet.co.jpSigned-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      bc87302a
    • Kirill A. Shutemov's avatar
      mm: avoid data corruption on CoW fault into PFN-mapped VMA · c3e5ea6e
      Kirill A. Shutemov authored
      Jeff Moyer has reported that one of xfstests triggers a warning when run
      on DAX-enabled filesystem:
      
      	WARNING: CPU: 76 PID: 51024 at mm/memory.c:2317 wp_page_copy+0xc40/0xd50
      	...
      	wp_page_copy+0x98c/0xd50 (unreliable)
      	do_wp_page+0xd8/0xad0
      	__handle_mm_fault+0x748/0x1b90
      	handle_mm_fault+0x120/0x1f0
      	__do_page_fault+0x240/0xd70
      	do_page_fault+0x38/0xd0
      	handle_page_fault+0x10/0x30
      
      The warning happens on failed __copy_from_user_inatomic() which tries to
      copy data into a CoW page.
      
      This happens because of race between MADV_DONTNEED and CoW page fault:
      
      	CPU0					CPU1
       handle_mm_fault()
         do_wp_page()
           wp_page_copy()
             do_wp_page()
      					madvise(MADV_DONTNEED)
      					  zap_page_range()
      					    zap_pte_range()
      					      ptep_get_and_clear_full()
      					      <TLB flush>
      	 __copy_from_user_inatomic()
      	 sees empty PTE and fails
      	 WARN_ON_ONCE(1)
      	 clear_page()
      
      The solution is to re-try __copy_from_user_inatomic() under PTL after
      checking that PTE is matches the orig_pte.
      
      The second copy attempt can still fail, like due to non-readable PTE, but
      there's nothing reasonable we can do about, except clearing the CoW page.
      Reported-by: default avatarJeff Moyer <jmoyer@redhat.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarKirill A. Shutemov <kirill.shutemov@linux.intel.com>
      Tested-by: default avatarJeff Moyer <jmoyer@redhat.com>
      Cc: <stable@vger.kernel.org>
      Cc: Justin He <Justin.He@arm.com>
      Cc: Dan Williams <dan.j.williams@intel.com>
      Link: http://lkml.kernel.org/r/20200218154151.13349-1-kirill.shutemov@linux.intel.comSigned-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      c3e5ea6e
    • Huang Ying's avatar
      mm: fix possible PMD dirty bit lost in set_pmd_migration_entry() · 8a8683ad
      Huang Ying authored
      In set_pmd_migration_entry(), pmdp_invalidate() is used to change PMD
      atomically.  But the PMD is read before that with an ordinary memory
      reading.  If the THP (transparent huge page) is written between the PMD
      reading and pmdp_invalidate(), the PMD dirty bit may be lost, and cause
      data corruption.  The race window is quite small, but still possible in
      theory, so need to be fixed.
      
      The race is fixed via using the return value of pmdp_invalidate() to get
      the original content of PMD, which is a read/modify/write atomic
      operation.  So no THP writing can occur in between.
      
      The race has been introduced when the THP migration support is added in
      the commit 616b8371 ("mm: thp: enable thp migration in generic path").
      But this fix depends on the commit d52605d7 ("mm: do not lose dirty
      and accessed bits in pmdp_invalidate()").  So it's easy to be backported
      after v4.16.  But the race window is really small, so it may be fine not
      to backport the fix at all.
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatar"Huang, Ying" <ying.huang@intel.com>
      Reviewed-by: default avatarZi Yan <ziy@nvidia.com>
      Reviewed-by: default avatarWilliam Kucharski <william.kucharski@oracle.com>
      Acked-by: default avatarKirill A. Shutemov <kirill.shutemov@linux.intel.com>
      Cc: <stable@vger.kernel.org>
      Cc: Vlastimil Babka <vbabka@suse.cz>
      Cc: Michal Hocko <mhocko@kernel.org>
      Cc: Andrea Arcangeli <aarcange@redhat.com>
      Link: http://lkml.kernel.org/r/20200220075220.2327056-1-ying.huang@intel.comSigned-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      8a8683ad
    • Mel Gorman's avatar
      mm, numa: fix bad pmd by atomically check for pmd_trans_huge when marking page tables prot_numa · 8b272b3c
      Mel Gorman authored
      : A user reported a bug against a distribution kernel while running a
      : proprietary workload described as "memory intensive that is not swapping"
      : that is expected to apply to mainline kernels.  The workload is
      : read/write/modifying ranges of memory and checking the contents.  They
      : reported that within a few hours that a bad PMD would be reported followed
      : by a memory corruption where expected data was all zeros.  A partial
      : report of the bad PMD looked like
      :
      :   [ 5195.338482] ../mm/pgtable-generic.c:33: bad pmd ffff8888157ba008(000002e0396009e2)
      :   [ 5195.341184] ------------[ cut here ]------------
      :   [ 5195.356880] kernel BUG at ../mm/pgtable-generic.c:35!
      :   ....
      :   [ 5195.410033] Call Trace:
      :   [ 5195.410471]  [<ffffffff811bc75d>] change_protection_range+0x7dd/0x930
      :   [ 5195.410716]  [<ffffffff811d4be8>] change_prot_numa+0x18/0x30
      :   [ 5195.410918]  [<ffffffff810adefe>] task_numa_work+0x1fe/0x310
      :   [ 5195.411200]  [<ffffffff81098322>] task_work_run+0x72/0x90
      :   [ 5195.411246]  [<ffffffff81077139>] exit_to_usermode_loop+0x91/0xc2
      :   [ 5195.411494]  [<ffffffff81003a51>] prepare_exit_to_usermode+0x31/0x40
      :   [ 5195.411739]  [<ffffffff815e56af>] retint_user+0x8/0x10
      :
      : Decoding revealed that the PMD was a valid prot_numa PMD and the bad PMD
      : was a false detection.  The bug does not trigger if automatic NUMA
      : balancing or transparent huge pages is disabled.
      :
      : The bug is due a race in change_pmd_range between a pmd_trans_huge and
      : pmd_nond_or_clear_bad check without any locks held.  During the
      : pmd_trans_huge check, a parallel protection update under lock can have
      : cleared the PMD and filled it with a prot_numa entry between the transhuge
      : check and the pmd_none_or_clear_bad check.
      :
      : While this could be fixed with heavy locking, it's only necessary to make
      : a copy of the PMD on the stack during change_pmd_range and avoid races.  A
      : new helper is created for this as the check if quite subtle and the
      : existing similar helpful is not suitable.  This passed 154 hours of
      : testing (usually triggers between 20 minutes and 24 hours) without
      : detecting bad PMDs or corruption.  A basic test of an autonuma-intensive
      : workload showed no significant change in behaviour.
      
      Although Mel withdrew the patch on the face of LKML comment
      https://lkml.org/lkml/2017/4/10/922 the race window aforementioned is
      still open, and we have reports of Linpack test reporting bad residuals
      after the bad PMD warning is observed.  In addition to that, bad
      rss-counter and non-zero pgtables assertions are triggered on mm teardown
      for the task hitting the bad PMD.
      
       host kernel: mm/pgtable-generic.c:40: bad pmd 00000000b3152f68(8000000d2d2008e7)
       ....
       host kernel: BUG: Bad rss-counter state mm:00000000b583043d idx:1 val:512
       host kernel: BUG: non-zero pgtables_bytes on freeing mm: 4096
      
      The issue is observed on a v4.18-based distribution kernel, but the race
      window is expected to be applicable to mainline kernels, as well.
      
      [akpm@linux-foundation.org: fix comment typo, per Rafael]
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarRafael Aquini <aquini@redhat.com>
      Signed-off-by: default avatarMel Gorman <mgorman@techsingularity.net>
      Cc: <stable@vger.kernel.org>
      Cc: Zi Yan <zi.yan@cs.rutgers.edu>
      Cc: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>
      Cc: Vlastimil Babka <vbabka@suse.cz>
      Cc: Michal Hocko <mhocko@suse.com>
      Link: http://lkml.kernel.org/r/20200216191800.22423-1-aquini@redhat.comSigned-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      8b272b3c
  2. 05 Mar, 2020 5 commits
  3. 04 Mar, 2020 1 commit
    • Linus Torvalds's avatar
      Merge tag 'for-5.6/dm-fixes' of... · 776e49e8
      Linus Torvalds authored
      Merge tag 'for-5.6/dm-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/device-mapper/linux-dm
      
      Pull device mapper fixes from Mike Snitzer:
      
       - Fix request-based DM's congestion_fn and actually wire it up to the
         bdi.
      
       - Extend dm-bio-record to track additional struct bio members needed by
         DM integrity target.
      
       - Fix DM core to properly advertise that a device is suspended during
         unload (between the presuspend and postsuspend hooks). This change is
         a prereq for related DM integrity and DM writecache fixes. It
         elevates DM integrity's 'suspending' state tracking to DM core.
      
       - Four stable fixes for DM integrity target.
      
       - Fix crash in DM cache target due to incorrect work item cancelling.
      
       - Fix DM thin metadata lockdep warning that was introduced during 5.6
         merge window.
      
       - Fix DM zoned target's chunk work refcounting that regressed during
         recent conversion to refcount_t.
      
       - Bump the minor version for DM core and all target versions that have
         seen interface changes or important fixes during the 5.6 cycle.
      
      * tag 'for-5.6/dm-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/device-mapper/linux-dm:
        dm: bump version of core and various targets
        dm: fix congested_fn for request-based device
        dm integrity: use dm_bio_record and dm_bio_restore
        dm bio record: save/restore bi_end_io and bi_integrity
        dm zoned: Fix reference counter initial value of chunk works
        dm writecache: verify watermark during resume
        dm: report suspended device during destroy
        dm thin metadata: fix lockdep complaint
        dm cache: fix a crash due to incorrect work item cancelling
        dm integrity: fix invalid table returned due to argument count mismatch
        dm integrity: fix a deadlock due to offloading to an incorrect workqueue
        dm integrity: fix recalculation when moving from journal mode to bitmap mode
      776e49e8
  4. 03 Mar, 2020 5 commits
  5. 02 Mar, 2020 4 commits
    • Linus Torvalds's avatar
      Merge branch 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip · 2873dc25
      Linus Torvalds authored
      Pull x86 fixes from Ingo Molnar:
       "Misc fixes: a pkeys fix for a bug that triggers with weird BIOS
        settings, and two Xen PV fixes: a paravirt interface fix, and
        pagetable dumping fix"
      
      * 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
        x86/mm: Fix dump_pagetables with Xen PV
        x86/ioperm: Add new paravirt function update_io_bitmap()
        x86/pkeys: Manually set X86_FEATURE_OSPKE to preserve existing changes
      2873dc25
    • Linus Torvalds's avatar
      Merge branch 'sched-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip · c105df5d
      Linus Torvalds authored
      Pull scheduler fix from Ingo Molnar:
       "Fix a scheduler statistics bug"
      
      * 'sched-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
        sched/fair: Fix statistics for find_idlest_group()
      c105df5d
    • Linus Torvalds's avatar
      Merge branch 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip · 852fb4a7
      Linus Torvalds authored
      Pull perf fixes from Ingo Molnar:
       "No kernel side changes, all tooling fixes plus two tooling cleanups
        that were committed late in the merge window alongside the perf
        annotate fixes, delayed by Arnaldo's European trip"
      
      * 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (23 commits)
        perf annotate: Fix segfault with source toggle
        perf annotate: Align struct annotate_args
        perf annotate: Simplify disasm_line allocation and freeing code
        perf annotate: Remove privsize from symbol__annotate() args
        perf probe: Check return value of strlist__add() for -ENOMEM
        perf config: Document missing config options
        perf annotate: Fix perf config option description
        perf annotate: Prefer cmdline option over default config
        perf annotate: Make perf config effective
        perf config: Introduce perf_config_u8()
        perf annotate: Fix --show-nr-samples for tui/stdio2
        perf annotate: Fix --show-total-period for tui/stdio2
        perf annotate/tui: Re-render title bar after switching back from script browser
        tools headers UAPI: Update tools's copy of kvm.h headers
        tools arch x86: Sync the msr-index.h copy with the kernel sources
        perf arch powerpc: Sync powerpc syscall.tbl with the kernel sources
        perf auxtrace: Add auxtrace_record__read_finish()
        perf arm-spe: Fix endless record after being terminated
        perf cs-etm: Fix endless record after being terminated
        perf intel-bts: Fix endless record after being terminated
        ...
      852fb4a7
    • Linus Torvalds's avatar
      Merge branch 'efi-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip · e130a920
      Linus Torvalds authored
      Pull EFI fixes from Ingo Molnar:
       "Three fixes to EFI mixed boot mode, mostly related to x86-64 vmap
        stacks activated years ago, bug-fixed recently for EFI, which had
        knock-on effects of various 1:1 mapping assumptions in mixed mode.
      
        There's also a READ_ONCE() fix for reading an mmap-ed EFI firmware
        data field only once, out of caution"
      
      * 'efi-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
        efi: READ_ONCE rng seed size before munmap
        efi/x86: Handle by-ref arguments covering multiple pages in mixed mode
        efi/x86: Remove support for EFI time and counter services in mixed mode
        efi/x86: Align GUIDs to their size in the mixed mode runtime wrapper
      e130a920
  6. 01 Mar, 2020 5 commits
  7. 29 Feb, 2020 7 commits
    • Dan Carpenter's avatar
      ext4: potential crash on allocation error in ext4_alloc_flex_bg_array() · 37b0b6b8
      Dan Carpenter authored
      If sbi->s_flex_groups_allocated is zero and the first allocation fails
      then this code will crash.  The problem is that "i--" will set "i" to
      -1 but when we compare "i >= sbi->s_flex_groups_allocated" then the -1
      is type promoted to unsigned and becomes UINT_MAX.  Since UINT_MAX
      is more than zero, the condition is true so we call kvfree(new_groups[-1]).
      The loop will carry on freeing invalid memory until it crashes.
      
      Fixes: 7c990728 ("ext4: fix potential race between s_flex_groups online resizing and access")
      Reviewed-by: default avatarSuraj Jitindar Singh <surajjs@amazon.com>
      Signed-off-by: default avatarDan Carpenter <dan.carpenter@oracle.com>
      Cc: stable@kernel.org
      Link: https://lore.kernel.org/r/20200228092142.7irbc44yaz3by7nb@kili.mountainSigned-off-by: default avatarTheodore Ts'o <tytso@mit.edu>
      37b0b6b8
    • Wolfram Sang's avatar
      macintosh: therm_windtunnel: fix regression when instantiating devices · 38b17afb
      Wolfram Sang authored
      Removing attach_adapter from this driver caused a regression for at
      least some machines. Those machines had the sensors described in their
      DT, too, so they didn't need manual creation of the sensor devices. The
      old code worked, though, because manual creation came first. Creation of
      DT devices then failed later and caused error logs, but the sensors
      worked nonetheless because of the manually created devices.
      
      When removing attach_adaper, manual creation now comes later and loses
      the race. The sensor devices were already registered via DT, yet with
      another binding, so the driver could not be bound to it.
      
      This fix refactors the code to remove the race and only manually creates
      devices if there are no DT nodes present. Also, the DT binding is updated
      to match both, the DT and manually created devices. Because we don't
      know which device creation will be used at runtime, the code to start
      the kthread is moved to do_probe() which will be called by both methods.
      
      Fixes: 3e7bed52 ("macintosh: therm_windtunnel: drop using attach_adapter")
      Link: https://bugzilla.kernel.org/show_bug.cgi?id=201723Reported-by: default avatarErhard Furtner <erhard_f@mailbox.org>
      Tested-by: default avatarErhard Furtner <erhard_f@mailbox.org>
      Acked-by: Michael Ellerman <mpe@ellerman.id.au> (powerpc)
      Signed-off-by: default avatarWolfram Sang <wsa@the-dreams.de>
      Cc: stable@kernel.org # v4.19+
      38b17afb
    • Qian Cai's avatar
      jbd2: fix data races at struct journal_head · 6c5d9112
      Qian Cai authored
      journal_head::b_transaction and journal_head::b_next_transaction could
      be accessed concurrently as noticed by KCSAN,
      
       LTP: starting fsync04
       /dev/zero: Can't open blockdev
       EXT4-fs (loop0): mounting ext3 file system using the ext4 subsystem
       EXT4-fs (loop0): mounted filesystem with ordered data mode. Opts: (null)
       ==================================================================
       BUG: KCSAN: data-race in __jbd2_journal_refile_buffer [jbd2] / jbd2_write_access_granted [jbd2]
      
       write to 0xffff99f9b1bd0e30 of 8 bytes by task 25721 on cpu 70:
        __jbd2_journal_refile_buffer+0xdd/0x210 [jbd2]
        __jbd2_journal_refile_buffer at fs/jbd2/transaction.c:2569
        jbd2_journal_commit_transaction+0x2d15/0x3f20 [jbd2]
        (inlined by) jbd2_journal_commit_transaction at fs/jbd2/commit.c:1034
        kjournald2+0x13b/0x450 [jbd2]
        kthread+0x1cd/0x1f0
        ret_from_fork+0x27/0x50
      
       read to 0xffff99f9b1bd0e30 of 8 bytes by task 25724 on cpu 68:
        jbd2_write_access_granted+0x1b2/0x250 [jbd2]
        jbd2_write_access_granted at fs/jbd2/transaction.c:1155
        jbd2_journal_get_write_access+0x2c/0x60 [jbd2]
        __ext4_journal_get_write_access+0x50/0x90 [ext4]
        ext4_mb_mark_diskspace_used+0x158/0x620 [ext4]
        ext4_mb_new_blocks+0x54f/0xca0 [ext4]
        ext4_ind_map_blocks+0xc79/0x1b40 [ext4]
        ext4_map_blocks+0x3b4/0x950 [ext4]
        _ext4_get_block+0xfc/0x270 [ext4]
        ext4_get_block+0x3b/0x50 [ext4]
        __block_write_begin_int+0x22e/0xae0
        __block_write_begin+0x39/0x50
        ext4_write_begin+0x388/0xb50 [ext4]
        generic_perform_write+0x15d/0x290
        ext4_buffered_write_iter+0x11f/0x210 [ext4]
        ext4_file_write_iter+0xce/0x9e0 [ext4]
        new_sync_write+0x29c/0x3b0
        __vfs_write+0x92/0xa0
        vfs_write+0x103/0x260
        ksys_write+0x9d/0x130
        __x64_sys_write+0x4c/0x60
        do_syscall_64+0x91/0xb05
        entry_SYSCALL_64_after_hwframe+0x49/0xbe
      
       5 locks held by fsync04/25724:
        #0: ffff99f9911093f8 (sb_writers#13){.+.+}, at: vfs_write+0x21c/0x260
        #1: ffff99f9db4c0348 (&sb->s_type->i_mutex_key#15){+.+.}, at: ext4_buffered_write_iter+0x65/0x210 [ext4]
        #2: ffff99f5e7dfcf58 (jbd2_handle){++++}, at: start_this_handle+0x1c1/0x9d0 [jbd2]
        #3: ffff99f9db4c0168 (&ei->i_data_sem){++++}, at: ext4_map_blocks+0x176/0x950 [ext4]
        #4: ffffffff99086b40 (rcu_read_lock){....}, at: jbd2_write_access_granted+0x4e/0x250 [jbd2]
       irq event stamp: 1407125
       hardirqs last  enabled at (1407125): [<ffffffff980da9b7>] __find_get_block+0x107/0x790
       hardirqs last disabled at (1407124): [<ffffffff980da8f9>] __find_get_block+0x49/0x790
       softirqs last  enabled at (1405528): [<ffffffff98a0034c>] __do_softirq+0x34c/0x57c
       softirqs last disabled at (1405521): [<ffffffff97cc67a2>] irq_exit+0xa2/0xc0
      
       Reported by Kernel Concurrency Sanitizer on:
       CPU: 68 PID: 25724 Comm: fsync04 Tainted: G L 5.6.0-rc2-next-20200221+ #7
       Hardware name: HPE ProLiant DL385 Gen10/ProLiant DL385 Gen10, BIOS A40 07/10/2019
      
      The plain reads are outside of jh->b_state_lock critical section which result
      in data races. Fix them by adding pairs of READ|WRITE_ONCE().
      Reviewed-by: default avatarJan Kara <jack@suse.cz>
      Signed-off-by: default avatarQian Cai <cai@lca.pw>
      Link: https://lore.kernel.org/r/20200222043111.2227-1-cai@lca.pwSigned-off-by: default avatarTheodore Ts'o <tytso@mit.edu>
      6c5d9112
    • Linus Torvalds's avatar
      Merge tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi · 7557c1b3
      Linus Torvalds authored
      Pull SCSI fixes from James Bottomley:
       "Four small fixes.
      
        Three are in drivers for fairly obvious bugs. The fourth is a set of
        regressions introduced by the compat_ioctl changes because some of the
        compat updates wrongly replaced .ioctl instead of .compat_ioctl"
      
      * tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi:
        scsi: compat_ioctl: cdrom: Replace .ioctl with .compat_ioctl in four appropriate places
        scsi: zfcp: fix wrong data and display format of SFP+ temperature
        scsi: sd_sbc: Fix sd_zbc_report_zones()
        scsi: libfc: free response frame from GPN_ID
      7557c1b3
    • Juergen Gross's avatar
      x86/mm: Fix dump_pagetables with Xen PV · bba42aff
      Juergen Gross authored
      Commit 2ae27137 ("x86: mm: convert dump_pagetables to use
      walk_page_range") broke Xen PV guests as the hypervisor reserved hole in
      the memory map was not taken into account.
      
      Fix that by starting the kernel range only at GUARD_HOLE_END_ADDR.
      
      Fixes: 2ae27137 ("x86: mm: convert dump_pagetables to use walk_page_range")
      Reported-by: default avatarJulien Grall <julien@xen.org>
      Signed-off-by: default avatarJuergen Gross <jgross@suse.com>
      Signed-off-by: default avatarThomas Gleixner <tglx@linutronix.de>
      Tested-by: default avatarJulien Grall <julien@xen.org>
      Link: https://lkml.kernel.org/r/20200221103851.7855-1-jgross@suse.com
      bba42aff
    • Juergen Gross's avatar
      x86/ioperm: Add new paravirt function update_io_bitmap() · 99bcd4a6
      Juergen Gross authored
      Commit 111e7b15 ("x86/ioperm: Extend IOPL config to control ioperm()
      as well") reworked the iopl syscall to use I/O bitmaps.
      
      Unfortunately this broke Xen PV domains using that syscall as there is
      currently no I/O bitmap support in PV domains.
      
      Add I/O bitmap support via a new paravirt function update_io_bitmap which
      Xen PV domains can use to update their I/O bitmaps via a hypercall.
      
      Fixes: 111e7b15 ("x86/ioperm: Extend IOPL config to control ioperm() as well")
      Reported-by: default avatarJan Beulich <jbeulich@suse.com>
      Signed-off-by: default avatarJuergen Gross <jgross@suse.com>
      Signed-off-by: default avatarThomas Gleixner <tglx@linutronix.de>
      Tested-by: default avatarJan Beulich <jbeulich@suse.com>
      Reviewed-by: default avatarJan Beulich <jbeulich@suse.com>
      Cc: <stable@vger.kernel.org> # 5.5
      Link: https://lkml.kernel.org/r/20200218154712.25490-1-jgross@suse.com
      99bcd4a6
    • Ingo Molnar's avatar
      Merge tag 'perf-urgent-for-mingo-5.6-20200228' of... · 7977fed9
      Ingo Molnar authored
      Merge tag 'perf-urgent-for-mingo-5.6-20200228' of git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux into perf/urgent
      
      Pull perf/urgent fixes from Arnaldo Carvalho de Melo:
      
      perf annotate:
      
        Ravi Bangoria:
      
        - Fix segfault with source toggle.
      
        - Fix --show-total-period and --show-nr-samples for tui/stdio2.
      
        - Fix handling of settings in ~/.perfconfig versus the ones passed
          in the command line
      
        - Re-render title bar after switching back from script browser.
      
        - Fix options man page, document some missing ones.
      
      perf probe:
      
        He Zhe:
      
        - Check return value of strlist__add() for -ENOMEM.
      
      tools UAPI:
      
        Arnaldo Carvalho de Melo:
      
        - Sync x86's msr-index.h copy with the kernel sources.
      
        - Update tools's copy of x86's kvm.h headers.
      Signed-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      Signed-off-by: default avatarIngo Molnar <mingo@kernel.org>
      7977fed9
  8. 28 Feb, 2020 6 commits
    • Linus Torvalds's avatar
      Merge tag 'pci-v5.6-fixes-2' of git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci · 29795de0
      Linus Torvalds authored
      Pull PCI fixes from Bjorn Helgaas:
      
       - Fix build issue on 32-bit ARM with old compilers (Marek Szyprowski)
      
       - Update MAINTAINERS for recent Cadence driver file move (Lukas
         Bulwahn)
      
      * tag 'pci-v5.6-fixes-2' of git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci:
        MAINTAINERS: Correct Cadence PCI driver path
        PCI: brcmstb: Fix build on 32bit ARM platforms with older compilers
      29795de0
    • Linus Torvalds's avatar
      Merge tag 'block-5.6-2020-02-28' of git://git.kernel.dk/linux-block · 2edc78b9
      Linus Torvalds authored
      Pull block fixes from Jens Axboe:
      
       - Passthrough insertion fix (Ming)
      
       - Kill off some unused arguments (John)
      
       - blktrace RCU fix (Jan)
      
       - Dead fields removal for null_blk (Dongli)
      
       - NVMe polled IO fix (Bijan)
      
      * tag 'block-5.6-2020-02-28' of git://git.kernel.dk/linux-block:
        nvme-pci: Hold cq_poll_lock while completing CQEs
        blk-mq: Remove some unused function arguments
        null_blk: remove unused fields in 'nullb_cmd'
        blktrace: Protect q->blk_trace with RCU
        blk-mq: insert passthrough request into hctx->dispatch directly
      2edc78b9
    • Linus Torvalds's avatar
      Merge tag 'io_uring-5.6-2020-02-28' of git://git.kernel.dk/linux-block · 74dea5d9
      Linus Torvalds authored
      Pull io_uring fixes from Jens Axboe:
      
       - Fix for a race with IOPOLL used with SQPOLL (Xiaoguang)
      
       - Only show ->fdinfo if procfs is enabled (Tobias)
      
       - Fix for a chain with multiple personalities in the SQEs
      
       - Fix for a missing free of personality idr on exit
      
       - Removal of the spin-for-work optimization
      
       - Fix for next work lookup on request completion
      
       - Fix for non-vec read/write result progation in case of links
      
       - Fix for a fileset references on switch
      
       - Fix for a recvmsg/sendmsg 32-bit compatability mode
      
      * tag 'io_uring-5.6-2020-02-28' of git://git.kernel.dk/linux-block:
        io_uring: fix 32-bit compatability with sendmsg/recvmsg
        io_uring: define and set show_fdinfo only if procfs is enabled
        io_uring: drop file set ref put/get on switch
        io_uring: import_single_range() returns 0/-ERROR
        io_uring: pick up link work on submit reference drop
        io-wq: ensure work->task_pid is cleared on init
        io-wq: remove spin-for-work optimization
        io_uring: fix poll_list race for SETUP_IOPOLL|SETUP_SQPOLL
        io_uring: fix personality idr leak
        io_uring: handle multiple personalities in link chains
      74dea5d9
    • Jens Axboe's avatar
      Merge branch 'nvme-5.6-rc4' of git://git.infradead.org/nvme into block-5.6 · 5b8ea58b
      Jens Axboe authored
      Pull NVMe fix from Keith.
      
      * 'nvme-5.6-rc4' of git://git.infradead.org/nvme:
        nvme-pci: Hold cq_poll_lock while completing CQEs
      5b8ea58b
    • Linus Torvalds's avatar
      Merge tag 'acpi-5.6-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm · c60c0402
      Linus Torvalds authored
      Pull ACPI fixes from Rafael Wysocki:
       "Fix a couple of configuration issues in the ACPI watchdog (WDAT)
        driver (Mika Westerberg) and make it possible to disable that driver
        at boot time in case it still does not work as expected (Jean
        Delvare)"
      
      * tag 'acpi-5.6-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm:
        ACPI: watchdog: Set default timeout in probe
        ACPI: watchdog: Fix gas->access_width usage
        ACPICA: Introduce ACPI_ACCESS_BYTE_WIDTH() macro
        ACPI: watchdog: Allow disabling WDAT at boot
      c60c0402
    • Linus Torvalds's avatar
      Merge tag 'pm-5.6-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm · 36428598
      Linus Torvalds authored
      Pull power management fixes from Rafael Wysocki:
       "Fix a recent cpufreq initialization regression (Rafael Wysocki),
        revert a devfreq commit that made incompatible changes and broke user
        land on some systems (Orson Zhai), drop a stale reference to a
        document that has gone away recently (Jonathan Neuschäfer), and fix a
        typo in a hibernation code comment (Alexandre Belloni)"
      
      * tag 'pm-5.6-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm:
        cpufreq: Fix policy initialization for internal governor drivers
        Revert "PM / devfreq: Modify the device name as devfreq(X) for sysfs"
        PM / hibernate: fix typo "reserverd_size" -> "reserved_size"
        Documentation: power: Drop reference to interface.rst
      36428598