1. 09 Jun, 2019 27 commits
    • Qu Wenruo's avatar
      btrfs: reloc: Also queue orphan reloc tree for cleanup to avoid BUG_ON() · 19b65aac
      Qu Wenruo authored
      commit 30d40577 upstream.
      
      [BUG]
      When a fs has orphan reloc tree along with unfinished balance:
        ...
              item 16 key (TREE_RELOC ROOT_ITEM FS_TREE) itemoff 12090 itemsize 439
                      generation 12 root_dirid 256 bytenr 300400640 level 1 refs 0 <<<
                      lastsnap 8 byte_limit 0 bytes_used 1359872 flags 0x0(none)
                      uuid 7c48d938-33a3-4aae-ab19-6e5c9d406e46
              item 17 key (BALANCE TEMPORARY_ITEM 0) itemoff 11642 itemsize 448
                      temporary item objectid BALANCE offset 0
                      balance status flags 14
      
      Then at mount time, we can hit the following kernel BUG_ON():
        BTRFS info (device dm-3): relocating block group 298844160 flags metadata|dup
        ------------[ cut here ]------------
        kernel BUG at fs/btrfs/relocation.c:1413!
        invalid opcode: 0000 [#1] PREEMPT SMP NOPTI
        CPU: 1 PID: 897 Comm: btrfs-balance Tainted: G           O      5.2.0-rc1-custom #15
        RIP: 0010:create_reloc_root+0x1eb/0x200 [btrfs]
        Call Trace:
         btrfs_init_reloc_root+0x96/0xb0 [btrfs]
         record_root_in_trans+0xb2/0xe0 [btrfs]
         btrfs_record_root_in_trans+0x55/0x70 [btrfs]
         select_reloc_root+0x7e/0x230 [btrfs]
         do_relocation+0xc4/0x620 [btrfs]
         relocate_tree_blocks+0x592/0x6a0 [btrfs]
         relocate_block_group+0x47b/0x5d0 [btrfs]
         btrfs_relocate_block_group+0x183/0x2f0 [btrfs]
         btrfs_relocate_chunk+0x4e/0xe0 [btrfs]
         btrfs_balance+0x864/0xfa0 [btrfs]
         balance_kthread+0x3b/0x50 [btrfs]
         kthread+0x123/0x140
         ret_from_fork+0x27/0x50
      
      [CAUSE]
      In btrfs, reloc trees are used to record swapped tree blocks during
      balance.
      Reloc tree either get merged (replace old tree blocks of its parent
      subvolume) in next transaction if its ref is 1 (fresh).
      Or is already merged and will be cleaned up if its ref is 0 (orphan).
      
      After commit d2311e69 ("btrfs: relocation: Delay reloc tree deletion
      after merge_reloc_roots"), reloc tree cleanup is delayed until one block
      group is balanced.
      
      Since fresh reloc roots are recorded during merge, as long as there
      is no power loss, those orphan reloc roots converted from fresh ones are
      handled without problem.
      
      However when power loss happens, orphan reloc roots can be recorded
      on-disk, thus at next mount time, we will have orphan reloc roots from
      on-disk data directly, and ignored by clean_dirty_subvols() routine.
      
      Then when background balance starts to balance another block group, and
      needs to create new reloc root for the same root, btrfs_insert_item()
      returns -EEXIST, and trigger that BUG_ON().
      
      [FIX]
      For orphan reloc roots, also queue them to rc->dirty_subvol_roots, so
      all reloc roots no matter orphan or not, can be cleaned up properly and
      avoid above BUG_ON().
      
      And to cooperate with above change, clean_dirty_subvols() will check if
      the queued root is a reloc root or a subvol root.
      For a subvol root, do the old work, and for a orphan reloc root, clean it
      up.
      
      Fixes: d2311e69 ("btrfs: relocation: Delay reloc tree deletion after merge_reloc_roots")
      CC: stable@vger.kernel.org # 5.1
      Signed-off-by: default avatarQu Wenruo <wqu@suse.com>
      Signed-off-by: default avatarDavid Sterba <dsterba@suse.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      19b65aac
    • Filipe Manana's avatar
      Btrfs: incremental send, fix file corruption when no-holes feature is enabled · 8a7c6f19
      Filipe Manana authored
      commit 6b1f72e5 upstream.
      
      When using the no-holes feature, if we have a file with prealloc extents
      with a start offset beyond the file's eof, doing an incremental send can
      cause corruption of the file due to incorrect hole detection. Such case
      requires that the prealloc extent(s) exist in both the parent and send
      snapshots, and that a hole is punched into the file that covers all its
      extents that do not cross the eof boundary.
      
      Example reproducer:
      
        $ mkfs.btrfs -f -O no-holes /dev/sdb
        $ mount /dev/sdb /mnt/sdb
      
        $ xfs_io -f -c "pwrite -S 0xab 0 500K" /mnt/sdb/foobar
        $ xfs_io -c "falloc -k 1200K 800K" /mnt/sdb/foobar
      
        $ btrfs subvolume snapshot -r /mnt/sdb /mnt/sdb/base
      
        $ btrfs send -f /tmp/base.snap /mnt/sdb/base
      
        $ xfs_io -c "fpunch 0 500K" /mnt/sdb/foobar
      
        $ btrfs subvolume snapshot -r /mnt/sdb /mnt/sdb/incr
      
        $ btrfs send -p /mnt/sdb/base -f /tmp/incr.snap /mnt/sdb/incr
      
        $ md5sum /mnt/sdb/incr/foobar
        816df6f64deba63b029ca19d880ee10a   /mnt/sdb/incr/foobar
      
        $ mkfs.btrfs -f /dev/sdc
        $ mount /dev/sdc /mnt/sdc
      
        $ btrfs receive -f /tmp/base.snap /mnt/sdc
        $ btrfs receive -f /tmp/incr.snap /mnt/sdc
      
        $ md5sum /mnt/sdc/incr/foobar
        cf2ef71f4a9e90c2f6013ba3b2257ed2   /mnt/sdc/incr/foobar
      
          --> Different checksum, because the prealloc extent beyond the
              file's eof confused the hole detection code and it assumed
              a hole starting at offset 0 and ending at the offset of the
              prealloc extent (1200Kb) instead of ending at the offset
              500Kb (the file's size).
      
      Fix this by ensuring we never cross the file's size when issuing the
      write operations for a hole.
      
      Fixes: 16e7549f ("Btrfs: incompatible format change to remove hole extents")
      CC: stable@vger.kernel.org # 3.14+
      Signed-off-by: default avatarFilipe Manana <fdmanana@suse.com>
      Signed-off-by: default avatarDavid Sterba <dsterba@suse.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      8a7c6f19
    • Qu Wenruo's avatar
      btrfs: qgroup: Check bg while resuming relocation to avoid NULL pointer dereference · afd72278
      Qu Wenruo authored
      commit 57949d03 upstream.
      
      [BUG]
      When mounting a fs with reloc tree and has qgroup enabled, it can cause
      NULL pointer dereference at mount time:
      
        BUG: kernel NULL pointer dereference, address: 00000000000000a8
        #PF: supervisor read access in kernel mode
        #PF: error_code(0x0000) - not-present page
        PGD 0 P4D 0
        Oops: 0000 [#1] PREEMPT SMP NOPTI
        RIP: 0010:btrfs_qgroup_add_swapped_blocks+0x186/0x300 [btrfs]
        Call Trace:
         replace_path.isra.23+0x685/0x900 [btrfs]
         merge_reloc_root+0x26e/0x5f0 [btrfs]
         merge_reloc_roots+0x10a/0x1a0 [btrfs]
         btrfs_recover_relocation+0x3cd/0x420 [btrfs]
         open_ctree+0x1bc8/0x1ed0 [btrfs]
         btrfs_mount_root+0x544/0x680 [btrfs]
         legacy_get_tree+0x34/0x60
         vfs_get_tree+0x2d/0xf0
         fc_mount+0x12/0x40
         vfs_kern_mount.part.12+0x61/0xa0
         vfs_kern_mount+0x13/0x20
         btrfs_mount+0x16f/0x860 [btrfs]
         legacy_get_tree+0x34/0x60
         vfs_get_tree+0x2d/0xf0
         do_mount+0x81f/0xac0
         ksys_mount+0xbf/0xe0
         __x64_sys_mount+0x25/0x30
         do_syscall_64+0x65/0x240
         entry_SYSCALL_64_after_hwframe+0x49/0xbe
      
      [CAUSE]
      In btrfs_recover_relocation(), we don't have enough info to determine
      which block group we're relocating, but only to merge existing reloc
      trees.
      
      Thus in btrfs_recover_relocation(), rc->block_group is NULL.
      btrfs_qgroup_add_swapped_blocks() hasn't taken this into consideration,
      and causes a NULL pointer dereference.
      
      The bug is introduced by commit 3d0174f7 ("btrfs: qgroup: Only trace
      data extents in leaves if we're relocating data block group"), and
      later qgroup refactoring still keeps this optimization.
      
      [FIX]
      Thankfully in the context of btrfs_recover_relocation(), there is no
      other progress can modify tree blocks, thus those swapped tree blocks
      pair will never affect qgroup numbers, no matter whatever we set for
      block->trace_leaf.
      
      So we only need to check if @bg is NULL before accessing @bg->flags.
      Reported-by: default avatarJuan Erbes <jerbes@gmail.com>
      Link: https://bugzilla.opensuse.org/show_bug.cgi?id=1134806
      Fixes: 3d0174f7 ("btrfs: qgroup: Only trace data extents in leaves if we're relocating data block group")
      CC: stable@vger.kernel.org # 4.20+
      Signed-off-by: default avatarQu Wenruo <wqu@suse.com>
      Signed-off-by: default avatarDavid Sterba <dsterba@suse.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      afd72278
    • Dennis Zhou's avatar
      btrfs: correct zstd workspace manager lock to use spin_lock_bh() · c2d03b61
      Dennis Zhou authored
      commit fee13fe9 upstream.
      
      The btrfs zstd workspace manager uses a background timer to reclaim not
      recently used workspaces. I used spin_lock() from this context which
      should have been caught with lockdep, but was not. This deadlock was
      reported in bugzilla. The fix is to switch the zstd wsm lock to use
      spin_lock_bh() from the softirq context.
      
      This happened quite relibably on ppc64, unlike on other architectures.
      
        [  313.402874] ================================
        [  313.402875] WARNING: inconsistent lock state
        [  313.402879] 5.1.0-rc7 #1 Not tainted
        [  313.402880] --------------------------------
        [  313.402882] inconsistent {SOFTIRQ-ON-W} -> {IN-SOFTIRQ-W} usage.
        [  313.402885] swapper/5/0 [HC0[0]:SC1[1]:HE1:SE0] takes:
        [  313.402888] 0000000080d1120c (&(&wsm.lock)->rlock){+.?.}, at: .zstd_reclaim_timer_fn+0x40/0x230
        [  313.402895] {SOFTIRQ-ON-W} state was registered at:
        [  313.402899]   .lock_acquire+0xd0/0x240
        [  313.402903]   ._raw_spin_lock+0x34/0x60
        [  313.402906]   .zstd_get_workspace+0xd0/0x360
        [  313.402908]   .end_compressed_bio_read+0x3b8/0x540
        [  313.402911]   .bio_endio+0x174/0x2c0
        [  313.402914]   .end_workqueue_fn+0x4c/0x70
        [  313.402917]   .normal_work_helper+0x138/0x7e0
        [  313.402920]   .process_one_work+0x324/0x790
        [  313.402922]   .worker_thread+0x68/0x570
        [  313.402925]   .kthread+0x19c/0x1b0
        [  313.402928]   .ret_from_kernel_thread+0x58/0x78
        [  313.402930] irq event stamp: 2629216
        [  313.402933] hardirqs last  enabled at (2629216): [<c0000000009da738>] ._raw_spin_unlock_irq+0x38/0x60
        [  313.402936] hardirqs last disabled at (2629215): [<c0000000009da4c4>] ._raw_spin_lock_irq+0x24/0x70
        [  313.402939] softirqs last  enabled at (2629212): [<c0000000000af9fc>] .irq_enter+0x8c/0xd0
        [  313.402942] softirqs last disabled at (2629213): [<c0000000000afb58>] .irq_exit+0x118/0x170
        [  313.402944]
      		 other info that might help us debug this:
        [  313.402945]  Possible unsafe locking scenario:
      
        [  313.402947]        CPU0
        [  313.402948]        ----
        [  313.402949]   lock(&(&wsm.lock)->rlock);
        [  313.402951]   <Interrupt>
        [  313.402952]     lock(&(&wsm.lock)->rlock);
        [  313.402954]
      		  *** DEADLOCK ***
      
        [  313.402957] 1 lock held by swapper/5/0:
        [  313.402958]  #0: 000000004b612042 ((&wsm.timer)){+.-.}, at: .call_timer_fn+0x0/0x3c0
        [  313.402963]
      		 stack backtrace:
        [  313.402967] CPU: 5 PID: 0 Comm: swapper/5 Not tainted 5.1.0-rc7 #1
        [  313.402968] Call Trace:
        [  313.402972] [c0000007fa262e70] [c0000000009b3294] .dump_stack+0xe0/0x15c (unreliable)
        [  313.402975] [c0000007fa262f10] [c000000000125548] .print_usage_bug+0x348/0x390
        [  313.402978] [c0000007fa262fd0] [c000000000125cb4] .mark_lock+0x724/0x930
        [  313.402981] [c0000007fa263080] [c000000000126c20] .__lock_acquire+0xc90/0x16a0
        [  313.402984] [c0000007fa2631b0] [c000000000128040] .lock_acquire+0xd0/0x240
        [  313.402987] [c0000007fa263280] [c0000000009da2b4] ._raw_spin_lock+0x34/0x60
        [  313.402990] [c0000007fa263300] [c00000000054b0b0] .zstd_reclaim_timer_fn+0x40/0x230
        [  313.402993] [c0000007fa2633d0] [c000000000158b38] .call_timer_fn+0xc8/0x3c0
        [  313.402996] [c0000007fa2634a0] [c000000000158f74] .expire_timers+0x144/0x260
        [  313.402999] [c0000007fa263550] [c000000000159178] .run_timer_softirq+0xe8/0x230
        [  313.403002] [c0000007fa263680] [c0000000009db288] .__do_softirq+0x188/0x5d4
        [  313.403004] [c0000007fa263790] [c0000000000afb58] .irq_exit+0x118/0x170
        [  313.403008] [c0000007fa263800] [c000000000028d88] .timer_interrupt+0x158/0x430
        [  313.403012] [c0000007fa2638b0] [c0000000000091d4] decrementer_common+0x134/0x140
        [  313.403017] --- interrupt: 901 at replay_interrupt_return+0x0/0x4
      		     LR = .arch_local_irq_restore.part.0+0x68/0x80
        [  313.403020] [c0000007fa263bb0] [c00000000001a3ac] .arch_local_irq_restore.part.0+0x2c/0x80 (unreliable)
        [  313.403024] [c0000007fa263c30] [c0000000007bbbcc] .cpuidle_enter_state+0xec/0x670
        [  313.403027] [c0000007fa263d00] [c0000000000f5130] .call_cpuidle+0x40/0x90
        [  313.403031] [c0000007fa263d70] [c0000000000f554c] .do_idle+0x2dc/0x3a0
        [  313.403034] [c0000007fa263e30] [c0000000000f59ac] .cpu_startup_entry+0x2c/0x30
        [  313.403037] [c0000007fa263ea0] [c000000000045674] .start_secondary+0x644/0x650
        [  313.403041] [c0000007fa263f90] [c00000000000ad5c] start_secondary_prolog+0x10/0x14
      
      Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=203517
      Fixes: 3f93aef5 ("btrfs: add zstd compression level support")
      CC: stable@vger.kernel.org # 5.1+
      Signed-off-by: default avatarDennis Zhou <dennis@kernel.org>
      Reviewed-by: default avatarDavid Sterba <dsterba@suse.com>
      Signed-off-by: default avatarDavid Sterba <dsterba@suse.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      c2d03b61
    • Filipe Manana's avatar
      Btrfs: fix fsync not persisting changed attributes of a directory · a1d13d87
      Filipe Manana authored
      commit 60d9f503 upstream.
      
      While logging an inode we follow its ancestors and for each one we mark
      it as logged in the current transaction, even if we have not logged it.
      As a consequence if we change an attribute of an ancestor, such as the
      UID or GID for example, and then explicitly fsync it, we end up not
      logging the inode at all despite returning success to user space, which
      results in the attribute being lost if a power failure happens after
      the fsync.
      
      Sample reproducer:
      
        $ mkfs.btrfs -f /dev/sdb
        $ mount /dev/sdb /mnt
      
        $ mkdir /mnt/dir
        $ chown 6007:6007 /mnt/dir
      
        $ sync
      
        $ chown 9003:9003 /mnt/dir
        $ touch /mnt/dir/file
        $ xfs_io -c fsync /mnt/dir/file
      
        # fsync our directory after fsync'ing the new file, should persist the
        # new values for the uid and gid.
        $ xfs_io -c fsync /mnt/dir
      
        <power failure>
      
        $ mount /dev/sdb /mnt
        $ stat -c %u:%g /mnt/dir
        6007:6007
      
          --> should be 9003:9003, the uid and gid were not persisted, despite
              the explicit fsync on the directory prior to the power failure
      
      Fix this by not updating the logged_trans field of ancestor inodes when
      logging an inode, since we have not logged them. Let only future calls to
      btrfs_log_inode() to mark inodes as logged.
      
      This could be triggered by my recent fsync fuzz tester for fstests, for
      which an fstests patch exists titled "fstests: generic, fsync fuzz tester
      with fsstress".
      
      Fixes: 12fcfd22 ("Btrfs: tree logging unlink/rename fixes")
      CC: stable@vger.kernel.org # 4.4+
      Signed-off-by: default avatarFilipe Manana <fdmanana@suse.com>
      Signed-off-by: default avatarDavid Sterba <dsterba@suse.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      a1d13d87
    • Filipe Manana's avatar
      Btrfs: fix race updating log root item during fsync · 1e6d76d4
      Filipe Manana authored
      commit 06989c79 upstream.
      
      When syncing the log, the final phase of a fsync operation, we need to
      either create a log root's item or update the existing item in the log
      tree of log roots, and that depends on the current value of the log
      root's log_transid - if it's 1 we need to create the log root item,
      otherwise it must exist already and we update it. Since there is no
      synchronization between updating the log_transid and checking it for
      deciding whether the log root's item needs to be created or updated, we
      end up with a tiny race window that results in attempts to update the
      item to fail because the item was not yet created:
      
                    CPU 1                                    CPU 2
      
        btrfs_sync_log()
      
          lock root->log_mutex
      
          set log root's log_transid to 1
      
          unlock root->log_mutex
      
                                                     btrfs_sync_log()
      
                                                       lock root->log_mutex
      
                                                       sets log root's
                                                       log_transid to 2
      
                                                       unlock root->log_mutex
      
          update_log_root()
      
            sees log root's log_transid
            with a value of 2
      
              calls btrfs_update_root(),
              which fails with -EUCLEAN
              and causes transaction abort
      
      Until recently the race lead to a BUG_ON at btrfs_update_root(), but after
      the recent commit 7ac1e464 ("btrfs: Don't panic when we can't find a
      root key") we just abort the current transaction.
      
      A sample trace of the BUG_ON() on a SLE12 kernel:
      
        ------------[ cut here ]------------
        kernel BUG at ../fs/btrfs/root-tree.c:157!
        Oops: Exception in kernel mode, sig: 5 [#1]
        SMP NR_CPUS=2048 NUMA pSeries
        (...)
        Supported: Yes, External
        CPU: 78 PID: 76303 Comm: rtas_errd Tainted: G                 X 4.4.156-94.57-default #1
        task: c00000ffa906d010 ti: c00000ff42b08000 task.ti: c00000ff42b08000
        NIP: d000000036ae5cdc LR: d000000036ae5cd8 CTR: 0000000000000000
        REGS: c00000ff42b0b860 TRAP: 0700   Tainted: G                 X  (4.4.156-94.57-default)
        MSR: 8000000002029033 <SF,VEC,EE,ME,IR,DR,RI,LE>  CR: 22444484  XER: 20000000
        CFAR: d000000036aba66c SOFTE: 1
        GPR00: d000000036ae5cd8 c00000ff42b0bae0 d000000036bda220 0000000000000054
        GPR04: 0000000000000001 0000000000000000 c00007ffff8d37c8 0000000000000000
        GPR08: c000000000e19c00 0000000000000000 0000000000000000 3736343438312079
        GPR12: 3930373337303434 c000000007a3a800 00000000007fffff 0000000000000023
        GPR16: c00000ffa9d26028 c00000ffa9d261f8 0000000000000010 c00000ffa9d2ab28
        GPR20: c00000ff42b0bc48 0000000000000001 c00000ff9f0d9888 0000000000000001
        GPR24: c00000ffa9d26000 c00000ffa9d261e8 c00000ffa9d2a800 c00000ff9f0d9888
        GPR28: c00000ffa9d26028 c00000ffa9d2aa98 0000000000000001 c00000ffa98f5b20
        NIP [d000000036ae5cdc] btrfs_update_root+0x25c/0x4e0 [btrfs]
        LR [d000000036ae5cd8] btrfs_update_root+0x258/0x4e0 [btrfs]
        Call Trace:
        [c00000ff42b0bae0] [d000000036ae5cd8] btrfs_update_root+0x258/0x4e0 [btrfs] (unreliable)
        [c00000ff42b0bba0] [d000000036b53610] btrfs_sync_log+0x2d0/0xc60 [btrfs]
        [c00000ff42b0bce0] [d000000036b1785c] btrfs_sync_file+0x44c/0x4e0 [btrfs]
        [c00000ff42b0bd80] [c00000000032e300] vfs_fsync_range+0x70/0x120
        [c00000ff42b0bdd0] [c00000000032e44c] do_fsync+0x5c/0xb0
        [c00000ff42b0be10] [c00000000032e8dc] SyS_fdatasync+0x2c/0x40
        [c00000ff42b0be30] [c000000000009488] system_call+0x3c/0x100
        Instruction dump:
        7f43d378 4bffebb9 60000000 88d90008 3d220000 e8b90000 3b390009 e87a01f0
        e8898e08 e8f90000 4bfd48e5 60000000 <0fe00000> e95b0060 39200004 394a0ea0
        ---[ end trace 8f2dc8f919cabab8 ]---
      
      So fix this by doing the check of log_transid and updating or creating the
      log root's item while holding the root's log_mutex.
      
      Fixes: 7237f183 ("Btrfs: fix tree logs parallel sync")
      CC: stable@vger.kernel.org # 4.4+
      Signed-off-by: default avatarFilipe Manana <fdmanana@suse.com>
      Signed-off-by: default avatarDavid Sterba <dsterba@suse.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      1e6d76d4
    • Filipe Manana's avatar
      Btrfs: fix wrong ctime and mtime of a directory after log replay · a48f8ac4
      Filipe Manana authored
      commit 5338e43a upstream.
      
      When replaying a log that contains a new file or directory name that needs
      to be added to its parent directory, we end up updating the mtime and the
      ctime of the parent directory to the current time after we have set their
      values to the correct ones (set at fsync time), efectivelly losing them.
      
      Sample reproducer:
      
        $ mkfs.btrfs -f /dev/sdb
        $ mount /dev/sdb /mnt
      
        $ mkdir /mnt/dir
        $ touch /mnt/dir/file
      
        # fsync of the directory is optional, not needed
        $ xfs_io -c fsync /mnt/dir
        $ xfs_io -c fsync /mnt/dir/file
      
        $ stat -c %Y /mnt/dir
        1557856079
      
        <power failure>
      
        $ sleep 3
        $ mount /dev/sdb /mnt
        $ stat -c %Y /mnt/dir
        1557856082
      
          --> should have been 1557856079, the mtime is updated to the current
              time when replaying the log
      
      Fix this by not updating the mtime and ctime to the current time at
      btrfs_add_link() when we are replaying a log tree.
      
      This could be triggered by my recent fsync fuzz tester for fstests, for
      which an fstests patch exists titled "fstests: generic, fsync fuzz tester
      with fsstress".
      
      Fixes: e02119d5 ("Btrfs: Add a write ahead tree log to optimize synchronous operations")
      CC: stable@vger.kernel.org # 4.4+
      Reviewed-by: default avatarNikolay Borisov <nborisov@suse.com>
      Signed-off-by: default avatarFilipe Manana <fdmanana@suse.com>
      Signed-off-by: default avatarDavid Sterba <dsterba@suse.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      a48f8ac4
    • Tomas Bortoli's avatar
      tracing: Avoid memory leak in predicate_parse() · 58abecd5
      Tomas Bortoli authored
      commit dfb4a6f2 upstream.
      
      In case of errors, predicate_parse() goes to the out_free label
      to free memory and to return an error code.
      
      However, predicate_parse() does not free the predicates of the
      temporary prog_stack array, thence leaking them.
      
      Link: http://lkml.kernel.org/r/20190528154338.29976-1-tomasbortoli@gmail.com
      
      Cc: stable@vger.kernel.org
      Fixes: 80765597 ("tracing: Rewrite filter logic to be simpler and faster")
      Reported-by: syzbot+6b8e0fb820e570c59e19@syzkaller.appspotmail.com
      Signed-off-by: default avatarTomas Bortoli <tomasbortoli@gmail.com>
      [ Added protection around freeing prog_stack[i].pred ]
      Signed-off-by: default avatarSteven Rostedt (VMware) <rostedt@goodmis.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      58abecd5
    • Steffen Maier's avatar
      scsi: zfcp: fix to prevent port_remove with pure auto scan LUNs (only sdevs) · e0c4ec19
      Steffen Maier authored
      commit ef4021fe upstream.
      
      When the user tries to remove a zfcp port via sysfs, we only rejected it if
      there are zfcp unit children under the port. With purely automatically
      scanned LUNs there are no zfcp units but only SCSI devices. In such cases,
      the port_remove erroneously continued. We close the port and this
      implicitly closes all LUNs under the port. The SCSI devices survive with
      their private zfcp_scsi_dev still holding a reference to the "removed"
      zfcp_port (still allocated but invisible in sysfs) [zfcp_get_port_by_wwpn
      in zfcp_scsi_slave_alloc]. This is not a problem as long as the fc_rport
      stays blocked. Once (auto) port scan brings back the removed port, we
      unblock its fc_rport again by design.  However, there is no mechanism that
      would recover (open) the LUNs under the port (no "ersfs_3" without
      zfcp_unit [zfcp_erp_strategy_followup_success]).  Any pending or new I/O to
      such LUN leads to repeated:
      
        Done: NEEDS_RETRY Result: hostbyte=DID_IMM_RETRY driverbyte=DRIVER_OK
      
      See also v4.10 commit 6f2ce1c6 ("scsi: zfcp: fix rport unblock race
      with LUN recovery"). Even a manual LUN recovery
      (echo 0 > /sys/bus/scsi/devices/H:C:T:L/zfcp_failed)
      does not help, as the LUN links to the old "removed" port which remains
      to lack ZFCP_STATUS_COMMON_RUNNING [zfcp_erp_required_act].
      The only workaround is to first ensure that the fc_rport is blocked
      (e.g. port_remove again in case it was re-discovered by (auto) port scan),
      then delete the SCSI devices, and finally re-discover by (auto) port scan.
      The port scan includes an fc_rport unblock, which in turn triggers
      a new scan on the scsi target to freshly get new pure auto scan LUNs.
      
      Fix this by rejecting port_remove also if there are SCSI devices
      (even without any zfcp_unit) under this port. Re-use mechanics from v3.7
      commit d99b601b ("[SCSI] zfcp: restore refcount check on port_remove").
      However, we have to give up zfcp_sysfs_port_units_mutex earlier in unit_add
      to prevent a deadlock with scsi_host scan taking shost->scan_mutex first
      and then zfcp_sysfs_port_units_mutex now in our zfcp_scsi_slave_alloc().
      Signed-off-by: default avatarSteffen Maier <maier@linux.ibm.com>
      Fixes: b62a8d9b ("[SCSI] zfcp: Use SCSI device data zfcp scsi dev instead of zfcp unit")
      Fixes: f8210e34 ("[SCSI] zfcp: Allow midlayer to scan for LUNs when running in NPIV mode")
      Cc: <stable@vger.kernel.org> #2.6.37+
      Reviewed-by: default avatarBenjamin Block <bblock@linux.ibm.com>
      Signed-off-by: default avatarMartin K. Petersen <martin.petersen@oracle.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      e0c4ec19
    • Steffen Maier's avatar
      scsi: zfcp: fix missing zfcp_port reference put on -EBUSY from port_remove · 66ba9b51
      Steffen Maier authored
      commit d27e5e07 upstream.
      
      With this early return due to zfcp_unit child(ren), we don't use the
      zfcp_port reference from the earlier zfcp_get_port_by_wwpn() anymore and
      need to put it.
      Signed-off-by: default avatarSteffen Maier <maier@linux.ibm.com>
      Fixes: d99b601b ("[SCSI] zfcp: restore refcount check on port_remove")
      Cc: <stable@vger.kernel.org> #3.7+
      Reviewed-by: default avatarJens Remus <jremus@linux.ibm.com>
      Reviewed-by: default avatarBenjamin Block <bblock@linux.ibm.com>
      Signed-off-by: default avatarMartin K. Petersen <martin.petersen@oracle.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      66ba9b51
    • Piotr Figiel's avatar
      brcmfmac: fix NULL pointer derefence during USB disconnect · 6da1e14b
      Piotr Figiel authored
      commit 5cdb0ef6 upstream.
      
      In case USB disconnect happens at the moment transmitting workqueue is in
      progress the underlying interface may be gone causing a NULL pointer
      dereference. Add synchronization of the workqueue destruction with the
      detach implementation in core so that the transmitting workqueue is stopped
      during detach before the interfaces are removed.
      
      Fix following Oops:
      
      Unable to handle kernel NULL pointer dereference at virtual address 00000008
      pgd = 9e6a802d
      [00000008] *pgd=00000000
      Internal error: Oops: 5 [#1] PREEMPT SMP ARM
      Modules linked in: nf_log_ipv4 nf_log_common xt_LOG xt_limit iptable_mangle
      xt_connmark xt_tcpudp xt_conntrack nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4
      iptable_filter ip_tables x_tables usb_f_mass_storage usb_f_rndis u_ether
      usb_serial_simple usbserial cdc_acm brcmfmac brcmutil smsc95xx usbnet
      ci_hdrc_imx ci_hdrc ulpi usbmisc_imx 8250_exar 8250_pci 8250 8250_base
      libcomposite configfs udc_core
      CPU: 0 PID: 7 Comm: kworker/u8:0 Not tainted 4.19.23-00076-g03740aa-dirty #102
      Hardware name: Freescale i.MX6 Quad/DualLite (Device Tree)
      Workqueue: brcmf_fws_wq brcmf_fws_dequeue_worker [brcmfmac]
      PC is at brcmf_txfinalize+0x34/0x90 [brcmfmac]
      LR is at brcmf_fws_dequeue_worker+0x218/0x33c [brcmfmac]
      pc : [<7f0dee64>]    lr : [<7f0e4140>]    psr: 60010093
      sp : ee8abef0  ip : 00000000  fp : edf38000
      r10: ffffffed  r9 : edf38970  r8 : edf38004
      r7 : edf3e970  r6 : 00000000  r5 : ede69000  r4 : 00000000
      r3 : 00000a97  r2 : 00000000  r1 : 0000888e  r0 : ede69000
      Flags: nZCv  IRQs off  FIQs on  Mode SVC_32  ISA ARM  Segment none
      Control: 10c5387d  Table: 7d03c04a  DAC: 00000051
      Process kworker/u8:0 (pid: 7, stack limit = 0x24ec3e04)
      Stack: (0xee8abef0 to 0xee8ac000)
      bee0:                                     ede69000 00000000 ed56c3e0 7f0e4140
      bf00: 00000001 00000000 edf38004 edf3e99c ed56c3e0 80d03d00 edfea43a edf3e970
      bf20: ee809880 ee804200 ee971100 00000000 edf3e974 00000000 ee804200 80135a70
      bf40: 80d03d00 ee804218 ee809880 ee809894 ee804200 80d03d00 ee804218 ee8aa000
      bf60: 00000088 80135d5c 00000000 ee829f00 ee829dc0 00000000 ee809880 80135d30
      bf80: ee829f1c ee873eac 00000000 8013b1a0 ee829dc0 8013b07c 00000000 00000000
      bfa0: 00000000 00000000 00000000 801010e8 00000000 00000000 00000000 00000000
      bfc0: 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
      bfe0: 00000000 00000000 00000000 00000000 00000013 00000000 00000000 00000000
      [<7f0dee64>] (brcmf_txfinalize [brcmfmac]) from [<7f0e4140>] (brcmf_fws_dequeue_worker+0x218/0x33c [brcmfmac])
      [<7f0e4140>] (brcmf_fws_dequeue_worker [brcmfmac]) from [<80135a70>] (process_one_work+0x138/0x3f8)
      [<80135a70>] (process_one_work) from [<80135d5c>] (worker_thread+0x2c/0x554)
      [<80135d5c>] (worker_thread) from [<8013b1a0>] (kthread+0x124/0x154)
      [<8013b1a0>] (kthread) from [<801010e8>] (ret_from_fork+0x14/0x2c)
      Exception stack(0xee8abfb0 to 0xee8abff8)
      bfa0:                                     00000000 00000000 00000000 00000000
      bfc0: 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
      bfe0: 00000000 00000000 00000000 00000000 00000013 00000000
      Code: e1530001 0a000007 e3560000 e1a00005 (05942008)
      ---[ end trace 079239dd31c86e90 ]---
      Signed-off-by: default avatarPiotr Figiel <p.figiel@camlintechnologies.com>
      Signed-off-by: default avatarKalle Valo <kvalo@codeaurora.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      6da1e14b
    • Mauro Carvalho Chehab's avatar
      media: smsusb: better handle optional alignment · 2a1ba1ea
      Mauro Carvalho Chehab authored
      commit a4768663 upstream.
      
      Most Siano devices require an alignment for the response.
      
      Changeset f3be52b0056a ("media: usb: siano: Fix general protection fault in smsusb")
      changed the logic with gets such aligment, but it now produces a
      sparce warning:
      
      drivers/media/usb/siano/smsusb.c: In function 'smsusb_init_device':
      drivers/media/usb/siano/smsusb.c:447:37: warning: 'in_maxp' may be used uninitialized in this function [-Wmaybe-uninitialized]
        447 |   dev->response_alignment = in_maxp - sizeof(struct sms_msg_hdr);
            |                             ~~~~~~~~^~~~~~~~~~~~~~~~~~~~~~~~~~~~
      
      The sparse message itself is bogus, but a broken (or fake) USB
      eeprom could produce a negative value for response_alignment.
      
      So, change the code in order to check if the result is not
      negative.
      
      Fixes: 31e0456d ("media: usb: siano: Fix general protection fault in smsusb")
      CC: <stable@vger.kernel.org>
      Signed-off-by: default avatarMauro Carvalho Chehab <mchehab+samsung@kernel.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      2a1ba1ea
    • Alan Stern's avatar
      media: usb: siano: Fix false-positive "uninitialized variable" warning · 8c28bf8a
      Alan Stern authored
      commit 45457c01 upstream.
      
      GCC complains about an apparently uninitialized variable recently
      added to smsusb_init_device().  It's a false positive, but to silence
      the warning this patch adds a trivial initialization.
      Signed-off-by: default avatarAlan Stern <stern@rowland.harvard.edu>
      Reported-by: default avatarkbuild test robot <lkp@intel.com>
      CC: <stable@vger.kernel.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      8c28bf8a
    • Alan Stern's avatar
      media: usb: siano: Fix general protection fault in smsusb · 45b7d1be
      Alan Stern authored
      commit 31e0456d upstream.
      
      The syzkaller USB fuzzer found a general-protection-fault bug in the
      smsusb part of the Siano DVB driver.  The fault occurs during probe
      because the driver assumes without checking that the device has both
      IN and OUT endpoints and the IN endpoint is ep1.
      
      By slightly rearranging the driver's initialization code, we can make
      the appropriate checks early on and thus avoid the problem.  If the
      expected endpoints aren't present, the new code safely returns -ENODEV
      from the probe routine.
      Signed-off-by: default avatarAlan Stern <stern@rowland.harvard.edu>
      Reported-and-tested-by: syzbot+53f029db71c19a47325a@syzkaller.appspotmail.com
      CC: <stable@vger.kernel.org>
      Reviewed-by: default avatarJohan Hovold <johan@kernel.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      45b7d1be
    • Oliver Neukum's avatar
      USB: rio500: fix memory leak in close after disconnect · 97059f28
      Oliver Neukum authored
      commit e0feb734 upstream.
      
      If a disconnected device is closed, rio_close() must free
      the buffers.
      Signed-off-by: default avatarOliver Neukum <oneukum@suse.com>
      Cc: stable <stable@vger.kernel.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      97059f28
    • Oliver Neukum's avatar
      USB: rio500: refuse more than one device at a time · 77198d72
      Oliver Neukum authored
      commit 3864d339 upstream.
      
      This driver is using a global variable. It cannot handle more than
      one device at a time. The issue has been existing since the dawn
      of the driver.
      Signed-off-by: default avatarOliver Neukum <oneukum@suse.com>
      Reported-by: syzbot+35f04d136fc975a70da4@syzkaller.appspotmail.com
      Cc: stable <stable@vger.kernel.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      77198d72
    • Maximilian Luz's avatar
      USB: Add LPM quirk for Surface Dock GigE adapter · 47991232
      Maximilian Luz authored
      commit ea261113 upstream.
      
      Without USB_QUIRK_NO_LPM ethernet will not work and rtl8152 will
      complain with
      
          r8152 <device...>: Stop submitting intr, status -71
      
      Adding the quirk resolves this. As the dock is externally powered, this
      should not have any drawbacks.
      Signed-off-by: default avatarMaximilian Luz <luzmaximilian@gmail.com>
      Cc: stable <stable@vger.kernel.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      47991232
    • Oliver Neukum's avatar
      USB: sisusbvga: fix oops in error path of sisusb_probe · 90eab089
      Oliver Neukum authored
      commit 9a5729f6 upstream.
      
      The pointer used to log a failure of usb_register_dev() must
      be set before the error is logged.
      
      v2: fix that minor is not available before registration
      Signed-off-by: default avataroliver Neukum <oneukum@suse.com>
      Reported-by: syzbot+a0cbdbd6d169020c8959@syzkaller.appspotmail.com
      Fixes: 7b5cd5fe ("USB: SisUSB2VGA: Convert printk to dev_* macros")
      Cc: stable <stable@vger.kernel.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      90eab089
    • Alan Stern's avatar
      USB: Fix slab-out-of-bounds write in usb_get_bos_descriptor · 7cfb3a00
      Alan Stern authored
      commit a03ff544 upstream.
      
      The syzkaller USB fuzzer found a slab-out-of-bounds write bug in the
      USB core, caused by a failure to check the actual size of a BOS
      descriptor.  This patch adds a check to make sure the descriptor is at
      least as large as it is supposed to be, so that the code doesn't
      inadvertently access memory beyond the end of the allocated region
      when assigning to dev->bos->desc->bNumDeviceCaps later on.
      Signed-off-by: default avatarAlan Stern <stern@rowland.harvard.edu>
      Reported-and-tested-by: syzbot+71f1e64501a309fcc012@syzkaller.appspotmail.com
      CC: <stable@vger.kernel.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      7cfb3a00
    • Shuah Khan's avatar
      usbip: usbip_host: fix stub_dev lock context imbalance regression · e0161d24
      Shuah Khan authored
      commit 3ea3091f upstream.
      
      Fix the following sparse context imbalance regression introduced in
      a patch that fixed sleeping function called from invalid context bug.
      
      kbuild test robot reported on:
      
      tree/branch: https://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb.git  usb-linus
      
      Regressions in current branch:
      
      drivers/usb/usbip/stub_dev.c:399:9: sparse: sparse: context imbalance in 'stub_probe' - different lock contexts for basic block
      drivers/usb/usbip/stub_dev.c:418:13: sparse: sparse: context imbalance in 'stub_disconnect' - different lock contexts for basic block
      drivers/usb/usbip/stub_dev.c:464:1-10: second lock on line 476
      
      Error ids grouped by kconfigs:
      
      recent_errors
      ├── i386-allmodconfig
      │   └── drivers-usb-usbip-stub_dev.c:second-lock-on-line
      ├── x86_64-allmodconfig
      │   ├── drivers-usb-usbip-stub_dev.c:sparse:sparse:context-imbalance-in-stub_disconnect-different-lock-contexts-for-basic-block
      │   └── drivers-usb-usbip-stub_dev.c:sparse:sparse:context-imbalance-in-stub_probe-different-lock-contexts-for-basic-block
      └── x86_64-allyesconfig
          └── drivers-usb-usbip-stub_dev.c:second-lock-on-line
      
      This is a real problem in an error leg where spin_lock() is called on an
      already held lock.
      
      Fix the imbalance in stub_probe() and stub_disconnect().
      Signed-off-by: default avatarShuah Khan <skhan@linuxfoundation.org>
      Fixes: 0c9e8b3c ("usbip: usbip_host: fix BUG: sleeping function called from invalid context")
      Cc: stable <stable@vger.kernel.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      e0161d24
    • Shuah Khan's avatar
      usbip: usbip_host: fix BUG: sleeping function called from invalid context · 6787013a
      Shuah Khan authored
      commit 0c9e8b3c upstream.
      
      stub_probe() and stub_disconnect() call functions which could call
      sleeping function in invalid context whil holding busid_lock.
      
      Fix the problem by refining the lock holds to short critical sections
      to change the busid_priv fields. This fix restructures the code to
      limit the lock holds in stub_probe() and stub_disconnect().
      
      stub_probe():
      
      [15217.927028] BUG: sleeping function called from invalid context at mm/slab.h:418
      [15217.927038] in_atomic(): 1, irqs_disabled(): 0, pid: 29087, name: usbip
      [15217.927044] 5 locks held by usbip/29087:
      [15217.927047]  #0: 0000000091647f28 (sb_writers#6){....}, at: vfs_write+0x191/0x1c0
      [15217.927062]  #1: 000000008f9ba75b (&of->mutex){....}, at: kernfs_fop_write+0xf7/0x1b0
      [15217.927072]  #2: 00000000872e5b4b (&dev->mutex){....}, at: __device_driver_lock+0x3b/0x50
      [15217.927082]  #3: 00000000e74ececc (&dev->mutex){....}, at: __device_driver_lock+0x46/0x50
      [15217.927090]  #4: 00000000b20abbe0 (&(&busid_table[i].busid_lock)->rlock){....}, at: get_busid_priv+0x48/0x60 [usbip_host]
      [15217.927103] CPU: 3 PID: 29087 Comm: usbip Tainted: G        W         5.1.0-rc6+ #40
      [15217.927106] Hardware name: Dell Inc. OptiPlex 790/0HY9JP, BIOS A18 09/24/2013
      [15217.927109] Call Trace:
      [15217.927118]  dump_stack+0x63/0x85
      [15217.927127]  ___might_sleep+0xff/0x120
      [15217.927133]  __might_sleep+0x4a/0x80
      [15217.927143]  kmem_cache_alloc_trace+0x1aa/0x210
      [15217.927156]  stub_probe+0xe8/0x440 [usbip_host]
      [15217.927171]  usb_probe_device+0x34/0x70
      
      stub_disconnect():
      
      [15279.182478] BUG: sleeping function called from invalid context at kernel/locking/mutex.c:908
      [15279.182487] in_atomic(): 1, irqs_disabled(): 0, pid: 29114, name: usbip
      [15279.182492] 5 locks held by usbip/29114:
      [15279.182494]  #0: 0000000091647f28 (sb_writers#6){....}, at: vfs_write+0x191/0x1c0
      [15279.182506]  #1: 00000000702cf0f3 (&of->mutex){....}, at: kernfs_fop_write+0xf7/0x1b0
      [15279.182514]  #2: 00000000872e5b4b (&dev->mutex){....}, at: __device_driver_lock+0x3b/0x50
      [15279.182522]  #3: 00000000e74ececc (&dev->mutex){....}, at: __device_driver_lock+0x46/0x50
      [15279.182529]  #4: 00000000b20abbe0 (&(&busid_table[i].busid_lock)->rlock){....}, at: get_busid_priv+0x48/0x60 [usbip_host]
      [15279.182541] CPU: 0 PID: 29114 Comm: usbip Tainted: G        W         5.1.0-rc6+ #40
      [15279.182543] Hardware name: Dell Inc. OptiPlex 790/0HY9JP, BIOS A18 09/24/2013
      [15279.182546] Call Trace:
      [15279.182554]  dump_stack+0x63/0x85
      [15279.182561]  ___might_sleep+0xff/0x120
      [15279.182566]  __might_sleep+0x4a/0x80
      [15279.182574]  __mutex_lock+0x55/0x950
      [15279.182582]  ? get_busid_priv+0x48/0x60 [usbip_host]
      [15279.182587]  ? reacquire_held_locks+0xec/0x1a0
      [15279.182591]  ? get_busid_priv+0x48/0x60 [usbip_host]
      [15279.182597]  ? find_held_lock+0x94/0xa0
      [15279.182609]  mutex_lock_nested+0x1b/0x20
      [15279.182614]  ? mutex_lock_nested+0x1b/0x20
      [15279.182618]  kernfs_remove_by_name_ns+0x2a/0x90
      [15279.182625]  sysfs_remove_file_ns+0x15/0x20
      [15279.182629]  device_remove_file+0x19/0x20
      [15279.182634]  stub_disconnect+0x6d/0x180 [usbip_host]
      [15279.182643]  usb_unbind_device+0x27/0x60
      Signed-off-by: default avatarShuah Khan <skhan@linuxfoundation.org>
      Cc: stable <stable@vger.kernel.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      6787013a
    • Carsten Schmid's avatar
      usb: xhci: avoid null pointer deref when bos field is NULL · 4eff5fb3
      Carsten Schmid authored
      commit 7aa1bb2f upstream.
      
      With defective USB sticks we see the following error happen:
      usb 1-3: new high-speed USB device number 6 using xhci_hcd
      usb 1-3: device descriptor read/64, error -71
      usb 1-3: device descriptor read/64, error -71
      usb 1-3: new high-speed USB device number 7 using xhci_hcd
      usb 1-3: device descriptor read/64, error -71
      usb 1-3: unable to get BOS descriptor set
      usb 1-3: New USB device found, idVendor=0781, idProduct=5581
      usb 1-3: New USB device strings: Mfr=1, Product=2, SerialNumber=3
      ...
      BUG: unable to handle kernel NULL pointer dereference at 0000000000000008
      
      This comes from the following place:
      [ 1660.215380] IP: xhci_set_usb2_hardware_lpm+0xdf/0x3d0 [xhci_hcd]
      [ 1660.222092] PGD 0 P4D 0
      [ 1660.224918] Oops: 0000 [#1] PREEMPT SMP NOPTI
      [ 1660.425520] CPU: 1 PID: 38 Comm: kworker/1:1 Tainted: P     U  W  O    4.14.67-apl #1
      [ 1660.434277] Workqueue: usb_hub_wq hub_event [usbcore]
      [ 1660.439918] task: ffffa295b6ae4c80 task.stack: ffffad4580150000
      [ 1660.446532] RIP: 0010:xhci_set_usb2_hardware_lpm+0xdf/0x3d0 [xhci_hcd]
      [ 1660.453821] RSP: 0018:ffffad4580153c70 EFLAGS: 00010046
      [ 1660.459655] RAX: 0000000000000000 RBX: ffffa295b4d7c000 RCX: 0000000000000002
      [ 1660.467625] RDX: 0000000000000002 RSI: ffffffff984a55b2 RDI: ffffffff984a55b2
      [ 1660.475586] RBP: ffffad4580153cc8 R08: 0000000000d6520a R09: 0000000000000001
      [ 1660.483556] R10: ffffad4580a004a0 R11: 0000000000000286 R12: ffffa295b4d7c000
      [ 1660.491525] R13: 0000000000010648 R14: ffffa295a84e1800 R15: 0000000000000000
      [ 1660.499494] FS:  0000000000000000(0000) GS:ffffa295bfc80000(0000) knlGS:0000000000000000
      [ 1660.508530] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
      [ 1660.514947] CR2: 0000000000000008 CR3: 000000025a114000 CR4: 00000000003406a0
      [ 1660.522917] Call Trace:
      [ 1660.525657]  usb_set_usb2_hardware_lpm+0x3d/0x70 [usbcore]
      [ 1660.531792]  usb_disable_device+0x242/0x260 [usbcore]
      [ 1660.537439]  usb_disconnect+0xc1/0x2b0 [usbcore]
      [ 1660.542600]  hub_event+0x596/0x18f0 [usbcore]
      [ 1660.547467]  ? trace_preempt_on+0xdf/0x100
      [ 1660.552040]  ? process_one_work+0x1c1/0x410
      [ 1660.556708]  process_one_work+0x1d2/0x410
      [ 1660.561184]  ? preempt_count_add.part.3+0x21/0x60
      [ 1660.566436]  worker_thread+0x2d/0x3f0
      [ 1660.570522]  kthread+0x122/0x140
      [ 1660.574123]  ? process_one_work+0x410/0x410
      [ 1660.578792]  ? kthread_create_on_node+0x60/0x60
      [ 1660.583849]  ret_from_fork+0x3a/0x50
      [ 1660.587839] Code: 00 49 89 c3 49 8b 84 24 50 16 00 00 8d 4a ff 48 8d 04 c8 48 89 ca 4c 8b 10 45 8b 6a 04 48 8b 00 48 89 45 c0 49 8b 86 80 03 00 00 <48> 8b 40 08 8b 40 03 0f 1f 44 00 00 45 85 ff 0f 84 81 01 00 00
      [ 1660.608980] RIP: xhci_set_usb2_hardware_lpm+0xdf/0x3d0 [xhci_hcd] RSP: ffffad4580153c70
      [ 1660.617921] CR2: 0000000000000008
      
      Tracking this down shows that udev->bos is NULL in the following code:
      (xhci.c, in xhci_set_usb2_hardware_lpm)
      	field = le32_to_cpu(udev->bos->ext_cap->bmAttributes);  <<<<<<< here
      
      	xhci_dbg(xhci, "%s port %d USB2 hardware LPM\n",
      			enable ? "enable" : "disable", port_num + 1);
      
      	if (enable) {
      		/* Host supports BESL timeout instead of HIRD */
      		if (udev->usb2_hw_lpm_besl_capable) {
      			/* if device doesn't have a preferred BESL value use a
      			 * default one which works with mixed HIRD and BESL
      			 * systems. See XHCI_DEFAULT_BESL definition in xhci.h
      			 */
      			if ((field & USB_BESL_SUPPORT) &&
      			    (field & USB_BESL_BASELINE_VALID))
      				hird = USB_GET_BESL_BASELINE(field);
      			else
      				hird = udev->l1_params.besl;
      
      The failing case is when disabling LPM. So it is sufficient to avoid
      access to udev->bos by moving the instruction into the "enable" clause.
      
      Cc: Stable <stable@vger.kernel.org>
      Signed-off-by: default avatarCarsten Schmid <carsten_schmid@mentor.com>
      Signed-off-by: default avatarMathias Nyman <mathias.nyman@linux.intel.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      4eff5fb3
    • Andrey Smirnov's avatar
      xhci: Convert xhci_handshake() to use readl_poll_timeout_atomic() · 239c0a91
      Andrey Smirnov authored
      commit f7fac17c upstream.
      
      Xhci_handshake() implements the algorithm already captured by
      readl_poll_timeout_atomic(). Convert the former to use the latter to
      avoid repetition.
      
      Turned out this patch also fixes a bug on the AMD Stoneyridge platform
      where usleep(1) sometimes takes over 10ms.
      This means a 5 second timeout can easily take over 15 seconds which will
      trigger the watchdog and reboot the system.
      
      [Add info about patch fixing a bug to commit message -Mathias]
      Signed-off-by: default avatarAndrey Smirnov <andrew.smirnov@gmail.com>
      Tested-by: default avatarRaul E Rangel <rrangel@chromium.org>
      Reviewed-by: default avatarRaul E Rangel <rrangel@chromium.org>
      Cc: <stable@vger.kernel.org>
      Signed-off-by: default avatarMathias Nyman <mathias.nyman@linux.intel.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      239c0a91
    • Fabio Estevam's avatar
      xhci: Use %zu for printing size_t type · 1f44ae1f
      Fabio Estevam authored
      commit c1a145a3 upstream.
      
      Commit 597c56e3 ("xhci: update bounce buffer with correct sg num")
      caused the following build warnings:
      
      drivers/usb/host/xhci-ring.c:676:19: warning: format '%ld' expects argument of type 'long int', but argument 3 has type 'size_t {aka unsigned int}' [-Wformat=]
      
      Use %zu for printing size_t type in order to fix the warnings.
      
      Fixes: 597c56e3 ("xhci: update bounce buffer with correct sg num")
      Reported-by: default avatarkbuild test robot <lkp@intel.com>
      Signed-off-by: default avatarFabio Estevam <festevam@gmail.com>
      Cc: stable <stable@vger.kernel.org>
      Acked-by: default avatarMathias Nyman <mathias.nyman@linux.intel.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      1f44ae1f
    • Henry Lin's avatar
      xhci: update bounce buffer with correct sg num · 3983bb9d
      Henry Lin authored
      commit 597c56e3 upstream.
      
      This change fixes a data corruption issue occurred on USB hard disk for
      the case that bounce buffer is used during transferring data.
      
      While updating data between sg list and bounce buffer, current
      implementation passes mapped sg number (urb->num_mapped_sgs) to
      sg_pcopy_from_buffer() and sg_pcopy_to_buffer(). This causes data
      not get copied if target buffer is located in the elements after
      mapped sg elements. This change passes sg number for full list to
      fix issue.
      
      Besides, for copying data from bounce buffer, calling dma_unmap_single()
      on the bounce buffer before copying data to sg list can avoid cache issue.
      
      Fixes: f9c589e1 ("xhci: TD-fragment, align the unsplittable case with a bounce buffer")
      Cc: <stable@vger.kernel.org> # v4.8+
      Signed-off-by: default avatarHenry Lin <henryl@nvidia.com>
      Signed-off-by: default avatarMathias Nyman <mathias.nyman@linux.intel.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      3983bb9d
    • Rasmus Villemoes's avatar
      include/linux/bitops.h: sanitize rotate primitives · a40d0bd2
      Rasmus Villemoes authored
      commit ef4d6f6b upstream.
      
      The ror32 implementation (word >> shift) | (word << (32 - shift) has
      undefined behaviour if shift is outside the [1, 31] range.  Similarly
      for the 64 bit variants.  Most callers pass a compile-time constant
      (naturally in that range), but there's an UBSAN report that these may
      actually be called with a shift count of 0.
      
      Instead of special-casing that, we can make them DTRT for all values of
      shift while also avoiding UB.  For some reason, this was already partly
      done for rol32 (which was well-defined for [0, 31]).  gcc 8 recognizes
      these patterns as rotates, so for example
      
        __u32 rol32(__u32 word, unsigned int shift)
        {
      	return (word << (shift & 31)) | (word >> ((-shift) & 31));
        }
      
      compiles to
      
      0000000000000020 <rol32>:
        20:   89 f8                   mov    %edi,%eax
        22:   89 f1                   mov    %esi,%ecx
        24:   d3 c0                   rol    %cl,%eax
        26:   c3                      retq
      
      Older compilers unfortunately do not do as well, but this only affects
      the small minority of users that don't pass constants.
      
      Due to integer promotions, ro[lr]8 were already well-defined for shifts
      in [0, 8], and ro[lr]16 were mostly well-defined for shifts in [0, 16]
      (only mostly - u16 gets promoted to _signed_ int, so if bit 15 is set,
      word << 16 is undefined).  For consistency, update those as well.
      
      Link: http://lkml.kernel.org/r/20190410211906.2190-1-linux@rasmusvillemoes.dkSigned-off-by: default avatarRasmus Villemoes <linux@rasmusvillemoes.dk>
      Reported-by: default avatarIdo Schimmel <idosch@mellanox.com>
      Tested-by: default avatarIdo Schimmel <idosch@mellanox.com>
      Reviewed-by: default avatarWill Deacon <will.deacon@arm.com>
      Cc: Vadim Pasternak <vadimp@mellanox.com>
      Cc: Andrey Ryabinin <aryabinin@virtuozzo.com>
      Cc: Jacek Anaszewski <jacek.anaszewski@gmail.com>
      Cc: Pavel Machek <pavel@ucw.cz>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      a40d0bd2
    • James Clarke's avatar
      sparc64: Fix regression in non-hypervisor TLB flush xcall · 42962ec2
      James Clarke authored
      commit d3c976c1 upstream.
      
      Previously, %g2 would end up with the value PAGE_SIZE, but after the
      commit mentioned below it ends up with the value 1 due to being reused
      for a different purpose. We need it to be PAGE_SIZE as we use it to step
      through pages in our demap loop, otherwise we set different flags in the
      low 12 bits of the address written to, thereby doing things other than a
      nucleus page flush.
      
      Fixes: a74ad5e6 ("sparc64: Handle extremely large kernel TLB range flushes more gracefully.")
      Reported-by: default avatarMeelis Roos <mroos@linux.ee>
      Tested-by: default avatarMeelis Roos <mroos@linux.ee>
      Signed-off-by: default avatarJames Clarke <jrtc27@jrtc27.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      42962ec2
  2. 04 Jun, 2019 13 commits