1. 13 Nov, 2018 40 commits
    • Josef Bacik's avatar
      btrfs: don't use ctl->free_space for max_extent_size · 7509d4f9
      Josef Bacik authored
      commit fb5c39d7 upstream.
      
      max_extent_size is supposed to be the largest contiguous range for the
      space info, and ctl->free_space is the total free space in the block
      group.  We need to keep track of these separately and _only_ use the
      max_free_space if we don't have a max_extent_size, as that means our
      original request was too large to search any of the block groups for and
      therefore wouldn't have a max_extent_size set.
      
      CC: stable@vger.kernel.org # 4.14+
      Reviewed-by: default avatarFilipe Manana <fdmanana@suse.com>
      Signed-off-by: default avatarJosef Bacik <jbacik@fb.com>
      Signed-off-by: default avatarDavid Sterba <dsterba@suse.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      7509d4f9
    • Josef Bacik's avatar
      btrfs: set max_extent_size properly · 65d41e98
      Josef Bacik authored
      commit ad22cf6e upstream.
      
      We can't use entry->bytes if our entry is a bitmap entry, we need to use
      entry->max_extent_size in that case.  Fix up all the logic to make this
      consistent.
      
      CC: stable@vger.kernel.org # 4.4+
      Signed-off-by: default avatarJosef Bacik <jbacik@fb.com>
      Reviewed-by: default avatarDavid Sterba <dsterba@suse.com>
      Signed-off-by: default avatarDavid Sterba <dsterba@suse.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      65d41e98
    • Filipe Manana's avatar
      Btrfs: fix assertion on fsync of regular file when using no-holes feature · 947a9b02
      Filipe Manana authored
      commit 7ed586d0 upstream.
      
      When using the NO_HOLES feature and logging a regular file, we were
      expecting that if we find an inline extent, that either its size in RAM
      (uncompressed and unenconded) matches the size of the file or if it does
      not, that it matches the sector size and it represents compressed data.
      This assertion does not cover a case where the length of the inline extent
      is smaller than the sector size and also smaller the file's size, such
      case is possible through fallocate. Example:
      
        $ mkfs.btrfs -f -O no-holes /dev/sdb
        $ mount /dev/sdb /mnt
      
        $ xfs_io -f -c "pwrite -S 0xb60 0 21" /mnt/foobar
        $ xfs_io -c "falloc 40 40" /mnt/foobar
        $ xfs_io -c "fsync" /mnt/foobar
      
      In the above example we trigger the assertion because the inline extent's
      length is 21 bytes while the file size is 80 bytes. The fallocate() call
      merely updated the file's size and did not touch the existing inline
      extent, as expected.
      
      So fix this by adjusting the assertion so that an inline extent length
      smaller than the file size is valid if the file size is smaller than the
      filesystem's sector size.
      
      A test case for fstests follows soon.
      Reported-by: default avatarAnatoly Trosinenko <anatoly.trosinenko@gmail.com>
      Fixes: a89ca6f2 ("Btrfs: fix fsync after truncate when no_holes feature is enabled")
      CC: stable@vger.kernel.org # 4.14+
      Link: https://lore.kernel.org/linux-btrfs/CAE5jQCfRSBC7n4pUTFJcmHh109=gwyT9mFkCOL+NKfzswmR=_Q@mail.gmail.com/Signed-off-by: default avatarFilipe Manana <fdmanana@suse.com>
      Signed-off-by: default avatarDavid Sterba <dsterba@suse.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      947a9b02
    • Filipe Manana's avatar
      Btrfs: fix null pointer dereference on compressed write path error · bdda3175
      Filipe Manana authored
      commit 3527a018 upstream.
      
      At inode.c:compress_file_range(), under the "free_pages_out" label, we can
      end up dereferencing the "pages" pointer when it has a NULL value. This
      case happens when "start" has a value of 0 and we fail to allocate memory
      for the "pages" pointer. When that happens we jump to the "cont" label and
      then enter the "if (start == 0)" branch where we immediately call the
      cow_file_range_inline() function. If that function returns 0 (success
      creating an inline extent) or an error (like -ENOMEM for example) we jump
      to the "free_pages_out" label and then access "pages[i]" leading to a NULL
      pointer dereference, since "nr_pages" has a value greater than zero at
      that point.
      
      Fix this by setting "nr_pages" to 0 when we fail to allocate memory for
      the "pages" pointer.
      
      Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=201119
      Fixes: 771ed689 ("Btrfs: Optimize compressed writeback and reads")
      CC: stable@vger.kernel.org # 4.4+
      Reviewed-by: default avatarLiu Bo <bo.liu@linux.alibaba.com>
      Signed-off-by: default avatarFilipe Manana <fdmanana@suse.com>
      Reviewed-by: default avatarDavid Sterba <dsterba@suse.com>
      Signed-off-by: default avatarDavid Sterba <dsterba@suse.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      bdda3175
    • Qu Wenruo's avatar
      btrfs: qgroup: Dirty all qgroups before rescan · 08374d8b
      Qu Wenruo authored
      commit 9c7b0c2e upstream.
      
      [BUG]
      In the following case, rescan won't zero out the number of qgroup 1/0:
      
        $ mkfs.btrfs -fq $DEV
        $ mount $DEV /mnt
      
        $ btrfs quota enable /mnt
        $ btrfs qgroup create 1/0 /mnt
        $ btrfs sub create /mnt/sub
        $ btrfs qgroup assign 0/257 1/0 /mnt
      
        $ dd if=/dev/urandom of=/mnt/sub/file bs=1k count=1000
        $ btrfs sub snap /mnt/sub /mnt/snap
        $ btrfs quota rescan -w /mnt
        $ btrfs qgroup show -pcre /mnt
        qgroupid         rfer         excl     max_rfer     max_excl parent  child
        --------         ----         ----     --------     -------- ------  -----
        0/5          16.00KiB     16.00KiB         none         none ---     ---
        0/257      1016.00KiB     16.00KiB         none         none 1/0     ---
        0/258      1016.00KiB     16.00KiB         none         none ---     ---
        1/0        1016.00KiB     16.00KiB         none         none ---     0/257
      
      So far so good, but:
      
        $ btrfs qgroup remove 0/257 1/0 /mnt
        WARNING: quotas may be inconsistent, rescan needed
        $ btrfs quota rescan -w /mnt
        $ btrfs qgroup show -pcre  /mnt
        qgoupid         rfer         excl     max_rfer     max_excl parent  child
        --------         ----         ----     --------     -------- ------  -----
        0/5          16.00KiB     16.00KiB         none         none ---     ---
        0/257      1016.00KiB     16.00KiB         none         none ---     ---
        0/258      1016.00KiB     16.00KiB         none         none ---     ---
        1/0        1016.00KiB     16.00KiB         none         none ---     ---
      	     ^^^^^^^^^^     ^^^^^^^^ not cleared
      
      [CAUSE]
      Before rescan we call qgroup_rescan_zero_tracking() to zero out all
      qgroups' accounting numbers.
      
      However we don't mark all qgroups dirty, but rely on rescan to do so.
      
      If we have any high level qgroup without children, it won't be marked
      dirty during rescan, since we cannot reach that qgroup.
      
      This will cause QGROUP_INFO items of childless qgroups never get updated
      in the quota tree, thus their numbers will stay the same in "btrfs
      qgroup show" output.
      
      [FIX]
      Just mark all qgroups dirty in qgroup_rescan_zero_tracking(), so even if
      we have childless qgroups, their QGROUP_INFO items will still get
      updated during rescan.
      Reported-by: default avatarMisono Tomohiro <misono.tomohiro@jp.fujitsu.com>
      CC: stable@vger.kernel.org # 4.4+
      Signed-off-by: default avatarQu Wenruo <wqu@suse.com>
      Reviewed-by: default avatarMisono Tomohiro <misono.tomohiro@jp.fujitsu.com>
      Tested-by: default avatarMisono Tomohiro <misono.tomohiro@jp.fujitsu.com>
      Signed-off-by: default avatarDavid Sterba <dsterba@suse.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      08374d8b
    • Filipe Manana's avatar
      Btrfs: fix wrong dentries after fsync of file that got its parent replaced · 45be291e
      Filipe Manana authored
      commit 0f375eed upstream.
      
      In a scenario like the following:
      
        mkdir /mnt/A               # inode 258
        mkdir /mnt/B               # inode 259
        touch /mnt/B/bar           # inode 260
      
        sync
      
        mv /mnt/B/bar /mnt/A/bar
        mv -T /mnt/A /mnt/B
        fsync /mnt/B/bar
      
        <power fail>
      
      After replaying the log we end up with file bar having 2 hard links, both
      with the name 'bar' and one in the directory with inode number 258 and the
      other in the directory with inode number 259. Also, we end up with the
      directory inode 259 still existing and with the directory inode 258 still
      named as 'A', instead of 'B'. In this scenario, file 'bar' should only
      have one hard link, located at directory inode 258, the directory inode
      259 should not exist anymore and the name for directory inode 258 should
      be 'B'.
      
      This incorrect behaviour happens because when attempting to log the old
      parents of an inode, we skip any parents that no longer exist. Fix this
      by forcing a full commit if an old parent no longer exists.
      
      A test case for fstests follows soon.
      
      CC: stable@vger.kernel.org # 4.4+
      Signed-off-by: default avatarFilipe Manana <fdmanana@suse.com>
      Signed-off-by: default avatarDavid Sterba <dsterba@suse.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      45be291e
    • Filipe Manana's avatar
      Btrfs: fix warning when replaying log after fsync of a tmpfile · 985579fa
      Filipe Manana authored
      commit f2d72f42 upstream.
      
      When replaying a log which contains a tmpfile (which necessarily has a
      link count of 0) we end up calling inc_nlink(), at
      fs/btrfs/tree-log.c:replay_one_buffer(), which produces a warning like
      the following:
      
        [195191.943673] WARNING: CPU: 0 PID: 6924 at fs/inode.c:342 inc_nlink+0x33/0x40
        [195191.943723] CPU: 0 PID: 6924 Comm: mount Not tainted 4.19.0-rc6-btrfs-next-38 #1
        [195191.943724] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.11.2-0-gf9626ccb91-prebuilt.qemu-project.org 04/01/2014
        [195191.943726] RIP: 0010:inc_nlink+0x33/0x40
        [195191.943728] RSP: 0018:ffffb96e425e3870 EFLAGS: 00010246
        [195191.943730] RAX: 0000000000000000 RBX: ffff8c0d1e6af4f0 RCX: 0000000000000006
        [195191.943731] RDX: 0000000000000000 RSI: 0000000000000000 RDI: ffff8c0d1e6af4f0
        [195191.943731] RBP: 0000000000000097 R08: 0000000000000001 R09: 0000000000000000
        [195191.943732] R10: 0000000000000000 R11: 0000000000000000 R12: ffffb96e425e3a60
        [195191.943733] R13: ffff8c0d10cff0c8 R14: ffff8c0d0d515348 R15: ffff8c0d78a1b3f8
        [195191.943735] FS:  00007f570ee24480(0000) GS:ffff8c0dfb200000(0000) knlGS:0000000000000000
        [195191.943736] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
        [195191.943737] CR2: 00005593286277c8 CR3: 00000000bb8f2006 CR4: 00000000003606f0
        [195191.943739] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
        [195191.943740] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
        [195191.943741] Call Trace:
        [195191.943778]  replay_one_buffer+0x797/0x7d0 [btrfs]
        [195191.943802]  walk_up_log_tree+0x1c1/0x250 [btrfs]
        [195191.943809]  ? rcu_read_lock_sched_held+0x3f/0x70
        [195191.943825]  walk_log_tree+0xae/0x1d0 [btrfs]
        [195191.943840]  btrfs_recover_log_trees+0x1d7/0x4d0 [btrfs]
        [195191.943856]  ? replay_dir_deletes+0x280/0x280 [btrfs]
        [195191.943870]  open_ctree+0x1c3b/0x22a0 [btrfs]
        [195191.943887]  btrfs_mount_root+0x6b4/0x800 [btrfs]
        [195191.943894]  ? rcu_read_lock_sched_held+0x3f/0x70
        [195191.943899]  ? pcpu_alloc+0x55b/0x7c0
        [195191.943906]  ? mount_fs+0x3b/0x140
        [195191.943908]  mount_fs+0x3b/0x140
        [195191.943912]  ? __init_waitqueue_head+0x36/0x50
        [195191.943916]  vfs_kern_mount+0x62/0x160
        [195191.943927]  btrfs_mount+0x134/0x890 [btrfs]
        [195191.943936]  ? rcu_read_lock_sched_held+0x3f/0x70
        [195191.943938]  ? pcpu_alloc+0x55b/0x7c0
        [195191.943943]  ? mount_fs+0x3b/0x140
        [195191.943952]  ? btrfs_remount+0x570/0x570 [btrfs]
        [195191.943954]  mount_fs+0x3b/0x140
        [195191.943956]  ? __init_waitqueue_head+0x36/0x50
        [195191.943960]  vfs_kern_mount+0x62/0x160
        [195191.943963]  do_mount+0x1f9/0xd40
        [195191.943967]  ? memdup_user+0x4b/0x70
        [195191.943971]  ksys_mount+0x7e/0xd0
        [195191.943974]  __x64_sys_mount+0x21/0x30
        [195191.943977]  do_syscall_64+0x60/0x1b0
        [195191.943980]  entry_SYSCALL_64_after_hwframe+0x49/0xbe
        [195191.943983] RIP: 0033:0x7f570e4e524a
        [195191.943986] RSP: 002b:00007ffd83589478 EFLAGS: 00000206 ORIG_RAX: 00000000000000a5
        [195191.943989] RAX: ffffffffffffffda RBX: 0000563f335b2060 RCX: 00007f570e4e524a
        [195191.943990] RDX: 0000563f335b2240 RSI: 0000563f335b2280 RDI: 0000563f335b2260
        [195191.943992] RBP: 0000000000000000 R08: 0000000000000000 R09: 0000000000000020
        [195191.943993] R10: 00000000c0ed0000 R11: 0000000000000206 R12: 0000563f335b2260
        [195191.943994] R13: 0000563f335b2240 R14: 0000000000000000 R15: 00000000ffffffff
        [195191.944002] irq event stamp: 8688
        [195191.944010] hardirqs last  enabled at (8687): [<ffffffff9cb004c3>] console_unlock+0x503/0x640
        [195191.944012] hardirqs last disabled at (8688): [<ffffffff9ca037dd>] trace_hardirqs_off_thunk+0x1a/0x1c
        [195191.944018] softirqs last  enabled at (8638): [<ffffffff9cc0a5d1>] __set_page_dirty_nobuffers+0x101/0x150
        [195191.944020] softirqs last disabled at (8634): [<ffffffff9cc26bbe>] wb_wakeup_delayed+0x2e/0x60
        [195191.944022] ---[ end trace 5d6e873a9a0b811a ]---
      
      This happens because the inode does not have the flag I_LINKABLE set,
      which is a runtime only flag, not meant to be persisted, set when the
      inode is created through open(2) if the flag O_EXCL is not passed to it.
      Except for the warning, there are no other consequences (like corruptions
      or metadata inconsistencies).
      
      Since it's pointless to replay a tmpfile as it would be deleted in a
      later phase of the log replay procedure (it has a link count of 0), fix
      this by not logging tmpfiles and if a tmpfile is found in a log (created
      by a kernel without this change), skip the replay of the inode.
      
      A test case for fstests follows soon.
      
      Fixes: 471d557a ("Btrfs: fix loss of prealloc extents past i_size after fsync log replay")
      CC: stable@vger.kernel.org # 4.18+
      Reported-by: default avatarMartin Steigerwald <martin@lichtvoll.de>
      Link: https://lore.kernel.org/linux-btrfs/3666619.NTnn27ZJZE@merkaba/Signed-off-by: default avatarFilipe Manana <fdmanana@suse.com>
      Signed-off-by: default avatarDavid Sterba <dsterba@suse.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      985579fa
    • Josef Bacik's avatar
      btrfs: make sure we create all new block groups · 0ae66cf1
      Josef Bacik authored
      commit 545e3366 upstream.
      
      Allocating new chunks modifies both the extent and chunk tree, which can
      trigger new chunk allocations.  So instead of doing list_for_each_safe,
      just do while (!list_empty()) so we make sure we don't exit with other
      pending bg's still on our list.
      
      CC: stable@vger.kernel.org # 4.4+
      Reviewed-by: default avatarOmar Sandoval <osandov@fb.com>
      Reviewed-by: default avatarLiu Bo <bo.liu@linux.alibaba.com>
      Signed-off-by: default avatarJosef Bacik <josef@toxicpanda.com>
      Reviewed-by: default avatarDavid Sterba <dsterba@suse.com>
      Signed-off-by: default avatarDavid Sterba <dsterba@suse.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      0ae66cf1
    • Josef Bacik's avatar
      btrfs: reset max_extent_size on clear in a bitmap · a7e3da49
      Josef Bacik authored
      commit 553cceb4 upstream.
      
      We need to clear the max_extent_size when we clear bits from a bitmap
      since it could have been from the range that contains the
      max_extent_size.
      
      CC: stable@vger.kernel.org # 4.4+
      Reviewed-by: default avatarLiu Bo <bo.liu@linux.alibaba.com>
      Signed-off-by: default avatarJosef Bacik <jbacik@fb.com>
      Signed-off-by: default avatarDavid Sterba <dsterba@suse.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      a7e3da49
    • Josef Bacik's avatar
      btrfs: protect space cache inode alloc with GFP_NOFS · b5d59a48
      Josef Bacik authored
      commit 84de76a2 upstream.
      
      If we're allocating a new space cache inode it's likely going to be
      under a transaction handle, so we need to use memalloc_nofs_save() in
      order to avoid deadlocks, and more importantly lockdep messages that
      make xfstests fail.
      
      CC: stable@vger.kernel.org # 4.4+
      Reviewed-by: default avatarOmar Sandoval <osandov@fb.com>
      Signed-off-by: default avatarJosef Bacik <josef@toxicpanda.com>
      Reviewed-by: default avatarDavid Sterba <dsterba@suse.com>
      Signed-off-by: default avatarDavid Sterba <dsterba@suse.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      b5d59a48
    • Josef Bacik's avatar
      btrfs: wait on caching when putting the bg cache · 9ce0f8b6
      Josef Bacik authored
      commit 3aa7c7a3 upstream.
      
      While testing my backport I noticed there was a panic if I ran
      generic/416 generic/417 generic/418 all in a row.  This just happened to
      uncover a race where we had outstanding IO after we destroy all of our
      workqueues, and then we'd go to queue the endio work on those free'd
      workqueues.
      
      This is because we aren't waiting for the caching threads to be done
      before freeing everything up, so to fix this make sure we wait on any
      outstanding caching that's being done before we free up the block group,
      so we're sure to be done with all IO by the time we get to
      btrfs_stop_all_workers().  This fixes the panic I was seeing
      consistently in testing.
      
      ------------[ cut here ]------------
      kernel BUG at fs/btrfs/volumes.c:6112!
      SMP PTI
      Modules linked in:
      CPU: 1 PID: 27165 Comm: kworker/u4:7 Not tainted 4.16.0-02155-g3553e54a578d-dirty #875
      Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.11.0-2.el7 04/01/2014
      Workqueue: btrfs-cache btrfs_cache_helper
      RIP: 0010:btrfs_map_bio+0x346/0x370
      RSP: 0000:ffffc900061e79d0 EFLAGS: 00010202
      RAX: 0000000000000000 RBX: ffff880071542e00 RCX: 0000000000533000
      RDX: ffff88006bb74380 RSI: 0000000000000008 RDI: ffff880078160000
      RBP: 0000000000000001 R08: ffff8800781cd200 R09: 0000000000503000
      R10: ffff88006cd21200 R11: 0000000000000000 R12: 0000000000000000
      R13: 0000000000000000 R14: ffff8800781cd200 R15: ffff880071542e00
      FS:  0000000000000000(0000) GS:ffff88007fd00000(0000) knlGS:0000000000000000
      CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
      CR2: 000000000817ffc4 CR3: 0000000078314000 CR4: 00000000000006e0
      DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
      DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
      Call Trace:
       btree_submit_bio_hook+0x8a/0xd0
       submit_one_bio+0x5d/0x80
       read_extent_buffer_pages+0x18a/0x320
       btree_read_extent_buffer_pages+0xbc/0x200
       ? alloc_extent_buffer+0x359/0x3e0
       read_tree_block+0x3d/0x60
       read_block_for_search.isra.30+0x1a5/0x360
       btrfs_search_slot+0x41b/0xa10
       btrfs_next_old_leaf+0x212/0x470
       caching_thread+0x323/0x490
       normal_work_helper+0xc5/0x310
       process_one_work+0x141/0x340
       worker_thread+0x44/0x3c0
       kthread+0xf8/0x130
       ? process_one_work+0x340/0x340
       ? kthread_bind+0x10/0x10
       ret_from_fork+0x35/0x40
      RIP: btrfs_map_bio+0x346/0x370 RSP: ffffc900061e79d0
      ---[ end trace 827eb13e50846033 ]---
      Kernel panic - not syncing: Fatal exception
      Kernel Offset: disabled
      ---[ end Kernel panic - not syncing: Fatal exception
      
      CC: stable@vger.kernel.org # 4.4+
      Signed-off-by: default avatarJosef Bacik <josef@toxicpanda.com>
      Reviewed-by: default avatarOmar Sandoval <osandov@fb.com>
      Signed-off-by: default avatarDavid Sterba <dsterba@suse.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      9ce0f8b6
    • Jeff Mahoney's avatar
      btrfs: don't attempt to trim devices that don't support it · da05e937
      Jeff Mahoney authored
      commit 0be88e36 upstream.
      
      We check whether any device the file system is using supports discard in
      the ioctl call, but then we attempt to trim free extents on every device
      regardless of whether discard is supported.  Due to the way we mask off
      EOPNOTSUPP, we can end up issuing the trim operations on each free range
      on devices that don't support it, just wasting time.
      
      Fixes: 499f377f ("btrfs: iterate over unused chunk space in FITRIM")
      CC: stable@vger.kernel.org # 4.4+
      Signed-off-by: default avatarJeff Mahoney <jeffm@suse.com>
      Reviewed-by: default avatarDavid Sterba <dsterba@suse.com>
      Signed-off-by: default avatarDavid Sterba <dsterba@suse.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      da05e937
    • Jeff Mahoney's avatar
      btrfs: iterate all devices during trim, instead of fs_devices::alloc_list · a7d24c89
      Jeff Mahoney authored
      commit d4e329de upstream.
      
      btrfs_trim_fs iterates over the fs_devices->alloc_list while holding the
      device_list_mutex.  The problem is that ->alloc_list is protected by the
      chunk mutex.  We don't want to hold the chunk mutex over the trim of the
      entire file system.  Fortunately, the ->dev_list list is protected by
      the dev_list mutex and while it will give us all devices, including
      read-only devices, we already just skip the read-only devices.  Then we
      can continue to take and release the chunk mutex while scanning each
      device.
      
      Fixes: 499f377f ("btrfs: iterate over unused chunk space in FITRIM")
      CC: stable@vger.kernel.org # 4.4+
      Signed-off-by: default avatarJeff Mahoney <jeffm@suse.com>
      Reviewed-by: default avatarDavid Sterba <dsterba@suse.com>
      Signed-off-by: default avatarDavid Sterba <dsterba@suse.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      a7d24c89
    • Qu Wenruo's avatar
      btrfs: Ensure btrfs_trim_fs can trim the whole filesystem · 2fdb337b
      Qu Wenruo authored
      commit 6ba9fc8e upstream.
      
      [BUG]
      fstrim on some btrfs only trims the unallocated space, not trimming any
      space in existing block groups.
      
      [CAUSE]
      Before fstrim_range passed to btrfs_trim_fs(), it gets truncated to
      range [0, super->total_bytes).  So later btrfs_trim_fs() will only be
      able to trim block groups in range [0, super->total_bytes).
      
      While for btrfs, any bytenr aligned to sectorsize is valid, since btrfs
      uses its logical address space, there is nothing limiting the location
      where we put block groups.
      
      For filesystem with frequent balance, it's quite easy to relocate all
      block groups and bytenr of block groups will start beyond
      super->total_bytes.
      
      In that case, btrfs will not trim existing block groups.
      
      [FIX]
      Just remove the truncation in btrfs_ioctl_fitrim(), so btrfs_trim_fs()
      can get the unmodified range, which is normally set to [0, U64_MAX].
      Reported-by: default avatarChris Murphy <lists@colorremedies.com>
      Fixes: f4c697e6 ("btrfs: return EINVAL if start > total_bytes in fitrim ioctl")
      CC: <stable@vger.kernel.org> # v4.4+
      Signed-off-by: default avatarQu Wenruo <wqu@suse.com>
      Reviewed-by: default avatarNikolay Borisov <nborisov@suse.com>
      Reviewed-by: default avatarDavid Sterba <dsterba@suse.com>
      Signed-off-by: default avatarDavid Sterba <dsterba@suse.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      2fdb337b
    • Qu Wenruo's avatar
      btrfs: Enhance btrfs_trim_fs function to handle error better · 7b23644c
      Qu Wenruo authored
      commit 93bba24d upstream.
      
      Function btrfs_trim_fs() doesn't handle errors in a consistent way. If
      error happens when trimming existing block groups, it will skip the
      remaining blocks and continue to trim unallocated space for each device.
      
      The return value will only reflect the final error from device trimming.
      
      This patch will fix such behavior by:
      
      1) Recording the last error from block group or device trimming
         The return value will also reflect the last error during trimming.
         Make developer more aware of the problem.
      
      2) Continuing trimming if possible
         If we failed to trim one block group or device, we could still try
         the next block group or device.
      
      3) Report number of failures during block group and device trimming
         It would be less noisy, but still gives user a brief summary of
         what's going wrong.
      
      Such behavior can avoid confusion for cases like failure to trim the
      first block group and then only unallocated space is trimmed.
      Reported-by: default avatarChris Murphy <lists@colorremedies.com>
      CC: stable@vger.kernel.org # 4.4+
      Signed-off-by: default avatarQu Wenruo <wqu@suse.com>
      Reviewed-by: default avatarDavid Sterba <dsterba@suse.com>
      [ add bg_ret and dev_ret to the messages ]
      Signed-off-by: default avatarDavid Sterba <dsterba@suse.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      7b23644c
    • Jeff Mahoney's avatar
      btrfs: fix error handling in free_log_tree · e2fc220f
      Jeff Mahoney authored
      commit 374b0e2d upstream.
      
      When we hit an I/O error in free_log_tree->walk_log_tree during file system
      shutdown we can crash due to there not being a valid transaction handle.
      
      Use btrfs_handle_fs_error when there's no transaction handle to use.
      
        BUG: unable to handle kernel NULL pointer dereference at 0000000000000060
        IP: free_log_tree+0xd2/0x140 [btrfs]
        PGD 0 P4D 0
        Oops: 0000 [#1] SMP DEBUG_PAGEALLOC PTI
        Modules linked in: <modules>
        CPU: 2 PID: 23544 Comm: umount Tainted: G        W        4.12.14-kvmsmall #9 SLE15 (unreleased)
        Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.0.0-prebuilt.qemu-project.org 04/01/2014
        task: ffff96bfd3478880 task.stack: ffffa7cf40d78000
        RIP: 0010:free_log_tree+0xd2/0x140 [btrfs]
        RSP: 0018:ffffa7cf40d7bd10 EFLAGS: 00010282
        RAX: 00000000fffffffb RBX: 00000000fffffffb RCX: 0000000000000002
        RDX: 0000000000000000 RSI: ffff96c02f07d4c8 RDI: 0000000000000282
        RBP: ffff96c013cf1000 R08: ffff96c02f07d4c8 R09: ffff96c02f07d4d0
        R10: 0000000000000000 R11: 0000000000000002 R12: 0000000000000000
        R13: ffff96c005e800c0 R14: ffffa7cf40d7bdb8 R15: 0000000000000000
        FS:  00007f17856bcfc0(0000) GS:ffff96c03f600000(0000) knlGS:0000000000000000
        CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
        CR2: 0000000000000060 CR3: 0000000045ed6002 CR4: 00000000003606e0
        DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
        DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
        Call Trace:
         ? wait_for_writer+0xb0/0xb0 [btrfs]
         btrfs_free_log+0x17/0x30 [btrfs]
         btrfs_drop_and_free_fs_root+0x9a/0xe0 [btrfs]
         btrfs_free_fs_roots+0xc0/0x130 [btrfs]
         ? wait_for_completion+0xf2/0x100
         close_ctree+0xea/0x2e0 [btrfs]
         ? kthread_stop+0x161/0x260
         generic_shutdown_super+0x6c/0x120
         kill_anon_super+0xe/0x20
         btrfs_kill_super+0x13/0x100 [btrfs]
         deactivate_locked_super+0x3f/0x70
         cleanup_mnt+0x3b/0x70
         task_work_run+0x78/0x90
         exit_to_usermode_loop+0x77/0xa6
         do_syscall_64+0x1c5/0x1e0
         entry_SYSCALL_64_after_hwframe+0x42/0xb7
        RIP: 0033:0x7f1784f90827
        RSP: 002b:00007ffdeeb03118 EFLAGS: 00000246 ORIG_RAX: 00000000000000a6
        RAX: 0000000000000000 RBX: 0000556a60c62970 RCX: 00007f1784f90827
        RDX: 0000000000000001 RSI: 0000000000000000 RDI: 0000556a60c62b50
        RBP: 0000000000000000 R08: 0000000000000005 R09: 00000000ffffffff
        R10: 0000556a60c63900 R11: 0000000000000246 R12: 0000556a60c62b50
        R13: 00007f17854a81c4 R14: 0000000000000000 R15: 0000000000000000
        RIP: free_log_tree+0xd2/0x140 [btrfs] RSP: ffffa7cf40d7bd10
        CR2: 0000000000000060
      
      Fixes: 681ae509 ("Btrfs: cleanup reserved space when freeing tree log on error")
      CC: <stable@vger.kernel.org> # v3.13
      Signed-off-by: default avatarJeff Mahoney <jeffm@suse.com>
      Reviewed-by: default avatarDavid Sterba <dsterba@suse.com>
      Signed-off-by: default avatarDavid Sterba <dsterba@suse.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      e2fc220f
    • Qu Wenruo's avatar
      btrfs: locking: Add extra check in btrfs_init_new_buffer() to avoid deadlock · d36ac4a3
      Qu Wenruo authored
      commit b72c3aba upstream.
      
      [BUG]
      For certain crafted image, whose csum root leaf has missing backref, if
      we try to trigger write with data csum, it could cause deadlock with the
      following kernel WARN_ON():
      
        WARNING: CPU: 1 PID: 41 at fs/btrfs/locking.c:230 btrfs_tree_lock+0x3e2/0x400
        CPU: 1 PID: 41 Comm: kworker/u4:1 Not tainted 4.18.0-rc1+ #8
        Workqueue: btrfs-endio-write btrfs_endio_write_helper
        RIP: 0010:btrfs_tree_lock+0x3e2/0x400
        Call Trace:
         btrfs_alloc_tree_block+0x39f/0x770
         __btrfs_cow_block+0x285/0x9e0
         btrfs_cow_block+0x191/0x2e0
         btrfs_search_slot+0x492/0x1160
         btrfs_lookup_csum+0xec/0x280
         btrfs_csum_file_blocks+0x2be/0xa60
         add_pending_csums+0xaf/0xf0
         btrfs_finish_ordered_io+0x74b/0xc90
         finish_ordered_fn+0x15/0x20
         normal_work_helper+0xf6/0x500
         btrfs_endio_write_helper+0x12/0x20
         process_one_work+0x302/0x770
         worker_thread+0x81/0x6d0
         kthread+0x180/0x1d0
         ret_from_fork+0x35/0x40
      
      [CAUSE]
      That crafted image has missing backref for csum tree root leaf.  And
      when we try to allocate new tree block, since there is no
      EXTENT/METADATA_ITEM for csum tree root, btrfs consider it's free slot
      and use it.
      
      The extent tree of the image looks like:
      
        Normal image                      |       This fuzzed image
        ----------------------------------+--------------------------------
        BG 29360128                       | BG 29360128
         One empty slot                   |  One empty slot
        29364224: backref to UUID tree    | 29364224: backref to UUID tree
         Two empty slots                  |  Two empty slots
        29376512: backref to CSUM tree    |  One empty slot (bad type) <<<
        29380608: backref to D_RELOC tree | 29380608: backref to D_RELOC tree
        ...                               | ...
      
      Since bytenr 29376512 has no METADATA/EXTENT_ITEM, when btrfs try to
      alloc tree block, it's an valid slot for btrfs.
      
      And for finish_ordered_write, when we need to insert csum, we try to CoW
      csum tree root.
      
      By accident, empty slots at bytenr BG_OFFSET, BG_OFFSET + 8K,
      BG_OFFSET + 12K is already used by tree block COW for other trees, the
      next empty slot is BG_OFFSET + 16K, which should be the backref for CSUM
      tree.
      
      But due to the bad type, btrfs can recognize it and still consider it as
      an empty slot, and will try to use it for csum tree CoW.
      
      Then in the following call trace, we will try to lock the new tree
      block, which turns out to be the old csum tree root which is already
      locked:
      
      btrfs_search_slot() called on csum tree root, which is at 29376512
      |- btrfs_cow_block()
         |- btrfs_set_lock_block()
         |  |- Now locks tree block 29376512 (old csum tree root)
         |- __btrfs_cow_block()
            |- btrfs_alloc_tree_block()
               |- btrfs_reserve_extent()
                  | Now it returns tree block 29376512, which extent tree
                  | shows its empty slot, but it's already hold by csum tree
                  |- btrfs_init_new_buffer()
                     |- btrfs_tree_lock()
                        | Triggers WARN_ON(eb->lock_owner == current->pid)
                        |- wait_event()
                           Wait lock owner to release the lock, but it's
                           locked by ourself, so it will deadlock
      
      [FIX]
      This patch will do the lock_owner and current->pid check at
      btrfs_init_new_buffer().
      So above deadlock can be avoided.
      
      Since such problem can only happen in crafted image, we will still
      trigger kernel warning for later aborted transaction, but with a little
      more meaningful warning message.
      
      Link: https://bugzilla.kernel.org/show_bug.cgi?id=200405Reported-by: default avatarXu Wen <wen.xu@gatech.edu>
      CC: stable@vger.kernel.org # 4.4+
      Signed-off-by: default avatarQu Wenruo <wqu@suse.com>
      Reviewed-by: default avatarDavid Sterba <dsterba@suse.com>
      Signed-off-by: default avatarDavid Sterba <dsterba@suse.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      d36ac4a3
    • Qu Wenruo's avatar
      btrfs: Handle owner mismatch gracefully when walking up tree · 61e3643d
      Qu Wenruo authored
      commit 65c6e82b upstream.
      
      [BUG]
      When mounting certain crafted image, btrfs will trigger kernel BUG_ON()
      when trying to recover balance:
      
        kernel BUG at fs/btrfs/extent-tree.c:8956!
        invalid opcode: 0000 [#1] PREEMPT SMP NOPTI
        CPU: 1 PID: 662 Comm: mount Not tainted 4.18.0-rc1-custom+ #10
        RIP: 0010:walk_up_proc+0x336/0x480 [btrfs]
        RSP: 0018:ffffb53540c9b890 EFLAGS: 00010202
        Call Trace:
         walk_up_tree+0x172/0x1f0 [btrfs]
         btrfs_drop_snapshot+0x3a4/0x830 [btrfs]
         merge_reloc_roots+0xe1/0x1d0 [btrfs]
         btrfs_recover_relocation+0x3ea/0x420 [btrfs]
         open_ctree+0x1af3/0x1dd0 [btrfs]
         btrfs_mount_root+0x66b/0x740 [btrfs]
         mount_fs+0x3b/0x16a
         vfs_kern_mount.part.9+0x54/0x140
         btrfs_mount+0x16d/0x890 [btrfs]
         mount_fs+0x3b/0x16a
         vfs_kern_mount.part.9+0x54/0x140
         do_mount+0x1fd/0xda0
         ksys_mount+0xba/0xd0
         __x64_sys_mount+0x21/0x30
         do_syscall_64+0x60/0x210
         entry_SYSCALL_64_after_hwframe+0x49/0xbe
      
      [CAUSE]
      Extent tree corruption.  In this particular case, reloc tree root's
      owner is DATA_RELOC_TREE (should be TREE_RELOC), thus its backref is
      corrupted and we failed the owner check in walk_up_tree().
      
      [FIX]
      It's pretty hard to take care of every extent tree corruption, but at
      least we can remove such BUG_ON() and exit more gracefully.
      
      And since in this particular image, DATA_RELOC_TREE and TREE_RELOC share
      the same root (which is obviously invalid), we needs to make
      __del_reloc_root() more robust to detect such invalid sharing to avoid
      possible NULL dereference as root->node can be NULL in this case.
      
      Link: https://bugzilla.kernel.org/show_bug.cgi?id=200411Reported-by: default avatarXu Wen <wen.xu@gatech.edu>
      CC: stable@vger.kernel.org # 4.4+
      Signed-off-by: default avatarQu Wenruo <wqu@suse.com>
      Reviewed-by: default avatarDavid Sterba <dsterba@suse.com>
      Signed-off-by: default avatarDavid Sterba <dsterba@suse.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      61e3643d
    • Qu Wenruo's avatar
      btrfs: qgroup: Avoid calling qgroup functions if qgroup is not enabled · 2f7e525a
      Qu Wenruo authored
      commit 3628b4ca upstream.
      
      Some qgroup trace events like btrfs_qgroup_release_data() and
      btrfs_qgroup_free_delayed_ref() can still be triggered even if qgroup is
      not enabled.
      
      This is caused by the lack of qgroup status check before calling some
      qgroup functions.  Thankfully the functions can handle quota disabled
      case well and just do nothing for qgroup disabled case.
      
      This patch will do earlier check before triggering related trace events.
      
      And for enabled <-> disabled race case:
      
      1) For enabled->disabled case
         Disable will wipe out all qgroups data including reservation and
         excl/rfer. Even if we leak some reservation or numbers, it will
         still be cleared, so nothing will go wrong.
      
      2) For disabled -> enabled case
         Current btrfs_qgroup_release_data() will use extent_io tree to ensure
         we won't underflow reservation. And for delayed_ref we use
         head->qgroup_reserved to record the reserved space, so in that case
         head->qgroup_reserved should be 0 and we won't underflow.
      
      CC: stable@vger.kernel.org # 4.14+
      Reported-by: default avatarChris Murphy <lists@colorremedies.com>
      Link: https://lore.kernel.org/linux-btrfs/CAJCQCtQau7DtuUUeycCkZ36qjbKuxNzsgqJ7+sJ6W0dK_NLE3w@mail.gmail.com/Signed-off-by: default avatarQu Wenruo <wqu@suse.com>
      Reviewed-by: default avatarDavid Sterba <dsterba@suse.com>
      Signed-off-by: default avatarDavid Sterba <dsterba@suse.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      2f7e525a
    • Breno Leitao's avatar
      selftests/powerpc: Fix ptrace tm failure · e09399e8
      Breno Leitao authored
      commit 48dc0ef1 upstream.
      
      Test ptrace-tm-spd-gpr fails on current kernel (4.19) due to a segmentation
      fault that happens on the child process prior to setting cptr[2] = 1. This
      causes the parent process to wait forever at 'while (!pptr[2])' and the test to
      be killed by the test harness framework by timeout, thus, failing.
      
      The segmentation fault happens because of a inline assembly being
      generated as:
      
      	0x10000355c <tm_spd_gpr+492>    lfs    f0, 0(0)
      
      This is reading memory position 0x0 and causing the segmentation fault.
      
      This code is being generated by ASM_LOAD_FPR_SINGLE_PRECISION(flt_4), where
      flt_4 is passed to the inline assembly block as:
      
      	[flt_4] "r" (&d)
      
      Since the inline assembly 'r' constraint means any GPR, gpr0 is being
      chosen, thus causing this issue when issuing a Load Floating-Point Single
      instruction.
      
      This patch simply changes the constraint to 'b', which specify that this
      register will be used as base, and r0 is not allowed to be used, avoiding
      this issue.
      
      Other than that, removing flt_2 register from the input operands, since it
      is not used by the inline assembly code at all.
      
      Cc: stable@vger.kernel.org
      Signed-off-by: default avatarBreno Leitao <leitao@debian.org>
      Acked-by: default avatarSegher Boessenkool <segher@kernel.crashing.org>
      Signed-off-by: default avatarMichael Ellerman <mpe@ellerman.id.au>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      e09399e8
    • Johan Hovold's avatar
      soc/tegra: pmc: Fix child-node lookup · 59bc2d3a
      Johan Hovold authored
      commit 1dc6bd5e upstream.
      
      Fix child-node lookup during probe, which ended up searching the whole
      device tree depth-first starting at the parent rather than just matching
      on its children.
      
      To make things worse, the parent pmc node could end up being prematurely
      freed as of_find_node_by_name() drops a reference to its first argument.
      
      Fixes: 3568df3d ("soc: tegra: Add thermal reset (thermtrip) support to PMC")
      Cc: stable <stable@vger.kernel.org>     # 4.0
      Cc: Mikko Perttunen <mperttunen@nvidia.com>
      Signed-off-by: default avatarJohan Hovold <johan@kernel.org>
      Reviewed-by: default avatarMikko Perttunen <mperttunen@nvidia.com>
      Signed-off-by: default avatarThierry Reding <treding@nvidia.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      59bc2d3a
    • Thor Thayer's avatar
      arm64: dts: stratix10: Correct System Manager register size · 1258e45d
      Thor Thayer authored
      commit 74121b9a upstream.
      
      Correct the register size of the System Manager node.
      
      Cc: stable@vger.kernel.org
      Fixes: 78cd6a9d ("arm64: dts: Add base stratix 10 dtsi")
      Signed-off-by: default avatarThor Thayer <thor.thayer@linux.intel.com>
      Signed-off-by: default avatarDinh Nguyen <dinguyen@kernel.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      1258e45d
    • Thor Thayer's avatar
      ARM: dts: socfpga: Fix SDRAM node address for Arria10 · 540631d9
      Thor Thayer authored
      commit ce3bf934 upstream.
      
      The address in the SDRAM node was incorrect. Fix this to agree with the
      correct address and to match the reg definition block.
      
      Cc: stable@vger.kernel.org
      Fixes: 54b4a8f5("arm: socfpga: dts: Add Arria10 SDRAM EDAC DTS support")
      Signed-off-by: default avatarThor Thayer <thor.thayer@linux.intel.com>
      Signed-off-by: default avatarDinh Nguyen <dinguyen@kernel.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      540631d9
    • Nicolas Pitre's avatar
      Cramfs: fix abad comparison when wrap-arounds occur · fdd780d0
      Nicolas Pitre authored
      commit 672ca9dd upstream.
      
      It is possible for corrupted filesystem images to produce very large
      block offsets that may wrap when a length is added, and wrongly pass
      the buffer size test.
      Reported-by: default avatarAnatoly Trosinenko <anatoly.trosinenko@gmail.com>
      Signed-off-by: default avatarNicolas Pitre <nico@linaro.org>
      Cc: stable@vger.kernel.org
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      fdd780d0
    • Colin Ian King's avatar
      rpmsg: smd: fix memory leak on channel create · 7f46d951
      Colin Ian King authored
      commit 940c620d upstream.
      
      Currently a failed allocation of channel->name leads to an
      immediate return without freeing channel. Fix this by setting
      ret to -ENOMEM and jumping to an exit path that kfree's channel.
      
      Detected by CoverityScan, CID#1473692 ("Resource Leak")
      
      Fixes: 53e2822e ("rpmsg: Introduce Qualcomm SMD backend")
      Cc: stable@vger.kernel.org
      Signed-off-by: default avatarColin Ian King <colin.king@canonical.com>
      Signed-off-by: default avatarBjorn Andersson <bjorn.andersson@linaro.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      7f46d951
    • Tri Vo's avatar
      arm64: lse: remove -fcall-used-x0 flag · ad2c712b
      Tri Vo authored
      commit 2a6c7c36 upstream.
      
      x0 is not callee-saved in the PCS. So there is no need to specify
      -fcall-used-x0.
      
      Clang doesn't currently support -fcall-used flags. This patch will help
      building the kernel with clang.
      Tested-by: default avatarNick Desaulniers <ndesaulniers@google.com>
      Acked-by: default avatarWill Deacon <will.deacon@arm.com>
      Signed-off-by: default avatarTri Vo <trong@android.com>
      Signed-off-by: default avatarCatalin Marinas <catalin.marinas@arm.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      ad2c712b
    • Hans Verkuil's avatar
      media: media colorspaces*.rst: rename AdobeRGB to opRGB · baf1746d
      Hans Verkuil authored
      commit a58c3797 upstream.
      
      Drop all Adobe references and use the official opRGB standard
      instead.
      Signed-off-by: default avatarHans Verkuil <hans.verkuil@cisco.com>
      Cc: stable@vger.kernel.org
      Acked-by: default avatarDaniel Vetter <daniel.vetter@ffwll.ch>
      Signed-off-by: default avatarMauro Carvalho Chehab <mchehab+samsung@kernel.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      baf1746d
    • Mauro Carvalho Chehab's avatar
      media: em28xx: make v4l2-compliance happier by starting sequence on zero · 453a8a45
      Mauro Carvalho Chehab authored
      commit afeaade9 upstream.
      
      The v4l2-compliance tool complains if a video doesn't start
      with a zero sequence number.
      
      While this shouldn't cause any real problem for apps, let's
      make it happier, in order to better check the v4l2-compliance
      differences before and after patchsets.
      
      This is actually an old issue. It is there since at least its
      videobuf2 conversion, e. g. changeset 3829fadc461 ("[media]
      em28xx: convert to videobuf2"), if VB1 wouldn't suffer from
      the same issue.
      
      Cc: stable@vger.kernel.org
      Fixes: d3829fad ("[media] em28xx: convert to videobuf2")
      Signed-off-by: default avatarMauro Carvalho Chehab <mchehab+samsung@kernel.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      453a8a45
    • Mauro Carvalho Chehab's avatar
      media: em28xx: fix input name for Terratec AV 350 · a4e311ed
      Mauro Carvalho Chehab authored
      commit 15644bfa upstream.
      
      Instead of using a register value, use an AMUX name, as otherwise
      VIDIOC_G_AUDIO would fail.
      
      Cc: stable@vger.kernel.org
      Fixes: 766ed64d ("V4L/DVB (11827): Add support for Terratec Grabster AV350")
      Signed-off-by: default avatarMauro Carvalho Chehab <mchehab+samsung@kernel.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      a4e311ed
    • Mauro Carvalho Chehab's avatar
      media: tvp5150: avoid going past array on v4l2_querymenu() · 3dfd975b
      Mauro Carvalho Chehab authored
      commit 5c4c4505 upstream.
      
      The parameters of v4l2_ctrl_new_std_menu_items() are tricky: instead of
      the number of possible values, it requires the number of the maximum
      value. In other words, the ARRAY_SIZE() value should be decremented,
      otherwise it will go past the array bounds, as warned by KASAN:
      
      [  279.839688] BUG: KASAN: global-out-of-bounds in v4l2_querymenu+0x10d/0x180 [videodev]
      [  279.839709] Read of size 8 at addr ffffffffc10a4cb0 by task v4l2-compliance/16676
      
      [  279.839736] CPU: 1 PID: 16676 Comm: v4l2-compliance Not tainted 4.18.0-rc2+ #120
      [  279.839741] Hardware name:  /NUC5i7RYB, BIOS RYBDWi35.86A.0364.2017.0511.0949 05/11/2017
      [  279.839743] Call Trace:
      [  279.839758]  dump_stack+0x71/0xab
      [  279.839807]  ? v4l2_querymenu+0x10d/0x180 [videodev]
      [  279.839817]  print_address_description+0x1c9/0x270
      [  279.839863]  ? v4l2_querymenu+0x10d/0x180 [videodev]
      [  279.839871]  kasan_report+0x237/0x360
      [  279.839918]  v4l2_querymenu+0x10d/0x180 [videodev]
      [  279.839964]  __video_do_ioctl+0x2c8/0x590 [videodev]
      [  279.840011]  ? copy_overflow+0x20/0x20 [videodev]
      [  279.840020]  ? avc_ss_reset+0xa0/0xa0
      [  279.840028]  ? check_stack_object+0x21/0x60
      [  279.840036]  ? __check_object_size+0xe7/0x240
      [  279.840080]  video_usercopy+0xed/0x730 [videodev]
      [  279.840123]  ? copy_overflow+0x20/0x20 [videodev]
      [  279.840167]  ? v4l_enumstd+0x40/0x40 [videodev]
      [  279.840177]  ? __handle_mm_fault+0x9f9/0x1ba0
      [  279.840186]  ? __pmd_alloc+0x2c0/0x2c0
      [  279.840193]  ? __vfs_write+0xb6/0x350
      [  279.840200]  ? kernel_read+0xa0/0xa0
      [  279.840244]  ? video_usercopy+0x730/0x730 [videodev]
      [  279.840284]  v4l2_ioctl+0xa1/0xb0 [videodev]
      [  279.840295]  do_vfs_ioctl+0x117/0x8a0
      [  279.840303]  ? selinux_file_ioctl+0x211/0x2f0
      [  279.840313]  ? ioctl_preallocate+0x120/0x120
      [  279.840319]  ? selinux_capable+0x20/0x20
      [  279.840332]  ksys_ioctl+0x70/0x80
      [  279.840342]  __x64_sys_ioctl+0x3d/0x50
      [  279.840351]  do_syscall_64+0x6d/0x1c0
      [  279.840361]  entry_SYSCALL_64_after_hwframe+0x44/0xa9
      [  279.840367] RIP: 0033:0x7fdfb46275d7
      [  279.840369] Code: b3 66 90 48 8b 05 b1 48 2d 00 64 c7 00 26 00 00 00 48 c7 c0 ff ff ff ff c3 66 2e 0f 1f 84 00 00 00 00 00 b8 10 00 00 00 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d 81 48 2d 00 f7 d8 64 89 01 48
      [  279.840474] RSP: 002b:00007ffee1179038 EFLAGS: 00000202 ORIG_RAX: 0000000000000010
      [  279.840483] RAX: ffffffffffffffda RBX: 00007ffee1179180 RCX: 00007fdfb46275d7
      [  279.840488] RDX: 00007ffee11790c0 RSI: 00000000c02c5625 RDI: 0000000000000003
      [  279.840493] RBP: 0000000000000002 R08: 0000000000000020 R09: 00000000009f0902
      [  279.840497] R10: 0000000000000000 R11: 0000000000000202 R12: 00007ffee117a5a0
      [  279.840501] R13: 00007ffee11790c0 R14: 0000000000000002 R15: 0000000000000000
      
      [  279.840515] The buggy address belongs to the variable:
      [  279.840535]  tvp5150_test_patterns+0x10/0xffffffffffffe360 [tvp5150]
      
      Fixes: c43875f6 ("[media] tvp5150: replace MEDIA_ENT_F_CONN_TEST by a control")
      Cc: stable@vger.kernel.org
      Signed-off-by: default avatarMauro Carvalho Chehab <mchehab+samsung@kernel.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      3dfd975b
    • Mauro Carvalho Chehab's avatar
      media: em28xx: use a default format if TRY_FMT fails · 1f1eb844
      Mauro Carvalho Chehab authored
      commit f823ce2a upstream.
      
      Follow the V4L2 spec, as warned by v4l2-compliance:
      
      	warn: v4l2-test-formats.cpp(732): TRY_FMT cannot handle an invalid pixelformat.
      	warn: v4l2-test-formats.cpp(733): This may or may not be a problem. For more information see:
      
      warn: v4l2-test-formats.cpp(734): http://www.mail-archive.com/linux-media@vger.kernel.org/msg56550.html
      
      Cc: stable@vger.kernel.org
      Fixes: bddcf633 ("V4L/DVB (9927): em28xx: use a more standard way to specify video formats")
      Signed-off-by: default avatarMauro Carvalho Chehab <mchehab+samsung@kernel.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      1f1eb844
    • Manjunath Patil's avatar
      xen-blkfront: fix kernel panic with negotiate_mq error path · 4e6d30de
      Manjunath Patil authored
      commit 6cc4a086 upstream.
      
      info->nr_rings isn't adjusted in case of ENOMEM error from
      negotiate_mq(). This leads to kernel panic in error path.
      
      Typical call stack involving panic -
       #8 page_fault at ffffffff8175936f
          [exception RIP: blkif_free_ring+33]
          RIP: ffffffffa0149491  RSP: ffff8804f7673c08  RFLAGS: 00010292
       ...
       #9 blkif_free at ffffffffa0149aaa [xen_blkfront]
       #10 talk_to_blkback at ffffffffa014c8cd [xen_blkfront]
       #11 blkback_changed at ffffffffa014ea8b [xen_blkfront]
       #12 xenbus_otherend_changed at ffffffff81424670
       #13 backend_changed at ffffffff81426dc3
       #14 xenwatch_thread at ffffffff81422f29
       #15 kthread at ffffffff810abe6a
       #16 ret_from_fork at ffffffff81754078
      
      Cc: stable@vger.kernel.org
      Fixes: 7ed8ce1c ("xen-blkfront: move negotiate_mq to cover all cases of new VBDs")
      Signed-off-by: default avatarManjunath Patil <manjunath.b.patil@oracle.com>
      Acked-by: default avatarRoger Pau Monné <roger.pau@citrix.com>
      Signed-off-by: default avatarJuergen Gross <jgross@suse.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      4e6d30de
    • Juergen Gross's avatar
      xen: fix xen_qlock_wait() · 034680fc
      Juergen Gross authored
      commit d3132b38 upstream.
      
      Commit a8565319 ("xen: make xen_qlock_wait() nestable")
      introduced a regression for Xen guests running fully virtualized
      (HVM or PVH mode). The Xen hypervisor wouldn't return from the poll
      hypercall with interrupts disabled in case of an interrupt (for PV
      guests it does).
      
      So instead of disabling interrupts in xen_qlock_wait() use a nesting
      counter to avoid calling xen_clear_irq_pending() in case
      xen_qlock_wait() is nested.
      
      Fixes: a8565319 ("xen: make xen_qlock_wait() nestable")
      Cc: stable@vger.kernel.org
      Reported-by: default avatarSander Eikelenboom <linux@eikelenboom.it>
      Signed-off-by: default avatarJuergen Gross <jgross@suse.com>
      Reviewed-by: default avatarBoris Ostrovsky <boris.ostrovsky@oracle.com>
      Tested-by: default avatarSander Eikelenboom <linux@eikelenboom.it>
      Signed-off-by: default avatarJuergen Gross <jgross@suse.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      034680fc
    • He Zhe's avatar
      kgdboc: Passing ekgdboc to command line causes panic · 8305d98c
      He Zhe authored
      commit 1bd54d85 upstream.
      
      kgdboc_option_setup does not check input argument before passing it
      to strlen. The argument would be a NULL pointer if "ekgdboc", without
      its value, is set in command line and thus cause the following panic.
      
      PANIC: early exception 0xe3 IP 10:ffffffff8fbbb620 error 0 cr2 0x0
      [    0.000000] CPU: 0 PID: 0 Comm: swapper Not tainted 4.18-rc8+ #1
      [    0.000000] RIP: 0010:strlen+0x0/0x20
      ...
      [    0.000000] Call Trace
      [    0.000000]  ? kgdboc_option_setup+0x9/0xa0
      [    0.000000]  ? kgdboc_early_init+0x6/0x1b
      [    0.000000]  ? do_early_param+0x4d/0x82
      [    0.000000]  ? parse_args+0x212/0x330
      [    0.000000]  ? rdinit_setup+0x26/0x26
      [    0.000000]  ? parse_early_options+0x20/0x23
      [    0.000000]  ? rdinit_setup+0x26/0x26
      [    0.000000]  ? parse_early_param+0x2d/0x39
      [    0.000000]  ? setup_arch+0x2f7/0xbf4
      [    0.000000]  ? start_kernel+0x5e/0x4c2
      [    0.000000]  ? load_ucode_bsp+0x113/0x12f
      [    0.000000]  ? secondary_startup_64+0xa5/0xb0
      
      This patch adds a check to prevent the panic.
      
      Cc: stable@vger.kernel.org
      Cc: jason.wessel@windriver.com
      Cc: gregkh@linuxfoundation.org
      Cc: jslaby@suse.com
      Signed-off-by: default avatarHe Zhe <zhe.he@windriver.com>
      Reviewed-by: default avatarDaniel Thompson <daniel.thompson@linaro.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      8305d98c
    • Hans Verkuil's avatar
      media: v4l2-tpg: fix kernel oops when enabling HFLIP and OSD · fdff1cf0
      Hans Verkuil authored
      commit 250854ee upstream.
      
      When the OSD is on (i.e. vivid displays text on top of the test pattern), and
      you enable hflip, then the driver crashes.
      
      The cause turned out to be a division of a negative number by an unsigned value.
      You expect that -8 / 2U would be -4, but in reality it is 2147483644 :-(
      
      Fixes: 3e14e7a8 ("vivid-tpg: add hor/vert downsampling support to tpg_gen_text")
      Signed-off-by: default avatarHans Verkuil <hans.verkuil@cisco.com>
      Reported-by: default avatarMauro Carvalho Chehab <mchehab+samsung@kernel.org>
      Cc: <stable@vger.kernel.org>      # for v4.1 and up
      Signed-off-by: default avatarMauro Carvalho Chehab <mchehab+samsung@kernel.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      fdff1cf0
    • Maciej W. Rozycki's avatar
      TC: Set DMA masks for devices · e7f6b41a
      Maciej W. Rozycki authored
      commit 3f2aa244 upstream.
      
      Fix a TURBOchannel support regression with commit 205e1b7f
      ("dma-mapping: warn when there is no coherent_dma_mask") that caused
      coherent DMA allocations to produce a warning such as:
      
      defxx: v1.11 2014/07/01  Lawrence V. Stefani and others
      tc1: DEFTA at MMIO addr = 0x1e900000, IRQ = 20, Hardware addr = 08-00-2b-a3-a3-29
      ------------[ cut here ]------------
      WARNING: CPU: 0 PID: 1 at ./include/linux/dma-mapping.h:516 dfx_dev_register+0x670/0x678
      Modules linked in:
      CPU: 0 PID: 1 Comm: swapper Not tainted 4.19.0-rc6 #2
      Stack : ffffffff8009ffc0 fffffffffffffec0 0000000000000000 ffffffff80647650
              0000000000000000 0000000000000000 ffffffff806f5f80 ffffffffffffffff
              0000000000000000 0000000000000000 0000000000000001 ffffffff8065d4e8
              98000000031b6300 ffffffff80563478 ffffffff805685b0 ffffffffffffffff
              0000000000000000 ffffffff805d6720 0000000000000204 ffffffff80388df8
              0000000000000000 0000000000000009 ffffffff8053efd0 ffffffff806657d0
              0000000000000000 ffffffff803177f8 0000000000000000 ffffffff806d0000
              9800000003078000 980000000307b9e0 000000001e900000 ffffffff80067940
              0000000000000000 ffffffff805d6720 0000000000000204 ffffffff80388df8
              ffffffff805176c0 ffffffff8004dc78 0000000000000000 ffffffff80067940
              ...
      Call Trace:
      [<ffffffff8004dc78>] show_stack+0xa0/0x130
      [<ffffffff80067940>] __warn+0x128/0x170
      ---[ end trace b1d1e094f67f3bb2 ]---
      
      This is because the TURBOchannel bus driver fails to set the coherent
      DMA mask for devices enumerated.
      
      Set the regular and coherent DMA masks for TURBOchannel devices then,
      observing that the bus protocol supports a 34-bit (16GiB) DMA address
      space, by interpreting the value presented in the address cycle across
      the 32 `ad' lines as a 32-bit word rather than byte address[1].  The
      architectural size of the TURBOchannel DMA address space exceeds the
      maximum amount of RAM any actual TURBOchannel system in existence may
      have, hence both masks are the same.
      
      This removes the warning shown above.
      
      References:
      
      [1] "TURBOchannel Hardware Specification", EK-369AA-OD-007B, Digital
          Equipment Corporation, January 1993, Section "DMA", pp. 1-15 -- 1-17
      Signed-off-by: default avatarMaciej W. Rozycki <macro@linux-mips.org>
      Signed-off-by: default avatarPaul Burton <paul.burton@mips.com>
      Patchwork: https://patchwork.linux-mips.org/patch/20835/
      Fixes: 205e1b7f ("dma-mapping: warn when there is no coherent_dma_mask")
      Cc: stable@vger.kernel.org # 4.16+
      Cc: Ralf Baechle <ralf@linux-mips.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      e7f6b41a
    • Will Deacon's avatar
      iommu/arm-smmu: Ensure that page-table updates are visible before TLBI · 34e0ab87
      Will Deacon authored
      commit 7d321bd3 upstream.
      
      The IO-pgtable code relies on the driver TLB invalidation callbacks to
      ensure that all page-table updates are visible to the IOMMU page-table
      walker.
      
      In the case that the page-table walker is cache-coherent, we cannot rely
      on an implicit DSB from the DMA-mapping code, so we must ensure that we
      execute a DSB in our tlb_add_flush() callback prior to triggering the
      invalidation.
      
      Cc: <stable@vger.kernel.org>
      Cc: Robin Murphy <robin.murphy@arm.com>
      Fixes: 2df7a25c ("iommu/arm-smmu: Clean up DMA API usage")
      Signed-off-by: default avatarWill Deacon <will.deacon@arm.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      34e0ab87
    • Aaro Koskinen's avatar
      MIPS: OCTEON: fix out of bounds array access on CN68XX · d70d2e3c
      Aaro Koskinen authored
      commit c0fae7e2 upstream.
      
      The maximum number of interfaces is returned by
      cvmx_helper_get_number_of_interfaces(), and the value is used to access
      interface_port_count[]. When CN68XX support was added, we forgot
      to increase the array size. Fix that.
      
      Fixes: 2c8c3f02 ("MIPS: Octeon: Support additional interfaces on CN68XX")
      Signed-off-by: default avatarAaro Koskinen <aaro.koskinen@iki.fi>
      Signed-off-by: default avatarPaul Burton <paul.burton@mips.com>
      Patchwork: https://patchwork.linux-mips.org/patch/20949/
      Cc: Ralf Baechle <ralf@linux-mips.org>
      Cc: linux-mips@linux-mips.org
      Cc: linux-kernel@vger.kernel.org
      Cc: stable@vger.kernel.org # v4.3+
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      d70d2e3c
    • Christophe Leroy's avatar
      powerpc/msi: Fix compile error on mpc83xx · dfed53c8
      Christophe Leroy authored
      commit 0f99153d upstream.
      
      mpic_get_primary_version() is not defined when not using MPIC.
      The compile error log like:
      
      arch/powerpc/sysdev/built-in.o: In function `fsl_of_msi_probe':
      fsl_msi.c:(.text+0x150c): undefined reference to `fsl_mpic_primary_get_version'
      Signed-off-by: default avatarJia Hongtao <hongtao.jia@freescale.com>
      Signed-off-by: default avatarScott Wood <scottwood@freescale.com>
      Reported-by: default avatarRadu Rendec <radu.rendec@gmail.com>
      Fixes: 807d38b7 ("powerpc/mpic: Add get_version API both for internal and external use")
      Cc: stable@vger.kernel.org
      Signed-off-by: default avatarChristophe Leroy <christophe.leroy@c-s.fr>
      Signed-off-by: default avatarMichael Ellerman <mpe@ellerman.id.au>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      dfed53c8
    • Damien Le Moal's avatar
      dm zoned: fix various dmz_get_mblock() issues · 37531246
      Damien Le Moal authored
      commit 3d4e7383 upstream.
      
      dmz_fetch_mblock() called from dmz_get_mblock() has a race since the
      allocation of the new metadata block descriptor and its insertion in
      the cache rbtree with the READING state is not atomic. Two different
      contexts requesting the same block may end up each adding two different
      descriptors of the same block to the cache.
      
      Another problem for this function is that the BIO for processing the
      block read is allocated after the metadata block descriptor is inserted
      in the cache rbtree. If the BIO allocation fails, the metadata block
      descriptor is freed without first being removed from the rbtree.
      
      Fix the first problem by checking again if the requested block is not in
      the cache right before inserting the newly allocated descriptor,
      atomically under the mblk_lock spinlock. The second problem is fixed by
      simply allocating the BIO before inserting the new block in the cache.
      
      Finally, since dmz_fetch_mblock() also increments a block reference
      counter, rename the function to dmz_get_mblock_slow(). To be symmetric
      and clear, also rename dmz_lookup_mblock() to dmz_get_mblock_fast() and
      increment the block reference counter directly in that function rather
      than in dmz_get_mblock().
      
      Fixes: 3b1a94c8 ("dm zoned: drive-managed zoned block device target")
      Cc: stable@vger.kernel.org
      Signed-off-by: default avatarDamien Le Moal <damien.lemoal@wdc.com>
      Signed-off-by: default avatarMike Snitzer <snitzer@redhat.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      37531246