1. 23 Nov, 2016 2 commits
    • Filipe Manana's avatar
      Btrfs: fix emptiness check for dirtied extent buffers at check_leaf() · f177d739
      Filipe Manana authored
      We can not simply use the owner field from an extent buffer's header to
      get the id of the respective tree when the extent buffer is from a
      relocation tree. When we create the root for a relocation tree we leave
      (on purpose) the owner field with the same value as the subvolume's tree
      root (we do this at ctree.c:btrfs_copy_root()). So we must ignore extent
      buffers from relocation trees, which have the BTRFS_HEADER_FLAG_RELOC
      flag set, because otherwise we will always consider the extent buffer
      as not being the root of the tree (the root of original subvolume tree
      is always different from the root of the respective relocation tree).
      
      This lead to assertion failures when running with the integrity checker
      enabled (CONFIG_BTRFS_FS_CHECK_INTEGRITY=y) such as the following:
      
      [  643.393409] BTRFS critical (device sdg): corrupt leaf, non-root leaf's nritems is 0: block=38506496, root=260, slot=0
      [  643.397609] BTRFS info (device sdg): leaf 38506496 total ptrs 0 free space 3995
      [  643.407075] assertion failed: 0, file: fs/btrfs/disk-io.c, line: 4078
      [  643.408425] ------------[ cut here ]------------
      [  643.409112] kernel BUG at fs/btrfs/ctree.h:3419!
      [  643.409773] invalid opcode: 0000 [#1] PREEMPT SMP
      [  643.410447] Modules linked in: dm_flakey dm_mod crc32c_generic btrfs xor raid6_pq ppdev psmouse acpi_cpufreq parport_pc evdev parport tpm_tis tpm_tis_core pcspkr serio_raw i2c_piix4 sg tpm i2c_core button processor loop autofs4 ext4 crc16 jbd2 mbcache sr_mod cdrom sd_mod ata_generic virtio_scsi ata_piix libata virtio_pci virtio_ring scsi_mod virtio e1000 floppy
      [  643.414356] CPU: 11 PID: 32726 Comm: btrfs Not tainted 4.8.0-rc8-btrfs-next-35+ #1
      [  643.414356] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.9.1-0-gb3ef39f-prebuilt.qemu-project.org 04/01/2014
      [  643.414356] task: ffff880145e95b00 task.stack: ffff88014826c000
      [  643.414356] RIP: 0010:[<ffffffffa0352759>]  [<ffffffffa0352759>] assfail.constprop.41+0x1c/0x1e [btrfs]
      [  643.414356] RSP: 0018:ffff88014826fa28  EFLAGS: 00010292
      [  643.414356] RAX: 0000000000000039 RBX: ffff88014e2d7c38 RCX: 0000000000000001
      [  643.414356] RDX: ffff88023f4d2f58 RSI: ffffffff81806c63 RDI: 00000000ffffffff
      [  643.414356] RBP: ffff88014826fa28 R08: 0000000000000001 R09: 0000000000000000
      [  643.414356] R10: ffff88014826f918 R11: ffffffff82f3c5ed R12: ffff880172910000
      [  643.414356] R13: ffff880233992230 R14: ffff8801a68a3310 R15: fffffffffffffff8
      [  643.414356] FS:  00007f9ca305e8c0(0000) GS:ffff88023f4c0000(0000) knlGS:0000000000000000
      [  643.414356] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
      [  643.414356] CR2: 00007f9ca3071000 CR3: 000000015d01b000 CR4: 00000000000006e0
      [  643.414356] Stack:
      [  643.414356]  ffff88014826fa50 ffffffffa02d655a 000000000000000a ffff88014e2d7c38
      [  643.414356]  0000000000000000 ffff88014826faa8 ffffffffa02b72f3 ffff88014826fab8
      [  643.414356]  00ffffffa03228e4 0000000000000000 0000000000000000 ffff8801bbd4e000
      [  643.414356] Call Trace:
      [  643.414356]  [<ffffffffa02d655a>] btrfs_mark_buffer_dirty+0xdf/0xe5 [btrfs]
      [  643.414356]  [<ffffffffa02b72f3>] btrfs_copy_root+0x18a/0x1d1 [btrfs]
      [  643.414356]  [<ffffffffa0322921>] create_reloc_root+0x72/0x1ba [btrfs]
      [  643.414356]  [<ffffffffa03267c2>] btrfs_init_reloc_root+0x7b/0xa7 [btrfs]
      [  643.414356]  [<ffffffffa02d9e44>] record_root_in_trans+0xdf/0xed [btrfs]
      [  643.414356]  [<ffffffffa02db04e>] btrfs_record_root_in_trans+0x50/0x6a [btrfs]
      [  643.414356]  [<ffffffffa030ad2b>] create_subvol+0x472/0x773 [btrfs]
      [  643.414356]  [<ffffffffa030b406>] btrfs_mksubvol+0x3da/0x463 [btrfs]
      [  643.414356]  [<ffffffffa030b406>] ? btrfs_mksubvol+0x3da/0x463 [btrfs]
      [  643.414356]  [<ffffffff810781ac>] ? preempt_count_add+0x65/0x68
      [  643.414356]  [<ffffffff811a6e97>] ? __mnt_want_write+0x62/0x77
      [  643.414356]  [<ffffffffa030b55d>] btrfs_ioctl_snap_create_transid+0xce/0x187 [btrfs]
      [  643.414356]  [<ffffffffa030b67d>] btrfs_ioctl_snap_create+0x67/0x81 [btrfs]
      [  643.414356]  [<ffffffffa030ecfd>] btrfs_ioctl+0x508/0x20dd [btrfs]
      [  643.414356]  [<ffffffff81293e39>] ? __this_cpu_preempt_check+0x13/0x15
      [  643.414356]  [<ffffffff81155eca>] ? handle_mm_fault+0x976/0x9ab
      [  643.414356]  [<ffffffff81091300>] ? arch_local_irq_save+0x9/0xc
      [  643.414356]  [<ffffffff8119a2b0>] vfs_ioctl+0x18/0x34
      [  643.414356]  [<ffffffff8119a8e8>] do_vfs_ioctl+0x581/0x600
      [  643.414356]  [<ffffffff814b9552>] ? entry_SYSCALL_64_fastpath+0x5/0xa8
      [  643.414356]  [<ffffffff81093fe9>] ? trace_hardirqs_on_caller+0x17b/0x197
      [  643.414356]  [<ffffffff8119a9be>] SyS_ioctl+0x57/0x79
      [  643.414356]  [<ffffffff814b9565>] entry_SYSCALL_64_fastpath+0x18/0xa8
      [  643.414356]  [<ffffffff81091b08>] ? trace_hardirqs_off_caller+0x3f/0xaa
      [  643.414356] Code: 89 83 88 00 00 00 31 c0 5b 41 5c 41 5d 5d c3 55 89 f1 48 c7 c2 98 bc 35 a0 48 89 fe 48 c7 c7 05 be 35 a0 48 89 e5 e8 13 46 dd e0 <0f> 0b 55 89 f1 48 c7 c2 9f d3 35 a0 48 89 fe 48 c7 c7 7a d5 35
      [  643.414356] RIP  [<ffffffffa0352759>] assfail.constprop.41+0x1c/0x1e [btrfs]
      [  643.414356]  RSP <ffff88014826fa28>
      [  643.468267] ---[ end trace 6a1b3fb1a9d7d6e3 ]---
      
      This can be easily reproduced by running xfstests with the integrity
      checker enabled.
      
      Fixes: 1ba98d08 (Btrfs: detect corruption when non-root leaf has zero item)
      Cc: stable@vger.kernel.org  # 4.8+
      Signed-off-by: default avatarFilipe Manana <fdmanana@suse.com>
      Reviewed-by: default avatarLiu Bo <bo.li.liu@oracle.com>
      f177d739
    • Liu Bo's avatar
      Btrfs: fix BUG_ON in btrfs_mark_buffer_dirty · ef85b25e
      Liu Bo authored
      This can only happen with CONFIG_BTRFS_FS_CHECK_INTEGRITY=y.
      
      Commit 1ba98d08 ("Btrfs: detect corruption when non-root leaf has zero item")
      assumes that a leaf is its root when leaf->bytenr == btrfs_root_bytenr(root),
      however, we should not use btrfs_root_bytenr(root) since it's mainly got
      updated during committing transaction.  So the check can fail when doing
      COW on this leaf while it is a root.
      
      This changes to use "if (leaf == btrfs_root_node(root))" instead, just like
      how we check whether leaf is a root in __btrfs_cow_block().
      
      Fixes: 1ba98d08 (Btrfs: detect corruption when non-root leaf has zero item)
      Cc: stable@vger.kernel.org  # 4.8+
      Reported-by: default avatarJeff Mahoney <jeffm@suse.com>
      Signed-off-by: default avatarLiu Bo <bo.li.liu@oracle.com>
      Reviewed-by: default avatarFilipe Manana <fdmanana@suse.com>
      ef85b25e
  2. 19 Nov, 2016 3 commits
    • Filipe Manana's avatar
      Btrfs: remove rb_node field from the delayed ref node structure · 2a2a83de
      Filipe Manana authored
      After the last big change in the delayed references code that was needed
      for the last qgroups rework, the red black tree node field of struct
      btrfs_delayed_ref_node is no longer used, so just remove it, this helps
      us save some memory (since struct rb_node is 24 bytes on x86_64) for
      these structures.
      Signed-off-by: default avatarFilipe Manana <fdmanana@suse.com>
      2a2a83de
    • Filipe Manana's avatar
      Btrfs: remove unused code when creating and merging reloc trees · 001895b3
      Filipe Manana authored
      In commit 5bc7247a (Btrfs: fix broken nocow after balance) we started
      abusing the rtransid and otransid fields of root items from relocation
      trees to fix some issues with nodatacow mode. However later in commit
      ba8b0289 (Btrfs: do not reset last_snapshot after relocation) we
      dropped the code that made use of those fields but did not remove
      the code that sets those fields.
      
      So just remove them to avoid confusion.
      Signed-off-by: default avatarFilipe Manana <fdmanana@suse.com>
      Reviewed-by: default avatarJosef Bacik <jbacik@fb.com>
      001895b3
    • Filipe Manana's avatar
      Btrfs: fix relocation incorrectly dropping data references · 054570a1
      Filipe Manana authored
      During relocation of a data block group we create a relocation tree
      for each fs/subvol tree by making a snapshot of each tree using
      btrfs_copy_root() and the tree's commit root, and then setting the last
      snapshot field for the fs/subvol tree's root to the value of the current
      transaction id minus 1. However this can lead to relocation later
      dropping references that it did not create if we have qgroups enabled,
      leaving the filesystem in an inconsistent state that keeps aborting
      transactions.
      
      Lets consider the following example to explain the problem, which requires
      qgroups to be enabled.
      
      We are relocating data block group Y, we have a subvolume with id 258 that
      has a root at level 1, that subvolume is used to store directory entries
      for snapshots and we are currently at transaction 3404.
      
      When committing transaction 3404, we have a pending snapshot and therefore
      we call btrfs_run_delayed_items() at transaction.c:create_pending_snapshot()
      in order to create its dentry at subvolume 258. This results in COWing
      leaf A from root 258 in order to add the dentry. Note that leaf A
      also contains file extent items referring to extents from some other
      block group X (we are currently relocating block group Y). Later on, still
      at create_pending_snapshot() we call qgroup_account_snapshot(), which
      switches the commit root for root 258 when it calls switch_commit_roots(),
      so now the COWed version of leaf A, lets call it leaf A', is accessible
      from the commit root of tree 258. At the end of qgroup_account_snapshot(),
      we call record_root_in_trans() with 258 as its argument, which results
      in btrfs_init_reloc_root() being called, which in turn calls
      relocation.c:create_reloc_root() in order to create a relocation tree
      associated to root 258, which results in assigning the value of 3403
      (which is the current transaction id minus 1 = 3404 - 1) to the
      last_snapshot field of root 258. When creating the relocation tree root
      at ctree.c:btrfs_copy_root() we add a shared reference for leaf A',
      corresponding to the relocation tree's root, when we call btrfs_inc_ref()
      against the COWed root (a copy of the commit root from tree 258), which
      is at level 1. So at this point leaf A' has 2 references, one normal
      reference corresponding to root 258 and one shared reference corresponding
      to the root of the relocation tree.
      
      Transaction 3404 finishes its commit and transaction 3405 is started by
      relocation when calling merge_reloc_root() for the relocation tree
      associated to root 258. In the meanwhile leaf A' is COWed again, in
      response to some filesystem operation, when we are still at transaction
      3405. However when we COW leaf A', at ctree.c:update_ref_for_cow(), we
      call btrfs_block_can_be_shared() in order to figure out if other trees
      refer to the leaf and if any such trees exists, add a full back reference
      to leaf A' - but btrfs_block_can_be_shared() incorrectly returns false
      because the following condition is false:
      
        btrfs_header_generation(buf) <= btrfs_root_last_snapshot(&root->root_item)
      
      which evaluates to 3404 <= 3403. So after leaf A' is COWed, it stays with
      only one reference, corresponding to the shared reference we created when
      we called btrfs_copy_root() to create the relocation tree's root and
      btrfs_inc_ref() ends up not being called for leaf A' nor we end up setting
      the flag BTRFS_BLOCK_FLAG_FULL_BACKREF in leaf A'. This results in not
      adding shared references for the extents from block group X that leaf A'
      refers to with its file extent items.
      
      Later, after merging the relocation root we do a call to to
      btrfs_drop_snapshot() in order to delete the relocation tree. This ends
      up calling do_walk_down() when path->slots[1] points to leaf A', which
      results in calling btrfs_lookup_extent_info() to get the number of
      references for leaf A', which is 1 at this time (only the shared reference
      exists) and this value is stored at wc->refs[0]. After this walk_up_proc()
      is called when wc->level is 0 and path->nodes[0] corresponds to leaf A'.
      Because the current level is 0 and wc->refs[0] is 1, it does call
      btrfs_dec_ref() against leaf A', which results in removing the single
      references that the extents from block group X have which are associated
      to root 258 - the expectation was to have each of these extents with 2
      references - one reference for root 258 and one shared reference related
      to the root of the relocation tree, and so we would drop only the shared
      reference (because leaf A' was supposed to have the flag
      BTRFS_BLOCK_FLAG_FULL_BACKREF set).
      
      This leaves the filesystem in an inconsistent state as we now have file
      extent items in a subvolume tree that point to extents from block group X
      without references in the extent tree. So later on when we try to decrement
      the references for these extents, for example due to a file unlink operation,
      truncate operation or overwriting ranges of a file, we fail because the
      expected references do not exist in the extent tree.
      
      This leads to warnings and transaction aborts like the following:
      
      [  588.965795] ------------[ cut here ]------------
      [  588.965815] WARNING: CPU: 2 PID: 2479 at fs/btrfs/extent-tree.c:1625 lookup_inline_extent_backref+0x432/0x5b0 [btrfs]
      [  588.965816] Modules linked in: af_packet iscsi_ibft iscsi_boot_sysfs xfs libcrc32c ppdev acpi_cpufreq button tpm_tis e1000 i2c_piix4 pcspkr parport_pc
      parport tpm qemu_fw_cfg joydev btrfs xor raid6_pq sr_mod cdrom ata_generic virtio_scsi ata_piix virtio_pci bochs_drm virtio_ring drm_kms_helper syscopyarea
      sysfillrect sysimgblt fb_sys_fops virtio ttm serio_raw drm floppy sg
      [  588.965831] CPU: 2 PID: 2479 Comm: kworker/u8:7 Not tainted 4.7.3-3-default-fdm+ #1
      [  588.965832] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.9.1-0-gb3ef39f-prebuilt.qemu-project.org 04/01/2014
      [  588.965844] Workqueue: btrfs-extent-refs btrfs_extent_refs_helper [btrfs]
      [  588.965845]  0000000000000000 ffff8802263bfa28 ffffffff813af542 0000000000000000
      [  588.965847]  0000000000000000 ffff8802263bfa68 ffffffff81081e8b 0000065900000000
      [  588.965848]  ffff8801db2af000 000000012bbe2000 0000000000000000 ffff880215703b48
      [  588.965849] Call Trace:
      [  588.965852]  [<ffffffff813af542>] dump_stack+0x63/0x81
      [  588.965854]  [<ffffffff81081e8b>] __warn+0xcb/0xf0
      [  588.965855]  [<ffffffff81081f7d>] warn_slowpath_null+0x1d/0x20
      [  588.965863]  [<ffffffffa0175042>] lookup_inline_extent_backref+0x432/0x5b0 [btrfs]
      [  588.965865]  [<ffffffff81143220>] ? trace_clock_local+0x10/0x30
      [  588.965867]  [<ffffffff8114c5df>] ? rb_reserve_next_event+0x6f/0x460
      [  588.965875]  [<ffffffffa0175215>] insert_inline_extent_backref+0x55/0xd0 [btrfs]
      [  588.965882]  [<ffffffffa017531f>] __btrfs_inc_extent_ref.isra.55+0x8f/0x240 [btrfs]
      [  588.965890]  [<ffffffffa017acea>] __btrfs_run_delayed_refs+0x74a/0x1260 [btrfs]
      [  588.965892]  [<ffffffff810cb046>] ? cpuacct_charge+0x86/0xa0
      [  588.965900]  [<ffffffffa017e74f>] btrfs_run_delayed_refs+0x9f/0x2c0 [btrfs]
      [  588.965908]  [<ffffffffa017ea04>] delayed_ref_async_start+0x94/0xb0 [btrfs]
      [  588.965918]  [<ffffffffa01c799a>] btrfs_scrubparity_helper+0xca/0x350 [btrfs]
      [  588.965928]  [<ffffffffa01c7c5e>] btrfs_extent_refs_helper+0xe/0x10 [btrfs]
      [  588.965930]  [<ffffffff8109b323>] process_one_work+0x1f3/0x4e0
      [  588.965931]  [<ffffffff8109b658>] worker_thread+0x48/0x4e0
      [  588.965932]  [<ffffffff8109b610>] ? process_one_work+0x4e0/0x4e0
      [  588.965934]  [<ffffffff810a1659>] kthread+0xc9/0xe0
      [  588.965936]  [<ffffffff816f2f1f>] ret_from_fork+0x1f/0x40
      [  588.965937]  [<ffffffff810a1590>] ? kthread_worker_fn+0x170/0x170
      [  588.965938] ---[ end trace 34e5232c933a1749 ]---
      [  588.966187] ------------[ cut here ]------------
      [  588.966196] WARNING: CPU: 2 PID: 2479 at fs/btrfs/extent-tree.c:2966 btrfs_run_delayed_refs+0x28c/0x2c0 [btrfs]
      [  588.966196] BTRFS: Transaction aborted (error -5)
      [  588.966197] Modules linked in: af_packet iscsi_ibft iscsi_boot_sysfs xfs libcrc32c ppdev acpi_cpufreq button tpm_tis e1000 i2c_piix4 pcspkr parport_pc
      parport tpm qemu_fw_cfg joydev btrfs xor raid6_pq sr_mod cdrom ata_generic virtio_scsi ata_piix virtio_pci bochs_drm virtio_ring drm_kms_helper syscopyarea
      sysfillrect sysimgblt fb_sys_fops virtio ttm serio_raw drm floppy sg
      [  588.966206] CPU: 2 PID: 2479 Comm: kworker/u8:7 Tainted: G        W       4.7.3-3-default-fdm+ #1
      [  588.966207] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.9.1-0-gb3ef39f-prebuilt.qemu-project.org 04/01/2014
      [  588.966217] Workqueue: btrfs-extent-refs btrfs_extent_refs_helper [btrfs]
      [  588.966217]  0000000000000000 ffff8802263bfc98 ffffffff813af542 ffff8802263bfce8
      [  588.966219]  0000000000000000 ffff8802263bfcd8 ffffffff81081e8b 00000b96345ee000
      [  588.966220]  ffffffffa021ae1c ffff880215703b48 00000000000005fe ffff8802345ee000
      [  588.966221] Call Trace:
      [  588.966223]  [<ffffffff813af542>] dump_stack+0x63/0x81
      [  588.966224]  [<ffffffff81081e8b>] __warn+0xcb/0xf0
      [  588.966225]  [<ffffffff81081eff>] warn_slowpath_fmt+0x4f/0x60
      [  588.966233]  [<ffffffffa017e93c>] btrfs_run_delayed_refs+0x28c/0x2c0 [btrfs]
      [  588.966241]  [<ffffffffa017ea04>] delayed_ref_async_start+0x94/0xb0 [btrfs]
      [  588.966250]  [<ffffffffa01c799a>] btrfs_scrubparity_helper+0xca/0x350 [btrfs]
      [  588.966259]  [<ffffffffa01c7c5e>] btrfs_extent_refs_helper+0xe/0x10 [btrfs]
      [  588.966260]  [<ffffffff8109b323>] process_one_work+0x1f3/0x4e0
      [  588.966261]  [<ffffffff8109b658>] worker_thread+0x48/0x4e0
      [  588.966263]  [<ffffffff8109b610>] ? process_one_work+0x4e0/0x4e0
      [  588.966264]  [<ffffffff810a1659>] kthread+0xc9/0xe0
      [  588.966265]  [<ffffffff816f2f1f>] ret_from_fork+0x1f/0x40
      [  588.966267]  [<ffffffff810a1590>] ? kthread_worker_fn+0x170/0x170
      [  588.966268] ---[ end trace 34e5232c933a174a ]---
      [  588.966269] BTRFS: error (device sda2) in btrfs_run_delayed_refs:2966: errno=-5 IO failure
      [  588.966270] BTRFS info (device sda2): forced readonly
      
      This was happening often on openSUSE and SLE systems using btrfs as the
      root filesystem (with its default layout where multiple subvolumes are
      used) where balance happens in the background triggered by a cron job and
      snapshots are automatically created before/after package installations,
      upgrades and removals. The issue could be triggered simply by running the
      following loop on the first system boot post installation:
      
        while true; do
           zypper -n in nfs-kernel-server
           zypper -n rm nfs-kernel-server
        done
      
      (If we were fast enough and made that loop before the cron job triggered
      a balance operation and the balance finished)
      
      So fix by setting the last_snapshot field of the root to the value of the
      generation of its commit root. Like this btrfs_block_can_be_shared()
      behaves correctly for the case where the relocation root is created during
      a transaction commit and for the case where it's created before a
      transaction commit.
      
      Fixes: 6426c7ad (btrfs: qgroup: Fix qgroup accounting when creating snapshot)
      Cc: stable@vger.kernel.org  # 4.7+
      Signed-off-by: default avatarFilipe Manana <fdmanana@suse.com>
      Reviewed-by: default avatarJosef Bacik <jbacik@fb.com>
      054570a1
  3. 01 Nov, 2016 1 commit
  4. 27 Oct, 2016 1 commit
    • Chris Mason's avatar
      btrfs: fix races on root_log_ctx lists · 570dd450
      Chris Mason authored
      btrfs_remove_all_log_ctxs takes a shortcut where it avoids walking the
      list because it knows all of the waiters are patiently waiting for the
      commit to finish.
      
      But, there's a small race where btrfs_sync_log can remove itself from
      the list if it finds a log commit is already done.  Also, it uses
      list_del_init() to remove itself from the list, but there's no way to
      know if btrfs_remove_all_log_ctxs has already run, so we don't know for
      sure if it is safe to call list_del_init().
      
      This gets rid of all the shortcuts for btrfs_remove_all_log_ctxs(), and
      just calls it with the proper locking.
      
      This is part two of the corruption fixed by cbd60aa7.  I should have
      done this in the first place, but convinced myself the optimizations were
      safe.  A 12 hour run of dbench 2048 will eventually trigger a list debug
      WARN_ON for the list_del_init() in btrfs_sync_log().
      
      Fixes: d1433debReported-by: default avatarDave Jones <davej@codemonkey.org.uk>
      cc: stable@vger.kernel.org # 3.15+
      Signed-off-by: default avatarChris Mason <clm@fb.com>
      570dd450
  5. 24 Oct, 2016 5 commits
  6. 18 Oct, 2016 1 commit
  7. 17 Oct, 2016 1 commit
    • Liu Bo's avatar
      Btrfs: kill BUG_ON in do_relocation · 4547f4d8
      Liu Bo authored
      While updating btree, we try to push items between sibling
      nodes/leaves in order to keep height as low as possible.
      But we don't memset the original places with zero when
      pushing items so that we could end up leaving stale content
      in nodes/leaves.  One may read the above stale content by
      increasing btree blocks' @nritems.
      
      One case I've come across is that in fs tree, a leaf has two
      parent nodes, hence running balance ends up with processing
      this leaf with two parent nodes, but it can only reach the
      valid parent node through btrfs_search_slot, so it'd be like,
      
      do_relocation
          for P in all parent nodes of block A:
              if !P->eb:
                  btrfs_search_slot(key);   --> get path from P to A.
              if lowest:
                  BUG_ON(A->bytenr != bytenr of A recorded in P);
              btrfs_cow_block(P, A);   --> change A's bytenr in P.
      
      After btrfs_cow_block, P has the new bytenr of A, but with the
      same @key, we get the same path again, and get panic by BUG_ON.
      
      Note that this is only happening in a corrupted fs, for a
      regular fs in which we have correct @nritems so that we won't
      read stale content in any case.
      Reviewed-by: default avatarJosef Bacik <jbacik@fb.com>
      Signed-off-by: default avatarLiu Bo <bo.li.liu@oracle.com>
      Signed-off-by: default avatarDavid Sterba <dsterba@suse.com>
      4547f4d8
  8. 12 Oct, 2016 2 commits
    • Chris Mason's avatar
      Merge branch 'fst-fixes' of... · d9ed71e5
      Chris Mason authored
      Merge branch 'fst-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux into for-linus-4.9
      Signed-off-by: default avatarChris Mason <clm@fb.com>
      d9ed71e5
    • Filipe Manana's avatar
      Btrfs: fix incremental send failure caused by balance · d5e84fd8
      Filipe Manana authored
      Commit 95155585 ("Btrfs: send, don't bug on inconsistent snapshots")
      removed some BUG_ON() statements (replacing them with returning errors
      to user space and logging error messages) when a snapshot is in an
      inconsistent state due to failures to update a delayed inode item (ENOMEM
      or ENOSPC) after adding/updating/deleting references, xattrs or file
      extent items.
      
      However there is a case, when no errors happen, where a file extent item
      can be modified without having the corresponding inode item updated. This
      case happens during balance under very specific timings, when relocation
      is in the stage where it updates data pointers and a leaf that contains
      file extent items is COWed. When that happens file extent items get their
      disk_bytenr field updated to a new value that reflects the post relocation
      logical address of the extent, without updating their respective inode
      items (as there is nothing that needs to be updated on them). This is
      performed at relocation.c:replace_file_extents() through
      relocation.c:btrfs_reloc_cow_block().
      
      So make an incremental send deal with this case and don't do any processing
      for a file extent item that got its disk_bytenr field updated by relocation,
      since the extent's data is the same as the one pointed by the file extent
      item in the parent snapshot.
      
      After the recent commit mentioned above this case resulted in EIO errors
      returned to user space (and an error message logged to dmesg/syslog) when
      doing an incremental send, while before it, it resulted in hitting a
      BUG_ON leading to the following trace:
      
      [  952.206705] ------------[ cut here ]------------
      [  952.206714] kernel BUG at ../fs/btrfs/send.c:5653!
      [  952.206719] Internal error: Oops - BUG: 0 [#1] SMP
      [  952.209854] Modules linked in: st dm_mod nls_utf8 isofs fuse nf_log_ipv6 xt_pkttype xt_physdev br_netfilter nf_log_ipv4 nf_log_common xt_LOG xt_limit ebtable_filter ebtables af_packet bridge stp llc ip6t_REJECT xt_tcpudp nf_conntrack_ipv6 nf_defrag_ipv6 ip6table_raw ipt_REJECT iptable_raw xt_CT iptable_filter ip6table_mangle nf_conntrack_netbios_ns nf_conntrack_broadcast nf_conntrack_ipv4 nf_defrag_ipv4 ip_tables xt_conntrack nf_conntrack ip6table_filter ip6_tables x_tables xfs libcrc32c nls_iso8859_1 nls_cp437 vfat fat joydev aes_ce_blk ablk_helper cryptd snd_intel8x0 aes_ce_cipher snd_ac97_codec ac97_bus snd_pcm ghash_ce sha2_ce sha1_ce snd_timer snd virtio_net soundcore btrfs xor sr_mod cdrom hid_generic usbhid raid6_pq virtio_blk virtio_scsi bochs_drm drm_kms_helper syscopyarea sysfillrect sysimgblt fb_sys_fops ttm virtio_mmio xhci_pci xhci_hcd usbcore usb_common virtio_pci virtio_ring virtio drm sg efivarfs
      [  952.228333] Supported: Yes
      [  952.228908] CPU: 0 PID: 12779 Comm: snapperd Not tainted 4.4.14-50-default #1
      [  952.230329] Hardware name: QEMU KVM Virtual Machine, BIOS 0.0.0 02/06/2015
      [  952.231683] task: ffff800058e94100 ti: ffff8000d866c000 task.ti: ffff8000d866c000
      [  952.233279] PC is at changed_cb+0x9f4/0xa48 [btrfs]
      [  952.234375] LR is at changed_cb+0x58/0xa48 [btrfs]
      [  952.236552] pc : [<ffff7ffffc39de7c>] lr : [<ffff7ffffc39d4e0>] pstate: 80000145
      [  952.238049] sp : ffff8000d866fa20
      [  952.238732] x29: ffff8000d866fa20 x28: 0000000000000019
      [  952.239840] x27: 00000000000028d5 x26: 00000000000024a2
      [  952.241008] x25: 0000000000000002 x24: ffff8000e66e92f0
      [  952.242131] x23: ffff8000b8c76800 x22: ffff800092879140
      [  952.243238] x21: 0000000000000002 x20: ffff8000d866fb78
      [  952.244348] x19: ffff8000b8f8c200 x18: 0000000000002710
      [  952.245607] x17: 0000ffff90d42480 x16: ffff800000237dc0
      [  952.246719] x15: 0000ffff90de7510 x14: ab000c000a2faf08
      [  952.247835] x13: 0000000000577c2b x12: ab000c000b696665
      [  952.248981] x11: 2e65726f632f6966 x10: 652d34366d72612f
      [  952.250101] x9 : 32627572672f746f x8 : ab000c00092f1671
      [  952.251352] x7 : 8000000000577c2b x6 : ffff800053eadf45
      [  952.252468] x5 : 0000000000000000 x4 : ffff80005e169494
      [  952.253582] x3 : 0000000000000004 x2 : ffff8000d866fb78
      [  952.254695] x1 : 000000000003e2a3 x0 : 000000000003e2a4
      [  952.255803]
      [  952.256150] Process snapperd (pid: 12779, stack limit = 0xffff8000d866c020)
      [  952.257516] Stack: (0xffff8000d866fa20 to 0xffff8000d8670000)
      [  952.258654] fa20: ffff8000d866fae0 ffff7ffffc308fc0 ffff800092879140 ffff8000e66e92f0
      [  952.260219] fa40: 0000000000000035 ffff800055de6000 ffff8000b8c76800 ffff8000d866fb78
      [  952.261745] fa60: 0000000000000002 00000000000024a2 00000000000028d5 0000000000000019
      [  952.263269] fa80: ffff8000d866fae0 ffff7ffffc3090f0 ffff8000d866fae0 ffff7ffffc309128
      [  952.264797] faa0: ffff800092879140 ffff8000e66e92f0 0000000000000035 ffff800055de6000
      [  952.268261] fac0: ffff8000b8c76800 ffff8000d866fb78 0000000000000002 0000000000001000
      [  952.269822] fae0: ffff8000d866fbc0 ffff7ffffc39ecfc ffff8000b8f8c200 ffff8000b8f8c368
      [  952.271368] fb00: ffff8000b8f8c378 ffff800055de6000 0000000000000001 ffff8000ecb17500
      [  952.272893] fb20: ffff8000b8c76800 ffff800092879140 ffff800062b6d000 ffff80007a9e2470
      [  952.274420] fb40: ffff8000b8f8c208 0000000005784000 ffff8000580a8000 ffff8000b8f8c200
      [  952.276088] fb60: ffff7ffffc39d488 00000002b8f8c368 0000000000000000 000000000003e2a4
      [  952.280275] fb80: 000000000000006c ffff7ffffc39ec00 000000000003e2a4 000000000000006c
      [  952.283219] fba0: ffff8000b8f8c300 0000000000000100 0000000000000001 ffff8000ecb17500
      [  952.286166] fbc0: ffff8000d866fcd0 ffff7ffffc3643c0 ffff8000f8842700 0000ffff8ffe9278
      [  952.289136] fbe0: 0000000040489426 ffff800055de6000 0000ffff8ffe9278 0000000040489426
      [  952.292083] fc00: 000000000000011d 000000000000001d ffff80007a9e4598 ffff80007a9e43e8
      [  952.294959] fc20: ffff8000b8c7693f 0000000000003b24 0000000000000019 ffff8000b8f8c218
      [  952.301161] fc40: 00000001d866fc70 ffff8000b8c76800 0000000000000128 ffffffffffffff84
      [  952.305749] fc60: ffff800058e941ff 0000000000003a58 ffff8000d866fcb0 ffff8000000f7390
      [  952.308875] fc80: 000000000000012a 0000000000010290 ffff8000d866fc00 000000000000007b
      [  952.311915] fca0: 0000000000010290 ffff800046c1b100 74732d7366727462 000001006d616572
      [  952.314937] fcc0: ffff8000fffc4100 cb88537fdc8ba60e ffff8000d866fe10 ffff8000002499e8
      [  952.318008] fce0: 0000000040489426 ffff8000f8842700 0000ffff8ffe9278 ffff80007a9e4598
      [  952.321321] fd00: 0000ffff8ffe9278 0000000040489426 000000000000011d 000000000000001d
      [  952.324280] fd20: ffff80000072c000 ffff8000d866c000 ffff8000d866fda0 ffff8000000e997c
      [  952.327156] fd40: ffff8000fffc4180 00000000000031ed ffff8000fffc4180 ffff800046c1b7d4
      [  952.329895] fd60: 0000000000000140 0000ffff907ea170 000000000000011d 00000000000000dc
      [  952.334641] fd80: ffff80000072c000 ffff8000d866c000 0000000000000000 0000000000000002
      [  952.338002] fda0: ffff8000d866fdd0 ffff8000000ebacc ffff800046c1b080 ffff800046c1b7d4
      [  952.340724] fdc0: ffff8000d866fdf0 ffff8000000db67c 0000000000000040 ffff800000e69198
      [  952.343415] fde0: 0000ffff8ffea790 00000000000031ed ffff8000d866fe20 ffff800000254000
      [  952.346101] fe00: 000000000000001d 0000000000000004 ffff8000d866fe90 ffff800000249d3c
      [  952.348980] fe20: ffff8000f8842700 0000000000000000 ffff8000f8842701 0000000000000008
      [  952.351696] fe40: ffff8000d866fe70 0000000000000008 ffff8000d866fe90 ffff800000249cf8
      [  952.354387] fe60: ffff8000f8842700 0000ffff8ffe9170 ffff8000f8842701 0000000000000008
      [  952.357083] fe80: 0000ffff8ffe9278 ffff80008ff85500 0000ffff8ffe90c0 ffff800000085c84
      [  952.359800] fea0: 0000000000000000 0000ffff8ffe9170 ffffffffffffffff 0000ffff90d473bc
      [  952.365351] fec0: 0000000000000000 0000000000000015 0000000000000008 0000000040489426
      [  952.369550] fee0: 0000ffff8ffe9278 0000ffff907ea790 0000ffff907ea170 0000ffff907ea790
      [  952.372416] ff00: 0000ffff907ea170 0000000000000000 000000000000001d 0000000000000004
      [  952.375223] ff20: 0000ffff90a32220 00000000003d0f00 0000ffff907ea0a0 0000ffff8ffe8f30
      [  952.378099] ff40: 0000ffff9100f554 0000ffff91147000 0000ffff91117bc0 0000ffff90d473b0
      [  952.381115] ff60: 0000ffff9100f620 0000ffff880069b0 0000ffff8ffe9170 0000ffff8ffe91a0
      [  952.384003] ff80: 0000ffff8ffe9160 0000ffff8ffe9140 0000ffff88006990 0000ffff8ffe9278
      [  952.386860] ffa0: 0000ffff88008a60 0000ffff8ffe9480 0000ffff88014ca0 0000ffff8ffe90c0
      [  952.389654] ffc0: 0000ffff910be8e8 0000ffff8ffe90c0 0000ffff90d473bc 0000000000000000
      [  952.410986] ffe0: 0000000000000008 000000000000001d 6e2079747265706f 72616d223d656d61
      [  952.415497] Call trace:
      [  952.417403] [<ffff7ffffc39de7c>] changed_cb+0x9f4/0xa48 [btrfs]
      [  952.420023] [<ffff7ffffc308fc0>] btrfs_compare_trees+0x500/0x6b0 [btrfs]
      [  952.422759] [<ffff7ffffc39ecfc>] btrfs_ioctl_send+0xb4c/0xe10 [btrfs]
      [  952.425601] [<ffff7ffffc3643c0>] btrfs_ioctl+0x374/0x29a4 [btrfs]
      [  952.428031] [<ffff8000002499e8>] do_vfs_ioctl+0x33c/0x600
      [  952.430360] [<ffff800000249d3c>] SyS_ioctl+0x90/0xa4
      [  952.432552] [<ffff800000085c84>] el0_svc_naked+0x38/0x3c
      [  952.434803] Code: 2a1503e0 17fffdac b9404282 17ffff28 (d4210000)
      [  952.437457] ---[ end trace 9afd7090c466cf15 ]---
      Signed-off-by: default avatarFilipe Manana <fdmanana@suse.com>
      d5e84fd8
  9. 10 Oct, 2016 1 commit
  10. 03 Oct, 2016 7 commits
  11. 26 Sep, 2016 16 commits
    • Liu Bo's avatar
      Btrfs: remove unnecessary btrfs_mark_buffer_dirty in split_leaf · 196e0249
      Liu Bo authored
      When we're not able to get enough space through splitting leaf,
      we'd create a new sibling leaf instead, and it's possible that we return
       a zero-nritem sibling leaf and mark it dirty before it's in a consistent
      state.  With CONFIG_BTRFS_FS_CHECK_INTEGRITY=y, the integrity check of
      check_leaf will report panic due to this zero-nritem non-root leaf.
      
      This removes the unnecessary btrfs_mark_buffer_dirty.
      Reported-by: default avatarFilipe Manana <fdmanana@gmail.com>
      Signed-off-by: default avatarLiu Bo <bo.li.liu@oracle.com>
      Signed-off-by: default avatarDavid Sterba <dsterba@suse.com>
      196e0249
    • Josef Bacik's avatar
      Btrfs: don't BUG() during drop snapshot · 4867268c
      Josef Bacik authored
      Really there's lots of things that can go wrong here, kill all the
      BUG_ON()'s and replace the logic ones with ASSERT()'s and return EIO
      instead.
      Signed-off-by: default avatarJosef Bacik <jbacik@fb.com>
      [ switched to btrfs_err, errors go to common label ]
      Reviewed-by: default avatarLiu Bo <bo.li.liu@oracle.com>
      Signed-off-by: default avatarDavid Sterba <dsterba@suse.com>
      4867268c
    • Arnd Bergmann's avatar
      btrfs: fix btrfs_no_printk stub helper · 2fd57fcb
      Arnd Bergmann authored
      The addition of btrfs_no_printk() caused a build failure when
      CONFIG_PRINTK is disabled:
      
      fs/btrfs/send.c: In function 'send_rename':
      fs/btrfs/ctree.h:3367:2: error: implicit declaration of function 'btrfs_no_printk' [-Werror=implicit-function-declaration]
      
      This moves the helper outside of that #ifdef so it is always
      defined, and changes the existing #ifdef to refer to that
      helper as well for consistency.
      
      Fixes: 47c57058ff2c ("btrfs: btrfs_debug should consume fs_info when DEBUG is not defined")
      Signed-off-by: default avatarArnd Bergmann <arnd@arndb.de>
      Reviewed-by: default avatarDavid Sterba <dsterba@suse.com>
      Signed-off-by: default avatarDavid Sterba <dsterba@suse.com>
      2fd57fcb
    • Liu Bo's avatar
      Btrfs: memset to avoid stale content in btree leaf · 851cd173
      Liu Bo authored
      This is an additional patch to
      "Btrfs: memset to avoid stale content in btree node block".
      
      This uses memset to initialize the unused space in a leaf to avoid
      potential stale content, which may be incurred by pushing items
      between sibling leaves.
      Signed-off-by: default avatarLiu Bo <bo.li.liu@oracle.com>
      Reviewed-by: default avatarDavid Sterba <dsterba@suse.com>
      Signed-off-by: default avatarDavid Sterba <dsterba@suse.com>
      851cd173
    • Goldwyn Rodrigues's avatar
      btrfs: parent_start initialization cleanup · 0f5053eb
      Goldwyn Rodrigues authored
      Code cleanup. parent_start is initialized multiple times when it is
      not necessary to do so.
      Signed-off-by: default avatarGoldwyn Rodrigues <rgoldwyn@suse.com>
      Reviewed-by: default avatarDavid Sterba <dsterba@suse.com>
      Signed-off-by: default avatarDavid Sterba <dsterba@suse.com>
      0f5053eb
    • Goldwyn Rodrigues's avatar
      btrfs: Remove already completed TODO comment · 6cea66e5
      Goldwyn Rodrigues authored
      Fixes: 7cf5b976 ("btrfs: qgroup: Cleanup old inaccurate facilities")
      Signed-off-by: default avatarGoldwyn Rodrigues <rgoldwyn@suse.com>
      Signed-off-by: default avatarDavid Sterba <dsterba@suse.com>
      6cea66e5
    • Goldwyn Rodrigues's avatar
      btrfs: Do not reassign count in btrfs_run_delayed_refs · dd12d5b8
      Goldwyn Rodrigues authored
      Code cleanup. count is already (unsgined long)-1. That is the reason
      run_all was set. Do not reassign it (unsigned long)-1.
      Signed-off-by: default avatarGoldwyn Rodrigues <rgoldwyn@suse.com>
      Reviewed-by: default avatarDavid Sterba <dsterba@suse.com>
      Signed-off-by: default avatarDavid Sterba <dsterba@suse.com>
      dd12d5b8
    • Anand Jain's avatar
      btrfs: fix a possible umount deadlock · 0ccd0528
      Anand Jain authored
      btrfs_show_devname() is using the device_list_mutex, sometimes
      a call to blkdev_put() leads vfs calling into this func. So
      call blkdev_put() outside of device_list_mutex, as of now.
      
      [  983.284212] ======================================================
      [  983.290401] [ INFO: possible circular locking dependency detected ]
      [  983.296677] 4.8.0-rc5-ceph-00023-g1b39cec2 #1 Not tainted
      [  983.302081] -------------------------------------------------------
      [  983.308357] umount/21720 is trying to acquire lock:
      [  983.313243]  (&bdev->bd_mutex){+.+.+.}, at: [<ffffffff9128ec51>] blkdev_put+0x31/0x150
      [  983.321264]
      [  983.321264] but task is already holding lock:
      [  983.327101]  (&fs_devs->device_list_mutex){+.+...}, at: [<ffffffffc033d6f6>] __btrfs_close_devices+0x46/0x200 [btrfs]
      [  983.337839]
      [  983.337839] which lock already depends on the new lock.
      [  983.337839]
      [  983.346024]
      [  983.346024] the existing dependency chain (in reverse order) is:
      [  983.353512]
      -> #4 (&fs_devs->device_list_mutex){+.+...}:
      [  983.359096]        [<ffffffff910dfd0c>] lock_acquire+0x1bc/0x1f0
      [  983.365143]        [<ffffffff91823125>] mutex_lock_nested+0x65/0x350
      [  983.371521]        [<ffffffffc02d8116>] btrfs_show_devname+0x36/0x1f0 [btrfs]
      [  983.378710]        [<ffffffff9129523e>] show_vfsmnt+0x4e/0x150
      [  983.384593]        [<ffffffff9126ffc7>] m_show+0x17/0x20
      [  983.389957]        [<ffffffff91276405>] seq_read+0x2b5/0x3b0
      [  983.395669]        [<ffffffff9124c808>] __vfs_read+0x28/0x100
      [  983.401464]        [<ffffffff9124eb3b>] vfs_read+0xab/0x150
      [  983.407080]        [<ffffffff9124ec32>] SyS_read+0x52/0xb0
      [  983.412609]        [<ffffffff91825fc0>] entry_SYSCALL_64_fastpath+0x23/0xc1
      [  983.419617]
      -> #3 (namespace_sem){++++++}:
      [  983.424024]        [<ffffffff910dfd0c>] lock_acquire+0x1bc/0x1f0
      [  983.430074]        [<ffffffff918239e9>] down_write+0x49/0x80
      [  983.435785]        [<ffffffff91272457>] lock_mount+0x67/0x1c0
      [  983.441582]        [<ffffffff91272ab2>] do_add_mount+0x32/0xf0
      [  983.447458]        [<ffffffff9127363a>] finish_automount+0x5a/0xc0
      [  983.453682]        [<ffffffff91259513>] follow_managed+0x1b3/0x2a0
      [  983.459912]        [<ffffffff9125b750>] lookup_fast+0x300/0x350
      [  983.465875]        [<ffffffff9125d6e7>] path_openat+0x3a7/0xaa0
      [  983.471846]        [<ffffffff9125ef75>] do_filp_open+0x85/0xe0
      [  983.477731]        [<ffffffff9124c41c>] do_sys_open+0x14c/0x1f0
      [  983.483702]        [<ffffffff9124c4de>] SyS_open+0x1e/0x20
      [  983.489240]        [<ffffffff91825fc0>] entry_SYSCALL_64_fastpath+0x23/0xc1
      [  983.496254]
      -> #2 (&sb->s_type->i_mutex_key#3){+.+.+.}:
      [  983.501798]        [<ffffffff910dfd0c>] lock_acquire+0x1bc/0x1f0
      [  983.507855]        [<ffffffff918239e9>] down_write+0x49/0x80
      [  983.513558]        [<ffffffff91366237>] start_creating+0x87/0x100
      [  983.519703]        [<ffffffff91366647>] debugfs_create_dir+0x17/0x100
      [  983.526195]        [<ffffffff911df153>] bdi_register+0x93/0x210
      [  983.532165]        [<ffffffff911df313>] bdi_register_owner+0x43/0x70
      [  983.538570]        [<ffffffff914080fb>] device_add_disk+0x1fb/0x450
      [  983.544888]        [<ffffffff91580226>] loop_add+0x1e6/0x290
      [  983.550596]        [<ffffffff91fec358>] loop_init+0x10b/0x14f
      [  983.556394]        [<ffffffff91002207>] do_one_initcall+0xa7/0x180
      [  983.562618]        [<ffffffff91f932e0>] kernel_init_freeable+0x1cc/0x266
      [  983.569370]        [<ffffffff918174be>] kernel_init+0xe/0x100
      [  983.575166]        [<ffffffff9182620f>] ret_from_fork+0x1f/0x40
      [  983.581131]
      -> #1 (loop_index_mutex){+.+.+.}:
      [  983.585801]        [<ffffffff910dfd0c>] lock_acquire+0x1bc/0x1f0
      [  983.591858]        [<ffffffff91823125>] mutex_lock_nested+0x65/0x350
      [  983.598256]        [<ffffffff9157ed3f>] lo_open+0x1f/0x60
      [  983.603704]        [<ffffffff9128eec3>] __blkdev_get+0x123/0x400
      [  983.609757]        [<ffffffff9128f4ea>] blkdev_get+0x34a/0x350
      [  983.615639]        [<ffffffff9128f554>] blkdev_open+0x64/0x80
      [  983.621428]        [<ffffffff9124aff6>] do_dentry_open+0x1c6/0x2d0
      [  983.627651]        [<ffffffff9124c029>] vfs_open+0x69/0x80
      [  983.633181]        [<ffffffff9125db74>] path_openat+0x834/0xaa0
      [  983.639152]        [<ffffffff9125ef75>] do_filp_open+0x85/0xe0
      [  983.645035]        [<ffffffff9124c41c>] do_sys_open+0x14c/0x1f0
      [  983.650999]        [<ffffffff9124c4de>] SyS_open+0x1e/0x20
      [  983.656535]        [<ffffffff91825fc0>] entry_SYSCALL_64_fastpath+0x23/0xc1
      [  983.663541]
      -> #0 (&bdev->bd_mutex){+.+.+.}:
      [  983.668107]        [<ffffffff910def43>] __lock_acquire+0x1003/0x17b0
      [  983.674510]        [<ffffffff910dfd0c>] lock_acquire+0x1bc/0x1f0
      [  983.680561]        [<ffffffff91823125>] mutex_lock_nested+0x65/0x350
      [  983.686967]        [<ffffffff9128ec51>] blkdev_put+0x31/0x150
      [  983.692761]        [<ffffffffc033481f>] btrfs_close_bdev+0x4f/0x60 [btrfs]
      [  983.699699]        [<ffffffffc033d77b>] __btrfs_close_devices+0xcb/0x200 [btrfs]
      [  983.707178]        [<ffffffffc033d8db>] btrfs_close_devices+0x2b/0xa0 [btrfs]
      [  983.714380]        [<ffffffffc03081c5>] close_ctree+0x265/0x340 [btrfs]
      [  983.721061]        [<ffffffffc02d7959>] btrfs_put_super+0x19/0x20 [btrfs]
      [  983.727908]        [<ffffffff91250e2f>] generic_shutdown_super+0x6f/0x100
      [  983.734744]        [<ffffffff91250f56>] kill_anon_super+0x16/0x30
      [  983.740888]        [<ffffffffc02da97e>] btrfs_kill_super+0x1e/0x130 [btrfs]
      [  983.747909]        [<ffffffff91250fe9>] deactivate_locked_super+0x49/0x80
      [  983.754745]        [<ffffffff912515fd>] deactivate_super+0x5d/0x70
      [  983.760977]        [<ffffffff91270a1c>] cleanup_mnt+0x5c/0x80
      [  983.766773]        [<ffffffff91270a92>] __cleanup_mnt+0x12/0x20
      [  983.772738]        [<ffffffff910aa2fe>] task_work_run+0x7e/0xc0
      [  983.778708]        [<ffffffff91081b5a>] exit_to_usermode_loop+0x7e/0xb4
      [  983.785373]        [<ffffffff910039eb>] syscall_return_slowpath+0xbb/0xd0
      [  983.792212]        [<ffffffff9182605c>] entry_SYSCALL_64_fastpath+0xbf/0xc1
      [  983.799225]
      [  983.799225] other info that might help us debug this:
      [  983.799225]
      [  983.807291] Chain exists of:
        &bdev->bd_mutex --> namespace_sem --> &fs_devs->device_list_mutex
      
      [  983.816521]  Possible unsafe locking scenario:
      [  983.816521]
      [  983.822489]        CPU0                    CPU1
      [  983.827043]        ----                    ----
      [  983.831599]   lock(&fs_devs->device_list_mutex);
      [  983.836289]                                lock(namespace_sem);
      [  983.842268]                                lock(&fs_devs->device_list_mutex);
      [  983.849478]   lock(&bdev->bd_mutex);
      [  983.853127]
      [  983.853127]  *** DEADLOCK ***
      [  983.853127]
      [  983.859113] 3 locks held by umount/21720:
      [  983.863145]  #0:  (&type->s_umount_key#35){++++..}, at: [<ffffffff912515f5>] deactivate_super+0x55/0x70
      [  983.872713]  #1:  (uuid_mutex){+.+.+.}, at: [<ffffffffc033d8d3>] btrfs_close_devices+0x23/0xa0 [btrfs]
      [  983.882206]  #2:  (&fs_devs->device_list_mutex){+.+...}, at: [<ffffffffc033d6f6>] __btrfs_close_devices+0x46/0x200 [btrfs]
      [  983.893422]
      [  983.893422] stack backtrace:
      [  983.897824] CPU: 6 PID: 21720 Comm: umount Not tainted 4.8.0-rc5-ceph-00023-g1b39cec2 #1
      [  983.905958] Hardware name: Supermicro SYS-5018R-WR/X10SRW-F, BIOS 1.0c 09/07/2015
      [  983.913492]  0000000000000000 ffff8c8a53c17a38 ffffffff91429521 ffffffff9260f4f0
      [  983.921018]  ffffffff92642760 ffff8c8a53c17a88 ffffffff911b2b04 0000000000000050
      [  983.928542]  ffffffff9237d620 ffff8c8a5294aee0 ffff8c8a5294aeb8 ffff8c8a5294aee0
      [  983.936072] Call Trace:
      [  983.938545]  [<ffffffff91429521>] dump_stack+0x85/0xc4
      [  983.943715]  [<ffffffff911b2b04>] print_circular_bug+0x1fb/0x20c
      [  983.949748]  [<ffffffff910def43>] __lock_acquire+0x1003/0x17b0
      [  983.955613]  [<ffffffff910dfd0c>] lock_acquire+0x1bc/0x1f0
      [  983.961123]  [<ffffffff9128ec51>] ? blkdev_put+0x31/0x150
      [  983.966550]  [<ffffffff91823125>] mutex_lock_nested+0x65/0x350
      [  983.972407]  [<ffffffff9128ec51>] ? blkdev_put+0x31/0x150
      [  983.977832]  [<ffffffff9128ec51>] blkdev_put+0x31/0x150
      [  983.983101]  [<ffffffffc033481f>] btrfs_close_bdev+0x4f/0x60 [btrfs]
      [  983.989500]  [<ffffffffc033d77b>] __btrfs_close_devices+0xcb/0x200 [btrfs]
      [  983.996415]  [<ffffffffc033d8db>] btrfs_close_devices+0x2b/0xa0 [btrfs]
      [  984.003068]  [<ffffffffc03081c5>] close_ctree+0x265/0x340 [btrfs]
      [  984.009189]  [<ffffffff9126cc5e>] ? evict_inodes+0x15e/0x170
      [  984.014881]  [<ffffffffc02d7959>] btrfs_put_super+0x19/0x20 [btrfs]
      [  984.021176]  [<ffffffff91250e2f>] generic_shutdown_super+0x6f/0x100
      [  984.027476]  [<ffffffff91250f56>] kill_anon_super+0x16/0x30
      [  984.033082]  [<ffffffffc02da97e>] btrfs_kill_super+0x1e/0x130 [btrfs]
      [  984.039548]  [<ffffffff91250fe9>] deactivate_locked_super+0x49/0x80
      [  984.045839]  [<ffffffff912515fd>] deactivate_super+0x5d/0x70
      [  984.051525]  [<ffffffff91270a1c>] cleanup_mnt+0x5c/0x80
      [  984.056774]  [<ffffffff91270a92>] __cleanup_mnt+0x12/0x20
      [  984.062201]  [<ffffffff910aa2fe>] task_work_run+0x7e/0xc0
      [  984.067625]  [<ffffffff91081b5a>] exit_to_usermode_loop+0x7e/0xb4
      [  984.073747]  [<ffffffff910039eb>] syscall_return_slowpath+0xbb/0xd0
      [  984.080038]  [<ffffffff9182605c>] entry_SYSCALL_64_fastpath+0xbf/0xc1
      Reported-by: default avatarIlya Dryomov <idryomov@gmail.com>
      Signed-off-by: default avatarAnand Jain <anand.jain@oracle.com>
      Reviewed-by: default avatarDavid Sterba <dsterba@suse.com>
      Signed-off-by: default avatarDavid Sterba <dsterba@suse.com>
      0ccd0528
    • Liu Bo's avatar
      Btrfs: fix memory leak in do_walk_down · a958eab0
      Liu Bo authored
      The extent buffer 'next' needs to be free'd conditionally.
      Signed-off-by: default avatarLiu Bo <bo.li.liu@oracle.com>
      Reviewed-by: default avatarDavid Sterba <dsterba@suse.com>
      Signed-off-by: default avatarDavid Sterba <dsterba@suse.com>
      a958eab0
    • Jeff Mahoney's avatar
      btrfs: btrfs_debug should consume fs_info when DEBUG is not defined · c01f5f96
      Jeff Mahoney authored
      We can hit unused variable warnings when btrfs_debug and friends are
      just aliases for no_printk.  This is due to the fs_info not getting
      consumed by the function call, which can happen if convenenience
      variables are used.  This patch adds a new btrfs_no_printk static inline
      that consumes the convenience variable and does nothing else.  It
      silences the unused variable warning and has no impact on the generated
      code:
      
      $ size fs/btrfs/extent_io.o*
         text	   data	    bss	    dec	    hex	filename
        44072	    152	     32	  44256	   ace0	fs/btrfs/extent_io.o.btrfs_no_printk
        44072	    152	     32	  44256	   ace0	fs/btrfs/extent_io.o.no_printk
      
      Fixes: 27a0dd61 (Btrfs: make btrfs_debug match pr_debug handling related to DEBUG)
      Signed-off-by: default avatarJeff Mahoney <jeffm@suse.com>
      Signed-off-by: default avatarDavid Sterba <dsterba@suse.com>
      c01f5f96
    • Jeff Mahoney's avatar
      btrfs: convert send's verbose_printk to btrfs_debug · 04ab956e
      Jeff Mahoney authored
      This was basically an open-coded, less flexible dynamic printk.  We can
      just use btrfs_debug instead.
      Signed-off-by: default avatarJeff Mahoney <jeffm@suse.com>
      Signed-off-by: default avatarDavid Sterba <dsterba@suse.com>
      04ab956e
    • Jeff Mahoney's avatar
      btrfs: convert pr_* to btrfs_* where possible · ab8d0fc4
      Jeff Mahoney authored
      For many printks, we want to know which file system issued the message.
      
      This patch converts most pr_* calls to use the btrfs_* versions instead.
      In some cases, this means adding plumbing to allow call sites access to
      an fs_info pointer.
      
      fs/btrfs/check-integrity.c is left alone for another day.
      Signed-off-by: default avatarJeff Mahoney <jeffm@suse.com>
      Reviewed-by: default avatarDavid Sterba <dsterba@suse.com>
      Signed-off-by: default avatarDavid Sterba <dsterba@suse.com>
      ab8d0fc4
    • Jeff Mahoney's avatar
      btrfs: convert printk(KERN_* to use pr_* calls · 62e85577
      Jeff Mahoney authored
      This patch converts printk(KERN_* style messages to use the pr_* versions.
      
      One side effect is that anything that was KERN_DEBUG is now automatically
      a dynamic debug message.
      Signed-off-by: default avatarJeff Mahoney <jeffm@suse.com>
      Signed-off-by: default avatarDavid Sterba <dsterba@suse.com>
      62e85577
    • Jeff Mahoney's avatar
      btrfs: unsplit printed strings · 5d163e0e
      Jeff Mahoney authored
      CodingStyle chapter 2:
      "[...] never break user-visible strings such as printk messages,
      because that breaks the ability to grep for them."
      
      This patch unsplits user-visible strings.
      Signed-off-by: default avatarJeff Mahoney <jeffm@suse.com>
      Signed-off-by: default avatarDavid Sterba <dsterba@suse.com>
      5d163e0e
    • Jeff Mahoney's avatar
      btrfs: clean the old superblocks before freeing the device · cea67ab9
      Jeff Mahoney authored
      btrfs_rm_device frees the block device but then re-opens it using
      the saved device name.  A race exists between the close and the
      re-open that allows the block size to be changed.  The result
      is getting stuck forever in the reclaim loop in __getblk_slow.
      
      This patch moves the superblock cleanup before closing the block
      device, which is also consistent with other callers.  We also don't
      need a private copy of dev_name as the whole routine operates under
      the uuid_mutex.
      Signed-off-by: default avatarJeff Mahoney <jeffm@suse.com>
      Reviewed-by: default avatarDavid Sterba <dsterba@suse.com>
      Signed-off-by: default avatarDavid Sterba <dsterba@suse.com>
      cea67ab9
    • Liu Bo's avatar
      Btrfs: kill BUG_ON in run_delayed_tree_ref · 02794222
      Liu Bo authored
      In a corrupted btrfs image, we can come across this BUG_ON and
      get an unreponsive system, but if we return errors instead,
      its caller can handle everything gracefully by aborting the current
      transaction.
      Signed-off-by: default avatarLiu Bo <bo.li.liu@oracle.com>
      Signed-off-by: default avatarDavid Sterba <dsterba@suse.com>
      02794222