Commits · 2135fb9bb4b8d05d288d994c4f9f8077ce90d890 · nexedi / linux

22 Jan, 2018 38 commits

btrfs: sink get_extent parameter to extent_fiemap · 2135fb9b

David Sterba authored Jun 23, 2017

All callers pass btrfs_get_extent_fiemap and we don't expect anything
else in the context of extent_fiemap.
Signed-off-by: David Sterba <dsterba@suse.com>

2135fb9b

btrfs: drop get_extent from extent_page_data · 3c98c62f

David Sterba authored Jun 23, 2017

Previous patches cleaned up all places where
extent_page_data::get_extent was set and it was btrfs_get_extent all the
time, so we can simply call that instead.

This also reduces size of extent_page_data by 8 bytes which has positive
effect on stack consumption on various functions on the write out path.
Signed-off-by: David Sterba <dsterba@suse.com>

3c98c62f

btrfs: sink get_extent parameter to extent_write_full_page · deac642d
David Sterba authored Jun 23, 2017
```
There's only one caller.
Signed-off-by: David Sterba <dsterba@suse.com>
```
deac642d
btrfs: sink get_extent parameter to extent_write_locked_range · 916b9298
David Sterba authored Jun 23, 2017
```
There's only one caller.
Signed-off-by: David Sterba <dsterba@suse.com>
```
916b9298
btrfs: sink get_extent parameter to extent_writepages · 43317599
David Sterba authored Jun 23, 2017
```
There's only one caller.
Signed-off-by: David Sterba <dsterba@suse.com>
```
43317599

btrfs: Cleanup existing name_len checks · bae15d95

Qu Wenruo authored Nov 08, 2017

Since tree-checker has verified leaf when reading from disk, we don't
need the existing verify_dir_item() or btrfs_is_name_len_valid() checks.
Signed-off-by: Qu Wenruo <wqu@suse.com>
Reviewed-by: Nikolay Borisov <nborisov@suse.com>
Reviewed-by: David Sterba <dsterba@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>

bae15d95

btrfs: tree-checker: Add checker for dir item · ad7b0368

Qu Wenruo authored Nov 08, 2017

Add checker for dir item, for key types DIR_ITEM, DIR_INDEX and
XATTR_ITEM.

This checker does comprehensive checks for:

1) dir_item header and its data size
   Against item boundary and maximum name/xattr length.
   This part is mostly the same as old verify_dir_item().

2) dir_type
   Against maximum file types, and against key type.
   Since XATTR key should only have FT_XATTR dir item, and normal dir
   item type should not have XATTR key.

   The check between key->type and dir_type is newly introduced by this
   patch.

3) name hash
   For XATTR and DIR_ITEM key, key->offset is name hash (crc32c).
   Check the hash of the name against the key to ensure it's correct.

   The name hash check is only found in btrfs-progs before this patch.
Signed-off-by: Qu Wenruo <wqu@suse.com>
Reviewed-by: Nikolay Borisov <nborisov@suse.com>
Reviewed-by: Su Yue <suy.fnst@cn.fujitsu.com>
Reviewed-by: David Sterba <dsterba@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>

ad7b0368

btrfs: use GFP_KERNEL in btrfs_alloc_inode · 712e36c5

David Sterba authored Oct 31, 2017

This callback is called directly from VFS, no locks are held at the
allocation time.
Signed-off-by: David Sterba <dsterba@suse.com>

712e36c5

btrfs: sink gfp parameter to clear_extent_uptodate · f08dc36f
David Sterba authored Oct 31, 2017
```
There's only one callsite with GFP_NOFS.
Signed-off-by: David Sterba <dsterba@suse.com>
```
f08dc36f

btrfs: sink gfp parameter to clear_extent_bit · ae0f1625

David Sterba authored Oct 31, 2017

All callers use GFP_NOFS, we don't have to pass it as an argument. The
built-in tests pass GFP_KERNEL, but they run only at module load time
and NOFS works there as well.
Signed-off-by: David Sterba <dsterba@suse.com>

ae0f1625

btrfs: prepare to drop gfp mask parameter from clear_extent_bit · 66b0c887

David Sterba authored Oct 31, 2017

Use __clear_extent_bit directly in case we want to pass unknown
gfp flags. Otherwise all clear_extent_bit callers use GFP_NOFS, so we
can sink them to the function and reduce argument count, at the cost
that __clear_extent_bit has to be exported.
Signed-off-by: David Sterba <dsterba@suse.com>

66b0c887

btrfs: use non-RCU list traversal in write_all_supers callees · 1538e6c5

David Sterba authored Jun 16, 2017

We take the fs_devices::device_list_mutex mutex in write_all_supers
which will prevent any add/del changes to the device list. Therefore we
don't need to use the RCU variant list_for_each_entry_rcu in any of the
called functions.
Signed-off-by: David Sterba <dsterba@suse.com>

1538e6c5

btrfs: switch to RCU for device traversal in btrfs_ioctl_fs_info · d03262c7

David Sterba authored Jun 16, 2017

We don't need to use the mutex as we do not modify the devices nor the
list itself and just read information about device counts.
Move copying fsid out of the protected section, not applicable to RCU
same as the rest of the retrieved information.
Signed-off-by: David Sterba <dsterba@suse.com>

d03262c7

btrfs: switch to RCU for device traversal in btrfs_ioctl_dev_info · c5593ca3

David Sterba authored Jun 16, 2017

We don't need to use the mutex as we do not modify the devices nor the
list itself and just read some information:

does not change during device lifetime:
- devid
- uuid
- name (ie. the path)

may change in parallel to the ioctl call, but can lead only to reporting
inacurracy:
- bytes_used
- total_bytes
Signed-off-by: David Sterba <dsterba@suse.com>

c5593ca3

btrfs: simplify btrfs_close_bdev · 08ffcae8
David Sterba authored Jun 19, 2017
```
Split the conditions a bit.
Signed-off-by: David Sterba <dsterba@suse.com>
```
08ffcae8

btrfs: document device locking · 9c6b1c4d

David Sterba authored Jun 16, 2017

Overview of the main locks protecting various device-related structures.
Signed-off-by: David Sterba <dsterba@suse.com>

9c6b1c4d

btrfs: simplify exit paths in btrfs_init_new_device · 5c4cf6c9
David Sterba authored Oct 30, 2017
```
Reviewed-by: Anand Jain <anand.jain@oracle.com>
Signed-off-by: David Sterba <dsterba@suse.com>
```
5c4cf6c9

btrfs: use free_device where opencoded · 55de4803

David Sterba authored Oct 30, 2017

Reviewed-by: Anand Jain <anand.jain@oracle.com>
Signed-off-by: David Sterba <dsterba@suse.com>

55de4803

btrfs: introduce free_device helper · 48dae9cf

David Sterba authored Oct 30, 2017

A helper to free a device and all it's dynamically allocated members,
like the rcu_string name or flush_bio. This is going to replace all
open coded places.
Signed-off-by: David Sterba <dsterba@suse.com>

48dae9cf

btrfs: rename device free rcu helper to free_device_rcu · f06c5965

David Sterba authored Jun 06, 2017

Make it clear that it is an RCU helper, we want to use the name
free_device for a wrapper freeing all device members.
Signed-off-by: David Sterba <dsterba@suse.com>

f06c5965

Btrfs: document rules about bio async submit · 4c274bc6

Liu Bo authored Nov 01, 2017

These rules have been hidden in several if-else and are not
straightforward to follow, for example, dio submit hook's nocsum case
has a bug , i.e. doing async submit instead of sync submit, which has
been fixed recently.

This is documenting the rules for reference.
Signed-off-by: Liu Bo <bo.li.liu@oracle.com>
Signed-off-by: David Sterba <dsterba@suse.com>

4c274bc6

btrfs: Reduce scope of delayed_rsv->lock in may_commit_trans · 057aac3e

Nikolay Borisov authored Nov 07, 2017

After commit 996478ca ("btrfs: change how we decide to commit
transactions during flushing") there is no need to hold the delayed_rsv
during the percpu_counter_compare call since we get the byte's snapshot
earlier. So hold the lock only while reading delayed_rsv.
Signed-off-by: Nikolay Borisov <nborisov@suse.com>
Reviewed-by: David Sterba <dsterba@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>

057aac3e

Btrfs: add __init macro to btrfs init functions · f5c29bd9

Liu Bo authored Nov 02, 2017

Adding __init macro gives kernel a hint that this function is only used
during the initialization phase and its memory resources can be freed up
after.
Signed-off-by: Liu Bo <bo.li.liu@oracle.com>
Reviewed-by: David Sterba <dsterba@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>

f5c29bd9

btrfs: rename btrfs_add_device to btrfs_add_dev_item · c74a0b02

Anand Jain authored Nov 06, 2017

Function btrfs_add_device() is adding the device item so rename to
reflect that in the function. Similarly we have btrfs_rm_dev_item().
Signed-off-by: Anand Jain <anand.jain@oracle.com>
Reviewed-by: David Sterba <dsterba@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>

c74a0b02

btrfs: Don't generate UUID for non-fs tree · 33d85fda

Qu Wenruo authored Oct 31, 2017

btrfs_create_tree() will unconditionally generate UUID for any root.
So for quota tree and data reloc tree created by kernel, they will have
unique UUIDs.

However UUID in root item is only referred by UUID tree, which only
records UUID for fs trees.  This makes unique UUIDs for quota/data reloc
tree meaningless.

Leave the UUID as zero for non-fs tree, making btrfs-debug-tree output
less confusing.
Reported-by: Misono Tomohiro <misono.tomohiro@jp.fujitsu.com>
Signed-off-by: Qu Wenruo <wqu@suse.com>
Reviewed-by: David Sterba <dsterba@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>

33d85fda

btrfs: move volume_mutex into the btrfs_rm_device() · 2c997384

Anand Jain authored Nov 06, 2017

A cleanup patch no functional change, we hold volume_mutex before
calling btrfs_rm_device, so move it into the function itself.
Signed-off-by: Anand Jain <anand.jain@oracle.com>
Reviewed-by: Nikolay Borisov <nborisov@suse.com>
Reviewed-by: David Sterba <dsterba@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>

2c997384

btrfs: Use locked_end rather than open coding it · 96b09dde

Nikolay Borisov authored Nov 01, 2017

Right before we go into this loop locked_end is set to alloc_end - 1 and
is being used in nearby functions, no need to have exceptions. This just
makes the code consistent, no functional changes.
Signed-off-by: Nikolay Borisov <nborisov@suse.com>
Reviewed-by: David Sterba <dsterba@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>

96b09dde

btrfs: Move loop termination condition in while() · 6b7d6e93

Nikolay Borisov authored Nov 01, 2017

Fallocating a file in btrfs goes through several stages. The one before
actually inserting the fallocated extents is to create a qgroup
reservation, covering the desired range. To this end there is a loop in
btrfs_fallocate which checks to see if there are holes in the fallocated
range or !PREALLOC extents past EOF and if so create qgroup reservations
for them. Unfortunately, the main condition of the loop is burried right
at the end of its body rather than in the actual while statement which
makes it non-obvious. Fix this by moving the condition in the while
statement where it belongs. No functional changes.
Signed-off-by: Nikolay Borisov <nborisov@suse.com>
Reviewed-by: David Sterba <dsterba@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>

6b7d6e93

Btrfs: remove rcu_barrier in btrfs_close_devices · 47dba171

Liu Bo authored Oct 10, 2017

It was introduced because btrfs used to do blkdev_put in a deferred
work, now that btrfs has blkdev_put in place, this rcu_barrier can be
removed.

modprobe -r btrfs will do btrfs_cleanup_fs_uuids(), where it cleanup
every %fs_devices on the list, but when we do btrfs_close_devices(), we
have replaced the devices on the list with dummy ones which only have
the same name and uuid, so modprobe -r btrfs will free those instead of
what we were using, this change won't cause a problem for it.
Signed-off-by: Liu Bo <bo.li.liu@oracle.com>
Reviewed-by: Anand Jain <anand.jain@oracle.com>
Reviewed-by: David Sterba <dsterba@suse.com>
[ copied 2nd paragraph from mailinglist discussion ]
Signed-off-by: David Sterba <dsterba@suse.com>

47dba171

btrfs: Move checks from btrfs_wq_run_delayed_node to btrfs_balance_delayed_items · 8577787f

Nikolay Borisov authored Oct 23, 2017

btrfs_balance_delayed_items is the sole caller of
btrfs_wq_run_delayed_node and already includes one of the checks whether
the delayed inodes should be run. On the other hand
btrfs_wq_run_delayed_node duplicates that check and performs an
additional one for wq congestion.

Let's remove the duplicate check and move the congestion one in
btrfs_balance_delayed_items, leaving btrfs_wq_run_delayed_node to only
care about setting up the wq run. No functional changes.
Signed-off-by: Nikolay Borisov <nborisov@suse.com>
Reviewed-by: Qu Wenruo <wqu@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>

8577787f

btrfs: Make btrfs_async_run_delayed_root use a loop rather than multiple labels · 617c54a8

Nikolay Borisov authored Oct 23, 2017

Currently btrfs_async_run_delayed_root's implementation uses 3 goto
labels to mimic the functionality of a simple do {} while loop. Refactor
the function to use a do {} while construct, making intention clear and
code easier to follow. No functional changes.
Signed-off-by: Nikolay Borisov <nborisov@suse.com>
Reviewed-by: Qu Wenruo <wqu@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>

617c54a8

btrfs: Remove redundant mirror_num arg · d3fac6ba

Nikolay Borisov authored Oct 24, 2017

The following callpath is always invoked with mirror_num set to 0, so
let's remove it as an argument and directly pass 0 to __do_redpage. No
functional change.

extent_readpages
  __extent_readpages
    __do_contiguous_readpages
      __do_readpage
Signed-off-by: Nikolay Borisov <nborisov@suse.com>
Reviewed-by: David Sterba <dsterba@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>

d3fac6ba

btrfs: Remove unused function · ac244ef1

Nikolay Borisov authored Oct 20, 2017

It's sole callsite was removed in a previous patch so just nuke it for good.
Signed-off-by: Nikolay Borisov <nborisov@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>

ac244ef1

btrfs: Remove redundant memory barrier in dev stats · 4660c49f

Nikolay Borisov authored Oct 20, 2017

As per atomic_t.txt documentation :
 - RMW operations that have a return value are fully ordered;

atomic_xchg is one such operation so it already includes everything it
needs w.r.t memory ordering and add a comment to be more explicit about
that.
Signed-off-by: Nikolay Borisov <nborisov@suse.com>
Reviewed-by: David Sterba <dsterba@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>

4660c49f

btrfs: Fix memory barriers usage with device stats counters · 9deae968

Nikolay Borisov authored Oct 24, 2017

Commit addc3fa7 ("Btrfs: Fix the problem that the dirty flag of dev
stats is cleared") reworked the way device stats changes are tracked. A
new atomic dev_stats_ccnt counter was introduced which is incremented
every time any of the device stats counters are changed. This serves as
a flag whether there are any pending stats changes. However, this patch
only partially implemented the correct memory barriers necessary:

- It only ordered the stores to the counters but not the reads e.g.
  btrfs_run_dev_stats
- It completely omitted any comments documenting the intended design and
  how the memory barriers pair with each-other

This patch provides the necessary comments as well as adds a missing
smp_rmb in btrfs_run_dev_stats. Furthermore since dev_stats_cnt is only
a snapshot at best there was no point in reading the counter twice -
once in btrfs_dev_stats_dirty and then again when assigning stats_cnt.
Just collapse both reads into 1.

Fixes: addc3fa7 ("Btrfs: Fix the problem that the dirty flag of dev stats is cleared")
Signed-off-by: Nikolay Borisov <nborisov@suse.com>
Reviewed-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Reviewed-by: David Sterba <dsterba@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>

9deae968

btrfs: clean up btrfs_dev_stat_inc usage · 1cb34c8e

Anand Jain authored Oct 21, 2017

btrfs_end_bio() is using btrfs_dev_stat_inc() and then
btrfs_dev_stat_print_on_error() separately instead use
btrfs_dev_stat_inc_and_print() directly.

As of now there isn't any bio in btrfs which is - a non-empty write and
also the REQ_PREFLUSH flag is set. So in actual the condition

   if (bio->bi_opf & REQ_PREFLUSH)

is never true in btrfs_end_bio(), and so there won't be any redundant
error log by using btrfs_dev_stat_inc_and_print() separately one for
write and another for flush.

This consolidation will help to add the device critical error handles in
the function btrfs_dev_stat_inc_and_print() and which can be renamed as
needed.
Signed-off-by: Anand Jain <anand.jain@oracle.com>
Reviewed-by: Nikolay Borisov <nborisov@suse.com>
Reviewed-by: David Sterba <dsterba@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>

1cb34c8e

Btrfs: free btrfs_device in place · 9f5316c1

Liu Bo authored Oct 23, 2017

It's pointless to defer it to a kthread helper as we're not under a
special context.

For reference, commit 1f78160c ("Btrfs: using rcu lock in the reader
side of devices list") introduced RCU freeing for device structures.

Originally the blkdev_put was called from free_device and rcu_barrier had
to be called. This is no longer required, bdev and our device structures
are now freed separately.
Signed-off-by: Liu Bo <bo.li.liu@oracle.com>
Reviewed-by: Anand Jain <anand.jain@oracle.com>
Reviewed-by: David Sterba <dsterba@suse.com>
[ enhance changelog ]
Signed-off-by: David Sterba <dsterba@suse.com>

9f5316c1

Btrfs: remove redundant btrfs_balance_delayed_items · 1805f2ca

Liu Bo authored Oct 20, 2017

In functions like btrfs_create(), we run both
btrfs_balance_delayed_items() and btrfs_btree_balance_dirty() after
the operation, but btrfs_btree_balance_dirty() is surely going to run
btrfs_balance_delayed_items().

This keeps only btrfs_btree_balance_dirty().
Signed-off-by: Liu Bo <bo.li.liu@oracle.com>
Reviewed-by: Lu Fengqi <lufq.fnst@cn.fujitsu.com>
Reviewed-by: Nikolay Borisov <nborisov@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>

1805f2ca

21 Jan, 2018 2 commits

Linux 4.15-rc9 · 0c5b9b5d
Linus Torvalds authored Jan 21, 2018

0c5b9b5d

Merge branch 'x86-pti-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip · 55151142

Linus Torvalds authored Jan 21, 2018

Pull x86 pti fixes from Thomas Gleixner:
 "A small set of fixes for the meltdown/spectre mitigations:

   - Make kprobes aware of retpolines to prevent probes in the retpoline
     thunks.

   - Make the machine check exception speculation protected. MCE used to
     issue an indirect call directly from the ASM entry code. Convert
     that to a direct call into a C-function and issue the indirect call
     from there so the compiler can add the retpoline protection,

   - Make the vmexit_fill_RSB() assembly less stupid

   - Fix a typo in the PTI documentation"

* 'x86-pti-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  x86/retpoline: Optimize inline assembler for vmexit_fill_RSB
  x86/pti: Document fix wrong index
  kprobes/x86: Disable optimizing on the function jumps to indirect thunk
  kprobes/x86: Blacklist indirect thunk functions for kprobes
  retpoline: Introduce start/end markers of indirect thunk
  x86/mce: Make machine check speculation protected

55151142