Commits · 9345457f4a539a40056431aeb6f068750857472f · Kirill Smelkov / linux

27 Jun, 2012 2 commits

Btrfs: support root level changes in __resolve_indirect_ref · 9345457f

Jan Schmidt authored Jun 27, 2012

With the tree mod log, we can have a tree that's two levels high, but
btrfs_search_old_slot may still return a path with the tree root at level
one instead. __resolve_indirect_ref must care for this and accept parents in
a lower level than expected.
Signed-off-by: Jan Schmidt <list.btrfs@jan-o-sch.net>

9345457f

Btrfs: avoid waiting for delayed refs when we must not · 8ca78f3e

Jan Schmidt authored Jun 27, 2012

We track two conditions to decide if we should sleep while waiting for more
delayed refs, the number of delayed refs (num_refs) and the first entry in
the list of blockers (first_seq).

When we suspect staleness, we save num_refs and do one more cycle. If
nothing changes, we then save first_seq for later comparison and do
wait_event. We ought to save first_seq the very same moment we're saving
num_refs. Otherwise we cannot be sure that nothing has changed and we might
start waiting when we shouldn't, which could lead to starvation.
Signed-off-by: Jan Schmidt <list.btrfs@jan-o-sch.net>

8ca78f3e

21 Jun, 2012 4 commits

Btrfs: delay iput with async extents · cb77fcd8

Josef Bacik authored Jun 15, 2012

There is some concern that these iput()'s could be the final iputs and could
induce lockups on people waiting on writeback. This would happen in the
rare case that we don't create ordered extents because of an error, but it
is theoretically possible and we already have a mechanism to deal with this
so just make them delayed iputs to negate any worry.
Signed-off-by: Josef Bacik <josef@redhat.com>
Signed-off-by: Chris Mason <chris.mason@fusionio.com>

cb77fcd8

Btrfs: add a missing spin_lock · e18fca73

Josef Bacik authored Jun 18, 2012

When fixing up the locking in the delayed ref destruction work I accidently
broke the locking myself ;(.  Add back a spin_lock that should be there and
we are now all set.  Thanks,
Btrfs: add a missing spin_lock

When fixing up the locking in the delayed ref destruction work I accidently
broke the locking myself ;(.  Add back a spin_lock that should be there and
we are now all set.  Thanks,
Reported-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Josef Bacik <josef@redhat.com>
Signed-off-by: Chris Mason <chris.mason@fusionio.com>

e18fca73

Btrfs: don't assume to be on the correct extent in add_all_parents · 69bca40d

Alexander Block authored Jun 19, 2012

add_all_parents did assume that path is already at a correct extent data
item, which may not be true in case of data extents that were partly
rewritten and splitted.

We need to check if we're on a matching extent for every item and only
for the ones after the first. The loop is changed to do this now.

This patch also fixes a bug introduced with commit 3b127fd8 "Btrfs:
remove obsolete btrfs_next_leaf call from __resolve_indirect_ref".
The removal of next_leaf did sometimes result in slot==nritems when
the above described case happens, and thus resulting in invalid values
(e.g. wanted_obejctid) in add_all_parents (leading to missed backrefs
or even crashes).
Signed-off-by: Alexander Block <ablock84@googlemail.com>
Signed-off-by: Jan Schmidt <list.btrfs@jan-o-sch.net>
Signed-off-by: Chris Mason <chris.mason@fusionio.com>

69bca40d

Btrfs: introduce btrfs_next_old_item · 1c8f52a5

Alexander Block authored Jun 19, 2012

We introduce btrfs_next_old_item that uses btrfs_next_old_leaf instead
of btrfs_next_leaf.

btrfs_next_item is also changed to simply call btrfs_next_old_item with
time_seq being 0.
Signed-off-by: Alexander Block <ablock84@googlemail.com>
Signed-off-by: Chris Mason <chris.mason@fusionio.com>

1c8f52a5

16 Jun, 2012 2 commits

Btrfs: cast devid to unsigned long long for printk %llu · a8c4a33b
Chris Mason authored Jun 15, 2012
```
Avoid warning in 32 bit machines
Signed-off-by: Chris Mason <chris.mason@fusionio.com>
```
a8c4a33b

Btrfs: init old_generation in get_old_root · 4325edd0

Chris Mason authored Jun 15, 2012

gcc was giving an uninit variable warning here.  Strictly
speaking we don't need to init it, but this will make things
much less error prone.
Signed-off-by: Chris Mason <chris.mason@fusionio.com>

4325edd0

15 Jun, 2012 19 commits

Btrfs: update MAINTAINERS info for BTRFS FILE SYSTEM · 9c106405

Liu Bo authored Jun 14, 2012

Update to the latest btrfs's maintainer mail and git repo.
Signed-off-by: Liu Bo <liubo2009@cn.fujitsu.com>
Signed-off-by: Chris Mason <chris.mason@fusionio.com>

9c106405

Btrfs: destroy the items of the delayed inodes in error handling routine · 67cde344

Miao Xie authored Jun 14, 2012

the items of the delayed inodes were forgotten to be freed, this patch
fixes it.
Signed-off-by: Miao Xie <miaox@cn.fujitsu.com>
Signed-off-by: Chris Mason <chris.mason@fusionio.com>

67cde344

Btrfs: make sure that we've made everything in pinned tree clean · ed0eaa14

Liu Bo authored Jun 14, 2012

Since we have two trees for recording pinned extents, we need to go through
both of them to make sure that we've done everything clean.
Signed-off-by: Liu Bo <liubo2009@cn.fujitsu.com>
Signed-off-by: Chris Mason <chris.mason@fusionio.com>

ed0eaa14

Btrfs: avoid memory leak of extent state in error handling routine · 6e841e32

Liu Bo authored Jun 14, 2012

We've forgotten to clear extent states in pinned tree, which will results in
space counter mismatch and memory leak:

WARNING: at fs/btrfs/extent-tree.c:7537 btrfs_free_block_groups+0x1f3/0x2e0 [btrfs]()
...
space_info 2 has 8380416 free, is not full
space_info total=12582912, used=4096, pinned=4096, reserved=0, may_use=0, readonly=4194304
btrfs state leak: start 29364224 end 29376511 state 1 in tree ffff880075f20090 refs 1
...
Signed-off-by: Liu Bo <liubo2009@cn.fujitsu.com>
Signed-off-by: Chris Mason <chris.mason@fusionio.com>

6e841e32

Btrfs: do not resize a seeding device · 4e42ae1b

Liu Bo authored Jun 14, 2012

Seeding devices are not supposed to change any more.
Signed-off-by: Liu Bo <liubo2009@cn.fujitsu.com>
Signed-off-by: Chris Mason <chris.mason@fusionio.com>

4e42ae1b

Btrfs: fix missing inherited flag in rename · bc178237

Liu Bo authored Jun 14, 2012

When we move a file into a directory with compression flag, we need to
inherite BTRFS_INODE_COMPRESS and clear BTRFS_INODE_NOCOMPRESS as well.
But if we move a file into a directory without compression flag, we need
to clear both of them.

It is the way how our setflags deals with compression flag, so keep
the same behaviour here.
Signed-off-by: Liu Bo <liubo2009@cn.fujitsu.com>
Signed-off-by: Chris Mason <chris.mason@fusionio.com>

bc178237

Merge branch 'for-chris' of git://git.jan-o-sch.net/btrfs-unstable into for-linus · acbcabd2
Chris Mason authored Jun 14, 2012

acbcabd2

Btrfs: fix incompat flags setting · 69e380d1

Li Zefan authored Jun 11, 2012

It's a bug, but it happens to work, as BTRFS_COMPRESS_LZO == 2, which
has only one bit set.
Signed-off-by: Li Zefan <lizefan@huawei.com>

69e380d1

Btrfs: fix defrag regression · 6c282eb4

Li Zefan authored Jun 11, 2012

If a file has 3 small extents:

| ext1 | ext2 | ext3 |

Running "btrfs fi defrag" will only defrag the last two extents, if those
extent mappings hasn't been read into memory from disk.

This bug was introduced by commit 17ce6ef8
("Btrfs: add a check to decide if we should defrag the range")

The cause is, that commit looked into previous and next extents using
lookup_extent_mapping() only.

While at it, remove the code that checks the previous extent, since
it's sufficient to check the next extent.
Signed-off-by: Li Zefan <lizefan@huawei.com>

6c282eb4

Btrfs: call filemap_fdatawrite twice for compression · 7ddf5a42

Josef Bacik authored Jun 08, 2012

I removed this in an earlier commit and I was wrong. Because compression
can return from filemap_fdatawrite() without having actually set any of it's
pages as writeback() it can make filemap_fdatawait() do essentially nothing,
and then we won't find any ordered extents because they may not have been
created yet. So not only does this make fsync() completely useless, but it
will also screw up if you truncate on a non-page aligned offset since we
zero out the end and then wait on ordered extents and then call drop caches.
We can drop the cache before the io completes and then we try to unpin the
extent we just wrote we won't find it and everything goes sideways. So fix
this by putting it back and put a giant comment there to keep me from trying
to remove it in the future. Thanks,
Signed-off-by: Josef Bacik <josef@redhat.com>

7ddf5a42

Btrfs: keep inode pinned when compressing writes · 8180ef88

Josef Bacik authored Jun 08, 2012

A user reported lots of problems using compression on the new code and it
turns out part of the problem was that igrab() was failing when we added a
new ordered extent. This is because when writing out an inode under
compression we immediately return without actually doing anything to the
pages, and then in another thread at some point down the line actually do
the ordered dance. The problem is between the point that we start writeback
and we actually add the ordered extent we could be trying to reclaim the
inode, which makes igrab() return NULL. So we need to do an igrab() when we
create the async extent and then drop it when we are done with it. This
makes sure we stay pinned in memory until the ordered extent can get a
reference on it and we are good to go. With this patch we no longer panic
in btrfs_finish_ordered_io(). Thanks,
Signed-off-by: Josef Bacik <josef@redhat.com>

8180ef88

Btrfs: implement ->show_devname · 9c5085c1

Josef Bacik authored Jun 05, 2012

Because btrfs can remove the device that was mounted we need to have a
->show_devname so that in this case we can print out some other device in
the file system to /proc/mount.  So if there are multiple devices in a btrfs
file system we will just print the device with the lowest devid that we can
find.  This will make everything consistent and deal with device removal
properly.  The drawback is if you mount with a device that is higher than
the lowest devicd it won't show up as the mounted device in /proc/mounts,
but this is a small price to pay. This was inspired by Miao Xie's patch.
Thanks,
Reviewed-by: Miao Xie <miaox@cn.fujitsu.com>
Signed-off-by: Josef Bacik <josef@redhat.com>

9c5085c1

Btrfs: use rcu to protect device->name · 606686ee

Josef Bacik authored Jun 04, 2012

Al pointed out that we can just toss out the old name on a device and add a
new one arbitrarily, so anybody who uses device->name in printk could
possibly use free'd memory. Instead of adding locking around all of this he
suggested doing it with RCU, so I've introduced a struct rcu_string that
does just that and have gone through and protected all accesses to
device->name that aren't under the uuid_mutex with rcu_read_lock(). This
protects us and I will use it for dealing with removing the device that we
used to mount the file system in a later patch. Thanks,
Reviewed-by: David Sterba <dsterba@suse.cz>
Signed-off-by: Josef Bacik <josef@redhat.com>

606686ee

Btrfs: unlock everything properly in the error case for nocow · 17ca04af

Josef Bacik authored May 31, 2012

I was getting hung on umount when a transaction was aborted because a range
of one of the free space inodes was still locked. This is because the nocow
stuff doesn't unlock anything on error. This fixed the problem and I
verified that is what was happening. Thanks,
Signed-off-by: Josef Bacik <josef@redhat.com>

17ca04af

Btrfs: fix btrfs_destroy_marked_extents · ee670f0a

Josef Bacik authored May 31, 2012

So we're forcing the eb's to have their ref count set to 1 so invalidatepage
works but this breaks lots of things, for example root nodes, and is just
plain wrong, we don't need to just evict all of this stuff. Also drop the
invalidatepage altogether and add a page_cache_release(). With this patch
we no longer hang when trying to access the root nodes after an aborted
transaction and we no longer leak memory. Thanks,
Signed-off-by: Josef Bacik <josef@redhat.com>

ee670f0a

Btrfs: abort the transaction if the commit fails · 7b8b92af

Josef Bacik authored May 31, 2012

If a transaction commit fails we don't abort it so we don't set an error on
the file system. This patch fixes that by actually calling the abort stuff
and then adding a check for a fs error in the transaction start stuff to
make sure it is caught properly. Thanks,
Signed-off-by: Josef Bacik <josef@redhat.com>

7b8b92af

Btrfs: wake up transaction waiters when aborting a transaction · d7096fc3

Josef Bacik authored May 31, 2012

I was getting lots of hung tasks and a NULL pointer dereference because we
are not cleaning up the transaction properly when it aborts. First we need
to reset the running_transaction to NULL so we don't get a bad dereference
for any start_transaction callers after this. Also we cannot rely on
waitqueue_active() since it's just a list_empty(), so just call wake_up()
directly since that will do the barrier for us and such. Thanks,
Signed-off-by: Josef Bacik <josef@redhat.com>

d7096fc3

Btrfs: fix locking in btrfs_destroy_delayed_refs · b939d1ab

Josef Bacik authored May 31, 2012

The transaction abort stuff was throwing warnings from the list debugging
code because we do a list_del_init outside of the delayed_refs spin lock.
The delayed refs locking makes baby Jesus cry so it's not hard to get wrong,
but we need to take the ref head mutex to make sure it's not being processed
currently, and so if it is we need to drop the spin lock and then take and
drop the mutex and do the search again. If we can take the mutex then we
can safely remove the head from the list and carry on. Now when the
transaction aborts I don't get the list debugging warnings. Thanks,
Signed-off-by: Josef Bacik <josef@redhat.com>

b939d1ab

Btrfs: pass locked_page into extent_clear_unlock_delalloc if theres an error · beb42dd7

Josef Bacik authored May 30, 2012

While doing my enospc work I got a transaction abortion that resulted in a
panic when we tried to unlock_page() an already unlocked page.  This is
because we aren't calling extent_clear_unlock_delalloc with the locked page
so it was unlocking all the pages in the range.  This is wrong since
__extent_writepage expects to have the page locked still unless we return
*page_started as 1.  This should keep us from panicing.  Thanks,
Signed-off-by: Josef Bacik <josef@redhat.com>

beb42dd7

14 Jun, 2012 5 commits

Btrfs: fix race in tree mod log addition · 3310c36e

Jan Schmidt authored Jun 11, 2012

When adding to the tree modification log, we grab two locks at different
stages. We must not drop the outer lock until we're done with section
protected by the inner lock. This moves the unlock call for the outer lock
to the appropriate position.
Signed-off-by: Jan Schmidt <list.btrfs@jan-o-sch.net>

3310c36e

Btrfs: add btrfs_next_old_leaf · 3d7806ec

Jan Schmidt authored Jun 11, 2012

To make sense of the tree mod log, the backref walker not only needs
btrfs_search_old_slot, but it also called btrfs_next_leaf, which in turn was
calling btrfs_search_slot. This obviously didn't give the correct result.

This commit adds btrfs_next_old_leaf, a drop-in replacement for
btrfs_next_leaf with a time_seq parameter. If it is zero, it behaves exactly
like btrfs_next_leaf. If it is non-zero, it will use btrfs_search_old_slot
with this time_seq parameter.
Signed-off-by: Jan Schmidt <list.btrfs@jan-o-sch.net>

3d7806ec

Btrfs: fix return value for __tree_mod_log_oldest_root · a95236d9

Jan Schmidt authored Jun 05, 2012

In __tree_mod_log_oldest_root() we must return the found operation even if
it's not a ROOT_REPLACE operation. Otherwise, the caller assumes that there
are no operations to be rewinded and returns immediately.

The code in the caller is modified to improve readability.
Signed-off-by: Jan Schmidt <list.btrfs@jan-o-sch.net>

a95236d9

Btrfs: use btrfs_read_lock_root_node in get_old_root · 8ba97a15

Jan Schmidt authored Jun 04, 2012

get_old_root could race with root node updates because we weren't locking
the node early enough. Use btrfs_read_lock_root_node to grab the root locked
in the very beginning and release the lock as soon as possible (just like
btrfs_search_slot does).
Signed-off-by: Jan Schmidt <list.btrfs@jan-o-sch.net>

8ba97a15

Btrfs: remove obsolete btrfs_next_leaf call from __resolve_indirect_ref · f617e2fd

Jan Schmidt authored Jun 14, 2012

When resolving indirect refs, we used to call btrfs_next_leaf in case we
didn't find an exact match. While we should find exact matches most of the
time, in case we don't, we must continue searching. Treating those matches
differently depending on the level we're searching doesn't make sense.

Even worse, we might end up searching for a key larger than the largest, in
which case there is no next_leaf and subsequent jobs would fail. This commit
drops the bogous lines.
Signed-off-by: Jan Schmidt <list.btrfs@jan-o-sch.net>

f617e2fd

04 Jun, 2012 1 commit

Btrfs: remove call to btrfs_header_nritems with no effect · 4d5a0565

Jan Schmidt authored Apr 30, 2012

This is a leftover from cleanup patch 559af821. Before the cleanup,
btrfs_header_nritems was called inside an if condition. As it has no side
effects we need to preserve here, it should simply be dropped.
Signed-off-by: Jan Schmidt <list.btrfs@jan-o-sch.net>

4d5a0565

31 May, 2012 6 commits

Merge branch 'for-chris' of git://git.jan-o-sch.net/btrfs-unstable into for-linus · 1e20932a
Chris Mason authored May 31, 2012
```
Conflicts:
	fs/btrfs/ulist.h
Signed-off-by: Chris Mason <chris.mason@oracle.com>
```
1e20932a

Btrfs: fix tree mod log rewinded level and rewinding of moved keys · c3193108

Jan Schmidt authored May 31, 2012

When we rewind REMOVE_WHILE_FREEING operations, there's code that allocates
a fresh buffer instead of cloning the old one. Setting that buffer's level
correctly was missing in this case.

When rewinding a MOVE_KEYS operation, btrfs_node_key_ptr_offset(slot) was
missing for memmove_extent_buffer()'s arguments.
Signed-off-by: Jan Schmidt <list.btrfs@jan-o-sch.net>

c3193108

Btrfs: fix tree mod log del_ptr · f395694c

Jan Schmidt authored May 31, 2012

Logging for del_ptr when we're not deleting the last pointer was wrong. This
fixes both, duplicate log entries and log sequence.
Signed-off-by: Jan Schmidt <list.btrfs@jan-o-sch.net>

f395694c

Btrfs: add tree_mod_dont_log helper · e9b7fd4d

Jan Schmidt authored May 31, 2012

Replace duplicate code by small inline helper function.
Signed-off-by: Jan Schmidt <list.btrfs@jan-o-sch.net>

e9b7fd4d

Btrfs: add missing spin_lock for insertion into tree mod log · 926dd8a6

Jan Schmidt authored May 31, 2012

tree_mod_alloc calls __get_tree_mod_seq and must acquire a spinlock before
doing so.
Signed-off-by: Jan Schmidt <list.btrfs@jan-o-sch.net>

926dd8a6

Btrfs: add inodes before dropping the extent lock in find_all_leafs · 3301958b

Jan Schmidt authored May 30, 2012

We must build up the inode list with the extent lock held after following
indirect refs.

This also requires an extension to ulists, which allows to modify the stored
aux value in case a key already exists in the list.
Signed-off-by: Jan Schmidt <list.btrfs@jan-o-sch.net>

3301958b

30 May, 2012 1 commit

Btrfs: use delayed ref sequence numbers for all fs-tree updates · 95a06077

Jan Schmidt authored May 29, 2012

The sequence number for delayed refs is needed to postpone certain delayed
refs for a very short period while walking backrefs. Before the tree
modification log, we thought we'd only have to hold back those references
that don't have a counter operation.

While now we've the tree mod log, we're rewinding fs tree blocks to a
defined consistent state. We cannot know in advance for which tree block
we'll be doing rewind operations later. Therefore, we must postpone all the
delayed refs for fs-tree blocks, even those having a counter operation.
Signed-off-by: Jan Schmidt <list.btrfs@jan-o-sch.net>

95a06077