Commits · 83bfccb5c085658b0ad1450a6fc13b0bb5440970 · Kirill Smelkov / linux

22 Jan, 2013 4 commits

Merge branch 'mutex-ops@next-for-chris' of git://github.com/idryomov/btrfs-unstable into linus · 83bfccb5
Chris Mason authored Jan 21, 2013

83bfccb5

Merge branch 'for-chris' of... · daf2c089

Chris Mason authored Jan 21, 2013

Merge branch 'for-chris' of git://git.kernel.org/pub/scm/linux/kernel/git/josef/btrfs-next into linus

daf2c089

Btrfs: prevent qgroup destroy when there are still relations · 2cf68703

Arne Jansen authored Jan 17, 2013

Currently you can just destroy a qgroup even though it is in use by other qgroups
or has qgroups assigned to it. This patch prevents destruction of qgroups unless
they are completely unused. Otherwise destroy will return EBUSY.
Reported-by: Eric Hopper <hopper@omnifarious.org>
Signed-off-by: Arne Jansen <sensille@gmx.net>
Signed-off-by: Chris Mason <chris.mason@fusionio.com>

2cf68703

Btrfs: ignore orphan qgroup relations · ff24858c

Arne Jansen authored Jan 17, 2013

If a qgroup that has still assignments is deleted by the user, the corresponding
relations are left in the tree. This leads to an unmountable filesystem.
With this patch, those relations are simple ignored.
Reported-by: Eric Hopper <hopper@omnifarious.org>
Signed-off-by: Arne Jansen <sensille@gmx.net>
Signed-off-by: Chris Mason <chris.mason@fusionio.com>

ff24858c

20 Jan, 2013 5 commits

Btrfs: reorder locks and sanity checks in btrfs_ioctl_defrag · 25122d15

Ilya Dryomov authored Jan 20, 2013

Operation-specific check (whether subvol is readonly or not) should go
after the mutual exclusiveness check.
Signed-off-by: Ilya Dryomov <idryomov@gmail.com>

25122d15

Btrfs: fix unlock order in btrfs_ioctl_rm_dev · 4ac20c70
Ilya Dryomov authored Jan 20, 2013
```
Fix unlock order in btrfs_ioctl_rm_dev().
Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
```
4ac20c70
Btrfs: fix unlock order in btrfs_ioctl_resize · 18f39c41
Ilya Dryomov authored Jan 20, 2013
```
Fix unlock order in btrfs_ioctl_resize().
Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
```
18f39c41

Btrfs: fix "mutually exclusive op is running" error code · 2c0c9da0

Ilya Dryomov authored Jan 20, 2013

The error code that is returned in response to starting a mutually
exclusive operation when there is one already running got silently
changed from EINVAL to EINPROGRESS by 5ac00add. Returning EINPROGRESS
to, say, add_dev, when rm_dev is running is misleading. Furthermore,
the operation itself may want to use EINPROGRESS for other purposes.
Signed-off-by: Ilya Dryomov <idryomov@gmail.com>

2c0c9da0

Btrfs: bring back balance pause/resume logic · ed0fb78f

Ilya Dryomov authored Jan 20, 2013

Balance pause/resume logic got broken by 5ac00add (went in into 3.8-rc1
as part of dev-replace merge). Offending commit took a stab at making
mutually exclusive volume operations (add_dev, rm_dev, resize, balance,
replace_dev) not block behind volume_mutex if another such operation is
in progress and instead return an error right away. Balancing front-end
relied on the blocking behaviour, so the fix is ugly, but short of a
complete rework, it's the best we can do.
Reported-by: Liu Bo <bo.li.liu@oracle.com>
Signed-off-by: Ilya Dryomov <idryomov@gmail.com>

ed0fb78f

14 Jan, 2013 14 commits

btrfs: update timestamps on truncate() · 3972f260

Eric Sandeen authored Jan 12, 2013

truncate() vs. ftruncate() differ in the VFS; truncate()
doesn't set (ATTR_CTIME | ATTR_MTIME), and it's up to the
fs to do the timestamp updates if the size changes.
Signed-off-by: Eric Sandeen <sandeen@redhat.com>
Signed-off-by: Josef Bacik <josef@toxicpanda.com>

3972f260

btrfs: fix btrfs_cont_expand() freeing IS_ERR em · f2767956

Zach Brown authored Jan 08, 2013

btrfs_cont_expand() tries to free an IS_ERR em as it gets an error from
btrfs_get_extent() and breaks out of its loop.

An instance of -EEXIST was reported in the wild:

  https://bugzilla.redhat.com/show_bug.cgi?id=874407

I have no idea if that -EEXIST is surprising, or not.  Regardless, this
error handling should be cleaned up to handle other reasonable errors
(ENOMEM, EIO; whatever).

This seemed to be the only buggy freeing of the relatively rare IS_ERR
em so I opted to fix the caller rather than teach free_extent_map() to
use IS_ERR_OR_NULL().
Signed-off-by: Zach Brown <zab@redhat.com>
Reviewed-by: Eric Sandeen <sandeen@redhat.com>
Signed-off-by: Josef Bacik <josef@toxicpanda.com>

f2767956

Btrfs: fix a bug when llseek for delalloc bytes behind prealloc extents · f9e4fb53

Liu Bo authored Jan 07, 2013

xfstests case 285 complains.

It it because btrfs did not try to find unwritten delalloc
bytes(only dirty pages, not yet writeback) behind prealloc
extents, it ends up finding nothing while we're with SEEK_DATA.
Signed-off-by: Liu Bo <bo.li.liu@oracle.com>
Signed-off-by: Josef Bacik <jbacik@fusionio.com>

f9e4fb53

Btrfs: fix off-by-one in lseek · 1214b53f

Liu Bo authored Jan 07, 2013

Lock end is inclusive.
Signed-off-by: Liu Bo <bo.li.liu@oracle.com>
Signed-off-by: Josef Bacik <jbacik@fusionio.com>

1214b53f

Btrfs: reset path lock state to zero · 3268a246

Liu Bo authored Dec 28, 2012

We forgot to reset the path lock state to zero after we unlock the path block,
and this can lead to the ASSERT checker in tree unlock API.
Reported-by: Slava Barinov <rayslava@gmail.com>
Signed-off-by: Liu Bo <bo.li.liu@oracle.com>
Signed-off-by: Josef Bacik <jbacik@fusionio.com>

3268a246

Btrfs: let allocation start from the right raid type · ac5c9300

Liu Bo authored Dec 27, 2012

This'd avoid us empty looping.

Say we have only one disk and the metadata raid type will be defaultly DUP,
and we do not need to start from index=0(RAID10) and get over two empty
loops to index=2(DUP).
Signed-off-by: Liu Bo <bo.li.liu@oracle.com>
Signed-off-by: Josef Bacik <jbacik@fusionio.com>

ac5c9300

Btrfs: add orphan before truncating pagecache · f3fe820c

Josef Bacik authored Jan 07, 2013

Running xfstests 83 in a loop would sometimes fail the fsck. This happens
because if we invalidate a page that already has an ordered extent setup for
it we will complete the ordered extent ourselves, assuming that the truncate
will clean everything up. The problem with this is there is plenty of time
for the truncate to fail after we've done this work. So to fix this we need
to add the orphan item first to make sure the cleanup gets done properly,
and then we can truncate the pagecache and all that stuff and be safe. This
fixes the btrfsck failures I was seeing while running 83 in a loop. Thanks,
Signed-off-by: Josef Bacik <jbacik@fusionio.com>

f3fe820c

Btrfs: set flushing if we're limited flushing · 72bcd99d

Josef Bacik authored Dec 18, 2012

We still need to say we're flushing if we're limit flushing to keep somebody
from coming in and stealing our reservation.  Thanks,
Signed-off-by: Josef Bacik <jbacik@fusionio.com>

72bcd99d

Btrfs: fix missing write access release in btrfs_ioctl_resize() · 97547676

Miao Xie authored Dec 21, 2012

We forget to give up the write access after we find some device operation
is going on. Fix it.
Signed-off-by: Miao Xie <miaox@cn.fujitsu.com>
Signed-off-by: Josef Bacik <jbacik@fusionio.com>

97547676

Btrfs: fix resize a readonly device · dba60f3f

Miao Xie authored Dec 21, 2012

We should not resize a readonly device, fix it.
Signed-off-by: Miao Xie <miaox@cn.fujitsu.com>
Signed-off-by: Josef Bacik <jbacik@fusionio.com>

dba60f3f

Btrfs: do not delete a subvolume which is in a R/O subvolume · 5c39da5b

Miao Xie authored Oct 22, 2012

Step to reproduce:
 # mkfs.btrfs <disk>
 # mount <disk> <mnt>
 # btrfs sub create <mnt>/subv0
 # btrfs sub snap <mnt> <mnt>/subv0/snap0
 # change <mnt>/subv0 from R/W to R/O
 # btrfs sub del <mnt>/subv0/snap0

We deleted the snapshot successfully. I think we should not be able to delete
the snapshot since the parent subvolume is R/O.
Signed-off-by: Miao Xie <miaox@cn.fujitsu.com>

5c39da5b

Btrfs: disable qgroup id 0 · d86e56cf

Miao Xie authored Nov 15, 2012

Qgroup id 0 is a special number, we should set the id of a qgroup to 0.
Fix it.
Signed-off-by: Miao Xie <miaox@cn.fujitsu.com>

d86e56cf

btrfs: get the device in write mode when deleting it · cc975eb4

Lukas Czerner authored Dec 07, 2012

When we're deleting the device we should get it in write mode since
we're going to re-write the super block magic on that device. And it
should fail if the device is read-only.
Signed-off-by: Lukas Czerner <lczerner@redhat.com>

cc975eb4

Btrfs: fix memory leak in name_cache_insert() · cfa7a9cc

Tsutomu Itoh authored Dec 17, 2012

We should free name_cache_entry before returning from the
error handling code.
Signed-off-by: Tsutomu Itoh <t-itoh@jp.fujitsu.com>

cfa7a9cc

19 Dec, 2012 1 commit

Revert "Btrfs: reorder tree mod log operations in deleting a pointer" · 57ba86c0

Chris Mason authored Dec 18, 2012

This reverts commit 6a7a665d.

This was bug was fixed differently in 3.6, so this commit
isn't needed.

Conflicts:
	fs/btrfs/ctree.c
Signed-off-by: Chris Mason <chris.mason@fusionio.com>

57ba86c0

18 Dec, 2012 1 commit

Revert "Btrfs: MOD_LOG_KEY_REMOVE_WHILE_MOVING never change node's nritems" · 4c3e6969

Chris Mason authored Dec 18, 2012

This reverts commit 95c80bb1.

The bug addressed by this commit was fixed differently back in 3.6
Signed-off-by: Chris Mason <chris.mason@fusionio.com>

4c3e6969

17 Dec, 2012 15 commits

Btrfs: fix a bug of per-file nocow · 213490b3

Liu Bo authored Sep 11, 2012

Users report a bug, the reproducer is:
$ mkfs.btrfs /dev/loop0
$ mount /dev/loop0 /mnt/btrfs/
$ mkdir /mnt/btrfs/dir
$ chattr +C /mnt/btrfs/dir/
$ dd if=/dev/zero of=/mnt/btrfs/dir/foo bs=4K count=10;
$ lsattr /mnt/btrfs/dir/foo
---------------C- /mnt/btrfs/dir/foo
$ filefrag /mnt/btrfs/dir/foo
/mnt/btrfs/dir/foo: 1 extent found    ---> an extent
$ dd if=/dev/zero of=/mnt/btrfs/dir/foo bs=4K count=1 seek=5 conv=notrunc,nocreat; sync
$ filefrag /mnt/btrfs/dir/foo
/mnt/btrfs/dir/foo: 3 extents found   ---> with nocow, btrfs breaks the extent into three parts

The new created file should not only inherit the NODATACOW flag, but also
honor NODATASUM flag, because we must do COW on a file extent with checksum.
Signed-off-by: Liu Bo <bo.li.liu@oracle.com>
Signed-off-by: Chris Mason <chris.mason@fusionio.com>

213490b3

Btrfs: fix hash overflow handling · 9c52057c

Chris Mason authored Dec 17, 2012

The handling for directory crc hash overflows was fairly obscure,
split_leaf returns EOVERFLOW when we try to extend the item and that is
supposed to bubble up to userland.  For a while it did so, but along the
way we added better handling of errors and forced the FS readonly if we
hit IO errors during the directory insertion.

Along the way, we started testing only for EEXIST and the EOVERFLOW case
was dropped.  The end result is that we may force the FS readonly if we
catch a directory hash bucket overflow.

This fixes a few problem spots.  First I add tests for EOVERFLOW in the
places where we can safely just return the error up the chain.

btrfs_rename is harder though, because it tries to insert the new
directory item only after it has already unlinked anything the rename
was going to overwrite.  Rather than adding very complex logic, I added
a helper to test for the hash overflow case early while it is still safe
to bail out.

Snapshot and subvolume creation had a similar problem, so they are using
the new helper now too.
Signed-off-by: Chris Mason <chris.mason@fusionio.com>
Reported-by: Pascal Junod <pascal@junod.info>

9c52057c

Btrfs: don't take inode delalloc mutex if we're a free space inode · c64c2bd8

Josef Bacik authored Dec 14, 2012

This confuses and angers lockdep even though it's ok.  We don't really need
the lock for free space inodes since only the transaction committer will be
reserving space.  Thanks,
Signed-off-by: Josef Bacik <jbacik@fusionio.com>
Signed-off-by: Chris Mason <chris.mason@fusionio.com>

c64c2bd8

Btrfs: fix autodefrag and umount lockup · 1135d6df

Josef Bacik authored Dec 14, 2012

This happens because writeback_inodes_sb_nr_if_idle does down_read.  This
doesn't work for us and it has not been fixed upstream yet, so do it
ourselves and use that instead so we can stop having this stupid long
standing lockup.  Thanks,
Signed-off-by: Josef Bacik <jbacik@fusionio.com>
Signed-off-by: Chris Mason <chris.mason@fusionio.com>

1135d6df

Btrfs: fix permissions of empty files not affected by umask · 9185aa58

Filipe Brandenburger authored Nov 30, 2012

When a new file is created with btrfs_create(), the inode will initially be
created with permissions 0666 and later on in btrfs_init_acl() it will be
adapted to mask out the umask bits. The problem is that this change won't make
it into the btrfs_inode unless there's another change to the inode (e.g. writing
content changing the size or touching the file changing the mtime.)

This fix adds a call to btrfs_update_inode() to btrfs_create() to make sure that
the change will not get lost if the in-memory inode is flushed before other
changes are made to the file.
Signed-off-by: Filipe Brandenburger <filbranden@google.com>
Reviewed-by: Liu Bo <bo.li.liu@oracle.com>
Signed-off-by: Chris Mason <chris.mason@fusionio.com>

9185aa58

Btrfs: put raid properties into global table · 31e50229

Liu Bo authored Nov 21, 2012

Raid properties can be shared among raid calculation code, we can put
them into a global table to keep it simple.
Signed-off-by: Liu Bo <bo.li.liu@oracle.com>
Signed-off-by: Chris Mason <chris.mason@fusionio.com>

31e50229

Btrfs: fix BUG() in scrub when first superblock reading gives EIO · 4ded4f63

Stefan Behrens authored Nov 14, 2012

This fixes a very special case that can be reproduced by just
disconnecting a disk at runtime, and without unmounting the
filesystem first, start scrub on the filesystem with the
disconnected disk. All read and write EIOs are handled
correctly, only the first superblock is an exception and gives
a BUG() in a subfunction. The BUG() is correct, it would crash
later otherwise. The subfunction must not be called for
superblocks and this is what the fix changes.
Reported-by: Joeri Vanthienen <mail@joerivanthienen.be>
Signed-off-by: Stefan Behrens <sbehrens@giantdisaster.de>
Signed-off-by: Chris Mason <chris.mason@fusionio.com>

4ded4f63

Btrfs: do not call file_update_time in aio_write · 6c760c07

Josef Bacik authored Nov 09, 2012

This starts a transaction and dirties the inode everytime we call it, which
is super expensive if you have a write heavy workload. We will be updating
the inode when the IO completes and we reserve the space for the inode
update when we reserve space for the write, so there is no chance of loss of
information or enospc issues. Thanks,
Signed-off-by: Josef Bacik <jbacik@fusionio.com>
Signed-off-by: Chris Mason <chris.mason@fusionio.com>

6c760c07

Btrfs: only unlock and relock if we have to · 5124e00e

Josef Bacik authored Nov 07, 2012

I noticed while doing fsync tests that we were always dropping the path and
re-searching when we first cow the log root even though we've already gotten
the write lock on the root. That's because we don't take into account that
there might not be a parent node, so fix the check to make sure there is
actually a parent node before we undo all of this work for nothing. Thanks,
Signed-off-by: Josef Bacik <jbacik@fusionio.com>
Signed-off-by: Chris Mason <chris.mason@fusionio.com>

5124e00e

Btrfs: use tokens where we can in the tree log · 0b1c6cca

Josef Bacik authored Oct 23, 2012

If we are syncing over and over the overhead of doing all those maps in
fill_inode_item and log_changed_extents really starts to hurt, so use map
tokens so we can avoid all the extra mapping. Since the token maps from our
offset to the end of the page make sure to set the first thing in the item
first so we really only do one map. Thanks,
Signed-off-by: Josef Bacik <jbacik@fusionio.com>
Signed-off-by: Chris Mason <chris.mason@fusionio.com>

0b1c6cca

Btrfs: optimize leaf_space_used · 41be1f3b

Josef Bacik authored Oct 15, 2012

This gets called at least 4 times for every level while adding an object,
and it involves 3 kmapping calls, which on my box take about 5us a piece.
So instead use a token, which brings us down to 1 kmap call and makes this
function take 1/3 of the time per call. Thanks,
Signed-off-by: Josef Bacik <jbacik@fusionio.com>
Signed-off-by: Chris Mason <chris.mason@fusionio.com>

41be1f3b

Btrfs: don't memset new tokens · ad914559

Josef Bacik authored Oct 15, 2012

Our token logic depends on token->kaddr being set, and if it is not it sets
everything properly as needed.  So instead of memsetting just set
token->kaddr to NULL.  Thanks,
Signed-off-by: Josef Bacik <jbacik@fusionio.com>
Signed-off-by: Chris Mason <chris.mason@fusionio.com>

ad914559

Btrfs: only clear dirty on the buffer if it is marked as dirty · ed7b63eb

Josef Bacik authored Oct 15, 2012

No reason to set the path blocking or loop through all of the pages if the
extent buffer isn't actually marked dirty.  Thanks,
Signed-off-by: Josef Bacik <jbacik@fusionio.com>
Signed-off-by: Chris Mason <chris.mason@fusionio.com>

ed7b63eb

Btrfs: move checks in set_page_dirty under DEBUG · bb146eb2

Josef Bacik authored Oct 15, 2012

This is a high traffic function, let's try and do as little as possible
during normal operations shall we?
Signed-off-by: Josef Bacik <jbacik@fusionio.com>
Signed-off-by: Chris Mason <chris.mason@fusionio.com>

bb146eb2

Btrfs: log changed inodes based on the extent map tree · 70c8a91c

Josef Bacik authored Oct 11, 2012

We don't really need to copy extents from the source tree since we have all
of the information already available to us in the extent_map tree.  So
instead just write the extents straight to the log tree and don't bother to
copy the extent items from the source tree.
Signed-off-by: Josef Bacik <jbacik@fusionio.com>
Signed-off-by: Chris Mason <chris.mason@fusionio.com>

70c8a91c