Commits · a717531942f488209dded30f6bc648167bcefa72 · Kirill Smelkov / linux

22 Jan, 2009 1 commit

Btrfs: do less aggressive btree readahead · a7175319

Chris Mason authored Jan 22, 2009

Just before reading a leaf, btrfs scans the node for blocks that are
close by and reads them too.  It tries to build up a large window
of IO looking for blocks that are within a max distance from the top
and bottom of the IO window.

This patch changes things to just look for blocks within 64k of the
target block.  It will trigger less IO and make for lower latencies on
the read size.
Signed-off-by: Chris Mason <chris.mason@oracle.com>

a7175319

21 Jan, 2009 16 commits

Btrfs: fiemap support · 1506fcc8

Yehuda Sadeh authored Jan 21, 2009

Now that bmap support is gone, this is the only way to get extent
mappings for userland. These are still not valid for IO, but they
can tell us if a file has holes or how much fragmentation there is.
Signed-off-by: Yehuda Sadeh <yehuda@hq.newdream.net>

1506fcc8

Btrfs: stop providing a bmap operation to avoid swapfile corruptions · 35054394

Chris Mason authored Jan 21, 2009

Swapfiles use bmap to build a list of extents belonging to the file,
and they assume these extents won't change over the life of the file.
They also use resulting list to do IO directly to the block device.

This causes problems for btrfs in a few ways:

btrfs returns logical block numbers through bmap, and these are not suitable
for IO.  They might translate to different devices, raid etc.

COW means that file block mappings are going to change frequently.

Using swapfiles on btrfs will lead to corruption, so we're avoiding the
problem for now by dropping bmap support entirely.  A later commit
will add fiemap support for people that really want to know how
a file is laid out.
Signed-off-by: Chris Mason <chris.mason@oracle.com>

35054394

Btrfs: fix tree logs parallel sync · 7237f183

Yan Zheng authored Jan 21, 2009

To improve performance, btrfs_sync_log merges tree log sync
requests. But it wrongly merges sync requests for different
tree logs. If multiple tree logs are synced at the same time,
only one of them actually gets synced.

This patch has following changes to fix the bug:

Move most tree log related fields in btrfs_fs_info to
btrfs_root. This allows merging sync requests separately
for each tree log.

Don't insert root item into the log root tree immediately
after log tree is allocated. Root item for log tree is
inserted when log tree get synced for the first time. This
allows syncing the log root tree without first syncing all
log trees.

At tree-log sync, btrfs_sync_log first sync the log tree;
then updates corresponding root item in the log root tree;
sync the log root tree; then update the super block.
Signed-off-by: Yan Zheng <zheng.yan@oracle.com>

7237f183

Btrfs: open_ctree() error handling can oops on fs_info · 7e662854

Qinghuang Feng authored Jan 21, 2009

a bug in open_ctree:

struct btrfs_root *open_ctree(..)
{
....
	if (!extent_root || !tree_root || !fs_info ||
	    !chunk_root || !dev_root || !csum_root) {
		err = -ENOMEM;
		goto fail;
//When code flow goes to "fail", fs_info may be NULL or uninitialized.
	}
....

fail:
	btrfs_close_devices(fs_info->fs_devices);// !
	btrfs_mapping_tree_free(&fs_info->mapping_tree);// !

	kfree(extent_root);
	kfree(tree_root);
	bdi_destroy(&fs_info->bdi);// !
...
)
Signed-off-by: Qinghuang Feng <qhfeng.kernel@gmail.com>
Signed-off-by: Chris Mason <chris.mason@oracle.com>

7e662854

Btrfs: fix stop searching test in replace_one_extent · 86288a19

Yan Zheng authored Jan 21, 2009

replace_one_extent searches tree leaves for references to a given extent. It
stops searching if it goes beyond the last possible position.

The last possible position is computed by adding the starting offset of a found
file extent to the full size of the extent. The code uses physical size of the
extent as the full size. This is incorrect when compression is used.

The fix is get the full size from ram_bytes field of file extent item.
Signed-off-by: Yan Zheng <zheng.yan@oracle.com>

86288a19

Btrfs: change/remove typedef · 95029d7d

Jan Engelhardt authored Jan 21, 2009

Change one typedef to a regular enum, and remove an unused one.
Signed-off-by: Jan Engelhardt <jengelh@medozas.de>
Signed-off-by: Chris Mason <chris.mason@oracle.com>

95029d7d

Btrfs: remove duplicated #include · 653249ff

Huang Weiyi authored Jan 21, 2009

Removed duplicated #include "compat.h"in
fs/btrfs/extent-tree.c
Signed-off-by: Huang Weiyi <weiyi.huang@gmail.com>
Signed-off-by: Chris Mason <chris.mason@oracle.com>

653249ff

Btrfs: Fix infinite loop in btrfs_extent_post_op · 5a7be515

Yan Zheng authored Jan 21, 2009

btrfs_extent_post_op calls finish_current_insert and del_pending_extents. They
both may enter infinite loops.

finish_current_insert enters infinite loop if it only finds some backrefs to
update.  The fix is to check for pending backref updates before restarting the
loop.

The infinite loop in del_pending_extents is due to a the skipped variable
not being properly reset before looping around.
Signed-off-by: Yan Zheng <zheng.yan@oracle.com>

5a7be515

Btrfs: fix locking issue in btrfs_remove_block_group · 3dfdb934

Yan Zheng authored Jan 21, 2009

We should hold the block_group_cache_lock while modifying the
block groups red-black tree. Thank you,
Signed-off-by: Yan Zheng <zheng.yan@oracle.com>

3dfdb934

Btrfs: simplify iteration codes · c6e30871

Qinghuang Feng authored Jan 21, 2009

Merge list_for_each* and list_entry to list_for_each_entry*
Signed-off-by: Qinghuang Feng <qhfeng.kernel@gmail.com>
Signed-off-by: Chris Mason <chris.mason@oracle.com>

c6e30871

Btrfs: check return value for kthread_run() correctly · 57506d50

Qinghuang Feng authored Jan 21, 2009

kthread_run() returns the kthread or ERR_PTR(-ENOMEM), not NULL.
Signed-off-by: Qinghuang Feng <qhfeng.kernel@gmail.com>
Signed-off-by: Chris Mason <chris.mason@oracle.com>

57506d50

Btrfs: Remove extra KERN_INFO in the middle of a line · 119e10cf

Roland Dreier authored Jan 21, 2009

The "devid <xxx> transid <xxx>" printk in btrfs_scan_one_device()
actually follows another printk that doesn't end in a newline (since the
intention is for the two printks to make one line of output), so the
KERN_INFO just ends up messing up the output:

    device label exp <6>devid 1 transid 9 /dev/sda5

Fix this by changing the extra KERN_INFO to KERN_CONT.
Signed-off-by: Roland Dreier <rolandd@cisco.com>
Signed-off-by: Chris Mason <chris.mason@oracle.com>

119e10cf

Btrfs: removed unused #include <version.h>'s · 7eaebe7d

Huang Weiyi authored Jan 21, 2009

Removed unused #include <version.h>'s in btrfs
Signed-off-by: Huang Weiyi <weiyi.huang@gmail.com>
Signed-off-by: Chris Mason <chris.mason@oracle.com>

7eaebe7d

Btrfs: cleanup xattr code · 07060404

Josef Bacik authored Jan 21, 2009

Andrew's review of the xattr code revealed some minor issues that this patch
addresses.  Just an error return fix, got rid of a useless statement and
commented one of the trickier parts of __btrfs_getxattr.
Signed-off-by: Josef Bacik <jbacik@redhat.com>
Signed-off-by: Chris Mason <chris.mason@oracle.com>

07060404

Btrfs: MAINTAINERS entry · eb1eb04f
Joe Perches authored Jan 21, 2009
```
Signed-off-by: Chris Mason <chris.mason@oracle.com>
```
eb1eb04f

Btrfs: cleanup fs/btrfs/super.c::btrfs_control_ioctl() · 19d00cc1

Wang Cong authored Jan 21, 2009

- Remove the unused local variable 'len';
- Check return value of kmalloc().
Signed-off-by: Wang Cong <wangcong@zeuux.org>
Signed-off-by: Chris Mason <chris.mason@oracle.com>

19d00cc1

16 Jan, 2009 2 commits

Btrfs: fix ioctl arg size (userland incompatible change!) · c071fcfd

Chris Mason authored Jan 16, 2009

The structure used to send device in btrfs ioctl calls was not
properly aligned, and so 32 bit ioctls would not work properly on
64 bit kernels.

We could fix this with compat ioctls, but we're just one byte away
and it doesn't make sense at this stage to carry about the compat ioctls
forever at this stage in the project.

This patch brings the ioctl arg up to an evenly aligned 4k.
Signed-off-by: Chris Mason <chris.mason@oracle.com>

c071fcfd

Btrfs: Clear the device->running_pending flag before bailing on congestion · 1d9e2ae9

Chris Mason authored Jan 16, 2009

Btrfs maintains a queue of async bio submissions so the checksumming
threads don't have to wait on get_request_wait.  In order to avoid
extra wakeups, this code has a running_pending flag that is used
to tell new submissions they don't need to wake the thread.

When the threads notice congestion on a single device, they
may decide to requeue the job and move on to other devices.  This
makes sure the running_pending flag is cleared before the
job is requeued.

It should help avoid IO stalls by making sure the task is woken up
when new submissions come in.
Signed-off-by: Chris Mason <chris.mason@oracle.com>

1d9e2ae9

09 Jan, 2009 1 commit

Btrfs: explicitly mark the tree log root for writeback · e293e97e

Chris Mason authored Jan 09, 2009

Each subvolume has an extent_state_tree used to mark metadata
that needs to be sent to disk while syncing the tree. This is
used in addition to the dirty bits on the pages themselves so that
a single subvolume can be sent to disk efficiently in disk order.

Normally this marking happens in btrfs_alloc_free_block, which also does
special recording of dirty tree blocks for the tree log roots.

Yan Zheng noticed that when the root of the log tree is allocated, it is added
to the wrong writeback list. The fix used here is to explicitly set
it dirty as part of tree log creation.
Signed-off-by: Chris Mason <chris.mason@oracle.com>

e293e97e

08 Jan, 2009 1 commit

Btrfs: Drop the hardware crc32c asm code · 755efdc3

Chris Mason authored Jan 07, 2009

This is already in the arch specific directories in mainline and
shouldn't be copied into btrfs.
Signed-off-by: Chris Mason <chris.mason@oracle.com>

755efdc3

07 Jan, 2009 2 commits
- Btrfs: Add Documentation/filesystem/btrfs.txt, remove old COPYING · 709ac06a
  David Woodhouse authored Jan 07, 2009
```
Signed-off-by: Chris Mason <chris.mason@oracle.com>
```
  709ac06a
- Btrfs: kmap_atomic(KM_USER0) is safe for btrfs_readpage_end_io_hook · 9ab86c8e
  Chris Mason authored Jan 07, 2009
```
None of the checksum verification code schedules, so we can use the faster
kmap_atomic
Signed-off-by: Chris Mason <chris.mason@oracle.com>
```
  9ab86c8e
06 Jan, 2009 8 commits

Btrfs: Don't use kmap_atomic(..., KM_IRQ0) during checksum verifies · cc7172de

Chris Mason authored Jan 06, 2009

Checksum verification happens in a helper thread, and there is no
need to mess with interrupts.  This switches to kmap() instead.
Signed-off-by: Chris Mason <chris.mason@oracle.com>

cc7172de

Btrfs: tree logging checksum fixes · 07d400a6

Yan Zheng authored Jan 06, 2009

This patch contains following things.

1) Limit the max size of btrfs_ordered_sum structure to PAGE_SIZE.  This
struct is kmalloced so we want to keep it reasonable.

2) Replace copy_extent_csums by btrfs_lookup_csums_range.  This was
duplicated code in tree-log.c

3) Remove replay_one_csum. csum items are replayed at the same time as
   replaying file extents. This guarantees we only replay useful csums.

4) nbytes accounting fix.
Signed-off-by: Yan Zheng <zheng.yan@oracle.com>

07d400a6

Btrfs: don't change file extent's ram_bytes in btrfs_drop_extents · 1ba12553

Yan Zheng authored Jan 06, 2009

btrfs_drop_extents doesn't change file extent's ram_bytes
in the case of booked extent. To be consistent, we should
also not change ram_bytes when truncating existing extent.
Signed-off-by: Yan Zheng <zheng.yan@oracle.com>

1ba12553

Btrfs: Use btrfs_join_transaction to avoid deadlocks during snapshot creation · 180591bc

Yan Zheng authored Jan 06, 2009

Snapshot creation happens at a specific time during transaction commit.  We
need to make sure the code called by snapshot creation doesn't wait
for the running transaction to commit.

This changes btrfs_delete_inode and finish_pending_snaps to use
btrfs_join_transaction instead of btrfs_start_transaction to avoid deadlocks.

It would be better if btrfs_delete_inode didn't use the join, but the
call path that triggers it is:

btrfs_commit_transaction->create_pending_snapshots->
create_pending_snapshot->btrfs_lookup_dentry->
fixup_tree_root_location->btrfs_read_fs_root->
btrfs_read_fs_root_no_name->btrfs_orphan_cleanup->iput

This will be fixed in a later patch by moving the orphan cleanup to the
cleaner thread.
Signed-off-by: Chris Mason <chris.mason@oracle.com>

180591bc

Btrfs: drop remaining LINUX_KERNEL_VERSION checks and compat code · 9ca03b99
Chris Mason authored Jan 06, 2009
```
Signed-off-by: Chris Mason <chris.mason@oracle.com>
```
9ca03b99
Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/mason/btrfs-unstable · 860a7a0c
Chris Mason authored Jan 06, 2009

860a7a0c

Btrfs: drop EXPORT symbols from extent_io.c · 43b774ba

Chris Mason authored Jan 05, 2009

They should stay out until this is turned into generic code.
Signed-off-by: Chris Mason <chris.mason@oracle.com>

43b774ba

Btrfs: Fix checkpatch.pl warnings · d397712b

Chris Mason authored Jan 05, 2009

There were many, most are fixed now.  struct-funcs.c generates some warnings
but these are bogus.
Signed-off-by: Chris Mason <chris.mason@oracle.com>

d397712b

05 Jan, 2009 9 commits

Btrfs: Fix free block discard calls down to the block layer · 1f3c79a2

Liu Hui authored Jan 05, 2009

This is a patch to fix discard semantic to make Btrfs work with FTL and SSD.
We can improve FTL's performance by telling it which sectors are freed by file
system. But if we don't tell FTL the information of free sectors in proper
time, the transaction mechanism of Btrfs will be destroyed and Btrfs could not
roll back the previous transaction under the power loss condition.

There are some problems in the old implementation:
1, In __free_extent(), the pinned down extents should not be discarded.
2, In free_extents(), the free extents are all pinned, so they need to
be discarded in transaction committing time instead of free_extents().
3, The reserved extent used by log tree should be discard too.

This patch change discard behavior as follows:
1, For the extents which need to be free at once,
   we discard them in update_block_group().
2, Delay discarding the pinned extent in btrfs_finish_extent_commit()
   when committing transaction.
3, Remove discarding from free_extents() and __free_extent()
4, Add discard interface into btrfs_free_reserved_extent()
5, Discard sectors before updating the free space cache, otherwise,
   FTL will destroy file system data.

1f3c79a2

Btrfs: avoid orphan inode caused by log replay · ec051c0f

Yan Zheng authored Jan 05, 2009

drop_one_dir_item does not properly update inode's link count. It can be
reproduced by executing following commands:

#touch test
#sync
#rm -f test
#dd if=/dev/zero bs=4k count=1 of=test conv=fsync
#echo b > /proc/sysrq-trigger

This fixes it by adding an BTRFS_ORPHAN_ITEM_KEY for the inode
Signed-off-by: Yan Zheng <zheng.yan@oracle.com>

ec051c0f

Btrfs: avoid potential super block corruption · 2d69a0f8

Yan Zheng authored Jan 05, 2009

The data in fs_info->super_for_commit are zeros before the
first transaction commit. If tree log sync and system crash
both occur before the first transaction commit, super block
will get corrupted.

This fixes it by properly filling in the super_for_commit field at
open time.
Signed-off-by: Yan Zheng <zheng.yan@oracle.com>

2d69a0f8

Btrfs: do not call kfree if kmalloc failed in btrfs_sysfs_add_super · dd3fd8bd
Shen Feng authored Jan 05, 2009
```
Signed-off-by: Shen Feng <shen@cn.fujitsu.com>
```
dd3fd8bd

Btrfs: fix a memory leak in btrfs_get_sb · 1f483660

Shen Feng authored Jan 05, 2009

subvol_name should be freed if error occurs.
Signed-off-by: Shen Feng <shen@cn.fujitsu.com>

1f483660

Btrfs: Fix typo in clear_state_cb · c584482b

Liu Hui authored Jan 05, 2009

In clear_state_cb, we should check 'tree->ops->clear_bit_hook' instead
of 'tree->ops->set_bit_hook'.
Signed-off-by: Chris Mason <chris.mason@oracle.com>

c584482b

Btrfs: Fix memset length in btrfs_file_write · 9aead435
yanhai zhu authored Jan 05, 2009
```
Signed-off-by: Chris Mason <chris.mason@oracle.com>
```
9aead435

Btrfs: update directory's size when creating subvol/snapshot · 52c26179

Yan Zheng authored Jan 05, 2009

Make sure directory's size properly updated when creating
subvol/snapshot.
Signed-off-by: Yan Zheng <zheng.yan@oracle.com>

52c26179

Btrfs: add permission checks to the ioctls · e441d54d

Chris Mason authored Jan 05, 2009

Only root can add/remove devices
Only root can defrag subtrees
Only files open for writing can be defragged
Only files open for writing can be the destination for a clone
Signed-off-by: Chris Mason <chris.mason@oracle.com>

e441d54d