Commits · 1c691b330a19a1344df89bcb0f4cacd99e8b289a · nexedi / linux

29 Mar, 2012 4 commits

Merge branch 'for-chris' of git://github.com/idryomov/btrfs-unstable into for-linus · 1c691b33
Chris Mason authored Mar 28, 2012

1c691b33

Merge branch 'error-handling' into for-linus · 1d4284bd

Chris Mason authored Mar 28, 2012

Conflicts:
	fs/btrfs/ctree.c
	fs/btrfs/disk-io.c
	fs/btrfs/extent-tree.c
	fs/btrfs/extent_io.c
	fs/btrfs/extent_io.h
	fs/btrfs/inode.c
	fs/btrfs/scrub.c
Signed-off-by: Chris Mason <chris.mason@oracle.com>

1d4284bd

btrfs: disallow unequal data/metadata blocksize for mixed block groups · 65139ed9

David Sterba authored Feb 17, 2012

With support for bigger metadata blocks, we must avoid mounting a
filesystem with different block size for mixed block groups, this causes
corruption (found by xfstests/083).
Signed-off-by: David Sterba <dsterba@suse.cz>

65139ed9

Btrfs: enhance superblock sanity checks · fcd1f065

David Sterba authored Mar 06, 2012

Validate checksum algorithm during mount and prevent BUG_ON later in
btrfs_super_csum_size.
Signed-off-by: David Sterba <dsterba@suse.cz>

fcd1f065

27 Mar, 2012 15 commits

Btrfs: change scrub to support big blocks · b5d67f64

Stefan Behrens authored Mar 27, 2012

Scrub used to be coded for nodesize == leafsize == sectorsize == PAGE_SIZE.
This is now changed to support sizes for nodesize and leafsize which are
N * PAGE_SIZE.
Signed-off-by: Stefan Behrens <sbehrens@giantdisaster.de>
Signed-off-by: Chris Mason <chris.mason@oracle.com>

b5d67f64

Btrfs: minor cleanup in scrub · 1623edeb

Stefan Behrens authored Mar 27, 2012

Just a minor cleanup commit in preparation for the big block changes.
Signed-off-by: Stefan Behrens <sbehrens@giantdisaster.de>
Signed-off-by: Chris Mason <chris.mason@oracle.com>

1623edeb

Btrfs: introduce common define for max number of mirrors · 94598ba8

Stefan Behrens authored Mar 27, 2012

Readahead already has a define for the max number of mirrors. Scrub
needs such a define now, the rest of the code will need something
like this soon. Therefore the define was added to ctree.h and removed
from the readahead code.
Signed-off-by: Stefan Behrens <sbehrens@giantdisaster.de>
Signed-off-by: Chris Mason <chris.mason@oracle.com>

94598ba8

Btrfs: fix infinite loop in btrfs_shrink_device() · 213e64da

Ilya Dryomov authored Mar 27, 2012

If relocate of block group 0 fails with ENOSPC we end up infinitely
looping because key.offset -= 1 statement in that case brings us back to
where we started.
Signed-off-by: Ilya Dryomov <idryomov@gmail.com>

213e64da

Btrfs: fix memory leak in resolver code · 5eb56d25

Ilya Dryomov authored Mar 27, 2012

init_ipath() allocates btrfs_data_container which is never freed.  Free
it in free_ipath() and nuke the comment for init_data_container() - we
can safely free it with kfree().
Signed-off-by: Ilya Dryomov <idryomov@gmail.com>

5eb56d25

Btrfs: allow dup for data chunks in mixed mode · e4837f8f

Ilya Dryomov authored Mar 27, 2012

Generally we don't allow dup for data, but mixed chunks are special and
people seem to think this has its use cases.
Signed-off-by: Ilya Dryomov <idryomov@gmail.com>

e4837f8f

Btrfs: validate target profiles only if we are going to use them · 6728b198

Ilya Dryomov authored Mar 27, 2012

Do not run sanity checks on all target profiles unless they all will be
used.  This came up because alloc_profile_is_valid() is now more strict
than it used to be.
Signed-off-by: Ilya Dryomov <idryomov@gmail.com>

6728b198

Btrfs: improve the logic in btrfs_can_relocate() · 4a5e98f5

Ilya Dryomov authored Mar 27, 2012

Currently if we don't have enough space allocated we go ahead and loop
though devices in the hopes of finding enough space for a chunk of the
*same* type as the one we are trying to relocate.  The problem with that
is that if we are trying to restripe the chunk its target type can be
more relaxed than the current one (eg require less devices or less
space).  So, when restriping, run checks against the target profile
instead of the current one.
Signed-off-by: Ilya Dryomov <idryomov@gmail.com>

4a5e98f5

Btrfs: add __get_block_group_index() helper · 7738a53a

Ilya Dryomov authored Mar 27, 2012

Add __get_block_group_index() helper to be able to derive block group
index from an arbitary set of flags.  Implement get_block_group_index()
in terms of it.
Signed-off-by: Ilya Dryomov <idryomov@gmail.com>

7738a53a

Btrfs: add get_restripe_target() helper · fc67c450

Ilya Dryomov authored Mar 27, 2012

Add get_restripe_target() helper and switch everybody to use it.
Signed-off-by: Ilya Dryomov <idryomov@gmail.com>

fc67c450

Btrfs: move alloc_profile_is_valid() to volumes.c · 0c460c0d

Ilya Dryomov authored Mar 27, 2012

Header file is not a good place to define functions.  This also moves a
call to alloc_profile_is_valid() down the stack and removes a redundant
check from __btrfs_alloc_chunk() - alloc_profile_is_valid() takes it
into account.
Signed-off-by: Ilya Dryomov <idryomov@gmail.com>

0c460c0d

Btrfs: make profile_is_valid() check more strict · e8920a64

Ilya Dryomov authored Mar 27, 2012

"0" is a valid value for an on-disk chunk profile, but it is not a valid
extended profile.  (We have a separate bit for single chunks in extended
case)

Also rename it to alloc_profile_is_valid() for clarity.
Signed-off-by: Ilya Dryomov <idryomov@gmail.com>

e8920a64

Btrfs: add wrappers for working with alloc profiles · 899c81ea

Ilya Dryomov authored Mar 27, 2012

Add functions to abstract the conversion between chunk and extended
allocation profile formats and switch everybody to use them.
Signed-off-by: Ilya Dryomov <idryomov@gmail.com>

899c81ea

Btrfs: stop silently switching single chunks to raid0 on balance · e3176ca2

Ilya Dryomov authored Mar 27, 2012

This has been causing a lot of confusion for quite a while now and a lot
of users were surprised by this (some of them were even stuck in a
ENOSPC situation which they couldn't easily get out of). The addition
of restriper gives users a clear choice between raid0 and drive concat
setup so there's absolutely no excuse for us to keep doing this.
Signed-off-by: Ilya Dryomov <idryomov@gmail.com>

e3176ca2

Btrfs: deal with read errors on extent buffers differently · ea466794

Josef Bacik authored Mar 26, 2012

Since we need to read and write extent buffers in their entirety we can't use
the normal bio_readpage_error stuff since it only works on a per page basis. So
instead make it so that if we see an io error in endio we just mark the eb as
having an IO error and then in btree_read_extent_buffer_pages we will manually
try other mirrors and then overwrite the bad mirror if we find a good copy.
This works with larger than page size blocks. Thanks,
Signed-off-by: Josef Bacik <josef@redhat.com>
Signed-off-by: Chris Mason <chris.mason@oracle.com>

ea466794

26 Mar, 2012 12 commits

Btrfs: don't use threaded IO completion helpers for metadata writes · f3f266ab

Chris Mason authored Mar 23, 2012

The metadata write IO completion code is now simple enough that we
don't need the threaded helpers anymore.
Signed-off-by: Chris Mason <chris.mason@oracle.com>

f3f266ab

Btrfs: adjust the write_lock_level as we unlock · f7c79f30

Chris Mason authored Mar 19, 2012

btrfs_search_slot sometimes needs write locks on high levels of
the tree.  It remembers the highest level that needs a write lock
and will use that for all future searches through the tree in a given
call.

But, very often we'll just cow the top level or the level below and we
won't really need write locks on the root again after that.  This patch
changes things to adjust the write lock requirement as it unlocks
levels.
Signed-off-by: Chris Mason <chris.mason@oracle.com>

f7c79f30

Btrfs: loop waiting on writeback · a098d8e8

Chris Mason authored Mar 21, 2012

lock_extent_buffer_for_io needs to loop around and make sure the
writeback bits are not set.
Signed-off-by: Chris Mason <chris.mason@oracle.com>

a098d8e8

Btrfs: add the ability to cache a pointer into the eb · cfed81a0

Chris Mason authored Mar 03, 2012

This cuts down on the CPU time used by map_private_extent_buffer
Signed-off-by: Chris Mason <chris.mason@oracle.com>

cfed81a0

Btrfs: ensure an entire eb is written at once · 0b32f4bb

Josef Bacik authored Mar 13, 2012

This patch simplifies how we track our extent buffers. Previously we could exit
writepages with only having written half of an extent buffer, which meant we had
to track the state of the pages and the state of the extent buffers differently.
Now we only read in entire extent buffers and write out entire extent buffers,
this allows us to simply set bits in our bflags to indicate the state of the eb
and we no longer have to do things like track uptodate with our iotree. Thanks,
Signed-off-by: Josef Bacik <josef@redhat.com>
Signed-off-by: Chris Mason <chris.mason@oracle.com>

0b32f4bb

Btrfs: introduce mark_extent_buffer_accessed · 5df4235e

Josef Bacik authored Mar 15, 2012

Because an eb can have multiple pages we need to make sure that all pages within
the eb are markes as accessed, since releasepage can be called against any page
in the eb. This will keep us from possibly evicting hot eb's when we're doing
larger than pagesize eb's. Thanks,
Signed-off-by: Josef Bacik <josef@redhat.com>

5df4235e

Btrfs: introduce free_extent_buffer_stale · 3083ee2e

Josef Bacik authored Mar 09, 2012

Because btrfs cow's we can end up with extent buffers that are no longer
necessary just sitting around in memory. So instead of evicting these pages, we
could end up evicting things we actually care about. Thus we have
free_extent_buffer_stale for use when we are freeing tree blocks. This will
make it so that the ref for the eb being in the radix tree is dropped as soon as
possible and then is freed when the refcount hits 0 instead of waiting to be
released by releasepage. Thanks,
Signed-off-by: Josef Bacik <josef@redhat.com>

3083ee2e

Btrfs: only use the existing eb if it's count isn't 0 · 115391d2

Josef Bacik authored Mar 09, 2012

We can run into a problem where we find an eb for our existing page already on
the radix tree but it has a ref count of 0. It hasn't yet been removed by RCU
yet so this can cause issues where we will use the EB after free. So do
atomic_inc_not_zero on the exists->refs and if it is zero just do
synchronize_rcu() and try again. We won't have to worry about new allocators
coming in since they will block on the page lock at this point. Thanks,
Signed-off-by: Josef Bacik <josef@redhat.com>

115391d2

Btrfs: set page->private to the eb · 4f2de97a

Josef Bacik authored Mar 07, 2012

We spend a lot of time looking up extent buffers from pages when we could just
store the pointer to the eb the page is associated with in page->private. This
patch does just that, and it makes things a little simpler and reduces a bit of
CPU overhead involved with doing metadata IO. Thanks,
Signed-off-by: Josef Bacik <josef@redhat.com>

4f2de97a

Btrfs: allow metadata blocks larger than the page size · 727011e0

Chris Mason authored Aug 06, 2010

A few years ago the btrfs code to support blocks lager than
the page size was disabled to fix a few corner cases in the
page cache handling.  This fixes the code to properly support
large metadata blocks again.

Since current kernels will crash early and often with larger
metadata blocks, this adds an incompat bit so that older kernels
can't mount it.

This also does away with different blocksizes for nodes and leaves.
You get a single block size for all tree blocks.
Signed-off-by: Chris Mason <chris.mason@oracle.com>

727011e0

Btrfs: remove search_start and search_end from find_free_extent and callers · 81c9ad23

Josef Bacik authored Jan 18, 2012

We have been passing nothing but (u64)-1 to find_free_extent for search_end in
all of the callers, so it's completely useless, and we've always been passing 0
in as search_start, so just remove them as function arguments and move
search_start into find_free_extent. Thanks,
Signed-off-by: Josef Bacik <josef@redhat.com>

81c9ad23

Btrfs: remove the ideal caching code · 285ff5af

Josef Bacik authored Jan 13, 2012

This is a relic from before we had the disk space cache and it was to make
bootup times when you had btrfs as root not be so damned slow. Now that we have
the disk space cache this isn't a problem anymore and really having this code
casues uneeded fragmentation and complexity, so just remove it. Thanks,
Signed-off-by: Josef Bacik <josef@redhat.com>

285ff5af

22 Mar, 2012 9 commits

btrfs: Fix busyloop in transaction_kthread() · 914b2007

Jan Kara authored Mar 12, 2012

When a filesystem got aborted due do error, transaction_kthread() will
busyloop.  Fix it by going to sleep in that case as well. Maybe we should
just stop transaction_kthread() when filesystem is aborted but that would be
more complex.
Signed-off-by: Jan Kara <jack@suse.cz>

914b2007

btrfs: replace many BUG_ONs with proper error handling · 79787eaa

Jeff Mahoney authored Mar 12, 2012

 btrfs currently handles most errors with BUG_ON. This patch is a work-in-
 progress but aims to handle most errors other than internal logic
 errors and ENOMEM more gracefully.

 This iteration prevents most crashes but can run into lockups with
 the page lock on occasion when the timing "works out."
Signed-off-by: Jeff Mahoney <jeffm@suse.com>

79787eaa

btrfs: enhance transaction abort infrastructure · 49b25e05
Jeff Mahoney authored Mar 01, 2012
```
Signed-off-by: Jeff Mahoney <jeffm@suse.com>
```
49b25e05

btrfs: add varargs to btrfs_error · 4da35113

Jeff Mahoney authored Mar 01, 2012

 btrfs currently handles most errors with BUG_ON. This patch is a work-in-
 progress but aims to handle most errors other than internal logic
 errors and ENOMEM more gracefully.

 This iteration prevents most crashes but can run into lockups with
 the page lock on occasion when the timing "works out."
Signed-off-by: Jeff Mahoney <jeffm@suse.com>

4da35113

btrfs: Remove BUG_ON from __finish_chunk_alloc() · 3acd3953

Mark Fasheh authored Sep 08, 2011

btrfs_alloc_chunk() unconditionally BUGs on any error returned from
__finish_chunk_alloc() so there's no need for two BUG_ON lines. Remove the
one from __finish_chunk_alloc().
Signed-off-by: Mark Fasheh <mfasheh@suse.de>

3acd3953

btrfs: Remove BUG_ON from __btrfs_alloc_chunk() · 1dd4602f

Mark Fasheh authored Sep 08, 2011

We BUG_ON() error from add_extent_mapping(), but that error looks pretty
easy to bubble back up - as far as I can tell there have not been any
permanent modifications to fs state at that point.
Signed-off-by: Mark Fasheh <mfasheh@suse.de>

1dd4602f

btrfs: Don't BUG_ON insert errors in btrfs_alloc_dev_extent() · 2cdcecbc

Mark Fasheh authored Sep 08, 2011

The only caller of btrfs_alloc_dev_extent() is __btrfs_alloc_chunk() which
already bugs on any error returned. We can remove the BUG_ON's in
btrfs_alloc_dev_extent() then since __btrfs_alloc_chunk() will "catch" them
anyway.
Signed-off-by: Mark Fasheh <mfasheh@suse.de>

2cdcecbc

btrfs: Go readonly on tree errors in balance_level · 305a26af

Mark Fasheh authored Sep 01, 2011

balace_level() seems to deal with missing tree nodes by BUG_ON(). Instead,
we can easily just set the file system readonly and bubble -EROFS back up
the stack.
Signed-off-by: Mark Fasheh <mfasheh@suse.com>

305a26af

btrfs: Don't BUG_ON errors from update_ref_for_cow() · b68dc2a9

Mark Fasheh authored Aug 29, 2011

__btrfs_cow_block(), the only caller of update_ref_for_cow() will BUG_ON()
any error return.  Instead, we can go read-only fs as update_ref_for_cow()
manipulates disk data in a way which doesn't look like it's easily rolled
back.
Signed-off-by: Mark Fasheh <mfasheh@suse.de>

b68dc2a9