Commit baf92114 authored by Hans van Kranenburg's avatar Hans van Kranenburg Committed by David Sterba

btrfs: alloc_chunk: fix more DUP stripe size handling

Commit 92e222df "btrfs: alloc_chunk: fix DUP stripe size handling"
fixed calculating the stripe_size for a new DUP chunk.

However, the same calculation reappears a bit later, and that one was
not changed yet. The resulting bug that is exposed is that the newly
allocated device extents ('stripes') can have a few MiB overlap with the
next thing stored after them, which is another device extent or the end
of the disk.

The scenario in which this can happen is:
* The block device for the filesystem is less than 10GiB in size.
* The amount of contiguous free unallocated disk space chosen to use for
  chunk allocation is 20% of the total device size, or a few MiB more or
  less.

An example:
- The filesystem device is 7880MiB (max_chunk_size gets set to 788MiB)
- There's 1578MiB unallocated raw disk space left in one contiguous
  piece.

In this case stripe_size is first calculated as 789MiB, (half of
1578MiB).

Since 789MiB (stripe_size * data_stripes) > 788MiB (max_chunk_size), we
enter the if block. Now stripe_size value is immediately overwritten
while calculating an adjusted value based on max_chunk_size, which ends
up as 788MiB.

Next, the value is rounded up to a 16MiB boundary, 800MiB, which is
actually more than the value we had before. However, the last comparison
fails to detect this, because it's comparing the value with the total
amount of free space, which is about twice the size of stripe_size.

In the example above, this means that the resulting raw disk space being
allocated is 1600MiB, while only a gap of 1578MiB has been found. The
second device extent object for this DUP chunk will overlap for 22MiB
with whatever comes next.

The underlying problem here is that the stripe_size is reused all the
time for different things. So, when entering the code in the if block,
stripe_size is immediately overwritten with something else. If later we
decide we want to have the previous value back, then the logic to
compute it was copy pasted in again.

With this change, the value in stripe_size is not unnecessarily
destroyed, so the duplicated calculation is not needed any more.
Signed-off-by: default avatarHans van Kranenburg <hans.van.kranenburg@mendix.com>
Signed-off-by: default avatarDavid Sterba <dsterba@suse.com>
parent 23f0ff1e
...@@ -4794,19 +4794,17 @@ static int __btrfs_alloc_chunk(struct btrfs_trans_handle *trans, ...@@ -4794,19 +4794,17 @@ static int __btrfs_alloc_chunk(struct btrfs_trans_handle *trans,
/* /*
* Use the number of data stripes to figure out how big this chunk * Use the number of data stripes to figure out how big this chunk
* is really going to be in terms of logical address space, * is really going to be in terms of logical address space,
* and compare that answer with the max chunk size * and compare that answer with the max chunk size. If it's higher,
* we try to reduce stripe_size.
*/ */
if (stripe_size * data_stripes > max_chunk_size) { if (stripe_size * data_stripes > max_chunk_size) {
stripe_size = div_u64(max_chunk_size, data_stripes);
/* bump the answer up to a 16MB boundary */
stripe_size = round_up(stripe_size, SZ_16M);
/* /*
* But don't go higher than the limits we found while searching * Reduce stripe_size, round it up to a 16MB boundary again and
* for free extents * then use it, unless it ends up being even bigger than the
* previous value we had already.
*/ */
stripe_size = min(devices_info[ndevs - 1].max_avail, stripe_size = min(round_up(div_u64(max_chunk_size,
data_stripes), SZ_16M),
stripe_size); stripe_size);
} }
......
Markdown is supported
0%
or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment