Commit 64d2c847 authored by Naohiro Aota's avatar Naohiro Aota Committed by David Sterba

btrfs: zoned: fix calc_available_free_space() for zoned mode

calc_available_free_space() returns the total size of metadata (or
system) block groups, which can be allocated from unallocated disk
space. The logic is wrong on zoned mode in two places.

First, the calculation of data_chunk_size is wrong. We always allocate
one zone as one chunk, and no partial allocation of a zone. So, we
should use zone_size (= data_sinfo->chunk_size) as it is.

Second, the result "avail" may not be zone aligned. Since we always
allocate one zone as one chunk on zoned mode, returning non-zone size
aligned bytes will result in less pressure on the async metadata reclaim
process.

This is serious for the nearly full state with a large zone size device.
Allowing over-commit too much will result in less async reclaim work and
end up in ENOSPC. We can align down to the zone size to avoid that.

Fixes: cb6cbab7 ("btrfs: adjust overcommit logic when very close to full")
CC: stable@vger.kernel.org # 6.9
Signed-off-by: default avatarNaohiro Aota <naohiro.aota@wdc.com>
Reviewed-by: default avatarBoris Burkov <boris@bur.io>
Signed-off-by: default avatarDavid Sterba <dsterba@suse.com>
parent 48f091fd
...@@ -373,11 +373,18 @@ static u64 calc_available_free_space(struct btrfs_fs_info *fs_info, ...@@ -373,11 +373,18 @@ static u64 calc_available_free_space(struct btrfs_fs_info *fs_info,
* "optimal" chunk size based on the fs size. However when we actually * "optimal" chunk size based on the fs size. However when we actually
* allocate the chunk we will strip this down further, making it no more * allocate the chunk we will strip this down further, making it no more
* than 10% of the disk or 1G, whichever is smaller. * than 10% of the disk or 1G, whichever is smaller.
*
* On the zoned mode, we need to use zone_size (=
* data_sinfo->chunk_size) as it is.
*/ */
data_sinfo = btrfs_find_space_info(fs_info, BTRFS_BLOCK_GROUP_DATA); data_sinfo = btrfs_find_space_info(fs_info, BTRFS_BLOCK_GROUP_DATA);
if (!btrfs_is_zoned(fs_info)) {
data_chunk_size = min(data_sinfo->chunk_size, data_chunk_size = min(data_sinfo->chunk_size,
mult_perc(fs_info->fs_devices->total_rw_bytes, 10)); mult_perc(fs_info->fs_devices->total_rw_bytes, 10));
data_chunk_size = min_t(u64, data_chunk_size, SZ_1G); data_chunk_size = min_t(u64, data_chunk_size, SZ_1G);
} else {
data_chunk_size = data_sinfo->chunk_size;
}
/* /*
* Since data allocations immediately use block groups as part of the * Since data allocations immediately use block groups as part of the
...@@ -405,6 +412,17 @@ static u64 calc_available_free_space(struct btrfs_fs_info *fs_info, ...@@ -405,6 +412,17 @@ static u64 calc_available_free_space(struct btrfs_fs_info *fs_info,
avail >>= 3; avail >>= 3;
else else
avail >>= 1; avail >>= 1;
/*
* On the zoned mode, we always allocate one zone as one chunk.
* Returning non-zone size alingned bytes here will result in
* less pressure for the async metadata reclaim process, and it
* will over-commit too much leading to ENOSPC. Align down to the
* zone size to avoid that.
*/
if (btrfs_is_zoned(fs_info))
avail = ALIGN_DOWN(avail, fs_info->zone_size);
return avail; return avail;
} }
......
Markdown is supported
0%
or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment