Commit d7a1fee1 authored by Dan Ehrenberg's avatar Dan Ehrenberg Committed by Theodore Ts'o

ext4: make the preallocation size be a multiple of stripe size

Previously, if a stripe width was provided, then it would be used
as the preallocation granularity, with no santiy checking and no
way to override this. Now, mb_prealloc_size defaults to the smallest
multiple of stripe size that is greater than or equal to the old
default mb_prealloc_size, and this can be overridden with the sysfs
interface.
Signed-off-by: default avatarDan Ehrenberg <dehrenberg@google.com>
Signed-off-by: default avatar"Theodore Ts'o" <tytso@mit.edu>
parent 265c6a0f
...@@ -128,12 +128,13 @@ ...@@ -128,12 +128,13 @@
* we are doing a group prealloc we try to normalize the request to * we are doing a group prealloc we try to normalize the request to
* sbi->s_mb_group_prealloc. Default value of s_mb_group_prealloc is * sbi->s_mb_group_prealloc. Default value of s_mb_group_prealloc is
* 512 blocks. This can be tuned via * 512 blocks. This can be tuned via
* /sys/fs/ext4/<partition/mb_group_prealloc. The value is represented in * /sys/fs/ext4/<partition>/mb_group_prealloc. The value is represented in
* terms of number of blocks. If we have mounted the file system with -O * terms of number of blocks. If we have mounted the file system with -O
* stripe=<value> option the group prealloc request is normalized to the * stripe=<value> option the group prealloc request is normalized to the
* stripe value (sbi->s_stripe) * the smallest multiple of the stripe value (sbi->s_stripe) which is
* greater than the default mb_group_prealloc.
* *
* The regular allocator(using the buddy cache) supports few tunables. * The regular allocator (using the buddy cache) supports a few tunables.
* *
* /sys/fs/ext4/<partition>/mb_min_to_scan * /sys/fs/ext4/<partition>/mb_min_to_scan
* /sys/fs/ext4/<partition>/mb_max_to_scan * /sys/fs/ext4/<partition>/mb_max_to_scan
...@@ -2474,6 +2475,18 @@ int ext4_mb_init(struct super_block *sb, int needs_recovery) ...@@ -2474,6 +2475,18 @@ int ext4_mb_init(struct super_block *sb, int needs_recovery)
sbi->s_mb_stream_request = MB_DEFAULT_STREAM_THRESHOLD; sbi->s_mb_stream_request = MB_DEFAULT_STREAM_THRESHOLD;
sbi->s_mb_order2_reqs = MB_DEFAULT_ORDER2_REQS; sbi->s_mb_order2_reqs = MB_DEFAULT_ORDER2_REQS;
sbi->s_mb_group_prealloc = MB_DEFAULT_GROUP_PREALLOC; sbi->s_mb_group_prealloc = MB_DEFAULT_GROUP_PREALLOC;
/*
* If there is a s_stripe > 1, then we set the s_mb_group_prealloc
* to the lowest multiple of s_stripe which is bigger than
* the s_mb_group_prealloc as determined above. We want
* the preallocation size to be an exact multiple of the
* RAID stripe size so that preallocations don't fragment
* the stripes.
*/
if (sbi->s_stripe > 1) {
sbi->s_mb_group_prealloc = roundup(
sbi->s_mb_group_prealloc, sbi->s_stripe);
}
sbi->s_locality_groups = alloc_percpu(struct ext4_locality_group); sbi->s_locality_groups = alloc_percpu(struct ext4_locality_group);
if (sbi->s_locality_groups == NULL) { if (sbi->s_locality_groups == NULL) {
...@@ -2841,8 +2854,9 @@ ext4_mb_mark_diskspace_used(struct ext4_allocation_context *ac, ...@@ -2841,8 +2854,9 @@ ext4_mb_mark_diskspace_used(struct ext4_allocation_context *ac,
/* /*
* here we normalize request for locality group * here we normalize request for locality group
* Group request are normalized to s_strip size if we set the same via mount * Group request are normalized to s_mb_group_prealloc, which goes to
* option. If not we set it to s_mb_group_prealloc which can be configured via * s_strip if we set the same via mount option.
* s_mb_group_prealloc can be configured via
* /sys/fs/ext4/<partition>/mb_group_prealloc * /sys/fs/ext4/<partition>/mb_group_prealloc
* *
* XXX: should we try to preallocate more than the group has now? * XXX: should we try to preallocate more than the group has now?
...@@ -2853,10 +2867,7 @@ static void ext4_mb_normalize_group_request(struct ext4_allocation_context *ac) ...@@ -2853,10 +2867,7 @@ static void ext4_mb_normalize_group_request(struct ext4_allocation_context *ac)
struct ext4_locality_group *lg = ac->ac_lg; struct ext4_locality_group *lg = ac->ac_lg;
BUG_ON(lg == NULL); BUG_ON(lg == NULL);
if (EXT4_SB(sb)->s_stripe) ac->ac_g_ex.fe_len = EXT4_SB(sb)->s_mb_group_prealloc;
ac->ac_g_ex.fe_len = EXT4_SB(sb)->s_stripe;
else
ac->ac_g_ex.fe_len = EXT4_SB(sb)->s_mb_group_prealloc;
mb_debug(1, "#%u: goal %u blocks for locality group\n", mb_debug(1, "#%u: goal %u blocks for locality group\n",
current->pid, ac->ac_g_ex.fe_len); current->pid, ac->ac_g_ex.fe_len);
} }
......
Markdown is supported
0%
or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment