Commit 1db7959a authored by Qu Wenruo's avatar Qu Wenruo Committed by David Sterba

btrfs: do not wait for short bulk allocation

[BUG]
There is a recent report that when memory pressure is high (including
cached pages), btrfs can spend most of its time on memory allocation in
btrfs_alloc_page_array() for compressed read/write.

[CAUSE]
For btrfs_alloc_page_array() we always go alloc_pages_bulk_array(), and
even if the bulk allocation failed (fell back to single page
allocation) we still retry but with extra memalloc_retry_wait().

If the bulk alloc only returned one page a time, we would spend a lot of
time on the retry wait.

The behavior was introduced in commit 395cb57e ("btrfs: wait between
incomplete batch memory allocations").

[FIX]
Although the commit mentioned that other filesystems do the wait, it's
not the case at least nowadays.

All the mainlined filesystems only call memalloc_retry_wait() if they
failed to allocate any page (not only for bulk allocation).
If there is any progress, they won't call memalloc_retry_wait() at all.

For example, xfs_buf_alloc_pages() would only call memalloc_retry_wait()
if there is no allocation progress at all, and the call is not for
metadata readahead.

So I don't believe we should call memalloc_retry_wait() unconditionally
for short allocation.

Call memalloc_retry_wait() if it fails to allocate any page for tree
block allocation (which goes with __GFP_NOFAIL and may not need the
special handling anyway), and reduce the latency for
btrfs_alloc_page_array().
Reported-by: default avatarJulian Taylor <julian.taylor@1und1.de>
Tested-by: default avatarJulian Taylor <julian.taylor@1und1.de>
Link: https://lore.kernel.org/all/8966c095-cbe7-4d22-9784-a647d1bf27c3@1und1.de/
Fixes: 395cb57e ("btrfs: wait between incomplete batch memory allocations")
CC: stable@vger.kernel.org # 6.1+
Reviewed-by: default avatarSweet Tea Dorminy <sweettea-kernel@dorminy.me>
Reviewed-by: default avatarFilipe Manana <fdmanana@suse.com>
Signed-off-by: default avatarQu Wenruo <wqu@suse.com>
Reviewed-by: default avatarDavid Sterba <dsterba@suse.com>
Signed-off-by: default avatarDavid Sterba <dsterba@suse.com>
parent 073bda7a
...@@ -681,31 +681,21 @@ static void end_bbio_data_read(struct btrfs_bio *bbio) ...@@ -681,31 +681,21 @@ static void end_bbio_data_read(struct btrfs_bio *bbio)
int btrfs_alloc_page_array(unsigned int nr_pages, struct page **page_array, int btrfs_alloc_page_array(unsigned int nr_pages, struct page **page_array,
gfp_t extra_gfp) gfp_t extra_gfp)
{ {
const gfp_t gfp = GFP_NOFS | extra_gfp;
unsigned int allocated; unsigned int allocated;
for (allocated = 0; allocated < nr_pages;) { for (allocated = 0; allocated < nr_pages;) {
unsigned int last = allocated; unsigned int last = allocated;
allocated = alloc_pages_bulk_array(GFP_NOFS | extra_gfp, allocated = alloc_pages_bulk_array(gfp, nr_pages, page_array);
nr_pages, page_array); if (unlikely(allocated == last)) {
/* No progress, fail and do cleanup. */
if (allocated == nr_pages)
return 0;
/*
* During this iteration, no page could be allocated, even
* though alloc_pages_bulk_array() falls back to alloc_page()
* if it could not bulk-allocate. So we must be out of memory.
*/
if (allocated == last) {
for (int i = 0; i < allocated; i++) { for (int i = 0; i < allocated; i++) {
__free_page(page_array[i]); __free_page(page_array[i]);
page_array[i] = NULL; page_array[i] = NULL;
} }
return -ENOMEM; return -ENOMEM;
} }
memalloc_retry_wait(GFP_NOFS);
} }
return 0; return 0;
} }
......
Markdown is supported
0%
or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment