Commit 600e19af authored by Roman Gushchin's avatar Roman Gushchin Committed by Linus Torvalds

mm: use only per-device readahead limit

Maximal readahead size is limited now by two values:
 1) by global 2Mb constant (MAX_READAHEAD in max_sane_readahead())
 2) by configurable per-device value* (bdi->ra_pages)

There are devices, which require custom readahead limit.
For instance, for RAIDs it's calculated as number of devices
multiplied by chunk size times 2.

Readahead size can never be larger than bdi->ra_pages * 2 value
(POSIX_FADV_SEQUNTIAL doubles readahead size).

If so, why do we need two limits?
I suggest to completely remove this max_sane_readahead() stuff and
use per-device readahead limit everywhere.

Also, using right readahead size for RAID disks can significantly
increase i/o performance:

before:
  dd if=/dev/md2 of=/dev/null bs=100M count=100
  100+0 records in
  100+0 records out
  10485760000 bytes (10 GB) copied, 12.9741 s, 808 MB/s

after:
  $ dd if=/dev/md2 of=/dev/null bs=100M count=100
  100+0 records in
  100+0 records out
  10485760000 bytes (10 GB) copied, 8.91317 s, 1.2 GB/s

(It's an 8-disks RAID5 storage).

This patch doesn't change sys_readahead and madvise(MADV_WILLNEED)
behavior introduced by 6d2be915 ("mm/readahead.c: fix readahead
failure for memoryless NUMA nodes and limit readahead pages").
Signed-off-by: default avatarRoman Gushchin <klamm@yandex-team.ru>
Cc: Raghavendra K T <raghavendra.kt@linux.vnet.ibm.com>
Cc: Jan Kara <jack@suse.cz>
Cc: Wu Fengguang <fengguang.wu@intel.com>
Cc: David Rientjes <rientjes@google.com>
Cc: onstantin Khlebnikov <khlebnikov@yandex-team.ru>
Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
parent b171e409
...@@ -2036,8 +2036,6 @@ void page_cache_async_readahead(struct address_space *mapping, ...@@ -2036,8 +2036,6 @@ void page_cache_async_readahead(struct address_space *mapping,
pgoff_t offset, pgoff_t offset,
unsigned long size); unsigned long size);
unsigned long max_sane_readahead(unsigned long nr);
/* Generic expand stack which grows the stack according to GROWS{UP,DOWN} */ /* Generic expand stack which grows the stack according to GROWS{UP,DOWN} */
extern int expand_stack(struct vm_area_struct *vma, unsigned long address); extern int expand_stack(struct vm_area_struct *vma, unsigned long address);
......
...@@ -1807,7 +1807,6 @@ static void do_sync_mmap_readahead(struct vm_area_struct *vma, ...@@ -1807,7 +1807,6 @@ static void do_sync_mmap_readahead(struct vm_area_struct *vma,
struct file *file, struct file *file,
pgoff_t offset) pgoff_t offset)
{ {
unsigned long ra_pages;
struct address_space *mapping = file->f_mapping; struct address_space *mapping = file->f_mapping;
/* If we don't want any read-ahead, don't bother */ /* If we don't want any read-ahead, don't bother */
...@@ -1836,10 +1835,9 @@ static void do_sync_mmap_readahead(struct vm_area_struct *vma, ...@@ -1836,10 +1835,9 @@ static void do_sync_mmap_readahead(struct vm_area_struct *vma,
/* /*
* mmap read-around * mmap read-around
*/ */
ra_pages = max_sane_readahead(ra->ra_pages); ra->start = max_t(long, 0, offset - ra->ra_pages / 2);
ra->start = max_t(long, 0, offset - ra_pages / 2); ra->size = ra->ra_pages;
ra->size = ra_pages; ra->async_size = ra->ra_pages / 4;
ra->async_size = ra_pages / 4;
ra_submit(ra, mapping, file); ra_submit(ra, mapping, file);
} }
......
...@@ -213,7 +213,7 @@ int force_page_cache_readahead(struct address_space *mapping, struct file *filp, ...@@ -213,7 +213,7 @@ int force_page_cache_readahead(struct address_space *mapping, struct file *filp,
if (unlikely(!mapping->a_ops->readpage && !mapping->a_ops->readpages)) if (unlikely(!mapping->a_ops->readpage && !mapping->a_ops->readpages))
return -EINVAL; return -EINVAL;
nr_to_read = max_sane_readahead(nr_to_read); nr_to_read = min(nr_to_read, inode_to_bdi(mapping->host)->ra_pages);
while (nr_to_read) { while (nr_to_read) {
int err; int err;
...@@ -232,16 +232,6 @@ int force_page_cache_readahead(struct address_space *mapping, struct file *filp, ...@@ -232,16 +232,6 @@ int force_page_cache_readahead(struct address_space *mapping, struct file *filp,
return 0; return 0;
} }
#define MAX_READAHEAD ((512*4096)/PAGE_CACHE_SIZE)
/*
* Given a desired number of PAGE_CACHE_SIZE readahead pages, return a
* sensible upper limit.
*/
unsigned long max_sane_readahead(unsigned long nr)
{
return min(nr, MAX_READAHEAD);
}
/* /*
* Set the initial window size, round to next power of 2 and square * Set the initial window size, round to next power of 2 and square
* for small size, x 4 for medium, and x 2 for large * for small size, x 4 for medium, and x 2 for large
...@@ -380,7 +370,7 @@ ondemand_readahead(struct address_space *mapping, ...@@ -380,7 +370,7 @@ ondemand_readahead(struct address_space *mapping,
bool hit_readahead_marker, pgoff_t offset, bool hit_readahead_marker, pgoff_t offset,
unsigned long req_size) unsigned long req_size)
{ {
unsigned long max = max_sane_readahead(ra->ra_pages); unsigned long max = ra->ra_pages;
pgoff_t prev_offset; pgoff_t prev_offset;
/* /*
......
Markdown is supported
0%
or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment