- 23 Sep, 2002 3 commits
- 22 Sep, 2002 32 commits
-
-
Linus Torvalds authored
into home.transmeta.com:/home/torvalds/v2.5/linux
-
Linus Torvalds authored
into home.transmeta.com:/home/torvalds/v2.5/linux
-
Andrew Morton authored
Convert the VM to not wait on other people's dirty data. - If we find a dirty page and its queue is not congested, do some writeback. - If we find a dirty page and its queue _is_ congested then just refile the page. - If we find a PageWriteback page then just refile the page. - There is additional throttling for write(2) callers. Within generic_file_write(), record their backing queue in ->current. Within page reclaim, if this tasks encounters a page which is dirty or under writeback onthis queue, block on it. This gives some more writer throttling and reduces the page refiling frequency. It's somewhat CPU expensive - under really heavy load we only get a 50% reclaim rate in pages coming off the tail of the LRU. This can be fixed by splitting the inactive list into reclaimable and non-reclaimable lists. But the CPU load isn't too bad, and latency is much, much more important in these situations. Example: with `mem=512m', running 4 instances of `dbench 100', 2.5.34 took 35 minutes to compile a kernel. With this patch, it took three minutes, 45 seconds. I haven't done swapcache or MAP_SHARED pages yet. If there's tons of dirty swapcache or mmap data around we still stall heavily in page reclaim. That's less important. This patch also has a tweak for swapless machines: don't even bother bringing anon pages onto the inactive list if there is no swap online.
-
Andrew Morton authored
The key concept here is that pdflush does not block on request queues any more. Instead, it circulates across the queues, keeping any non-congested queues full of write data. When all queues are full, pdflush takes a nap, to be woken when *any* queue exits write congestion. This code can keep sixty spindles saturated - we've never been able to do that before. - Add the `nonblocking' flag to struct writeback_control, and teach the writeback paths to honour it. - Add the `encountered_congestion' flag to struct writeback_control and teach the writeback paths to set it. So as soon as a mapping's backing_dev_info indicates that it is getting congested, bale out of writeback. And don't even start writeback against filesystems whose queues are congested. - Convert pdflush's background_writeback() function to use nonblocking writeback. This way, a single pdflush thread will circulate around all the dirty queues, keeping them filled. - Convert the pdlfush `kupdate' function to do the same thing. This solves the problem of pdflush thread pool exhaustion. It solves the problem of pdflush startup latency. It solves the (minor) problem wherein `kupdate' writeback only writes back a single disk at a time (it was getting blocked on each queue in turn). It probably means that we only ever need a single pdflush thread.
-
Andrew Morton authored
Use the new queue congestion detector in ext2_preread_inode(). Don't try the speculative read if the read queue is congested. Also, don't try it if the disk is write-congested. Presumably it is more important to get the dirty memory cleaned out.
-
Andrew Morton authored
The patch provides a means for the VM to be able to determine whether a request queue is in a "congested" state. If it is congested, then a write to (or read from) the queue may cause blockage in get_request_wait(). So the VM can do: if (!bdi_write_congested(page->mapping->backing_dev_info)) writepage(page); This is not exact. The code assumes that if the request queue still has 1/4 of its capacity (queue_nr_requests) available then a request will be non-blocking. There is a small chance that another CPU could zoom in and consume those requests. But on the rare occasions where that may happen the result will mereley be some unexpected latency - it's not worth doing anything elaborate to prevent this. The patch decreases the size of `batch_requests'. batch_requests is positively harmful - when a "heavy" writer and a "light" writer are both writing to the same queue, batch_requests provides a means for the heavy writer to massively stall the light writer. Instead of waiting for one or two requests to come free, the light writer has to wait for 32 requests to complete. Plus batch_requests generally makes things harder to tune, understand and predict. I wanted to kill it altogether, but Jens says that it is important for some hardware - it allows decent size requests to be submitted. The VM changes which go along with this code cause batch_requests to be not so painful anyway - the only processes which sleep in get_request_wait() are the ones which we elect, by design, to wait in there - typically heavy writers. The patch changes the meaning of `queue_nr_requests'. It used to mean "total number of requests per queue". Half of these are for reads, and half are for writes. This always confused the heck out of me, and the code needs to divide queue_nr_requests by two all over the place. So queue_nr_requests now means "the number of write requests per queue" and "the number of read requests per queue". ie: I halved it. Also, queue_nr_requests was converted to static scope. Nothing else uses it. The accuracy of bdi_read_congested() and bdi_write_congested() depends upon the accuracy of mapping->backing_dev_info. With complex block stacking arrangements it is possible that ->backing_dev_info is pointing at the wrong queue. I don't know. But the cost of getting this wrong is merely latency, and if it is a problem we can fix it up in the block layer, by getting stacking devices to communicate their congestion state upwards in some manner.
-
Andrew Morton authored
__set_page_dirty_buffers() is calling __mark_inode_dirty under mapping->private_lock. We don't need to hold ->private_lock across that call. It's only there to pin page->buffers. This simplifies the VM locking heirarchy.
-
Andrew Morton authored
When I converted ext3 to use to use direct-to-BIO writeback for data=writeback mode I forgot that we need to hold a transaction open on behalf of MAP_SHARED pages. The fileystem is BUGging in get_block() because there is no transaction open. So let's forget that idea for now and send data=writeback mode back to ext3_writepage.
-
David S. Miller authored
into nuts.ninka.net:/home/davem/src/BK/sparc-2.5
-
David S. Miller authored
into nuts.ninka.net:/home/davem/src/BK/net-2.5
-
David S. Miller authored
into nuts.ninka.net:/home/davem/src/BK/net-2.5
-
Arnaldo Carvalho de Melo authored
Slowly killing the ugly struct forest.
-
Arnaldo Carvalho de Melo authored
-
Arnaldo Carvalho de Melo authored
With this llc_ui_sockets is almost not needed anymore, next changesets will deal with the dataunit/xid/test primitives, that are still using it.
-
Arnaldo Carvalho de Melo authored
-
David S. Miller authored
-
Arnaldo Carvalho de Melo authored
It is the same bug fixed some months ago in tcp_v6_get_port, i.e. we can't touch ipv6 private areas without checking if the socket is AF_INET6.
-
David S. Miller authored
-
David S. Miller authored
into nuts.ninka.net:/home/davem/src/BK/net-2.5
-
Alexander Viro authored
it is an ex-parrot
-
Alexander Viro authored
assorted compile fixes
-
Alexander Viro authored
mtdblock switched to use of gendisks + compile fixes
-
Alexander Viro authored
z2ram.c switched to use of gendisks
-
Alexander Viro authored
ataflop.c switched to use of gendisks
-
Alexander Viro authored
amiflop.c switched to use of gendisks
-
Alexander Viro authored
macroectomy a-la pf.c and pcd.c ones, ditto for passing pointers to structures instead of minors.
-
Alexander Viro authored
pd.c fed through Lindent
-
Alexander Viro authored
dumb expansion of macro - it had #define CURRENT current_req
-
Alexander Viro authored
tapeblock never assignes anything to its elements of blk_size[][]; we could not bother allocating it in the first place.
-
Alexander Viro authored
More trivial fixes: typos in partitions/check.c, block/floppy.c and acorn/block/fd1772.c + replacement of #define with inline in block/floppy.c (fd_eject()).
-
Adrian Bunk authored
Some trivial fixes for some typos introduced by Al's gendisk changes.. - missing comma in cdu31a - missing semicolon in cdu31a - comma instead of colon in gscd - semicolon instead of comma in mcd - missing closing bracket in sonycd535
-
Linus Torvalds authored
From Andries.
-
- 21 Sep, 2002 5 commits
-
-
Harald Welte authored
- Fix module usage counting for ip6_tables.o - make ipt_ULOG compile on SPARC - fix experimental ipt_unclean match, do not consider udp w/o csum unclean
-
Andrew Morton authored
From davem: replace `unsigned' with size_t.
-
David S. Miller authored
into nuts.ninka.net:/home/davem/src/BK/net-2.5
-
Petr Vandrovec authored
-
http://gkernel.bkbits.net/net-drivers-2.5Linus Torvalds authored
into home.transmeta.com:/home/torvalds/v2.5/linux
-