Commits · c9b22619390dfa338193a704109a29d93bbcfd00 · nexedi / linux

22 Sep, 2002 18 commits

[PATCH] use the congestion APIs in pdflush · c9b22619

Andrew Morton authored Sep 22, 2002

The key concept here is that pdflush does not block on request queues
any more.  Instead, it circulates across the queues, keeping any
non-congested queues full of write data.  When all queues are full,
pdflush takes a nap, to be woken when *any* queue exits write
congestion.

This code can keep sixty spindles saturated - we've never been able to
do that before.

 - Add the `nonblocking' flag to struct writeback_control, and teach
   the writeback paths to honour it.

 - Add the `encountered_congestion' flag to struct writeback_control
   and teach the writeback paths to set it.

So as soon as a mapping's backing_dev_info indicates that it is getting
congested, bale out of writeback.  And don't even start writeback
against filesystems whose queues are congested.

 - Convert pdflush's background_writeback() function to use
   nonblocking writeback.

This way, a single pdflush thread will circulate around all the
dirty queues, keeping them filled.

 - Convert the pdlfush `kupdate' function to do the same thing.

This solves the problem of pdflush thread pool exhaustion.

It solves the problem of pdflush startup latency.

It solves the (minor) problem wherein `kupdate' writeback only writes
back a single disk at a time (it was getting blocked on each queue in
turn).

It probably means that we only ever need a single pdflush thread.

c9b22619

[PATCH] use the queue congestion API in ext2_preread_inode() · f3332384

Andrew Morton authored Sep 22, 2002

Use the new queue congestion detector in ext2_preread_inode(). Don't
try the speculative read if the read queue is congested.

Also, don't try it if the disk is write-congested. Presumably it is
more important to get the dirty memory cleaned out.

f3332384

[PATCH] infrastructure for monitoring queue congestion state · 4cef1b04

Andrew Morton authored Sep 22, 2002

The patch provides a means for the VM to be able to determine whether a
request queue is in a "congested" state.  If it is congested, then a
write to (or read from) the queue may cause blockage in
get_request_wait().

So the VM can do:

	if (!bdi_write_congested(page->mapping->backing_dev_info))
		writepage(page);

This is not exact.  The code assumes that if the request queue still
has 1/4 of its capacity (queue_nr_requests) available then a request
will be non-blocking.  There is a small chance that another CPU could
zoom in and consume those requests.  But on the rare occasions where
that may happen the result will mereley be some unexpected latency -
it's not worth doing anything elaborate to prevent this.

The patch decreases the size of `batch_requests'.  batch_requests is
positively harmful - when a "heavy" writer and a "light" writer are
both writing to the same queue, batch_requests provides a means for the
heavy writer to massively stall the light writer.  Instead of waiting
for one or two requests to come free, the light writer has to wait for
32 requests to complete.

Plus batch_requests generally makes things harder to tune, understand
and predict.  I wanted to kill it altogether, but Jens says that it is
important for some hardware - it allows decent size requests to be
submitted.

The VM changes which go along with this code cause batch_requests to be
not so painful anyway - the only processes which sleep in
get_request_wait() are the ones which we elect, by design, to wait in
there - typically heavy writers.


The patch changes the meaning of `queue_nr_requests'.  It used to mean
"total number of requests per queue".  Half of these are for reads, and
half are for writes.  This always confused the heck out of me, and the
code needs to divide queue_nr_requests by two all over the place.

So queue_nr_requests now means "the number of write requests per queue"
and "the number of read requests per queue".  ie: I halved it.

Also, queue_nr_requests was converted to static scope.  Nothing else
uses it.


The accuracy of bdi_read_congested() and bdi_write_congested() depends
upon the accuracy of mapping->backing_dev_info.  With complex block
stacking arrangements it is possible that ->backing_dev_info is
pointing at the wrong queue.  I don't know.

But the cost of getting this wrong is merely latency, and if it is a
problem we can fix it up in the block layer, by getting stacking
devices to communicate their congestion state upwards in some manner.

4cef1b04

[PATCH] don't hold mapping->private_lock while marking a page dirty · b5742733

Andrew Morton authored Sep 22, 2002

__set_page_dirty_buffers() is calling __mark_inode_dirty under
mapping->private_lock.

We don't need to hold ->private_lock across that call.  It's only there
to pin page->buffers.

This simplifies the VM locking heirarchy.

b5742733

[PATCH] fix ext3 in data=writeback mode · c8b254cc

Andrew Morton authored Sep 22, 2002

When I converted ext3 to use to use direct-to-BIO writeback for
data=writeback mode I forgot that we need to hold a transaction open on
behalf of MAP_SHARED pages.  The fileystem is BUGging in get_block()
because there is no transaction open.

So let's forget that idea for now and send data=writeback mode back to
ext3_writepage.

c8b254cc

[PATCH] blk_size[] is gone · f5076217
Alexander Viro authored Sep 21, 2002
```
it is an ex-parrot
```
f5076217
[PATCH] compile fixes for ftl · 981de136
Alexander Viro authored Sep 21, 2002
```
assorted compile fixes
```
981de136
[PATCH] gendisk for mtdblock · b55a9a52
Alexander Viro authored Sep 21, 2002
```
mtdblock switched to use of gendisks + compile fixes
```
b55a9a52
[PATCH] gendisk for z2ram · 6ebe755c
Alexander Viro authored Sep 21, 2002
```
z2ram.c switched to use of gendisks
```
6ebe755c
[PATCH] gendisk for ataflop · 3f028def
Alexander Viro authored Sep 21, 2002
```
ataflop.c switched to use of gendisks
```
3f028def
[PATCH] gendisk for amiflop · b7264cd3
Alexander Viro authored Sep 21, 2002
```
amiflop.c switched to use of gendisks
```
b7264cd3

[PATCH] cleanup of pd.c · 8e273c4e

Alexander Viro authored Sep 21, 2002

macroectomy a-la pf.c and pcd.c ones, ditto for passing pointers to
structures instead of minors.

8e273c4e

[PATCH] Lindent pd.c · 91c42c4d
Alexander Viro authored Sep 21, 2002
```
pd.c fed through Lindent
```
91c42c4d
[PATCH] kills CURRENT in floppy.c · f299adc6
Alexander Viro authored Sep 21, 2002
```
dumb expansion of macro - it had #define CURRENT current_req
```
f299adc6

[PATCH] tapeblock blk_size removal · 0beb090c

Alexander Viro authored Sep 21, 2002

tapeblock never assignes anything to its elements of blk_size[][]; we could
not bother allocating it in the first place.

0beb090c

[PATCH] Re: Linux 2.5.38 · 1751d060

Alexander Viro authored Sep 21, 2002

More trivial fixes: typos in partitions/check.c, block/floppy.c and
acorn/block/fd1772.c + replacement of #define with inline in block/floppy.c
(fd_eject()).

1751d060

[PATCH] gendisk typo fixes · e2d496c5

Adrian Bunk authored Sep 21, 2002

Some trivial fixes for some typos introduced by Al's gendisk changes..

 - missing comma in cdu31a
 - missing semicolon in cdu31a
 - comma instead of colon in gscd
 - semicolon instead of comma in mcd
 - missing closing bracket in sonycd535

e2d496c5

IDE: Try to use PCI dma_mask only if the device actually _is_ PCI. · b890a32b
Linus Torvalds authored Sep 21, 2002
```
From Andries.
```
b890a32b

21 Sep, 2002 22 commits
- [PATCH] 64-bit type correctness in filemap.c · dd2ad358
  Andrew Morton authored Sep 21, 2002
```
From davem: replace `unsigned' with size_t.
```
  dd2ad358
- Merge http://gkernel.bkbits.net/net-drivers-2.5 · fa63ebcc
  Linus Torvalds authored Sep 21, 2002
```
into home.transmeta.com:/home/torvalds/v2.5/linux
```
  fa63ebcc
- Merge mandrakesoft.com:/home/jgarzik/repo/linus-2.5 · cc777197
  Jeff Garzik authored Sep 21, 2002
```
into mandrakesoft.com:/home/jgarzik/repo/net-drivers-2.5
```
  cc777197
- Merge mandrakesoft.com:/home/jgarzik/repo/linus-2.5 · 27d8034c
  Jeff Garzik authored Sep 21, 2002
```
into mandrakesoft.com:/home/jgarzik/repo/net-drivers-2.5
```
  27d8034c
- airo wireless: Fixes signal level retrieval in SPY mode · e9d52616
  Javier Achirica authored Sep 21, 2002
```
(releases memory block after read out)
```
  e9d52616
- airo wireless: fix "non-probe mode" setup · 11c72f75
  Javier Achirica authored Sep 21, 2002
  
  11c72f75
- airo wireless: power down on if down, define local 'ai' to fix build · 3c32c3e7
  Javier Achirica authored Sep 21, 2002
  
  3c32c3e7
- airo wireless: more verbose MAC-enable errors · 6df2eeec
  Javier Achirica authored Sep 21, 2002
  
  6df2eeec
- airo wireless: disable card while prom flashing is in progress · c9907289
  Javier Achirica authored Sep 21, 2002
```
[note - more work needs to be done here, but this is better than
nothing -jgarzik]
```
  c9907289
- airo wireless: use ETH_ALEN constant where appropriate · 3f370c85
  Javier Achirica authored Sep 21, 2002
  
  3f370c85
- Linux v2.5.38 · cc8f2609
  Linus Torvalds authored Sep 21, 2002
  
  cc8f2609
- Avoid confusion "mount" and "fsck" - don't show things like · d4c9cb10
  Linus Torvalds authored Sep 21, 2002
```
floppies and CD's in /proc/partitions.
```
  d4c9cb10
- [PATCH] free_area_init_node fix (for non discontigmem direct use) · 16903606
  Martin J. Bligh authored Sep 21, 2002
```
Some idiot (OK, it was me) broke free_area_init_node for
non discontigmem systems that call it directly (eg sparc64),
during a recent cleanup, thus invoking the wrath of DaveM.

I know Dave sent you a patch yesterday, but I think the BUG
statement in it will break anyone who just uses free_area_init
(eg any PC). So here's a portion of Dave's patch that should
fix things for everyone I think. Unfortunately my non-NUMA
test box is borked right now, but it just removes the BUG
statement from what he tested, and it's so simple that even
I couldn't screw this up (famous last words).

This code really needs some more cleanup work, but this will
fix it for now so everyone can do their work ...
```
  16903606
- Don't do a 64-bit divide when a simple shift will do.. · 914744e2
  Linus Torvalds authored Sep 21, 2002
  
  914744e2
- Merge home.transmeta.com:/home/torvalds/v2.5/viro · b79f413c
  Linus Torvalds authored Sep 21, 2002
```
into home.transmeta.com:/home/torvalds/v2.5/linux
```
  b79f413c
- [PATCH] gendisk for swim_iop · ec4258f9
  Alexander Viro authored Sep 21, 2002
```
swim_iop switched to use of gendisk
```
  ec4258f9
- [PATCH] gendisk for acorn floppy · db5ea6e9
  Alexander Viro authored Sep 21, 2002
```
acorn floppy switched to use of gendisk
```
  db5ea6e9
- [PATCH] gendisk for xpram · 5594d5a3
  Alexander Viro authored Sep 21, 2002
```
xpram switched to use of gendisk
```
  5594d5a3
- [PATCH] gendisk for nbd · 0d608510
  Alexander Viro authored Sep 21, 2002
```
nbd switched to use of gendisk
```
  0d608510
- [PATCH] gendisk for rd · e873005d
  Alexander Viro authored Sep 21, 2002
```
rd switched to use of gendisk
```
  e873005d
- [PATCH] gendisk for stram · 82cc65a9
  Alexander Viro authored Sep 21, 2002
```
stram switched to use of gendisk
```
  82cc65a9
- [PATCH] gendisk for sonycd · 71dd8909
  Alexander Viro authored Sep 21, 2002
```
sonycd switched to use of gendisk; missing initcall restored
```
  71dd8909