Commits · af738c8a482069b4660c14f700a470dec757ad5d · Kirill Smelkov / linux

10 Jul, 2003 40 commits

Andrew Morton authored Jul 10, 2003

From: Alex Tomas <bzzz@tmi.comex.ru>

fsync_super() calls ->sync_fs() just after ->write_super().  But
write_super() will start a commit.  In this case, ext3_sync_fs() will not
itself start a commit, and it hence forgets to wait on the commit which
ext3_write_super() started.

Fix that up by making journal_start_commit() return the transaction ID of
any currently-running transaction.

af738c8a

[PATCH] JBD: transaction buffer accounting fix · 4152cdfa

Andrew Morton authored Jul 10, 2003

From: Alex Tomas <bzzz@tmi.comex.ru>

start_this_handle() takes into account t_outstanding_credits when calculating
log free space, but journal_next_log_block() accounts for blocks being logged
also.  Hence, blocks are accounting twice.  This effectively reduces the
amount of log space available to transactions and forces more commits.

Fix it by decrementing t_outstanding_credits each time we allocate a new
journal block.

4152cdfa

[PATCH] JBD: checkpointing optimisations · a2df663d

Andrew Morton authored Jul 10, 2003

From: Alex Tomas <bzzz@tmi.comex.ru>

Some transaction checkpointing improvements for the JBD commit phase.  Decent
speedups:

creation of 500K files in single dir (with htree, of course):
 before: 4m16.094s, 4m12.035s, 4m11.911s
 after:  1m41.364s, 1m43.461s, 1m45.189s

removal of 500K files in single dir:
 before: 43m50.161s
 after:  38m45.510s


- Make __log_wait_for_space() recalculate the needed blocks because journal
  free space changes during commit

- Make log_do_checkpoint() starts scanning from the oldest transaction

- Make log_do_checkpoint() stop scanning if a transaction gets dropped.
  The caller will reevaluate the transaction state and decide whether more
  space needs to be generated in the log.

  The effect of this is to smooth out the I/O patterns, avoid the huge
  stop-and-go which currently happens when forced checkpointing writes out
  and waits upon 3/4 of the journal's size worth of data.

a2df663d

[PATCH] nbd: make nbd and block layer agree about device and · 20c52ab8

Andrew Morton authored Jul 10, 2003

From: Paul Clements <Paul.Clements@SteelEye.com>

Ensure that nbd and the block layer agree about device block sizes and total
device sizes.

20c52ab8

[PATCH] nbd: remove unneeded nbd_open/nbd_release and refcnt · 627c0412
Andrew Morton authored Jul 10, 2003
```
From: Paul Clements <Paul.Clements@SteelEye.com>

Remove the unneeded nbd_open and nbd_release functions.
```
627c0412
[PATCH] NBD documentation update · f4c39f4b
Andrew Morton authored Jul 10, 2003
```
From: Paul Clements <Paul.Clements@SteelEye.com>

Modernise nbd.txt a bit.
```
f4c39f4b

[PATCH] nbd: cleanup PARANOIA usage & code · d7b92e1d

Andrew Morton authored Jul 10, 2003

From: Lou Langholtz <ldl@aros.net>

This fifth patch cleans up usage of the PARANOIA sanity checking macro and
code.  This patch modifies both drivers/block/nbd.c and
include/linux/nbd.h.  It's intended to be applied incrementally on top of
my fourth patch (4.1 really if you count the memset addition as .1's worth)
that simply removed unneeded blksize_bits field.  Again, I wanted to get
this smaller change out of the way before my next patch will is much more
major.

d7b92e1d

[PATCH] nbd: initialise the embedded kobject · 4f9420c6
Andrew Morton authored Jul 10, 2003
```
From: Lou Langholtz <ldl@aros.net>

Fixes the NBD oopses which people have been reporting.
```
4f9420c6

[PATCH] nbd: remove unneeded blksize_bits field · 49e57bfc

Andrew Morton authored Jul 10, 2003

From: Lou Langholtz <ldl@aros.net>

This fourth patch simply removes the blksize_bits field from the nbd_device
struct and driver implementation.  How this field made it into this driver
to begin with is a mystery (where was Al Viro when that patch was
submitted??).  :-)

This patch modifies both drivers/block/nbd.c and include/linux/nbd.h files.
 It's intended to be applied incrementally on top of my third patch (for
enhanced diagnostics support).

49e57bfc

[PATCH] nbd: enhanced diagnostics support · 9c976399

Andrew Morton authored Jul 10, 2003

From: Lou Langholtz <ldl@aros.net>

This third patch (for enhancing diagnostics support) applies incrementally
after my last LKML'd patch (for cosmetic changes).  These changes introduce
configurable KERN_DEBUG level printk output for a variety of different
things that the driver does and provides the framework for enhanced future
debugging support as well.

9c976399

[PATCH] NBD: cosmetic cleanups · 52fa6e21

Andrew Morton authored Jul 10, 2003

From: Lou Langholtz <ldl@aros.net>

It's a helpful step in being better able to identify code inefficiencies
and problems particularly w.r.t.  locking.  It also modifies some of the
output messages for greater consistancy and better diagnostic support.

This second patch is a lead in that way to the third patch, which will
simply introduce the dprintk() debugging facility that my jumbo patch
originally had.

With the cosmetics patch and debugging enhancement (patch), it will make it
easier to fix or at least improve the locking bugs/races in NBD (that will
likely make up the fourth patch in my envisioned roadmap).

52fa6e21

[PATCH] fix for CPU scheduler load distribution · e0a3db1a

Andrew Morton authored Jul 10, 2003

From: Ingo Molnar <mingo@elte.hu>

It makes hot-balancing happen in the 'busy tick' case as well, which should
spread out processes more agressively.

e0a3db1a

[PATCH] separate locking for vfsmounts · 91b79ba7

Andrew Morton authored Jul 10, 2003

From: Maneesh Soni <maneesh@in.ibm.com>

While path walking we do follow_mount or follow_down which uses
dcache_lock for serialisation.  vfsmount related operations also use
dcache_lock for all updates. I think we can use a separate lock for
vfsmount related work and can improve path walking.

The following two patches does the same. The first one replaces
dcache_lock with new vfsmount_lock in namespace.c. The lock is
local to namespace.c and is not required outside. The second patch
uses RCU to have lock free lookup_mnt(). The patches are quite simple
and straight forward.

The lockmeter reults show reduced contention, and lock acquisitions
for dcache_lock while running dcachebench* on a 4-way SMP box

    SPINLOCKS         HOLD            WAIT
    UTIL  CON    MEAN(  MAX )   MEAN(  MAX )(% CPU)     TOTAL NOWAIT SPIN RJECT  NAME

  baselkm-2569:
    20.7% 20.9%  0.5us( 146us)  2.9us( 144us)(0.81%)  31590840 79.1% 20.9%    0%  dcache_lock
  mntlkm-2569:
    14.3% 13.6%  0.4us( 170us)  2.9us( 187us)(0.42%)  23071746 86.4% 13.6%    0%  dcache_lock

We get more than 8% improvement on 4-way SMP and 44% improvement on 16-way
NUMAQ while runing dcachebench*.

		Average (usecs/iteration)	Std. Deviation
		(lower is better)
4-way SMP
  2.5.69	15739.3				470.90
  2.5.69-mnt	14459.6				298.51

16-way NUMAQ
  2.5.69	120426.5			363.78
  2.5.69-mnt	 63225.8			427.60

*dcachebench is a microbenchmark written by Bill Hartner and is available at
http://www-124.ibm.com/developerworks/opensource/linuxperf/dcachebench/dcachebench.html

 vfsmount_lock.patch
 -------------------
 - Patch for replacing dcache_lock with new vfsmount_lock for all mount
   related operation. This removes the need to take dcache_lock while
   doing follow_mount or follow_down operations in path walking.

I re-ran dcachebench with 2.5.70 as base on 16-way NUMAQ box.

                	Average (usecs/iteration)       Std. Deviation
                	(lower is better)
16-way NUMAQ
2.5.70 				120710.9		 	230.67
 + vfsmount_lock.patch  	65209.6				242.97
    + lookup_mnt-rcu.patch 	64042.3				416.61

So just the lock splitting (vfsmount_lock.patch) gives almost similar benifits

91b79ba7

[PATCH] Fix race condition between aio_complete and · 679c40a8

Andrew Morton authored Jul 10, 2003

From: "Chen, Kenneth W" <kenneth.w.chen@intel.com>

We hit a memory ordering race condition on AIO ring buffer tail pointer
between function aio_complete() and aio_read_evt().

What happens is that on an architecture that has a relaxed memory ordering
model like IPF(ia64), explicit memory barrier is required in a SMP
execution environment. Considering the following case:

1 CPU is executing a tight loop of aio_read_evt. It is pulling event off
the ring buffer. During that loop, another CPU is executing aio_complete()
where it is putting event into the ring buffer and then update the tail
pointer. However, due to relaxed memory ordering model, the tail pointer
can be visible before the actual event is being updated. So the other CPU
sees the updated tail pointer but picks up a staled event data.

A memory barrier is required in this case between the event data and tail
pointer update. Same is true for the head pointer but the window of the
race condition is nil. For function correctness, it is fixed here as well.

By the way, this bug is fixed in the major distributor's kernel on 2.4.x
kernel series for a while, but somehow hasn't been propagated to 2.5 kernel
yet.

679c40a8

[PATCH] Bug fix in AIO initialization · b1648ead

Andrew Morton authored Jul 10, 2003

From: "Chen, Kenneth W" <kenneth.w.chen@intel.com>

We hit this bug when we have the following scenario:

One process initializes an AIO context and then forks out many child
processes.  When those child processes exit, many BUG checks
(effectively kernel oops) were triggered from put_ioctx(ctx) in function
exit_aio().

The issue was that the AIO context was incorrectly copied upon forking
and mislead all child processes to think they have an IO context and
trying to free it where they really don't own.  The following patch fix
the issue.

b1648ead

[PATCH] Set umask correctly for nfsd kernel threads · b14241c4

Andrew Morton authored Jul 10, 2003

From: Andreas Gruenbacher <agruen@suse.de>

Without acls, when creating files the umask is applied directly in the vfs.
ACLs require that the umask is applied at the file system level, depending on
whether or not the containing directory has a default acl. The daemonize()
function makes kernel threads share their fs_struct structure with the init
process. Among other things, fs_struct contains the umask, so all kernel
threads share their umask with init.

The kernel nfsd needs to create files with a umask of 0. Init's umask cannot
simply be changed to 0 --- this would have side effects on init, and init
would have side effects on nfsd. So this patch recreates a fs_struct
structure for nfsd kernel threads, and sets its umask to 0.

This fixes bug #721, <http://www.osdl.net/show_bug.cgi?id=721>.

b14241c4

[PATCH] misc fixes · ecbaa730

Andrew Morton authored Jul 10, 2003

- remove accidental debug code from ext3 commit.

- /proc/profile documentation fix (Randy Dunlap)

- use sb_breadahead() in ext2_preread_inode()

- unused var in mpage_writepages()

ecbaa730

[PATCH] make CONFIG_KALLSYMS default to "on" · f3eee922

Andrew Morton authored Jul 10, 2003

From: Diego Calleja Garcia <diegocg@teleline.es>

Move CONFIG_KALLSYMS out of the arch directory and into init/.

It defaults to "on" unless the user explicitly turns it off in the
"embedded systems" menu.

f3eee922

[PATCH] kmap() -> kmap_atomic() in fs/exec.c · 9f1ed86f
Andrew Morton authored Jul 10, 2003
```
replace a kmap() with kmap_atomic()
```
9f1ed86f

[PATCH] i_size atomic access · eafe5916

Andrew Morton authored Jul 10, 2003

From: Daniel McNeil <daniel@osdl.org>

This adds i_seqcount to the inode structure and then uses i_size_read() and
i_size_write() to provide atomic access to i_size.  This is a port of
Andrea Arcangeli's i_size atomic access patch from 2.4.  This only uses the
generic reader/writer consistent mechanism.

Before:
mnm:/usr/src/25> size vmlinux
   text    data     bss     dec     hex filename
2229582 1027683  162436 3419701  342e35 vmlinux

After:
mnm:/usr/src/25> size vmlinux
   text    data     bss     dec     hex filename
2225642 1027655  162436 3415733  341eb5 vmlinux

3.9k more text, a lot of it fastpath :(

It's a very minor bug, and the fix has a fairly non-minor cost.  The most
compelling reason for fixing this is that writepage() checks i_size.  If it
sees a transient value it may decide that page is outside i_size and will
refuse to write it.  Lost user data.

eafe5916

[PATCH] i_size atomic access: infrastructure · e9b94f6a

Andrew Morton authored Jul 10, 2003

From: Daniel McNeil <daniel@osdl.org>

This adds a sequence counter only version of the reader/writer consistent
mechanism to seqlock.h This is used in the second part of this patch give
atomic access to i_size.

e9b94f6a

[PATCH] wall_to_monotonic initialization fixes for · 1ac38088

Andrew Morton authored Jul 10, 2003

From: Tim Schmielau <tim@physik3.uni-rostock.de>

This patch adds (or fixes) initialization of wall_to_monotonic for a few
more architectures.

This should get rid of the strange uptime>14600 days reports, except on arm
whose arch file layout is too unfamiliar to me.

The patch is blessed by George Anzinger, but untested due to lack of
hardware.

1ac38088

[PATCH] fix reiserfs for 64bit arches · 9ed052e6

Andrew Morton authored Jul 10, 2003

From: Oleg Drokin <green@namesys.com>

From the time of reiserfs_file_write inclusion all 64bit arches were not
able to work with reiserfs for pretty stupid reason (incorrect "unsigned
long" definition of blocknumber type).

This fixes the problem.

9ed052e6

[PATCH] reiserfs dirty memory accounting fix · 0b124d82

Andrew Morton authored Jul 10, 2003

The ClearPageDirty() in there is wrong - it doesn't adjust the VM's dirty
memory accounting.  The system thinks it's full of dirty memory and stops.

0b124d82

[PATCH] remove proc_mknod() · 5ad9cb65

Andrew Morton authored Jul 10, 2003

From: Christoph Hellwig <hch@lst.de>

It's not used anymore since ALSA switched to traditional devices and device
nodes in procfs are a bad idea in general..

Also update the docs.

5ad9cb65

[PATCH] fix return of compat_sys_sched_getaffinity · ca2a459c
Andrew Morton authored Jul 10, 2003
```
From: rwhron@earthlink.net

It returns sizeof(compat_ulong_t) even if put_user() faulted.
```
ca2a459c
Merge from DRI CVS tree: avoid zero DRI "handles". · 2f20d8da
Linus Torvalds authored Jul 10, 2003

2f20d8da
Update radeon driver from DRI CVS: add more commands. · 8f3fb748
Linus Torvalds authored Jul 10, 2003
```
(version 1.8.0 -> 1.9.0)
```
8f3fb748
Update r128 driver from DRI CVS: add support for ycbcr textures. · fb4b152a
Linus Torvalds authored Jul 10, 2003
```
(version 2.3.0 -> 2.4.0)
```
fb4b152a
Update i810 DRI driver from CVS to add page flipping. · c2cff270
Linus Torvalds authored Jul 10, 2003
```
(version 1.2.1 to 1.3.0)
```
c2cff270
Merge comment updates from DRI CVS tree. · fd80ab16
Linus Torvalds authored Jul 10, 2003

fd80ab16
Merge bk://kernel.bkbits.net/davem/net-2.5 · 2850310b
Linus Torvalds authored Jul 10, 2003
```
into home.osdl.org:/home/torvalds/v2.5/linux
```
2850310b
[PATCH] gsc-ps2 update · 7e65788c
Matthew Wilcox authored Jul 10, 2003
```
Update gsc_ps2 for recent changes.
```
7e65788c
[PATCH] Remove warning from binfmt_elf.c for upwards growing stack · 46c7cd8b
Matthew Wilcox authored Jul 10, 2003

46c7cd8b
[PATCH] Add two sysctls for PA-RISC · e572d2bc
Matthew Wilcox authored Jul 10, 2003
```
Add two PA-RISC sysctls.
```
e572d2bc
[PATCH] eisa Kconfig update for parisc · 659b9a33
Matthew Wilcox authored Jul 10, 2003
```
PA-RISC doesn't have PCI<->EISA bridges (they're all GSC<->EISA).
```
659b9a33
[PATCH] Makefile update for parisc · 162fee61
Matthew Wilcox authored Jul 10, 2003
```
parisc64 machines should build parisc kernels.
```
162fee61

[PATCH] parisc updates · 233382d3

Matthew Wilcox authored Jul 10, 2003

arch/parisc, drivers/parisc and include/asm-parisc updates:

 - Fixups for struct timespec changes (James Bottomley)
 - Add CONFIG_FRAME_POINTER (Thibaut Varene)
 - Fix hpux ustat emulation (Helge Deller)
 - Add a ->remove operation to struct parisc_device (James Bottomley)
 - More work on modules (James Bottomley)
 - More unaligned instructions handled (LaMont Jones)
 - Fix byteswap assembly (Grant Grundler)
 - Allow ISA support to be selected (Matthew Wilcox)
 - Fix swapping (James Bottomley)

233382d3

[PATCH] cryptoloop · 05081dcb

Andries E. Brouwer authored Jul 10, 2003

util-linux is waiting for this: it needs to update "struct loop_info64"
to add the encryption policy name.

05081dcb

[PATCH] via-agp.c - agp_try_unsupported typo · a804e66c
Petr Sebor authored Jul 10, 2003
```
via-agp.c has the agp_try_unsupported test reverted
```
a804e66c