Commits · 990aef1a1fc69756773e647cd58ea06dc1188685 · Kirill Smelkov / linux

18 Jun, 2003 11 commits

[PATCH] JBD: Finish protection of journal_head.b_frozen_data · 990aef1a

Andrew Morton authored Jun 17, 2003

We now start to move across the JBD data structure's fields, from "innermost"
and outwards.

Start with journal_head.b_frozen_data, because the locking for this field was
partially implemented in jbd-010-b_committed_data-race-fix.patch.

It is protected by jbd_lock_bh_state().  We keep the lock_journal() and
spin_lock(&journal_datalist_lock) calls in place.  Later,
spin_lock(&journal_datalist_lock) is replaced by
spin_lock(&journal->j_list_lock).

Of course, this completion of the locking around b_frozen_data also puts a
lot of the locking for other fields in place.

990aef1a

[PATCH] JBD: rename journal_unlock_journal_head to · eacf9510

Andrew Morton authored Jun 17, 2003

journal_unlock_journal_head() is misnamed: what it does is to drop a ref on
the journal_head and free it if that ref fell to zero.  It doesn't actually
unlock anything.

Rename it to journal_put_journal_head().

eacf9510

[PATCH] JBD: fine-grain journal_add_journal_head locking · 1c69516f

Andrew Morton authored Jun 17, 2003

buffer_heads and journal_heads are joined at the hip.  We need a lock to
protect the joint and its refcounts.

JBD is currently using a global spinlock for that.  Change it to use one bit
in bh->b_state.

1c69516f

[PATCH] JBD: remove jh_splice_lock · 6fe2ab38

Andrew Morton authored Jun 17, 2003

This was a strange spinlock which was designed to prevent another CPU from
ripping a buffer's journal_head away while this CPU was inspecting its state.

Really, we don't need it - we can inspect that state directly from bh->b_state.

So kill it off, along with a few things which used it which are themselves
not actually used any more.

6fe2ab38

[PATCH] JBD: plan JBD locking schema · 13d8498a

Andrew Morton authored Jun 17, 2003

This is the start of the JBD locking rework.

The aims of all this are to remove all lock_kernel() calls from JBD, to
remove all lock_journal() calls (the context switch rate is astonishing when
the lock_kernel()s are removed) and to remove all sleep_on() instances.




The strategy which is taken is:

a) Define the lcoking schema (this patch)

b) Work through every JBD data structure and implement its locking fully,
   according to the above schema.  We work from "innermost" data structures
   and outwards.

It isn't guaranteed that the filesystem will work very well at all stages of
this patch series.



In this patch:


Add commentary and various locks to jbd.h describing the locking scheme which
is about to be implemented.

Initialise the new locks.

Coding-style goodness in jbd.h

13d8498a

[PATCH] JBD: fix race over access to b_committed_data · 47bb09d8

Andrew Morton authored Jun 17, 2003

From: Alex Tomas <bzzz@tmi.comex.ru>

We have a race wherein the block allocator can decide that
journal_head.b_committed_data is present and then will use it. But kjournald
can concurrently free it and set the pointer to NULL. It goes oops.

We introduce per-buffer_head "spinlocking" based on a bit in b_state. To do
this we abstract out pte_chain_lock() and reuse the implementation.

The bit-based spinlocking is pretty inefficient CPU-wise (hence the warning
in there) and we may move this to a hashed spinlock later.

47bb09d8

[PATCH] ext3: scalable counters and locks · 17aff938

Andrew Morton authored Jun 17, 2003

From: Alex Tomas <bzzz@tmi.comex.ru>

This is a port from ext2 of the fuzzy counters (for Orlov allocator
heuristics) and the hashed spinlocking (for the inode and bloock allocators).

17aff938

[PATCH] ext3: concurrent block/inode allocation · c12b9866

Andrew Morton authored Jun 17, 2003

From: Alex Tomas <bzzz@tmi.comex.ru>


This patch weans ext3 off lock_super()-based protection for the inode and
block allocators.

It's basically the same as the ext2 changes.


1) each group has own spinlock, which is used for group counter
   modifications

2) sb->s_free_blocks_count isn't used any more.  ext2_statfs() and
   find_group_orlov() loop over groups to count free blocks

3) sb->s_free_blocks_count is recalculated at mount/umount/sync_super time
   in order to check consistency and to avoid fsck warnings

4) reserved blocks are distributed over last groups

5) ext3_new_block() tries to use non-reserved blocks and if it fails then
   tries to use reserved blocks

6) ext3_new_block() and ext3_free_blocks do not modify sb->s_free_blocks,
   therefore they do not call mark_buffer_dirty() for superblock's
   buffer_head. this should reduce I/O a bit


Also fix orlov allocator boundary case:

In the interests of SMP scalability the ext2 free blocks and free inodes
counters are "approximate".  But there is a piece of code in the Orlov
allocator which fails due to boundary conditions on really small
filesystems.

Fix that up via a final allocation pass which simply uses first-fit for
allocatiopn of a directory inode.

c12b9866

[PATCH] JBD: journal_get_write_access() speedup · 78f2f471
Andrew Morton authored Jun 17, 2003
```
Move some lock_kernel() calls from the caller to the callee, reducing
holdtimes.
```
78f2f471

[PATCH] ext3: move lock_kernel() down into the JBD layer. · 3307fbd1

Andrew Morton authored Jun 17, 2003

This is the start of the ext3 scalability rework.  It basically comes in two
halves:

- ext3 BKL/lock_super removal and scalable inode/block allocators

- JBD locking rework.

The ext3 scalability work was completed a couple of months ago.

The JBD rework has been stable for a couple of weeks now.  My gut feeling is
that there should be one, maybe two bugs left in it, but no problems have
been discovered...


Performance-wise, throughput is increased by up to 2x on dual CPU.  10x on
16-way has been measured.  Given that current ext3 is able to chew two whole
CPUs spinning on locks on a 4-way, that wasn't especially suprising.

These patches were prepared by Alex Tomas <bzzz@tmi.comex.ru> and myself.


First patch: ext3 lock_kernel() removal.

The only reason why ext3 takes lock_kernel() is because it is requires by the
JBD API.

The patch removes the lock_kernels() from ext3 and pushes them down into JBD
itself.

3307fbd1

Merge http://lia64.bkbits.net/to-linus-2.5 · 0d0d8534
Linus Torvalds authored Jun 17, 2003
```
into home.transmeta.com:/home/torvalds/v2.5/linux
```
0d0d8534

17 Jun, 2003 29 commits
- ia64: Initial sync with 2.5.72. · 1626bd5b
  David Mosberger authored Jun 17, 2003
  
  1626bd5b
- ia64: Sync with 2.5.71. · 2b5f799d
  David Mosberger authored Jun 17, 2003
  
  2b5f799d
- Merge tiger.hpl.hp.com:/data1/bk/vanilla/linux-2.5 · 84a2f00e
  David Mosberger authored Jun 17, 2003
```
into tiger.hpl.hp.com:/data1/bk/lia64/to-linus-2.5
```
  84a2f00e
- [PATCH] Add __raw_ read/write ops to v850 io.h · 6da97790
  Miles Bader authored Jun 17, 2003
  
  6da97790
- [PATCH] Add linker script support for v850 "rte_nb85e_cb" platform · 99782bcc
  Miles Bader authored Jun 17, 2003
  
  99782bcc
- [PATCH] Add .con_initcall.init section on v850 · 68062a3e
  Miles Bader authored Jun 17, 2003
  
  68062a3e
- [PATCH] v850 whitespace tweaks · 82e8fb3b
  Miles Bader authored Jun 17, 2003
  
  82e8fb3b
- [PATCH] Fix compat_sys_getrusage. Again · d780ba1c
  Anton Blanchard authored Jun 17, 2003
```
I must not ignore compiler warnings.
I must not ignore compiler warnings.
I must not ignore compiler warnings.
```
  d780ba1c
- [PATCH] OProfile: thread switching performance fix · 5e9a1167
  John Levon authored Jun 17, 2003
```
Avoid the linear list walk of get_exec_dcookie() when we've switched to a task
using the same mm.
```
  5e9a1167
- [PATCH] OProfile: IO-APIC based NMI delivery · 438612ea
  John Levon authored Jun 17, 2003
```
Use the IO-APIC NMI delivery when the local APIC performance counter delivery is
not available. By Zwane Mwaikambo.
```
  438612ea
- [PATCH] OProfile: small NMI shutdown fix · 991cae79
  John Levon authored Jun 17, 2003
```
Reduce the possibility of dazed-and-confuseds.
```
  991cae79
- [PATCH] syncppp fixes · 3b08caa4
  Paul Fulghum authored Jun 17, 2003
```
 - Fix 'badness in local_bh_enable' warning

   This involved moving dev_queue_xmit() calls
   outside of sections with spinlock held.

 - Fix 'fix old protocol handler' warning

   This includes accounting for shared skbs,
   setting protocol .data field to non-null,
   and adding per device synchronization to
   receive handler.

This has been tested in PPP and Cisco modes
with and with out the keepalives enabled
on a SMP machine.
```
  3b08caa4
- [PATCH] Consolidate Kconfigs for binfmts · 65008dc6
  Matthew Wilcox authored Jun 17, 2003
```
This patch creates fs/Kconfig.binfmt and converts all architectures to
use it.  I took the opportunity to spruce up the a.out help text for
the new millennium.
```
  65008dc6
- [PATCH] kNFSd: Set nfsd user every time a filehandle is verified. · f36e10e5
  Neil Brown authored Jun 17, 2003
```
request might traverse several export points which may
have different uid squashing.
```
  f36e10e5
- [PATCH] kNFSd: Do NFSv4 server state initialisation when nfsd starts instead of when module loaded. · eaee716b
  Neil Brown authored Jun 17, 2003
```
From: "William A.(Andy) Adamson" <andros@citi.umich.edu>
```
  eaee716b
- [PATCH] kNFSd: RENEW and lease management for NFSv4 server · 1ac4906c
  Neil Brown authored Jun 17, 2003
```
From: "William A.(Andy) Adamson" <andros@citi.umich.edu>

Put all clients in a LRU list and use a "work_queue" to
expire old clients periodically.
```
  1ac4906c
- [PATCH] kNFSd: Make sure unused bits of NFSv4 xfr buffered are zero.. · 22239375
  Neil Brown authored Jun 17, 2003
  
  22239375
- [PATCH] kNFSd: Allow nfsv4 readdir to return filehandle when a mountpoint is found is a directory · 7e54636e
  Neil Brown authored Jun 17, 2003
```
From: "William A.(Andy) Adamson" <andros@citi.umich.edu>

When readdir is enumerating a directory and finds a mountpoint,
it needs to do a bit of extra work to find the filehandle to be
returned in the readdir reply.

It is even possible that finding the filehandle requires an up-call,
so the request might be dropped to be re-tried later.
```
  7e54636e
- [PATCH] kNFSd: Make sure an early close on a nfs/tcp connection is handled properly. · 69408a2f
  Neil Brown authored Jun 17, 2003
```
From: Hirokazu Takahashi <taka@valinux.co.jp>

In svc_tcp_listen_data_ready we should be waiting for
TCP_LISTEN, not TCP_ESTABLISHED.  The later only worked
by accident.

Also, if a socket is closed as soon as we accept it, we
must shut it down straight away as we will never get a 'close'
event.
```
  69408a2f
- [PATCH] kNFSd: Assorted fixed for NFS export cache · 74001bcd
  Neil Brown authored Jun 17, 2003
```
The most significant fix is cleaning up properly when
nfs service is stopped.

Also fix some refcounting problems and other little bits.
```
  74001bcd
- [PATCH] kNFSd: Fix bug in svc_pushback_unused_pages that occurs on zero byte NFS read · e927119b
  Neil Brown authored Jun 17, 2003
```
svc_pushback_unused_pages must be ready of the possibility that
no pages were allocated or will need to be pushed back.
```
  e927119b
- Fix moxa compile (at least for UP) and remove a few warnings. · 925971b6
  Linus Torvalds authored Jun 17, 2003
```
From Adrian Bunk.
```
  925971b6
- Merge bk://kernel.bkbits.net/davem/sparc-2.5 · 0dddcf52
  Linus Torvalds authored Jun 17, 2003
```
into home.transmeta.com:/home/torvalds/v2.5/linux
```
  0dddcf52
- [PATCH] eata and u14-34f update · 1bc50b61
  Dario Ballabio authored Jun 17, 2003
```
Here enclosed an update for the new IRQ and module_param APIs.
eata.h and u14-34f.h are no longer used and will be deleted.
```
  1bc50b61
- Merge bk://linux-scsi.bkbits.net/scsi-for-linus-2.5 · fa7875ec
  Linus Torvalds authored Jun 17, 2003
```
into home.transmeta.com:/home/torvalds/v2.5/linux
```
  fa7875ec
- [PATCH] aha1740.c doesn't compile. · 65a76955
  Adrian Bunk authored Jun 17, 2003
  
  65a76955
- SCSI: tidy up io vs mem mapping in 53c700 driver · 388dc0e4
  James Bottomley authored Jun 17, 2003
```
The parisc ports may use both the lasi700 and sim710 versions of this driver
Unfortunately, one must be memory mapped, and one must be IO mapped, so
add code to the driver for this case
```
  388dc0e4
- Merge davem@nuts.ninka.net:/home/davem/src/BK/sparc-2.5 · 45c7262e
  David S. Miller authored Jun 17, 2003
```
into kernel.bkbits.net:/home/davem/sparc-2.5
```
  45c7262e
- [SPARC]: Fix wall_to_monotonic initialization. · dcc320b6
  David S. Miller authored Jun 17, 2003
  
  dcc320b6