Commits · 88d416dbdfa39bc9b7c4ab8e68176368262b0609 · nexedi / linux

12 Apr, 2003 24 commits

[PATCH] kNFSd: NFSD binary compatibility breakage · 88d416db

Neil Brown authored Apr 12, 2003

The removal of "struct nfsctl_uidmap" from "nfsctl_fdparm" broke
binary compatiblity on 64-bit platforms (strictly speaking: on all
platforms with alignof(void *) > alignof(int)). The problem is that
nfsctl_uidmap contained a "char *", which forced the alignment of the
entire union to be 64 bits. With the removal of the uidmap, the
required alignment drops to 32 bits. Since the first member is only
32 bits in size, this breaks compatibility with user-space. Patch
below fixes the problem.

88d416db

[PATCH] kNFSd: Return correct result for ACCESS(READ) on eXecute-only file. · 4fe13364

Neil Brown authored Apr 12, 2003

Currently, an NFSv3 ACCESS check for READ permission on an
eXecute-only file will succeed where it should fail.

This is because nfsd_permission allows READ access to eXecute only
files so that mode 711 executables can be loaded and run, and
nfsd_access simply uses nfsd_permission.

This patch changes nfsd_permission to only map eXecute permission to
read permission of MAY_OWNER_OVERRIDE was set.  This is only set
when trying to read from a file, so ACCESS will no longer be tricked.

This change will only affect callers of nfsd_permission that specify
MAY_READ and not MAY_OWNER_OVERRIDE, and nfsd_access is the only
routine that calls nfsd_permission (via fh_verify) that way.

4fe13364

[PATCH] kNFSd: nfsd/export.c tidyup and add missing exp_put · 3a280533

Neil Brown authored Apr 12, 2003

There was a missing exp_put in export.c so that after a client
mounts an exported filesystem, the server would never be able to
unmount, even after trying to unexport.  This is fixed by the last
chunk of this patch.

Also assorted cleanups to the code found while hunting.

3a280533

[PATCH] Put all functions in kallsyms · 4b4bd81a

Andrew Morton authored Apr 12, 2003

From: Rusty Russell <rusty@rustcorp.com.au>

Introduce _sinittext and _einittext (cf. _stext and _etext), so kallsyms
includes __init functions.

TODO: Use huffman name compression and 16-bit offsets (see IDE
oopser patch)

4b4bd81a

[PATCH] use spinlocking in the ext2 inode allocator · 77a9874a

Andrew Morton authored Apr 12, 2003

From Alex Tomas and myself

It is identical in concept to the block allocator change.  It uses the same
hashed spinlock.

77a9874a

[PATCH] use spinlocking in the ext2 block allocator · c14c1a44

Andrew Morton authored Apr 12, 2003

From Alex Tomas and myself

ext2 currently uses lock_super() to protect the filesystem's in-core block
allocation bitmaps.

On big SMP machines the contention on that semaphore is causing high context
switch rates, large amounts of idle time and reduced throughput.

The context switch rate can also worsen block allocation: if several tasks
are trying to allocate blocks inside the same blockgroup for different files,
madly rotating between those tasks will cause the files' blocks to be
intermingled.

On SDET and dbench-style worloads (lots of tasks doing lots of allocation)
this patch (and a similar one for the inode allocator) improve throughout on
an 8-way by ~15%.  On 16-way NUMAQ the speedup is 150%.

What wedo isto remove the lock altogether and just rely on the atomic
semantics of test_and_set_bit(): if the allocator sees a block was free it
runs test_and_set_bit().  If that fails, then we raced and the allocator will
go and look for another block.

Of course, we don't really use test_and_set_bit() because that
isn'tendian-dependent.  New atomic endian-independent functions are
introduced: ext2_set_bit_atomic() and ext2_clear_bit_atomic().  We do not
need ext2_test_bit_atomic(), since even if ext2_test_bit() returns the wrong
result, that error will be detected and naturally handled in the subsequent
ext2_set_bit_atomic().

For little-endian machines the new atomic ops map directly onto the
test_and_set_bit(), etc.

For big-endian machines we provide the architecture's impementation with the
address of a spinlock whcih can be taken around the nonatomic ext2_set_bit().
 The spinlocks are hashed, and the hash is scaled according to the machine
size.  Architectures are free to implement optimised versions of
ext2_set_bit_atomic() and ext2_clear_bit_atomic().

c14c1a44

[PATCH] blockgroup_lock: hashed spinlocks for ext2 and ext3 · c9db333a

Andrew Morton authored Apr 12, 2003

ext2 and ext3 per-blockgroup metadata needs locking. An fs-wide lock is
expensive, and a per-blockgroup lock consumes too much storage (up to 32768
blockgroups per filesystem). We need something in-between.

blockgroup_locks are very simple hashed spinlocks which provide this
compromise. The size of the lock is scaled by NR_CPUS to implement an
additional speed/space tradeoff.

These locks are actually fairly generic. However I presented it as something
which is specific to ext2 and ext3 so that people wouldn't go using them all
over the place. They consume a lot of storage.

c9db333a

[PATCH] percpu_counters: approximate but scalable counters · ba8e8755

Andrew Morton authored Apr 12, 2003

Several places in ext2 and ext3 are using filesystem-wide counters which use
global locking.  Mainly for the orlov allocator's heuristics.

To solve the contention which this causes we can trade off accuracy against
speed.

This patch introduces a "percpu_counter" library type in which the counts are
per-cpu and are periodically spilled into a global counter.  Readers only
read the global counter.

These objects are *large*.  On a 32 CPU P4, they are 4 kbytes.  On a 4 way
p3, 128 bytes.

ba8e8755

[PATCH] /proc/meminfo documentation · f688c084
Andrew Morton authored Apr 12, 2003
```
From: Dave Hansen <haveblue@us.ibm.com>

Documents the information in /proc/meminfo
```
f688c084

[PATCH] vmalloc stats in /proc/meminfo · ffa5b8eb

Andrew Morton authored Apr 12, 2003

From: Matt Porter <porter@cox.net>

There was a thread a while back on lkml where Dave Hansen proposed this
simple vmalloc usage reporting patch. The thread pretty much died out as
most people seemed focused on what VM loading type bugs it could solve. I
had posted that this type of information was really valuable in debugging
embedded Linux board ports. A common example is where people do arch
specific setup that limits there vmalloc space and then they find modules
won't load. ;) Having the Vmalloc* info readily available is real useful in
helping folks to fix their kernel ports.

ffa5b8eb

[PATCH] /proc/interrupts allocates too much memory · 873015a8

Andrew Morton authored Apr 12, 2003

From: David Mosberger <davidm@napali.hpl.hp.com>

interrupts_open() can easily try to kmalloc() more memory than
supported by kmalloc.  E.g., with 16KB page size and NR_CPUS==64, it
would try to allocate 147456 bytes.

The workaround below is to allocate 4KB per 8 CPUs.  Not really a
solution, but the fundamental problem is that /proc/interrupts
shouldn't use a fixed buffer size in the first place.  I suppose
another solution would be to use vmalloc() instead.  It all feels like
bandaids though.

873015a8

[PATCH] Fix kmalloc_sizes[] indexing · 830d6ef2

Andrew Morton authored Apr 12, 2003

From: Brian Gerst and David Mosberger

The previous fix to the kmalloc_sizes[] array didn't null-terminate the
correct array.

Fix that up, and also avoid running ARRAY_SIZE() against an array which is
really a null-terminated list.

830d6ef2

[PATCH] architecture hooks for mem_map initialization · 17817b89

Andrew Morton authored Apr 12, 2003

From: Christoph Hellwig <hch@lst.de>

This patch is from the IA64 tree, with minor cleanups from me.

Split out initialization of pgdat->node_mem_map into a separate function
and allow architectures to override it.  This is needed for HP IA64
machines that have a virtually mapped memory map to support big
memory holes without having to use discontigmem.

(memmap_init_zone is non-static to allow the IA64 code to use it -
 I did that instead of passing it's address into the arch hook as
 it is done currently in the IA64 tree)

17817b89

[PATCH] bootmem speedup from the IA64 tree · 79e626e1

Andrew Morton authored Apr 12, 2003

From: Christoph Hellwig <hch@lst.de>

This patch is from the IA64 tree, with some minor cleanups by me.
David described it as:

This is a performance speed up and some minor indendation fixups.

The problem is that the bootmem code is (a) hugely slow and (b) has
execution that grow quadratically with the size of the bootmap bitmap.
This causes noticable slowdowns, especially on machines with (relatively)
large holes in the physical memory map. Issue (b) is addressed by
maintaining the "last_success" cache, so that we start the next search
from the place where we last found some memory (this part of the patch
could stand additional reviewing/testing). Issue (a) is addressed by
using find_next_zero_bit() instead of the slow bit-by-bit testing.

79e626e1

[PATCH] convert file_lock to a spinlock · a413a276

Andrew Morton authored Apr 12, 2003

Time to write a 2M file, one byte at a time:

Before:
        1.09s user 4.92s system 99% cpu 6.014 total
        0.74s user 5.28s system 99% cpu 6.023 total
        1.03s user 4.97s system 100% cpu 5.991 total

After:
	0.79s user 5.17s system 99% cpu 5.993 total
	0.79s user 5.17s system 100% cpu 5.957 total
	0.84s user 5.11s system 100% cpu 5.942 total

a413a276

[PATCH] correct vm_page_prot on stack pages · 300c2652

Andrew Morton authored Apr 12, 2003

From: David Mosberger <davidm@napali.hpl.hp.com>

The patch below is needed to make it possible to map stack pages
without execution permission (as we do on ia64).

300c2652

[PATCH] don't clear PG_uptodate on ENOSPC · 2accc2e3

Andrew Morton authored Apr 12, 2003

If get_block() returns -ENOSPC __block_write_full_page() is currently
clearing PG_uptodate.

Tht doesn't make any sense - failure to allocate space (or an IO error) does
not make the page not uptodate. It will create pages which are dirty, mapped
into pagetables and not uptodate, which is a nonsensical state.

2accc2e3

[PATCH] Fix deadlock with ext3+quota · 36b4f825

Andrew Morton authored Apr 12, 2003

From: Jan Kara <jack@ucw.cz>

Fixes a deadlock-causing lock-ranking bug between dqio_sem and
journal_start().

It sets up the needed infrastructure so that the quota code's sync_dquot()
operation can call into ext3 and arrange for the transaction start to be
nested outside the taking of dqio_sem.

36b4f825

[PATCH] Remove flush_page_to_ram() · edf20d3a

Andrew Morton authored Apr 12, 2003

From: Hugh Dickins <hugh@veritas.com>

This patch removes the long deprecated flush_page_to_ram. We have
two different schemes for doing this cache flushing stuff, the old
flush_page_to_ram way and the not so old flush_dcache_page etc. way:
see DaveM's Documentation/cachetlb.txt. Keeping flush_page_to_ram
around is confusing, and makes it harder to get this done right.

All architectures are updated, but the only ones where it amounts
to more than deleting a line or two are m68k, mips, mips64 and v850.

I followed a prescription from DaveM (though not to the letter), that
those arches with non-nop flush_page_to_ram need to do what it did
in their clear_user_page and copy_user_page and flush_dcache_page.

Dave is consterned that, in the v850 nb85e case, this patch leaves its
flush_dcache_page as was, uses it in clear_user_page and copy_user_page,
instead of making them all flush icache as well. That may be wrong:
I'm just hesitant to add cruft blindly, changing a flush_dcache macro
to flush icache too; and naively hope that the necessary flush_icache
calls are already in place. Miles, please let us know which way is
right for v850 nb85e - thanks.

edf20d3a

[PATCH] remove the test for null waitqueue in __wake_up() · 831cbe24

Andrew Morton authored Apr 12, 2003

I've had a warning in there for 4-5 months and it has never triggered.  I
think it's safe to remove this test.

831cbe24

[PATCH] Fix gen_rtc compilation error · a802b873

Andrew Morton authored Apr 12, 2003

From: Geert Uytterhoeven <geert@linux-m68k.org>

It updates include/asm-{generic,parisc}/rtc.h for the recent changes in
drivers/char/genrtc.c and include/asm-{m68k,ppc}/rtc.h.

get_rtc_time() now returns some RTC flags instead of a 0/-1 success/failure
indicator.  These flags include:

   - RTC_BATT_BAD: RTC battery is bad (can be detected on PA-RISC)
   - RTC_24H: Clock runs in 24 hour mode

Most of these flags are the same as drivers/char/rtc.c, but RTC_BATT_BAD is a
new one.

a802b873

[PATCH] radix_tree_delete API improvement · ed49cb09

Andrew Morton authored Apr 12, 2003

radix_tree_delete() currently returns 0 on success, -ENOENT if there was
nothing to delete.

But it is more useful to return the address of the deleted item on success
and NULL if there was no matching item.  It can potentially save a
lookup+delete operation.

ed49cb09

[PATCH] kobject hotplug fixes · 932fd605

Andrew Morton authored Apr 12, 2003

- allocated storage `envp' was being leaked on an error path

- kmalloc() returns void*, no need to cast it

- don't return 0 from a void-returning function

Greg has acked this patch.

932fd605

[PATCH] Fix module param decleration in pcilynx · a94538ff
Ben Collins authored Apr 11, 2003

a94538ff

11 Apr, 2003 16 commits

Merge davem@nuts.ninka.net:/home/davem/src/BK/net-2.5 · 7867b36d
David S. Miller authored Apr 11, 2003
```
into kernel.bkbits.net:/home/davem/net-2.5
```
7867b36d
[IGMP]: Dont dork with igmp timers on device down if not CONFIG_IP_MULTICAST. · 98bcb391
David S. Miller authored Apr 11, 2003

98bcb391
[IPSEC]: Add ipv4 tunnel transformer. · 66d2856c
David S. Miller authored Apr 11, 2003

66d2856c
[EBTABLES]: Get rid of brlock in ebtable_broute. · 925511b7
Stephen Hemminger authored Apr 11, 2003

925511b7
[BRIDGE]: Kill excessive stack usage in br_ioctl. · 21fed407
Stephen Hemminger authored Apr 11, 2003

21fed407
[IPV4]: Fix bootup lockup when !CONFIG_IP_MULTICAST. · 419c3e2e
Andrew Morton authored Apr 11, 2003

419c3e2e
Annotate sys_uselib() with user pointer annotation · 21347414
Linus Torvalds authored Apr 11, 2003

21347414

[PATCH] too much timer simplification... · 29c36d50

George Anzinger authored Apr 11, 2003

Noted by David Mosberger:

 "If someone happens to arm a periodic timer at exactly 256 jiffies (as
  ohci happens to do on platforms with HZ=1024), then you end up getting
  an endless loop of timer activations, causing a machine hang.

  The problem is that __run_timers updates base->timer_jiffies _before_
  running the callback routines.  If a callback re-arms the timer at
  exactly 256 jiffies, add_timers() will reinsert the timer into the list
  that we're currently processing, which of course will cause the timer to
  expire immediately again, etc., etc., ad naseum... "

The answer here is to move the whole expired list to a local header and
to not look back.

29c36d50

[PATCH] IEEE-1394/Firewire updates · 7b55ea65

Ben Collins authored Apr 11, 2003

- Convert nodemgr to new driver model.
- Convert to new module_param() calls.
- Merged fixes for devfs mkdir and some sleep-in-atomic fixes from
  mainline 2.5-bk
- Fix possible memory corruption on highlevel local read/write.
- Fix bitmap usage for some bitops.
- Fix bug in closing ISO stream.
- Fixes for nodemgr probing in the event of a reset storm.
- Workaround for nForce2 firewire chipset. This is preliminary.
- Conversion of SBP-2 to use new driver model in nodemgr, including
  providing a driver for firewire unit directories and registering
  proper callbacks.

7b55ea65

Merge bk://kernel.bkbits.net/gregkh/linux/linus-2.5 · 54fa1ff0
Linus Torvalds authored Apr 11, 2003
```
into home.transmeta.com:/home/torvalds/v2.5/linux
```
54fa1ff0
Merge kroah.com:/home/greg/linux/BK/bleed-2.5 · b97fe142
Greg Kroah-Hartman authored Apr 11, 2003
```
into kroah.com:/home/greg/linux/BK/gregkh-2.5
```
b97fe142
Merge kroah.com:/home/greg/linux/BK/bleed-2.5 · eab53a50
Greg Kroah-Hartman authored Apr 11, 2003
```
into kroah.com:/home/greg/linux/BK/i2c-2.5
```
eab53a50
[PATCH] i2c: Add i2c-viapro.c driver · 7e23bc16
Luca Tettamanti authored Apr 11, 2003

7e23bc16
[PATCH] USB: remove configuration change from pegasus.c · d33008a0
Oliver Neukum authored Apr 10, 2003
```
the driver should not mess with configurations here.
```
d33008a0
[PATCH] USB: remove configuration change from rtl8150 · 9adec06c
Oliver Neukum authored Apr 10, 2003
```
there's no reason this driver should mess with configurations.
```
9adec06c

Make sure to kunmap() the right address in fs/nfs/dir.c. · 8f44c5c2

Linus Torvalds authored Apr 10, 2003

Found by Rik van Riel:

 "There's a serious bug in the handling of the pointer returned
  by kmap_atomic() in nfs/dir.c.   The pointer (part of desc) is
  passed into find_dirent_name and from there into dir_decode,
  which modifies the pointer.

  That means you end up passing a wrong address to kunmap_atomic()."

8f44c5c2