Commits · 110eecfb33d49f8265cfdcf228926d7791154ba1 · Kirill Smelkov / linux

10 May, 2004 40 commits

[PATCH] blk: cache queue_congestion_on/off_threshold values · 110eecfb

Andrew Morton authored May 10, 2004

From: "Chen, Kenneth W" <kenneth.w.chen@intel.com>

It's kind of redundant that queue_congestion_on/off_threshold gets
calculated on every I/O and they produce the same number over and over
again unless q->nr_requests gets changed (which is probably a very rare
event). We can cache those values in the request_queue structure.

110eecfb

[PATCH] swsusp documentation updates · bbfbb758
Andrew Morton authored May 10, 2004
```
From: Pavel Machek <pavel@ucw.cz>
```
bbfbb758

[PATCH] simplify mqueue_inode_info->messages allocation · b3f8802c

Andrew Morton authored May 10, 2004

From: Chris Wright <chrisw@osdl.org>

Currently, if a user creates an mqueue and passes an mq_attr, the
info->messages will be created twice (and the extra one is properly freed).
This patch simply delays the allocation so that it only ever happens once. 
The relevant mq_attr data is passed to lower levels via the dentry->d_fsdata
fs private data.  This also helps isolate the areas we'd need to touch to do
rlimits on mqueues.

b3f8802c

[PATCH] bfs filesystem read past the end of dir · e37a41af

Andrew Morton authored May 10, 2004

From: Jakub Jermar <jermar@itbs.cz>

I found out that BFS filesystem will eventually try to read and interpret
garbage past the end of directory in bfs_add_entry().  If the garbage
(interpreted as i-node number) is not set to zero (does it have to be?)
bfs_add_entry() will consider it a regular directory entry. 

This causes weird things like this:
# touch a
# rm a
# ls
# touch b
# ls
a

My patch detects an attempt to read past the end of directory and explicitly
clears the garbage that represents i-node number.  Thus the correct behaviour
is achieved.

(was unable to contact Tigran)

e37a41af

[PATCH] update Documentation/md.txt · 28d627fb

Andrew Morton authored May 10, 2004

From: <spam@altium.nl> (Dick Streefland)

The following patch documents the currently undocumented raid= kernel
parameter.

28d627fb

[PATCH] es7000 subarch update for generic arch · 9828805c

Andrew Morton authored May 10, 2004

From: "Protasevich, Natalie" <Natalie.Protasevich@UNISYS.com>

This is ES7000 sub architecture update. It makes ES7000 a part of the
generic architecture, so the single compiled kernel will be able to choose
a correct set of parameters, routines ("genapic"), and a boot path. It
uses criteria provided by the subarch for platform identification. In case
of ES7000, it is a unique product/vendor string in the ACPI/MP OEM table,
and server control registers. The patch is confined to only es7000 subarch
and generic subarch. It was tested on ES7000 as well as generic Intel 8x
Xeon system. Andi Kleen has reviewed the changes.

9828805c

[PATCH] CLOCK_TICK_RATE: use CLOCK_TICK_RATE · 1dd5cc77

Andrew Morton authored May 10, 2004

From: Thorsten Kranzkowski <dl8bcu@dl8bcu.de>

use CLOCK_TICK_RATE where 1193180 was used in general timing calculations. 
(optional)

1dd5cc77

[PATCH] CLOCK_TICK_RATE: use PIT_TICK_RATE in *spkr.c · 52161621
Andrew Morton authored May 10, 2004
```
From: Thorsten Kranzkowski <dl8bcu@dl8bcu.de>
```
52161621

[PATCH] CLOCK_TICK_RATE: introduce asm-*/8253pit.h, #define PIT_TICK_RATE constant. · 80c44e42

Andrew Morton authored May 10, 2004

From: Thorsten Kranzkowski <dl8bcu@dl8bcu.de>

The calculation of the counter values in drivers/input/misc/pcspkr.c is
incorrectly based on CLOCK_TICK_RATE.  This goes unnoticed in i386 because
there the system clock is driven by the same Programmable Interval Timer chip
as the speaker.  But this doesn't hold true on other archs, e.g.  Alpha.

To solve this problem I made these patches:

1/3:    introduce asm-*/8253pit.h, #define PIT_TICK_RATE constant.
        It seems this is not always the same value.
2/3:    use PIT_TICK_RATE in *spkr.c
3/3:    use CLOCK_TICK_RATE where 1193180 was used in general timing
        calculations. (optional)

There are still some places where the magic number is used instead of the
#define (vt_ioctl.c, gameport.c) but I left them as-is.  I got some responses
from arch maintainers to specifically not touch their respective architectures
so changing these places would mean breakage for them.

Tested on Alpha and i386, ack'ed by Ralf Baechle for MIPS.


This patch:

introduce asm-*/8253pit.h, #define PIT_TICK_RATE constant.

80c44e42

[PATCH] readahead: keep file->f_ra sane · 2a12ed0e

Andrew Morton authored May 10, 2004

When two threads are simultaneously pread()ing from the same fd (which is a
legitimate thing to do), the readahead code thinks that a huge amount of
seeking is happening and shrinks the window, damaging performance a lot.

I don't see a sane way to avoid this within the readahead code, so take a
private copy of the readahead state and restore it prior to returning from the
read.

2a12ed0e

[PATCH] jiffies-to-clockt fix · 60967810

Andrew Morton authored May 10, 2004

From: john stultz <johnstul@us.ibm.com>

This patch polishes up Tim Schmielau's (tim@physik3.uni-rostock.de) fix for
jiffies_to_clock_t() and jiffies_64_to_clock_t(). The issues observed was
w/ /proc output not matching up to wall time due to accumulated error
caused by HZ not being exactly 1000 on i386 systems. The solution is to
correct that error by using the more accurate TICK_NSEC in our calculation.

Additionally, this patch corrects 3 warnings in the TCP layer uncovered by
this change.

60967810

[PATCH] cyclades cleanups · c2e48749

Andrew Morton authored May 10, 2004

From: Marcelo Tosatti <marcelo.tosatti@cyclades.com>

- cleanups for cyclades Kconfig entry	(Adrian Bunk/me)
- janitors project: remove dead function	(Don Koch)

From: aris@cathedrallabs.org (Aristeu Sergio Rozanski Filho)

	Use the standard min/max macros

c2e48749

[PATCH] fix ramdisk size assembler warning · 15e5643c

Andrew Morton authored May 10, 2004

From: Jorn Engel <joern@wohnheim.fh-wedel.de>

 AS	arch/i386/boot/setup.o
/usr/src/linux-2.6.5/arch/i386/boot/setup.S: Assembler messages:
/usr/src/linux-2.6.5/arch/i386/boot/setup.S:159: Warning: value 0x37ffffff truncated to 0x37ffffff

The warning is correct, the calculated value for ramdisk_max would be
0xb7ffffff instead of 0x37ffffff.  Truncating 0xb7ffffff to 0x37ffffff
is desired behaviour, so we should do it explicitly.

15e5643c

[PATCH] ppc64: use generic ipc syscall translation · 53fbd0b0

Andrew Morton authored May 10, 2004

From: David Gibson <david@gibson.dropbear.id.au>

Currently ppc64 has its own code to convert 32-bit ipc() syscalls to 64-bit,
rather than using the common translation code from ipc/compat.c. This patch,
tweaked slightly from an earlier version of Anton Blanchard's fixes that,
replacing the ppc64 code with calls to the common code.

I've run the LSB IPC tests, and as many of the LTP IPC tests as I could figure
out how to run easily, and it seems to pass them all.

53fbd0b0

[PATCH] gcc-3.4.0 fixes for 2.6.6-rc3 x86_64 kernel · bf434bf2

Andrew Morton authored May 10, 2004

From: Mikael Pettersson <mikpe@csd.uu.se>

Here are some patches to fix compilation warnings from
gcc-3.4.0 in the 2.6.6-rc3 x86_64 kernel.

- puts() type conflict in boot/compressed/misc.c:
  rename to putstr(), just like i386 did
- cast-as-lvalue in ia32_copy_siginfo_from_user():
  use temporary
- code before declaration in io_apic.c:
  move decl up
- code before declaration in ioremap.c:
  move existing #ifndef up
- cast-as-lvalue (tons of them) from UP version of per_cpu():
  merged asm-generic's version

bf434bf2

[PATCH] fixup 68360 module refcounting · 0b4e162c
Andrew Morton authored May 10, 2004
```
From: Christoph Hellwig <hch@lst.de>
```
0b4e162c

[PATCH] Warn when smp_call_function() is called with interrupts disabled · 43653667

Andrew Morton authored May 10, 2004

From: Keith Owens <kaos@sgi.com>

Almost every architecture has a comment above smp_call_function()

 * You must not call this function with disabled interrupts or from a
 * hardware interrupt handler or from a bottom half handler.

I have not seen any problems with calling smp_call_function() from a bottom
half handler, but calling it with interrupts disabled can definitely
deadlock.  This bug is hard to reproduce and even harder to debug.

CPU A                               CPU B
Disable interrupts
                                    smp_call_function()
                                    Take call_lock
                                    Send IPIs
                                    Wait for all cpus to acknowledge IPI
                                    CPU A has not responded, spin waiting
                                    for cpu A to respond, holding call_lock
smp_call_function()
Spin waiting for call_lock
Deadlock                            Deadlock

Change all smp_call_function() to WARN_ON(irqs_disabled()).  It should be
BUG_ON() but some buggy code like SCSI sg will break with BUG_ON, so just
warn for now.  Change it to BUG_ON after the buggy code has been fixed.

43653667

[PATCH] worker_thread race fix · 5805ad40
Andrew Morton authored May 10, 2004
```
Fix a waitqueue-handling race in worker_thread().
```
5805ad40

[PATCH] pcmcia/i82365.c warning fix · df125ce9

Andrew Morton authored May 10, 2004

From: "Luiz Fernando N. Capitulino" <lcapitulino@prefeitura.sp.gov.br>

drivers/pcmcia/i82365.c: At top level:
drivers/pcmcia/i82365.c:71: warning: `version' defined but not used

df125ce9

[PATCH] throttle P4 thermal warnings · d14c7e92

Andrew Morton authored May 10, 2004

From: Zwane Mwaikambo <zwane@linuxpower.ca>

In really bad conditions this can keep printing for a while, throttle the
output somewhat.  Also change the "CPU%d" formatting to better match the
other boot output.

d14c7e92

[PATCH] fix deadlock in create_workqueue() · b4ad84fc

Andrew Morton authored May 10, 2004

Fix bug identified by Srivatsa Vaddagiri <vatsa@in.ibm.com>:

There's a deadlock in __create_workqueue when CONFIG_HOTPLUG_CPU is set. This
can happen when create_workqueue_thread fails to create a worker thread. In
that case, we call destroy_workqueue with cpu hotplug lock held.
destroy_workqueue however also attempts to take the same lock.

b4ad84fc

[PATCH] remove blk_queue_bounce() printks · 7676bfa0

Andrew Morton authored May 10, 2004

From: Matt Domsch <Matt_Domsch@dell.com>

Jens Axboe wrote:
It should just be deleted. As you note, it is a debug message. I
originally added it so we would have some clues as to dma capability for
bug reports. There never was any, the check can go :)

7676bfa0

[PATCH] Fix MTD suspend/resume · b94ef24c

Andrew Morton authored May 10, 2004

From: Russell King <rmk@arm.linux.org.uk>

This patch carries forward the following bug fix from MTD CVS, which causes a
lot of noise after a suspend/resume cycle on ARM devices.

revision 1.127
date: 2003/07/02 20:29:38;  author: acurtis;  state: Exp;  lines: +2 -1
Added FL_STATUS to the FL_READY case in put_chip(). (Eliminate noise)

b94ef24c

[PATCH] dentry and inode cache hash algorithm performance changes. · 99effef9

Andrew Morton authored May 10, 2004

From: "Jose R. Santos" <jrsantos@austin.ibm.com>

It alleviates some issues seen with Linux when accessing millions of files on
machines with large amounts of RAM (+32GB).  Both algorithms are base on some
studies that Dominique Heger was doing on hash table efficiencies in Linux.
The dentry hash table has been tested in small systems with one internal IDE
hard disk as well as in large SMP with many fiberchanel disks.  Dominique
claims that in all the testing done, they did not see one case were this has
function provided worst performance and that in most test they were seeing
better performance.

The inode hash function was done by me base on Dominique's original work and
has only been stress tested with SpecSFS.  It provided a 3% improvement over
the default algorithm in the SpecSFS results and speed ups in the response
time of almost all filesystem operations the benchmark stress.  With the
better distribution is as also possible to reduce the number of inode buckets
for 32 million to 16 million and still get a slightly better results.

Anton was nice enough to provide some graphs that show the distribution 
before and after the patch at http://samba.org/~anton/linux/sfs/1/

For the dentry hash function, some of my other coorkers had put this hash
function through various testing and have concluded that the hash function was
equal or better than the default hash function.  These runs were done with a
(hopefully to be Open Source soon) benchmark called FFSB which can simulate
various io patters across many filesystems and variable file sizes.

SpecSFS fileset is basically a lot of small file which varies depending on the
size of the run.  For a not so big SMP system the number of file is in the +20
Million files range.  Of those 20 million files only 10% are access randomly
by the client.  The purpose of this is that the benchmark tries to stress not
only the NFS layer but, VM and Filesystems layers as well.  The filesets are
also hundreds of gigabytes in size in order to promote disk head movement by
guaranteeing cache misses in memory.  SFS 27% of the workload are lookups
__d_lookup has showing high in my profiles.

For the inode hash the problem that I see is that when running a benchmark
with this huge fileset we end up trying to free a lot of inode entries during
the run while trying to put new entries in cache.  We end up calling
ifind_fast() which calls find_inodes_fast() held under inode_lock.  In order
to avoid holding the inode_lock we needed to avoid having long chains in that
hash function.

When I took a look at the original hash function, I found it to be a bit to
simple for any workload.  My solution (which I took advantage of Dominique's
work) was to create a hash that function that could generate completely
different hashes depending on the hashval and the superblock in order to have
the hash scale as we added more filesystems to the machine.

Both of these problems can be somewhat tuned out by increasing the number of
buckets of both d and i cache but it got to a point were I had 256MB of inode
and 128MB in dentry hash buckets on a not so large SMP.  With the hash changes
I have been able to reduce the number of buckets to 128MB for inode cache and
to 32MB for dentry cache and still get better performance.

If it help my case...  I haven't been running this benchmark for long, so I
haven't been able to find a way to cheat.  I need to come up with generic
solutions until I can find a cheat for the benchmark.  :)


SDET results:

Steve Pratt seem to have a SDET setup already and he did me the favor of
running SDET with a reduce dentry entry hash table size.  I belive that
his table suggest that less than 3% change is acceptable variability, but
overall he got a 5% better number using the new hash algorith.

A) x4408way1.sdet.2.6.5100000-8p.04-05-05_12.08.44 vs 
B) x4408way1.sdet.2.6.5+hash-100000-8p.04-05-05_11.48.02


  Dentry cache hash table entries: 131072 (order: 7, 524288 bytes)
  Inode-cache hash table entries: 1048576 (order: 10, 4194304 bytes) 

Results:Throughput

                                          tolerance = 0.00 + 3.00% of A
                      A            B
   Threads      Ops/sec      Ops/sec    %diff         diff    tolerance
----------- ------------ ------------ -------- ------------ ------------
         1    4341.9300    4401.9500     1.38        60.02       130.26 
         2    8242.2000    8165.1200    -0.94       -77.08       247.27 
         4   15274.4900   15257.1000    -0.11       -17.39       458.23 
         8   21326.9200   21320.7000    -0.03        -6.22       639.81 
        16   23056.2100   24282.8000     5.32      1226.59       691.69  * 
        32   23397.2500   24684.6100     5.50      1287.36       701.92  * 
        64   23372.7600   23632.6500     1.11       259.89       701.18 
       128   17009.3900   16651.9600    -2.10      -357.43       510.28 
=========================================================================

99effef9

[PATCH] cmpci OSS driver update · 9e315f49
Andrew Morton authored May 10, 2004
```
From: C.L. Tien <cltien@cmedia.com.tw>

Current version from cmedia.
```
9e315f49

[PATCH] EDD: follow sysfs convention, MODULE_VERSION, remove dead SCSI symlink · da78fe73

Andrew Morton authored May 10, 2004

From: Matt Domsch <Matt_Domsch@dell.com>

Clean up the edd.c driver.

* use kobject_set_name() instead of snprintf() per GregKH's recommendation.
* Add MODULE_VERSION()
* s/driverfs/sysfs/ in Kconfig
* Remove report URL message, as there have been too many BIOSs reported,
  virtually none of which are EDD-capable.  This may return if/when I
  develop a better reporting method and database to capture/store the
  data from users.
* Remove the unused code for creating a symlink to the scsi_device.
  This never worked right, and I'm going to show the relationship from
  a userspace tool which uses libsysfs instead.

da78fe73

[PATCH] blk_start_queue() should use kblockd · 12db2584
Andrew Morton authored May 10, 2004
```
kblockd is the thread which runs unplug functions, not keventd.
```
12db2584

[PATCH] Only Print Taint Message Once · d137ab48

Andrew Morton authored May 10, 2004

From: Rusty Russell <rusty@rustcorp.com.au>

Only print the tainted message the first time.  Its purpose is to warn
users that we can't support them, not to fill their logs.

d137ab48

[PATCH] Un-inline spinlocks on ppc64 · 5dfd0a43

Andrew Morton authored May 10, 2004

From: Paul Mackerras <paulus@samba.org>

The patch below moves the ppc64 spinlocks and rwlocks out of line and into
arch/ppc64/lib/locks.c, and implements _raw_spin_lock_flags for ppc64.

Part of the motivation for moving the spinlocks and rwlocks out of line was
that I needed to add code to the slow paths to yield the processor to the
hypervisor on systems with shared processors. On these systems, a cpu as
seen by the kernel is a virtual processor that is not necessarily running
full-time on a real physical cpu. If we are spinning on a lock which is
held by another virtual processor which is not running at the moment, we
are just wasting time. In such a situation it is better to do a hypervisor
call to ask it to give the rest of our time slice to the lock holder so
that forward progress can be made.

The one problem with out-of-line spinlock routines is that lock contention
will show up in profiles in the spin_lock etc. routines rather than in the
callers, as it does with inline spinlocks. I have added a CONFIG_SPINLINE
config option for people that want to do profiling. In the longer term, Anton
is talking about teaching the profiling code to attribute samples in the spin
lock routines to the routine's caller.

This patch reduces the kernel by about 80kB on my G5. With inline
spinlocks selected, the kernel gets about 4kB bigger than without the
patch, because _raw_spin_lock_flags is slightly bigger than _raw_spin_lock.

This patch depends on the patch from Keith Owens to add
_raw_spin_lock_flags.

5dfd0a43

[PATCH] Allow architectures to reenable interrupts on contended spinlocks · 07f94531

Andrew Morton authored May 10, 2004

From: Keith Owens <kaos@sgi.com>

As requested by Linus, update all architectures to add the common
infrastructure.  Tested on ia64 and i386.

Enable interrupts while waiting for a disabled spinlock, but only if
interrupts were enabled before issuing spin_lock_irqsave().

This patch consists of three sections :-

* An architecture independent change to call _raw_spin_lock_flags()
  instead of _raw_spin_lock() when the flags are available.

* An ia64 specific change to implement _raw_spin_lock_flags() and to
  define _raw_spin_lock(lock) as _raw_spin_lock_flags(lock, 0) for the
  ASM_SUPPORTED case.

* Patches for all other architectures and for ia64 with !ASM_SUPPORTED
  to map _raw_spin_lock_flags(lock, flags) to _raw_spin_lock(lock).
  Architecture maintainers can define _raw_spin_lock_flags() to do
  something useful if they want to enable interrupts while waiting for
  a disabled spinlock.

07f94531

[PATCH] Kill some 'No description found...' warnings. (kernel-api.sgml) · a023cd55
Andrew Morton authored May 10, 2004
```
From: Alexey Dobriyan <adobriyan@mail.ru>

Fix various kernel-doc parameters.
```
a023cd55

[PATCH] Kill a warning while making pdfdocs. · 72468a40

Andrew Morton authored May 10, 2004

From: Alexey Dobriyan <adobriyan@mail.ru>

  DOCPROC Documentation/DocBook/parportbook.sgml
Warning(drivers/parport/share.c:188): No description found for parameter 'drv'
(kernel-doc parameter name is incorrect.)

72468a40

[PATCH] com90xx error message patch: check_region() gone · 8b3ca458

Andrew Morton authored May 10, 2004

From: Greg Aumann <Greg_Aumann@sil.org>

This patch updates two error messages to reflect changes in the code.

8b3ca458

[PATCH] Improve laptop mode's block_dump output · 6835de14

Andrew Morton authored May 10, 2004

From: "Theodore Ts'o" <tytso@mit.edu>

This patch versus improves the output produced by "echo 1 >
/proc/sys/vm/block_dump", in the following ways:

1) The messages are printed with KERN_DEBUG, so that even if sysklogd is
   running, if configured appropriately, it will not need to write to log
   files.

2) The inode which is dirtied by a process is now identified more
   precisely by inode number and filesystem ID, and by a dcache name if
   present.

3) In the generic filesystem sget function, the superblock id (s_id) is
   filled in with the filesystem type by default.  Filesystems which are
   block-device based will override s_id, but this allows pseudo
   filesystems such as tmpfs, procfs, etc.  to be identified in (2).

6835de14

[PATCH] find_user locking and leak fix · 475c3656

Andrew Morton authored May 10, 2004

find_user() is being called from set/get_priority(), but it doesn't take the
needed lock, and those callers were forgetting to drop the refcount which
find_user() took.

475c3656

[PATCH] mptfusion depends on scsi · 5a80c2ea
Andrew Morton authored May 10, 2004
```
From: Olaf Hering <olh@suse.de>
```
5a80c2ea

[PATCH] reiserfs: add device info to diagnostic messages · 9511c080

Andrew Morton authored May 10, 2004

From: Chris Mason <mason@suse.com>

From: Jeff Mahoney <jeffm@suse.com>

Add device info to the various reiserfs warnings and panics so you can tell
which filesystem triggers the message.  Loosely based on code from Oleg
Drokin.

9511c080

[PATCH] reiserfs: xattr permission fix · cee42600

Andrew Morton authored May 10, 2004

From: Chris Mason <mason@suse.com>

From: jeffm@suse.com

reiserfs permission bug fix for xattrs

cee42600

[PATCH] reiserfs: quota support · 446a7461

Andrew Morton authored May 10, 2004

From: Chris Mason <mason@suse.com>

ReiserFS support for quotas.  Originally from Jan Kara

446a7461

[PATCH] reiserfs: xattr locking fixes · 30304fc9
Andrew Morton authored May 09, 2004
```
From: Chris Mason <mason@suse.com>

From: jeffm@suse.com

reiserfs xattr locking fixes
```
30304fc9