Commits · 8f421acbbdc3daba1682bdf7b99083f4725ce5d3 · Kirill Smelkov / linux

20 Apr, 2003 40 commits

Merge home.transmeta.com:/home/torvalds/v2.5/akpm · 8f421acb
Linus Torvalds authored Apr 20, 2003
```
into home.transmeta.com:/home/torvalds/v2.5/linux
```
8f421acb
Update ensoniq driver to return whether the interrupt was for it · 2d0ed106
Linus Torvalds authored Apr 20, 2003
```
or not.
```
2d0ed106

Andrew Morton authored Apr 20, 2003

- fb_prepare_logo() is calling the undefined find_logo().  I think it wants
  fb_find_logo().

- fb_prepare_logo is not __init, therefore fb_find_logo() cannot be __init.

b009a1c6

[PATCH] Aggregated disk statistics · 34221611

Andrew Morton authored Apr 20, 2003

From: Rick Lindsley <ricklind@us.ibm.com>

To access all the system's disk statitics we currently need to access one
sysfs file per disk.  This clearly will not be acceptable with thousands of
disks.

The patch aggregates the system-wide statistics in real time and exposes them
via /proc/diskstats

34221611

[PATCH] Fix nfsctl for larger dev_t · 36ba76bb

Andrew Morton authored Apr 20, 2003

From: Andries.Brouwer@cwi.nl

The old NFS control interface passes dev_t's in from userspace.  This patch
keeps it working when the size of dev_t changes.

This is a deprecated interface - new nfs-utils uses an ascii representation
in exportfs.

Acked by Neil.

36ba76bb

[PATCH] smbfs: larger dev_t preparation · f0d10803

Andrew Morton authored Apr 20, 2003

Discard fewer bits of the device number recd with smb.
This does not depend on anything else.

Andries

f0d10803

[PATCH] prepare device mapper for larger dev_t · cc43a08a

Andrew Morton authored Apr 20, 2003

From: Joe Thornber <thornber@sistina.com>

The only other thing that will need changing in dm to cope with 64bit
dev_t concerns the bitset I'm using to keep track of allocated minor
numbers.  A trivial patch like this would work for now:

cc43a08a

[PATCH] don't shrink slab for highmem allocations · 5a08774a

Andrew Morton authored Apr 20, 2003

From: William Lee Irwin III <wli@holomorphy.com>

If one's goal is to free highmem pages, shrink_slab() is an ineffective
method of recovering them, as slab pages are all ZONE_NORMAL or ZONE_DMA.
Hence, this "FIXME: do not do for zone highmem". Presumably this is a
question of policy, as highmem allocations may be satisfied by reaping slab
pages and handing them back; but the FIXME says what we should do.

5a08774a

[PATCH] Extend map_vm_area()/get_vm_area() · 2096040f

Andrew Morton authored Apr 20, 2003

From: Christoph Hellwig <hch@infradead.org> and David M-T

The ia64 port can use vmap(), but needs to be able to specify the protection
flags and the resulting vma's vm_flags.

The patch adds the two extra args to vmap(), updates the two callers and
fixes some comment spellos.

2096040f

[PATCH] fix CONFIG_NOMMU mismerges · 4a6b60f2

Andrew Morton authored Apr 20, 2003

From: Christoph Hellwig <hch@lst.de>

we already have better stubs in nommu.c, the additional inlines in mm.h only
cause compile failures.

4a6b60f2

[PATCH] Allocate hd_structs dynamically · 5fb58500

Andrew Morton authored Apr 20, 2003

From: Badari Pulavarty <pbadari@us.ibm.com>

Here is the patch to allocate hd_struct dynamically as we find
partitions.

There are 3 things I didn't like in the patch.

1) The patch allocates 15 pointers instead of 15 hd_structs.  (incase of
   s= csi).  I was really hoping to get rid of "15" and make it really
   dynamic.  (In ca= se if we ever want to support more than 15 partitions
   per disk etc..).=20 I was thought about making it a linked list, but
   blk_partition_remap() needs to get to hd_struct for a given partition
   everytime we do IO.  So linked list would be bad, we really need direct
   access to partition in= fo.

2) I had to add "partno" to hd_struct, since part_dev_read() used to calc=
   ulate partition number from the address before.

3) kmalloc() failure in add_partition() will be silently ignored.

It saves 2048 bytes per disk.

5fb58500

[PATCH] shm_get_stat-handle-hugetlb-pages.patch · 88bdd4c3

Andrew Morton authored Apr 20, 2003

From: William Lee Irwin III <wli@holomorphy.com>

shm_get_stat() didn't know about hugetlbpage-backed shm.

88bdd4c3

[PATCH] DAC960: add call to blk_queue_bounce_limit · d7b557d1

Andrew Morton authored Apr 20, 2003

From: Dave Olien <dmo@osdl.org>

The following patch adds a call to blk_queue_bounce_limit to the DAC960
driver.  Otherwise, it uses bounce buffering more than it needs to.

d7b557d1

[PATCH] oom-kill: preferentially kill swapoff · cca095e0

Andrew Morton authored Apr 20, 2003

From: Hugh Dickins <hugh@veritas.com>

The current behaviour is that once swapoff has filled memory, other tasks get
OOMkilled one by one until swapoff completes, or more likely hangs.  It is
better that swapoff be the first choice for OOMkill.

The patch changes the oom-killer so that it will kill off any
currently-running swapoff instance before killing any other task.

(Bit kludgy, couldn't think of a better way)

cca095e0

[PATCH] Permit interruption of swapoff · 6bf11a46

Andrew Morton authored Apr 20, 2003

From: Hugh Dickins <hugh@veritas.com>

Sometimes you start a swapoff and, seeing how long it takes, wish you had
not: allow signal to interrupt and stop swapoff.

6bf11a46

[PATCH] Disallow swapoff if there is insufficient memory · 464f4e78

Andrew Morton authored Apr 20, 2003

From: Hugh Dickins <hugh@veritas.com>

First of three small "stop swapoff" patches based on 2.5.67-mm3:

stop swapoff 1/3 vm_enough_memory?

Before embarking upon swapoff, check vm_enough_memory. Mainly
for consistency in the overcommit_memory 2 (strict accounting) case:
fail with -ENOMEM if it wouldn't let the amount removed be committed.

Will always succeed in the overcommit_memory 1 case, as it should in
root-shoot-foot mode. In the overcommit_memory 0 case, well, I don't
care much either way, so opted for the simplest code: no special case.
Which means it could now fail at the start; but that's unlikely (case 0
is over-generous) and only when it would have got stuck later on anyway.

464f4e78

[PATCH] use __GFP_REPEAT in pmd_alloc_one() · 36f6aa1b
Andrew Morton authored Apr 20, 2003
```
Convert all pmd_alloc_one() implementations to use __GFP_REPEAT
```
36f6aa1b

[PATCH] use __GFP_REPEAT in pte_alloc_one() · 68b5a30f

Andrew Morton authored Apr 20, 2003

Remove all the open-coded retry loops in various architectures, use
__GFP_REPEAT.

It could be that at some time in the future we change __GFP_REPEAT to give up
after ten seconds or so, so all the checks for failed allocations are
retained.

68b5a30f

[PATCH] make alloc_buffer_head take gfp_flags · 8db50e8b

Andrew Morton authored Apr 20, 2003

- alloc_buffer_head() should take the allocation mode as an arg, and not
  assume.

- Use __GFP_NOFAIL in JBD's call to alloc_buffer_head().

- Remove all the retry code from jbd_kmalloc() - do it via page allocator
  controls.

8db50e8b

[PATCH] implement __GFP_REPEAT, __GFP_NOFAIL, __GFP_NORETRY · 75908778

Andrew Morton authored Apr 20, 2003

This is a cleanup patch.

There are quite a lot of places in the kernel which will infinitely retry a
memory allocation.

Generally, they get it wrong.  Some do yield(), the semantics of which have
changed over time.  Some do schedule(), which can lock up if the caller is
SCHED_FIFO/RR.  Some do schedule_timeout(), etc.

And often it is unnecessary, because the page allocator will do the retry
internally anyway.  But we cannot rely on that - this behaviour may change
(-aa and -rmap kernels do not do this, for instance).

So it is good to formalise and to centralise this operation.  If an
allocation specifies __GFP_REPEAT then the page allocator must infinitely
retry the allocation.

The semantics of __GFP_REPEAT are "try harder".  The allocation _may_ fail
(the 2.4 -aa and -rmap VM's do not retry infinitely by default).

The semantics of __GFP_NOFAIL are "cannot fail".  It is a no-op in this VM,
but needs to be honoured (or fix up the callers) if the VM ischanged to not
retry infinitely by default.

The semantics of __GFP_NOREPEAT are "try once, don't loop".  This isn't used
at present (although perhaps it should be, in swapoff).  It is mainly for
completeness.

75908778

[PATCH] shmdt() speedup · efbb77b2

Andrew Morton authored Apr 20, 2003

From: William Lee Irwin III <wli@holomorphy.com>

Micro-optimize sys_shmdt(). There are methods of exploiting knowledge
of the vma's being searched to restrict the search space. These are:

(1) shm mappings always start their lives at file offset 0, so only
	vma's above shmaddr need be considered. find_vma() can be used
	to seek to the proper position in mm->mmap in O(lg(n)) time.

(2) The search is for a vma which could be a fragment of a broken-up
	shm mapping, which would have been created starting at shmaddr
	with vm_pgoff 0 and then continued no further into userspace
	than shmaddr + size. So after having found an initial vma, find
	the size of the shm segment it maps to calculate an upper bound
	to the virtualspace that needs to be searched.

(3) mremap() would have caused the original checks to miss vma's mapping
	the shm segment if shmaddr were the original address at which
	the shm segments were attached. This does no better and no worse
	than the original code in that situation.

(4) If the chain of references in vma->vm_file->f_dentry->d_inode->i_size
	is not guaranteed by refcounting and/or the shm code then this is
	oopsable; AFAICT an inode is always allocated.

efbb77b2

[PATCH] AIO mmap fix · bb455250

Andrew Morton authored Apr 20, 2003

From: Badari Pulavarty <pbadari@us.ibm.com>

Here is a small bug fix for AIO. get_user_pages() takes number
of pages to map as argument. (not in bytes)

bb455250

[PATCH] quotactl(): sync all quotas · d637ceb0

Andrew Morton authored Apr 20, 2003

From: Jan Kara <jack@suse.cz>

  I'm resending a patch which implements quotactl(2) call for syncing
all devices. Particulary it allows the caller not to specify the device
for syncing and in that case quotas on all the devices are written.
The patch is rather trivial (mostly moving the code).

d637ceb0

[PATCH] ATI Mach64 build fix · 061fa91f

Andrew Morton authored Apr 20, 2003

From: Geert Uytterhoeven <geert@linux-m68k.org>

Atyfb: Add missing parts of reversal of Mobility changes, allowing ATI Mach64
GX support to compile again.

061fa91f

[PATCH] hugetlb math overflow fix · 03b83710

Andrew Morton authored Apr 20, 2003

From: William Lee Irwin III <wli@holomorphy.com>

And this one fixes an overflow when there is more than 4GB of hugetlb.

03b83710

[PATCH] follow_hugetlb_page fix · a3efc1fa

Andrew Morton authored Apr 20, 2003

From: William Lee Irwin III <wli@holomorphy.com>

follow_hugetlb_page() drops out of the loop prematurely and fails to take the
appropriate refcounts if its starting address was not hugepage-aligned.

It looked a bit unclean too, so I rewrote it.  This fixes a bug, and more
importantly, makes the thing readable by something other than a compiler
(e.g.  programmers).

a3efc1fa

[PATCH] Clean up various buffer-head dependencies · cda55f33

Andrew Morton authored Apr 20, 2003

From: William Lee Irwin III <wli@holomorphy.com>

Remove page_has_buffers() from various functions, document the dependencies
on buffer_head.h from other files besides filemap.c, and s/this file/core VM/
in filemap.c

cda55f33

[PATCH] Move __set_page_dirty_buffers to fs/buffer.c · 5549174d

Andrew Morton authored Apr 20, 2003

From: William Lee Irwin III <wli@holomorphy.com>

Move __set_page_dirty_buffers() to fs/buffer.c, as per the FIXME.

5549174d

[PATCH] Turn on NUMA rebalancing · 26fbf90f

Andrew Morton authored Apr 20, 2003

From: "Martin J. Bligh" <mbligh@aracnet.com>

I'd forgotten that I'd set this to only fire every 20s in the past, because
it would rebalance too agressively.  That seems to be fixed now, so we should
turn it back on.

26fbf90f

[PATCH] Make PCI scanning order the same as 2.4 · 609b0188

Andrew Morton authored Apr 20, 2003

From: Chuck Ebbert <76306.1226@compuserve.com>

2.4 builds its global PCI device list in breadth-first order.

2.5 is doing the scan that way but defers the construction of the global list
until later and then does it depth-first.  This causes devices to found in
different order by drivers.  The below fixed that problem for me.

Russell King has acked this change.

609b0188

[PATCH] keyboard.c Fix SAK in raw mode · 5da505b1

Andrew Morton authored Apr 20, 2003

From: Chris Heath <chris@heathens.co.nz>

Trivial fix to get the SAK key working in raw and medium raw modes.  Patch is
against kernel 2.5.67.

5da505b1

[PATCH] Minor fix for driver/serial/core.c · 72689e67

Andrew Morton authored Apr 20, 2003

From: Jean Tourrilhes <jt@bougret.hpl.hp.com>

	The following command will do nothing at all on 2.5.X :
		setserial /dev/ttyS0 uart none

72689e67

[PATCH] detect_lost_tick locking fixes · d9a4b6c5

Andrew Morton authored Apr 20, 2003

From: john stultz <johnstul@us.ibm.com>

This patch fixes a race in the timer_interrupt code caused by
detect_lost_tick(). Since we're doing lost-tick compensation outside
timer->mark_offset, time can pass between time-source reads which can cause
gettimeofday inconsistencies.

Additionally detect_lost_tick() was broken for the PIT case, since the whole
point of detect_lost_tick() is to interpolate between two time sources to
find inconsistencies. Additionally this could cause xtime_lock seq_lock
reader starvation which has been causing machine hangs for SMP boxes that use
the PIT as a time source.

This patch fixes the described race by removing detect_lost_tick() and
instead implementing the lost tick detection code inside mark_offset().

Some of the divs and mods being added here might concern folks, but by not
calling timer->get_offset() in detect_lost_tick() we eliminate much of the
same math. I did some simple cycle counting and the new code comes out on
average equivalent or faster.

d9a4b6c5

[PATCH] get_offset_pit and do_timer_overflow vs IRQ locking · e2ac56f6

Andrew Morton authored Apr 20, 2003

From: john stultz <johnstul@us.ibm.com>, Alexander Atanasov <alex@ssi.bg>

We want to make sure we update jiffies_p and count_p atomically.  So I'm
inserting the spin_unlock_irqrestore() after we update count_p, rather then
just before.

e2ac56f6

[PATCH] Fix jiffies_to_time[spec | val] and converse to use · 0ebcfd99

Andrew Morton authored Apr 20, 2003

From: george anzinger <george@mvista.com>

In the current system (2.5.67) time_spec to jiffies, time_val to
jiffies and the converse (jiffies to time_val and jiffies to
time_spec) all use 1/HZ as the measure of a jiffie.  Because of the
inability of the PIT to actually generate an accurate 1/HZ interrupt,
the wall clock is updated with a more accurate value (999848
nanoseconds per jiffie for HZ = 1000).  This causes a 1/HZ
interpretation of jiffies based timing to run faster than the wall
clock, thus causing sleeps and timers to expire short of the requested
time.  Try, for example:

time sleep 60

This patch changes the conversion routines to use the same value as
the wall clock update code to do the conversions.

The actual math is almost all done at compile time.  The run time
conversions require little if any more execution time.

This patch must be applied after the patch I posted earlier today
which fixed the CLOCK_MONOTONIC resolution issue.

0ebcfd99

[PATCH] Fix POSIX timers to give CLOCK_MONOTONIC full · 2f98681f

Andrew Morton authored Apr 20, 2003

The POSIX CLOCK_MONOTONIC currently has only 1/HZ resolution. Further, it is
tied to jiffies (i.e. is a restatment of jiffies) rather than "xtime" or the
gettimeofday() clock.

This patch changes CLOCK_MONOTONIC to be a restatment of gettimeofday() plus
an offset to remove any clock setting activity from CLOCK_MONOTONIC. An
offset is kept that represents the difference between CLOCK_MONOTONIC and
gettimeofday(). This offset is updated when ever the gettimeofday() clock is
set to back the clock setting change out of CLOCK_MONOTONIC (which by the
standard, can not be set).

With this change CLOCK_REALTIME (a direct restatement of gettimeofday()),
CLOCK_MONOTONIC and gettimeofday() will all tick at the same time and with
the same rate. And all will be affected by NTP adjustments (save those which
actually set the time).

2f98681f

[PATCH] Fix and clean up DCACHE_REFERENCED usage · 0e3efbd1

Andrew Morton authored Apr 20, 2003

From: Maneesh Soni <maneesh@in.ibm.com>

This patch changes the way DCACHE_REFERENCED flag is used. It
got messed up in dcache_rcu iterations. I hope this will be ok now.

The flag was meant to be advisory flag which is used while
prune_dcache() so as not to free dentries which have recently
entered d_lru list. At first pass in prune_dcache the dentries
marked DCACHE_REFERENCED are left with the flag reset. and they
are freed in the next pass.

So, now we mark the dentry as DCACHE_REFERENCED when it is first
entering the d_lru list in dput() and resetthe flag in prune_dcache().
If the flag remains reset in the next call to prune_dcache(), the
dentry is then freed.

Also I don't think any file system have to use this flag as it is taken
care by the dcache layer. The patch removes such code from a few of file
systems. Moreover these filesystems were anyway doing worng thing as they
were changing the flag out of dcache_lock.

Changes:
o dput() marks dentry DCACHE_REFERENCED when it is added to the dentry_unused
  list
o no need to set the flag in dget, dget_locked, d_lookup as these guys anyway
  increments the ref count.
o check the ref count in prune_dcache and use DCACHE_REFERENCED flag just for
  two stage aging.
o remove code for setting DACACHE_REFERENCED from reiserfs, fat, xfs and
  exportfs.

0e3efbd1

[PATCH] dentry_stat accounting fix · de8e3749

Andrew Morton authored Apr 20, 2003

From: Maneesh Soni <maneesh@in.ibm.com>

This patch the corrects the dentry_stat.nr_unused calculation.

In select_parent() and shrink_dcache_anon() we were not doing any adjustments
to the nr_unused count after manipulating the dentry_unused list.  Now the
nr_unused count is decremented if the dentry is on dentry_unused list and is
removed from there.

Further in the same routines, we have to adjust the nr_unused count again if
the dentry is moved to the end of d_lru list for pruning.

de8e3749

[PATCH] dmfe: don't free skb with local interrupts disabled · 70d67000

Andrew Morton authored Apr 20, 2003

dev_kfree_skb() can end up calling local_bh_enable() which goes BUG if local
interrupts are disabled.  Apparently it can deadlock.

So move the skb freeing outside the lock in the dmfe driver.  It will
decrease the lock hold time as well.

70d67000

[PATCH] Fix nc98 partition parser link error · cb970405
Andrew Morton authored Apr 20, 2003
```
Fix this:

fs/partitions/nec98.c:169: undefined reference to `parse_bsd'
```
cb970405