Commits · 68b5a30ff3f3bd193ab1c824b7dc9ed464988a41 · nexedi / linux

20 Apr, 2003 24 commits

[PATCH] use __GFP_REPEAT in pte_alloc_one() · 68b5a30f

Andrew Morton authored Apr 20, 2003

Remove all the open-coded retry loops in various architectures, use
__GFP_REPEAT.

It could be that at some time in the future we change __GFP_REPEAT to give up
after ten seconds or so, so all the checks for failed allocations are
retained.

68b5a30f

[PATCH] make alloc_buffer_head take gfp_flags · 8db50e8b

Andrew Morton authored Apr 20, 2003

- alloc_buffer_head() should take the allocation mode as an arg, and not
  assume.

- Use __GFP_NOFAIL in JBD's call to alloc_buffer_head().

- Remove all the retry code from jbd_kmalloc() - do it via page allocator
  controls.

8db50e8b

[PATCH] implement __GFP_REPEAT, __GFP_NOFAIL, __GFP_NORETRY · 75908778

Andrew Morton authored Apr 20, 2003

This is a cleanup patch.

There are quite a lot of places in the kernel which will infinitely retry a
memory allocation.

Generally, they get it wrong.  Some do yield(), the semantics of which have
changed over time.  Some do schedule(), which can lock up if the caller is
SCHED_FIFO/RR.  Some do schedule_timeout(), etc.

And often it is unnecessary, because the page allocator will do the retry
internally anyway.  But we cannot rely on that - this behaviour may change
(-aa and -rmap kernels do not do this, for instance).

So it is good to formalise and to centralise this operation.  If an
allocation specifies __GFP_REPEAT then the page allocator must infinitely
retry the allocation.

The semantics of __GFP_REPEAT are "try harder".  The allocation _may_ fail
(the 2.4 -aa and -rmap VM's do not retry infinitely by default).

The semantics of __GFP_NOFAIL are "cannot fail".  It is a no-op in this VM,
but needs to be honoured (or fix up the callers) if the VM ischanged to not
retry infinitely by default.

The semantics of __GFP_NOREPEAT are "try once, don't loop".  This isn't used
at present (although perhaps it should be, in swapoff).  It is mainly for
completeness.

75908778

[PATCH] shmdt() speedup · efbb77b2

Andrew Morton authored Apr 20, 2003

From: William Lee Irwin III <wli@holomorphy.com>

Micro-optimize sys_shmdt(). There are methods of exploiting knowledge
of the vma's being searched to restrict the search space. These are:

(1) shm mappings always start their lives at file offset 0, so only
	vma's above shmaddr need be considered. find_vma() can be used
	to seek to the proper position in mm->mmap in O(lg(n)) time.

(2) The search is for a vma which could be a fragment of a broken-up
	shm mapping, which would have been created starting at shmaddr
	with vm_pgoff 0 and then continued no further into userspace
	than shmaddr + size. So after having found an initial vma, find
	the size of the shm segment it maps to calculate an upper bound
	to the virtualspace that needs to be searched.

(3) mremap() would have caused the original checks to miss vma's mapping
	the shm segment if shmaddr were the original address at which
	the shm segments were attached. This does no better and no worse
	than the original code in that situation.

(4) If the chain of references in vma->vm_file->f_dentry->d_inode->i_size
	is not guaranteed by refcounting and/or the shm code then this is
	oopsable; AFAICT an inode is always allocated.

efbb77b2

[PATCH] AIO mmap fix · bb455250

Andrew Morton authored Apr 20, 2003

From: Badari Pulavarty <pbadari@us.ibm.com>

Here is a small bug fix for AIO. get_user_pages() takes number
of pages to map as argument. (not in bytes)

bb455250

[PATCH] quotactl(): sync all quotas · d637ceb0

Andrew Morton authored Apr 20, 2003

From: Jan Kara <jack@suse.cz>

  I'm resending a patch which implements quotactl(2) call for syncing
all devices. Particulary it allows the caller not to specify the device
for syncing and in that case quotas on all the devices are written.
The patch is rather trivial (mostly moving the code).

d637ceb0

[PATCH] ATI Mach64 build fix · 061fa91f

Andrew Morton authored Apr 20, 2003

From: Geert Uytterhoeven <geert@linux-m68k.org>

Atyfb: Add missing parts of reversal of Mobility changes, allowing ATI Mach64
GX support to compile again.

061fa91f

[PATCH] hugetlb math overflow fix · 03b83710

Andrew Morton authored Apr 20, 2003

From: William Lee Irwin III <wli@holomorphy.com>

And this one fixes an overflow when there is more than 4GB of hugetlb.

03b83710

[PATCH] follow_hugetlb_page fix · a3efc1fa

Andrew Morton authored Apr 20, 2003

From: William Lee Irwin III <wli@holomorphy.com>

follow_hugetlb_page() drops out of the loop prematurely and fails to take the
appropriate refcounts if its starting address was not hugepage-aligned.

It looked a bit unclean too, so I rewrote it.  This fixes a bug, and more
importantly, makes the thing readable by something other than a compiler
(e.g.  programmers).

a3efc1fa

[PATCH] Clean up various buffer-head dependencies · cda55f33

Andrew Morton authored Apr 20, 2003

From: William Lee Irwin III <wli@holomorphy.com>

Remove page_has_buffers() from various functions, document the dependencies
on buffer_head.h from other files besides filemap.c, and s/this file/core VM/
in filemap.c

cda55f33

[PATCH] Move __set_page_dirty_buffers to fs/buffer.c · 5549174d

Andrew Morton authored Apr 20, 2003

From: William Lee Irwin III <wli@holomorphy.com>

Move __set_page_dirty_buffers() to fs/buffer.c, as per the FIXME.

5549174d

[PATCH] Turn on NUMA rebalancing · 26fbf90f

Andrew Morton authored Apr 20, 2003

From: "Martin J. Bligh" <mbligh@aracnet.com>

I'd forgotten that I'd set this to only fire every 20s in the past, because
it would rebalance too agressively.  That seems to be fixed now, so we should
turn it back on.

26fbf90f

[PATCH] Make PCI scanning order the same as 2.4 · 609b0188

Andrew Morton authored Apr 20, 2003

From: Chuck Ebbert <76306.1226@compuserve.com>

2.4 builds its global PCI device list in breadth-first order.

2.5 is doing the scan that way but defers the construction of the global list
until later and then does it depth-first.  This causes devices to found in
different order by drivers.  The below fixed that problem for me.

Russell King has acked this change.

609b0188

[PATCH] keyboard.c Fix SAK in raw mode · 5da505b1

Andrew Morton authored Apr 20, 2003

From: Chris Heath <chris@heathens.co.nz>

Trivial fix to get the SAK key working in raw and medium raw modes.  Patch is
against kernel 2.5.67.

5da505b1

[PATCH] Minor fix for driver/serial/core.c · 72689e67

Andrew Morton authored Apr 20, 2003

From: Jean Tourrilhes <jt@bougret.hpl.hp.com>

	The following command will do nothing at all on 2.5.X :
		setserial /dev/ttyS0 uart none

72689e67

[PATCH] detect_lost_tick locking fixes · d9a4b6c5

Andrew Morton authored Apr 20, 2003

From: john stultz <johnstul@us.ibm.com>

This patch fixes a race in the timer_interrupt code caused by
detect_lost_tick(). Since we're doing lost-tick compensation outside
timer->mark_offset, time can pass between time-source reads which can cause
gettimeofday inconsistencies.

Additionally detect_lost_tick() was broken for the PIT case, since the whole
point of detect_lost_tick() is to interpolate between two time sources to
find inconsistencies. Additionally this could cause xtime_lock seq_lock
reader starvation which has been causing machine hangs for SMP boxes that use
the PIT as a time source.

This patch fixes the described race by removing detect_lost_tick() and
instead implementing the lost tick detection code inside mark_offset().

Some of the divs and mods being added here might concern folks, but by not
calling timer->get_offset() in detect_lost_tick() we eliminate much of the
same math. I did some simple cycle counting and the new code comes out on
average equivalent or faster.

d9a4b6c5

[PATCH] get_offset_pit and do_timer_overflow vs IRQ locking · e2ac56f6

Andrew Morton authored Apr 20, 2003

From: john stultz <johnstul@us.ibm.com>, Alexander Atanasov <alex@ssi.bg>

We want to make sure we update jiffies_p and count_p atomically.  So I'm
inserting the spin_unlock_irqrestore() after we update count_p, rather then
just before.

e2ac56f6

[PATCH] Fix jiffies_to_time[spec | val] and converse to use · 0ebcfd99

Andrew Morton authored Apr 20, 2003

From: george anzinger <george@mvista.com>

In the current system (2.5.67) time_spec to jiffies, time_val to
jiffies and the converse (jiffies to time_val and jiffies to
time_spec) all use 1/HZ as the measure of a jiffie.  Because of the
inability of the PIT to actually generate an accurate 1/HZ interrupt,
the wall clock is updated with a more accurate value (999848
nanoseconds per jiffie for HZ = 1000).  This causes a 1/HZ
interpretation of jiffies based timing to run faster than the wall
clock, thus causing sleeps and timers to expire short of the requested
time.  Try, for example:

time sleep 60

This patch changes the conversion routines to use the same value as
the wall clock update code to do the conversions.

The actual math is almost all done at compile time.  The run time
conversions require little if any more execution time.

This patch must be applied after the patch I posted earlier today
which fixed the CLOCK_MONOTONIC resolution issue.

0ebcfd99

[PATCH] Fix POSIX timers to give CLOCK_MONOTONIC full · 2f98681f

Andrew Morton authored Apr 20, 2003

The POSIX CLOCK_MONOTONIC currently has only 1/HZ resolution. Further, it is
tied to jiffies (i.e. is a restatment of jiffies) rather than "xtime" or the
gettimeofday() clock.

This patch changes CLOCK_MONOTONIC to be a restatment of gettimeofday() plus
an offset to remove any clock setting activity from CLOCK_MONOTONIC. An
offset is kept that represents the difference between CLOCK_MONOTONIC and
gettimeofday(). This offset is updated when ever the gettimeofday() clock is
set to back the clock setting change out of CLOCK_MONOTONIC (which by the
standard, can not be set).

With this change CLOCK_REALTIME (a direct restatement of gettimeofday()),
CLOCK_MONOTONIC and gettimeofday() will all tick at the same time and with
the same rate. And all will be affected by NTP adjustments (save those which
actually set the time).

2f98681f

[PATCH] Fix and clean up DCACHE_REFERENCED usage · 0e3efbd1

Andrew Morton authored Apr 20, 2003

From: Maneesh Soni <maneesh@in.ibm.com>

This patch changes the way DCACHE_REFERENCED flag is used. It
got messed up in dcache_rcu iterations. I hope this will be ok now.

The flag was meant to be advisory flag which is used while
prune_dcache() so as not to free dentries which have recently
entered d_lru list. At first pass in prune_dcache the dentries
marked DCACHE_REFERENCED are left with the flag reset. and they
are freed in the next pass.

So, now we mark the dentry as DCACHE_REFERENCED when it is first
entering the d_lru list in dput() and resetthe flag in prune_dcache().
If the flag remains reset in the next call to prune_dcache(), the
dentry is then freed.

Also I don't think any file system have to use this flag as it is taken
care by the dcache layer. The patch removes such code from a few of file
systems. Moreover these filesystems were anyway doing worng thing as they
were changing the flag out of dcache_lock.

Changes:
o dput() marks dentry DCACHE_REFERENCED when it is added to the dentry_unused
  list
o no need to set the flag in dget, dget_locked, d_lookup as these guys anyway
  increments the ref count.
o check the ref count in prune_dcache and use DCACHE_REFERENCED flag just for
  two stage aging.
o remove code for setting DACACHE_REFERENCED from reiserfs, fat, xfs and
  exportfs.

0e3efbd1

[PATCH] dentry_stat accounting fix · de8e3749

Andrew Morton authored Apr 20, 2003

From: Maneesh Soni <maneesh@in.ibm.com>

This patch the corrects the dentry_stat.nr_unused calculation.

In select_parent() and shrink_dcache_anon() we were not doing any adjustments
to the nr_unused count after manipulating the dentry_unused list.  Now the
nr_unused count is decremented if the dentry is on dentry_unused list and is
removed from there.

Further in the same routines, we have to adjust the nr_unused count again if
the dentry is moved to the end of d_lru list for pruning.

de8e3749

[PATCH] dmfe: don't free skb with local interrupts disabled · 70d67000

Andrew Morton authored Apr 20, 2003

dev_kfree_skb() can end up calling local_bh_enable() which goes BUG if local
interrupts are disabled.  Apparently it can deadlock.

So move the skb freeing outside the lock in the dmfe driver.  It will
decrease the lock hold time as well.

70d67000

[PATCH] Fix nc98 partition parser link error · cb970405
Andrew Morton authored Apr 20, 2003
```
Fix this:

fs/partitions/nec98.c:169: undefined reference to `parse_bsd'
```
cb970405

[PATCH] 3c574_cs fixes · 0f23a3a8

Andrew Morton authored Apr 20, 2003

- It was doing spin_lock_irqsave()/spin_unlock()

- Can't free the skb inside local_irq_save(): kfree_skb ends up running
  local_bh_enable(), which enables interrupts.

0f23a3a8

19 Apr, 2003 6 commits

Linux 2.5.68 · b2520649
Linus Torvalds authored Apr 19, 2003

b2520649

[PATCH] IEEE-1394/Firewire updates · ffb74927

Ben Collins authored Apr 19, 2003

- Cleaned up hostinfo usage in all drivers and created a central API to
  handle them all.
- Fixup some spinlock mis-usage.
- Remove devfs_handle mis-usage.
- Cleaned up some heavy handed spinlocking to use mutexes instead.
- Add function to send PHY config packets and use to to settle
  IRM/cycle-master/root descrepancies.

ffb74927

Merge bk://bk.arm.linux.org.uk/linux-2.5-serial · 77ad1af8
Linus Torvalds authored Apr 19, 2003
```
into home.transmeta.com:/home/torvalds/v2.5/linux
```
77ad1af8
[PATCH] s/to long/too long/ · e4a874f0
Andries E. Brouwer authored Apr 19, 2003

e4a874f0
[PATCH] correct error message for failed clone ns · e02fafe8
Andries E. Brouwer authored Apr 19, 2003
```
If copy_namespace() returns -EPERM, copy_process() will
return a confusing -ENOMEM. Fix it thus.
```
e02fafe8

[PATCH] fix slab corruption in namespace.c · d35f1926

Andries E. Brouwer authored Apr 19, 2003

	new_ns = kmalloc(sizeof(struct namespace *), GFP_KERNEL);
thing wasn't a very good idea.

The rest are whitespace cleanups.

d35f1926

18 Apr, 2003 10 commits

[PATCH] struct loop_info64 with __u64 · ca49321f

Andries E. Brouwer authored Apr 18, 2003

(i) Replace in struct loop_info the dev_t field by __kernel_old_dev_t,
where this type is defined in <asm/posix_types.h>, so that problems
with a differently sized dev_t in userspace are avoided.

(ii) Introduce a new loop_info64, with __u64 device, inode and offset
fields.

ca49321f

[PATCH] gconfig: bug #540 · 24ebdc14

Romain Liévin authored Apr 18, 2003

This replaces checkboxes by radiobuttons whereever necessary (menu
choices).  It partially fixes the #540 bug report.

24ebdc14

Merge bk://kernel.bkbits.net/davem/sparc-2.5 · 84545d88
Linus Torvalds authored Apr 18, 2003
```
into home.transmeta.com:/home/torvalds/v2.5/linux
```
84545d88

[BRIDGE]: Ethernet bridge driver device mangling cleanup · a4f64ccb

Stephen Hemminger authored Apr 17, 2003

Second try at the bridge driver module handling cleanup...

1) Eliminate keeping a seperate bridge_list and use a bit on
   the priv_flags structure.  This is equivalent to how the VLAN
   code works. Makes code cleaner and correctly handles cases like
   creating a bridge with the same name as an existing ether device etc.

2) Don't do own module ref counting that is inhernently racy.
   Instead set owner field and cleanup debris on unload.

3) Do last state cleanup in destructor

4) Change of bridge state (dev_open/stop) should use write_lock

5) Make sure timer is not running when cleared.

6) Use "const char *" where possible

a4f64ccb

[BRIDGE]: Fix race in br_fdb_get_entries. · 2631ea60
Stephen Hemminger authored Apr 17, 2003

2631ea60
[SPARC64]: Missed rusage/rlimit/wait4 compat conversions. · bce0381a
David S. Miller authored Apr 17, 2003

bce0381a
[SPARC64]: Remove LVM ioctls. · 11493c87
Andries E. Brouwer authored Apr 17, 2003

11493c87

[DECNET]: DECnet routing fixes etc. · e155ad0c

Steven Whitehouse authored Apr 17, 2003

o As requested, macros in dn_fib.h changed to decnet specific names
o Two bugs fixed (only in 2.5 decnet stack) relating to bind and connection
  states.
o Numerous style changes: using C99 initialisers and inline rather
  than __inline__
o Use struct flowi as routing key (for forthcoming flow cache)
o Add metrics to routing table
o Many routing table bug fixes
o New wait code to improve efficiency
o We use real device MTUs now rather than saying "hmm... looks like ethernet
  must be 1500" as we used to (still one or two places to fix, but its
  mostly correct in this patch)
o Tidy up in af_decnet.c:dn_sendmsg() in preparation for zerocopy
o Updates to rtnetlink code to return more information
o Removed ioctl() for decnet fib. It never did anything and rtnetlink is
  a far better interface anyway.
o Converted /proc/decnet_neigh to seq_file (other /proc files to follow)
o DECnet route cache now uses RCU like the ipv4 route cache
o Misc bug fixes wherever I found them
o SO_BINDTODEVICE works for outgoing connections

e155ad0c

[VLAN]: Cleaner module interface. · bd9056f7
Stephen Hemminger authored Apr 17, 2003

bd9056f7
Merge nuts.ninka.net:/home/davem/src/BK/network-2.5 · 73ee6aab
David S. Miller authored Apr 17, 2003
```
into nuts.ninka.net:/home/davem/src/BK/net-2.5
```
73ee6aab