Commits · 5dd7d1b6ad9e81c8376b0c95c73dd14a254d81f9 · Kirill Smelkov / linux

04 Feb, 2003 13 commits

Andrew Morton authored Feb 03, 2003

Patch from: Joel Becker <Joel.Becker@oracle.com>

This kernel module will detect long durations when jiffies has failed to
increment, and will reboot the machine in response.

Joel says:

"Here's why Oracle wants such a thing. We run clusters. Imagine a two node
cluster. Node1 pauses completely for some reason. There are multiple
reasons this can happen. A bad driver can udelay() for 90 seconds (qla used
to do this). zVM on S/390 can page Linux out for minutes at a time.
Anything that causes the box to freeze. Jiffies does *not* count during
this, so when Node1 returns it feels that no time has passed.

Node2, however, has been counting time. When Node1 goes away, the Oracle
cluster manager starts looking for it. After a timeout, it gives up. It
then recovers any in-progress transactions from Node1. After that, it
starts new operations, modifying the data in ways that Node1 has no idea
about (it's still out to lunch).

When Node1 finally returns (udelay() ends, zVM pages it in, whatever), any
I/O that it has queued or is about to queue will get sent to the disk.
Oops, you've just corrupted your shared data.

hangcheck-timer should catch this and reboot the box.

This is why Oracle wants this driver. We figure that such functionality
would be beneficial to others as well, so we posted to l-k. We'd all hope
that driver writers don't udelay() for 90s, but S/390 with zVM is still
around. Some folks might want to notice when it happens. I am sure other
things exist that trigger the same symptoms."

5dd7d1b6

[PATCH] remove unneeded locking in do_syslog() · 46052b73
Andrew Morton authored Feb 03, 2003
```
Lots of nonsensical locking in there.
```
46052b73

[PATCH] Avoid losing timer ticks when slab debug is enabled. · cf336416

Andrew Morton authored Feb 03, 2003

Patch from Manfred Spraul <manfred@colorfullife.com>

When slab debugging is enabled we're holding off interrupts for too long
(more than a jiffy), so reduce the alloc/free batching size when slab debug
is enabled.

cf336416

[PATCH] pgd_ctor update · ee3ddbbd

Andrew Morton authored Feb 03, 2003

From wli

A moment's reflection on the subject suggests to me it's worthwhile to
generalize pgd_ctor support so it works (without #ifdefs!) on both PAE
and non-PAE. This tiny tweak is actually more noticeably beneficial
on non-PAE systems but only really because pgd_alloc() is more visible;
the most likely reason it's less visible on PAE is "other overhead".
It looks particularly nice since it removes more code than it adds.

Touch tested on NUMA-Q (PAE). OFTC #kn testers testing the non-PAE case.

ee3ddbbd

[PATCH] Use a slab cache for pgd and pmd pages · a85cb652

Andrew Morton authored Feb 03, 2003

From Bill Irwin

This allocates pgd's and pmd's using the slab and slab ctors.  It has a
benefit beyond preconstruction in that PAE pmd's are accounted via
/proc/slabinfo

Profiling of kernel builds by Martin Bligh shows a 30-40% drop in CPU load
due to pgd_alloc()'s page clearing activity.  But this was already a tiny
fraction of the overall CPU time.

a85cb652

[PATCH] remove __GFP_HIGHIO · 3ac8c845

Andrew Morton authored Feb 03, 2003

Patch From: Hugh Dickins <hugh@veritas.com>

Recently noticed that __GFP_HIGHIO has played no real part since bounce
buffering was converted to mempool in 2.5.12: so this patch (over 2.5.58-mm1)
removes it and GFP_NOHIGHIO and SLAB_NOHIGHIO.

Also removes GFP_KSWAPD, in 2.5 same as GFP_KERNEL; leaves GFP_USER, which
can be a useful comment, even though in 2.5 same as GFP_KERNEL.

One anomaly needs comment: strictly, if there's no __GFP_HIGHIO, then
GFP_NOHIGHIO translates to GFP_NOFS; but GFP_NOFS looks wrong in the block
layer, and if you follow them down, you find that GFP_NOFS and GFP_NOIO
behave the same way in mempool_alloc - so I've used the less surprising
GFP_NOIO to replace GFP_NOHIGHIO.

3ac8c845

[PATCH] cleanup in read_cache_pages() · 99c88bc2

Andrew Morton authored Feb 03, 2003

Patch from Nikita Danilov <Nikita@Namesys.COM>

read_cache_pages() is passed a bunch of pages to start I/O against and it is
supposed to consume all those pages. But if there is an I/O error, someone
need to throw away the unused pages.

At present the single user of read_cache_pages() (nfs_readpages) does that
cleanup by hand. But it should be done in the core kernel.

99c88bc2

[PATCH] mm/mmap.c whitespace cleanups · cecee739
Andrew Morton authored Feb 03, 2003
```
- Don't require a 160-col xterm

- Coding style consistency
```
cecee739

[PATCH] file-backed vma merging · 6b2ca90b

Andrew Morton authored Feb 03, 2003

Implements merging of file-backed VMA's.  Based on Andrea's 2.4 patch.

It's only done for mmap().  mprotect() and mremap() still will not merge
VMA's.

It works for hugetlbfs mappings also.

6b2ca90b

[PATCH] add stats for page reclaim via inode freeing · b29422e3

Andrew Morton authored Feb 03, 2003

pagecache can be reclaimed via the page LRU and via prune_icache.  We
currently don't know how much reclaim is happening via each.

The patch adds instrumentation to display the number of pages which were
freed via prune_icache.  This is displayed in /proc/vmstat:pginodesteal and
/proc/vmstat:kswapd_inodesteal.

Turns out that under some workloads (well, dbench at least), fully half of
page reclaim is via the unused inode list.  Which seems quite OK to me.

b29422e3

[PATCH] fix agp compile warning · f5585f5d
Andrew Morton authored Feb 03, 2003
```
A static function in a header where presumably a static inline was intended.
```
f5585f5d

[PATCH] implement posix_fadvise64() · fccbe384

Andrew Morton authored Feb 03, 2003

An implementation of posix_fadvise64().  It adds 368 bytes to my vmlinux and
is worth it.

I didn't bother doing posix_fadvise(), as userspace can implement that by
calling fadvise64().

The main reason for wanting this syscall is to provide userspace with the
ability to explicitly shoot down pagecache when streaming large files.  This
is what O_STEAMING does, only posix_fadvise() is standards-based, and harder
to use.

posix_fadvise() also subsumes sys_readahead().

POSIX_FADV_WILLNEED will generally provide asynchronous readahead semantics
for small amounts of I/O.  As long as things like indirect blocks are aready
in core.

POSIX_FADV_RANDOM gives unprivileged applications a way of disabling
readahead on a per-fd basis, which may provide some benefit for super-seeky
access patterns such as databases.



The POSIX_FADV_* values are already implemented in glibc, and this patch
ensures that they are in sync.

A test app (fadvise.c) is available in ext3 CVS.  See

	http://www.zip.com.au/~akpm/linux/ext3/

for CVS details.

Ulrich has reviewed this patch (thanks).

fccbe384

[PATCH] stradis.c "proper" port to 2.5.x · e7bfb1db
Nathan Laredo authored Feb 03, 2003

e7bfb1db

03 Feb, 2003 15 commits

Merge http://jfs.bkbits.net/linux-2.5 · c98a2447
Linus Torvalds authored Feb 02, 2003
```
into penguin.transmeta.com:/home/penguin/torvalds/repositories/kernel/linux
```
c98a2447
kbuild: Remove export-objs := ... statements · 46124528
Kai Germaschewski authored Feb 03, 2003
```
One of the goals of the whole new modversions implementation:
export-objs is gone for good!
```
46124528
Merge jfs@jfs.bkbits.net:linux-2.5 · 13156dad
Dave Kleikamp authored Feb 03, 2003
```
into shaggy.austin.ibm.com:/shaggy/bk/jfs-2.5
```
13156dad

kbuild: Assorted fixlets · 03e7dcfb

Kai Germaschewski authored Feb 03, 2003

o Build modules with CONFIG_MODVERSIONS when just saying "make"
o Ignore generated *.ver.c files
o Fix a typo (Sam Ravnborg)
o Fix another typo (Paul Marinceu)

03e7dcfb

Merge tp1.ruhr-uni-bochum.de:/scratch/kai/kernel/v2.5/linux-2.5.make-sam · 44e26651
Kai Germaschewski authored Feb 03, 2003
```
into tp1.ruhr-uni-bochum.de:/scratch/kai/kernel/v2.5/linux-2.5.make
```
44e26651
Hand merged · c99e695c
Kai Germaschewski authored Feb 03, 2003

c99e695c

kbuild: Ignore kernel version part of vermagic if CONFIG_MODVERSIONS · e24c9231

Rusty Russell authored Feb 03, 2003

Skip over the first part of __vermagic in modversioning is on: otherwise
you'll have to force it when changing from 2.6.0 to 2.6.1.

e24c9231

kbuild: Modversions fixes · 2fe90be7

Rusty Russell authored Feb 03, 2003

Fix the case where no CRCs are supplied (OK, but taints kernel), and
only print one tainted message (otherwise --force gives hundreds of them).

2fe90be7

kbuild: Generate module versions in the normal object directories · 46f08e8a

Kai Germaschewski authored Feb 03, 2003

We generated the intermediate files that contain checksums for
unresolved symbols in .tmp_versions, which had the disadvantage
that is obscured what's going on during the build. Just
generate them as .ver.[co] right next to the actual objects in the
object tree.

46f08e8a

kbuild: Rename CONFIG_MODVERSIONING -> CONFIG_MODVERSIONS · d5ea3bb5

Kai Germaschewski authored Feb 03, 2003

CONFIG_MODVERSIONING was a temporary name introduced to distinguish
between the old and new module version implementation. Since the
traces of the old implementation are now gone from the build system,
we rename the config option back in order to not confuse users more
than necessary in 2.6.
 
Also, remove some historic modversions cruft throughout the tree.

d5ea3bb5

Merge bk://linux.bkbits.net/linux-2.5 · 6e5f0131
Dave Kleikamp authored Feb 02, 2003
```
into hostme.bitkeeper.com:/ua/repos/j/jfs/linux-2.5
```
6e5f0131

[PATCH] fix references to discarded sections · c92cacc2

Randy Dunlap authored Feb 02, 2003

After disabling files that wouldn't build, there were 2 (in-kernel)
modules that referenced _init or _exit code sections when they
shouldn't.

This fixes those modules.

c92cacc2

Merge http://linux-scsi.bkbits.net/scsi-for-linus-2.5 · 7f4174f1
Linus Torvalds authored Feb 02, 2003
```
into home.transmeta.com:/home/torvalds/v2.5/linux
```
7f4174f1
Merge raven.il.steeleye.com:/home/jejb/BK/scsi-misc-2.5 · 3b76b263
James Bottomley authored Feb 02, 2003
```
into raven.il.steeleye.com:/home/jejb/BK/scsi-for-linus-2.5
```
3b76b263
Merge raven.il.steeleye.com:/home/jejb/BK/scsi-combined-2.5 · 69a9afd5
James Bottomley authored Feb 02, 2003
```
into raven.il.steeleye.com:/home/jejb/BK/scsi-for-linus-2.5
```
69a9afd5

02 Feb, 2003 12 commits

3c509 fixes: correct MCA probing, add back ISA probe to Space.c · edf68308
James Bottomley authored Feb 02, 2003

edf68308
Merge kernel.bkbits.net:net-drivers-2.5 · 0145e5c9
Jeff Garzik authored Feb 02, 2003
```
into redhat.com:/garz/repo/net-drivers-2.5
```
0145e5c9
Merge bk://bk.arm.linux.org.uk · 49a85c6a
Linus Torvalds authored Feb 02, 2003
```
into home.transmeta.com:/home/torvalds/v2.5/linux
```
49a85c6a

[ARM] Add arch/arm/common · 1687c697

Russell King authored Feb 02, 2003

Certain support files are shared between various ARM machine classes.
In other to sanely support these, we place the shared files in
arch/arm/common instead of the individual machine class directories.

1687c697

Merge · 4251bd1a
Linus Torvalds authored Feb 02, 2003

4251bd1a

[PATCH] fix show_task oops · c7766898

Andrew Morton authored Feb 02, 2003

Patch from Russell King <rmk@arm.linux.org.uk>

show_task() attempts to calculate the amount of free space which hasn't been
written to on the kernel stack by reading from the base of the kernel stack
upwards.

However, it mistakenly uses the task_struct pointer as the base of the stack,
which it isn't, and this can cause an oops.

Here is a patch which uses the task thread pointer instead, which should be
located at the bottom of the kernel stack.  It appears this was missed when
the thread structure was introduced.

c7766898

[PATCH] exit_mmap fix for 64bit->32bit execs · 2e7c21ea

Andrew Morton authored Feb 02, 2003

The recent exit_mmap() changes broke PPC64 when 64-bit applications exec
32-bit ones. ia32-on-ia64 was broken as well

What is happening is that load_elf_binary() sets TIF_32BIT (via
SET_PERSONALITY) _before_ running exit_mmap(). So when we're unmapping the
vma's of the old image, we are running under the new image's personality.

This causes PPC64 to pass a 32-bit TASK_SIZE to unmap_vmas(), even when the
execing process had a 64-bit image. Because unmap_vmas() is not provided
with the correct virtual address span it does not unmap all the old image's
vma's and we go BUG_ON(mm->map_count) in exit_mmap().

The early SET_PERSONALITY() is required before we look up the interpreter
because the lookup of the executable has to happen under the alternate root
which SET_PERSONALITY() may set.

Unfortunately this means that we're running flush_old_exec() under the new
exec's personality. Hence this bug.

So what the patch does is to simply pass ~0UL into unmap_vmas(), which tells
it to unmap everything regardless of current personality. Which is what the
old open-coded VMA killer was doing.

There remains the problem that some architectures are sometimes passing the
incorrect TASK_SIZE into tlb_finish_mmu(). They've always been doing that.

2e7c21ea

[PATCH] Fix generic_file_readonly_mmap() · b91c1b1b

Andrew Morton authored Feb 02, 2003

We cannot clear VM_MAYWRITE in there - it turns writeable MAP_PRIVATE
mappings into readonly ones.

So change it back to the 2.4 form - disallow a writeable MAP_SHARED mapping
against filesystems which do not implement ->writepage().

b91c1b1b

[PATCH] soundcore.c referenced non-existent errno variable · 858743c2

Andrew Morton authored Feb 02, 2003

Patch from: Petr Vandrovec <vandrove@vc.cvut.cz>

soundcore is trying to perform kernel syscalls to load firmware, but falls
afoul of missing `errno'. Convert it to use VFS API functions.

858743c2

[PATCH] floppy locking fix · 7bb503fc

Andrew Morton authored Feb 02, 2003

redo_fd_request() needs to take the queue lock around the call to
elv_next_request().

7bb503fc

[PATCH] atyfb compilation fix · 0391b9be

Andrew Morton authored Feb 02, 2003

Patch from "Andres Salomon" <dilinger@voxel.net>

Fix compilation of atyfb_base.c

0391b9be

[PATCH] correct wait accounting in wait_on_buffer() · 8458eee6

Andrew Morton authored Feb 02, 2003

__wait_on_buffer() needs to use io_schedule(), so processes in there are
accounted as being in I/O wait.

8458eee6