Commits · 6a69bfeed836db64c6c3e16881d28c887bf52322 · Kirill Smelkov / linux

18 Feb, 2004 40 commits

[PATCH] off_t in nfsd_commit needs to be loff_t · 6a69bfee

Andrew Morton authored Feb 18, 2004

From: Neil Brown <neilb@cse.unsw.edu.au>,

From: Miquel van Smoorenburg <miquels@cistron.nl>

While I was stress-testing NFS/XFS on 2.6.1/2.6.2-rc, I found that
sometimes my "dd" would exit with:

	#  dd if=/dev/zero bs=4096 > /mnt/file
	dd: writing `standard output': Invalid argument
	1100753+0 records in
	1100752+0 records out

After adding some debug printk's to the server and client code and some
tcpdump-ing, I found that the NFSERR_INVAL was returned by nfsd_commit on
the server.

Turns out that the "offset" argument is off_t instead of loff_t.  It isn't
used at all (unfortunately), but it _is_ checked for sanity, so that's
where the error came from.

6a69bfee

[PATCH] drivers/char/vt possible race · ce8b13c9

Andrew Morton authored Feb 18, 2004

From: Benjamin Herrenschmidt <benh@kernel.crashing.org>

I falled again on the crash in con_do_write() with driver->data beeing
NULL. It happens during boot, when userland is playing open/close games
with tty's, I was intentionally typing keys like mad during boot trying to
trigger another problem when this one poped up.

Looking at the code, I'm not sure how protected we are by the above (tty)
layer, paulus told me to not rely on anything like locking coming from
there, so I decided to extend the scope of the console semaphore one more
bit to cover races between calls to con_open, con_close and con_write.
Note that in con_do_write, I intentionally drop the semaphore to avoid
keeping it held when waiting on the local buffer, and I added some sanity
checks on tty->driver_data with some printk's in case we still have an open
race by the tty layer. At least, now, the couple vc_allocated &
tty->driver_data should be protected though.

ce8b13c9

[PATCH] /proc thread visibility fixes · 9b6722ed

Andrew Morton authored Feb 18, 2004

From: Kingsley Cheung <kingsley@aurema.com>

Is is possible to examine the data of tasks currently existing in the system
which are not threads of the same thread group.

For example, the only task in the group where init is group leader is itself:

gen2 02:50:44 ~: ls /proc/1/task
1

However, I can then read the contents of 'stat' for any other task in the
system:

gen2 02:49:45 ~: cat /proc/1/task/$$/stat
1669 (bash) S 1668 1669 1669 34816 1730 256 1480 6479 12 4 8 5 5 17 15 0 1 0
+8065 3252224 451 4294967295 134512640 134955932 3221225104 3221222840
+4294960144 0 65536 3686404 1266761467 3222442959 0 0 17 0 0 0

I had a look at fs/proc/base.c and found that the 'lookup' functions for
these directories were checking that the task in question existed, but
overlooked the following:

1.  In the function proc_pid_lookup, a check is required to ensure that
    the task in question is a thread group leader.  Without the check, any
    task can have its data retrieved accordingly.  Consider the following.
    There is a multithreaded process 1777.

gen2 23:22:47 /proc/1777: ls task
1777  1778  1779  1780  1781  1782  1783  1784  1785  1786  1787  1788

However, I can read the stat file for its thread 1778 as follows:

gen2 23:22:50 /proc/1777: cat /proc/1778/stat
1778 (multithreadtest) T 1777 1777 1672 34816 1672 64 0 0 0 0 14 17 0 0 15 0 12 0 8871 24727552 104 4294967295 134512640 134515104 3221222496 1077365276 4294960144 0 0 0 0 3222479248 0 0 -1 1 0 0

But 1778 is not meant to show up in /proc/, as intended right?:

gen2 23:22:56 /proc/1777: ls /proc/
1     1365  1661  1793  881        dma          kcore       scsi
10    1371  1662  18    9          driver       kmsg        self
1014  1372  1663  2     909        execdomains  loadavg     slabinfo
1032  14    1664  3     963        fb           locks       stat
1062  15    1665  4     966        filesystems  mdstat      swaps
1066  16    1666  5     buddyinfo  fs           meminfo     sys
1067  1605  1669  6     bus        ide          misc        sysrq-trigger
1087  1610  1670  7     cmdline    interrupts   modules     sysvipc
1095  1611  1671  736   cpuinfo    iomem        mounts      tty
11    1641  1672  8     crypto     ioports      mtrr        uptime
12    1658  17    807   devices    irq          net         version
13    1660  1777  810   diskstats  kallsyms     partitions  vmstat

2.  The other part of the bug is in the function proc_task_lookup.  Here
    there needs to be a check that the task X is indeed a thread of the
    thread group Y when we read /proc/<Y>/task/<X>.

Right now, this check does not exist, which allows for any existing
task to have its data read from another thread group directory.  The
following reads the stat directory of my bash shell from the thread
group 1.

gen2 23:28:07 ~: cd /proc/1
gen2 23:28:10 /proc/1: ls
auxv     cwd      exe  maps  mounts  stat   status  wchan
cmdline  environ  fd   mem   root    statm  task
gen2 23:28:11 /proc/1: ls task
1
gen2 23:28:27 /proc/1: cat task/$$/stat
1671 (bash) S 1670 1671 1671 34817 1802 256 1953 8101 12 4 10 6 9 26 15 0 1 0 5789 3252224 454 4294967295 134512640 134955932 3221225104 3221222840 4294960144 0 65536 3686404 1266761467 3222442959 0 0 17 0 0 0

9b6722ed

[PATCH] Minor cross-compile issues · 8bbb25c3

Andrew Morton authored Feb 18, 2004

From: Pratik Solanki <pratik.solanki@timesys.com>

- Fix include path for build.c so that it finds asm/boot.h.
  /usr/include/asm/boot.h may not be present when cross-compiling on a
  non-Linux machine.

- $(CONFIG_SHELL) instead of sh.

8bbb25c3

[PATCH] cpufreq_scale() fixes · 2da050c4

Andrew Morton authored Feb 18, 2004

From: Dominik Brodowski <linux@dominikbrodowski.de>

Use do_div on 32-bit archs in cpufreq_scale, and native "/" on 64-bit
archs.

2da050c4

[PATCH] defer panic for too many items in boot parameter line · f9d4cdfc

Andrew Morton authored Feb 18, 2004

From: Werner Almesberger <werner@almesberger.net>

When passing too many unrecognized boot command line options (which become
arguments or environment variables), the 2.6 kernel panics (unlike 2.4,
which just ignores the extra items).  Unfortunately, this happens before
the console is initialized, so all you get is a kernel that dies quickly,
for no apparent reason.

This is particularly irritating if using UML with
init=something wi th a lot of ar gu men t s

The patch below delays the panic until after console_init.

(akpm: I mainly added this in because we have other places where the
panic-later-on machinery is needed).

f9d4cdfc

[PATCH] adfs: remove a kernel 2.2 #ifdef · 0ad0b87d

Andrew Morton authored Feb 18, 2004

From: Adrian Bunk <bunk@fs.tum.de>

The patch below removes a kernel 2.2 #ifdef from fs/adfs/adfs.h .

Note that this #ifdef was only present in the header, the implementation
of adfs_bmap was already removed.

0ad0b87d

[PATCH] kbuild documentation fix · 4f3a9491

Andrew Morton authored Feb 18, 2004

From: Ryan Boder <icanoop@bitwiser.org>

Explains how to compile external modules in
Documentation/kbuild/modules.txt.

4f3a9491

[PATCH] remove kernel 2.2 #ifdef's from {i,}stallion.h · c9700b7e
Andrew Morton authored Feb 18, 2004
```
From: Adrian Bunk <bunk@fs.tum.de>

The patch below removeskernel 2.2 #ifdef's from {i,}stallion.h .
```
c9700b7e

[PATCH] OSS: remove #ifdef's for kernel 2.0 · 32856f32

Andrew Morton authored Feb 18, 2004

From: Adrian Bunk <bunk@fs.tum.de>

The patch below removes two #ifdef's for kernel 2.0 from OSS.

32856f32

[PATCH] Rename bitmap_snprintf() and cpumask_snprintf() to *_scnprintf() · e9dc2e51

Andrew Morton authored Feb 18, 2004

From: Joe Korty <joe.korty@ccur.com>

Rename bitmap_snprintf() to bitmap_scnprintf() and cpumask_snprintf() to
cpumask_scnprintf(), as these functions now belong to the scnprintf family
of functions.

e9dc2e51

[PATCH] MCE fixes and cleanups · 3aa6ed84

Andrew Morton authored Feb 18, 2004

Andi notes that the

	smp_call_function(foo);
	foo();

in there is incorrect on preemptible kernels.

Fix that by using on_each_cpu(), which takes care of such things.

Also, remove the open-coded timer from here.  We have
schedule_delayed_work().

And remove the `timerset' variable, which doesn't do anything.

3aa6ed84

[PATCH] Fix printk level on non fatal MCEs · 2d943d44

Andrew Morton authored Feb 18, 2004

From: Andi Kleen <ak@suse.de>

For various reasons non fatal Machine Checks can happen on Athlons (e.g.
we have reports that laptops like to trigger them on suspend/resume)

They are not necessarily fatal and often only minor hardware glitches.

But what's annoying is that they're KERN_EMERG and pollute your console and
scare the user into writing confused kernel bug reports.

This patch just replaces the KERN_EMERGs with KERN_INFO for now. Longer
term I think it would be better to log this stuff into a separate log.

2d943d44

[PATCH] 8259 timer ack fix · 660ab10c

Andrew Morton authored Feb 18, 2004

From: "Maciej W. Rozycki" <macro@ds2.pg.gda.pl>

Fix up the 8259 ack handling for buggy SMM firmware.

See http://www.ussg.iu.edu/hypermail/linux/kernel/0203.2/0956.html

Apparently the embedded 8259A-compatible core is not fully functional.
This patch lets the I/O APIC-driven NMI watchdog to function correctly.
Credit to Ross Dickson for discovering this.

660ab10c

[PATCH] dm: drop BIO_SEG_VALID bit · d4634c58

Andrew Morton authored Feb 18, 2004

From: Joe Thornber <thornber@redhat.com>

I just noticed that bio_clone copies the BIO_SEG_VALID bit from the original
bio when it was set. When we modify bi_idx or bi_vcnt afterwards the segment
counts are invalid and the bit must be dropped (though it is fairly unlikely
that it has already been set). [Christophe Saout]

d4634c58

[PATCH] dm: Remove redundant spin lock in dec_pending() · 01fce686
Andrew Morton authored Feb 18, 2004
```
From: Joe Thornber <thornber@redhat.com>

Remove redundant spin lock in dec_pending()
```
01fce686

[PATCH] dm: Zero size target sanity check · bc553993

Andrew Morton authored Feb 18, 2004

From: Joe Thornber <thornber@redhat.com>

Add sanity check to dm_table_add_target() against zero length targets.
[Christophe Saout]

bc553993

[PATCH] dm: Correct GFP flag in dm_table_create() · 2c2eae81

Andrew Morton authored Feb 18, 2004

From: Joe Thornber <thornber@redhat.com>

For some reason dm_table_create() was allocating GFP_NOIO rather than
GFP_KERNEL.

2c2eae81

[PATCH] dm: Tidy up the error path for alloc_dev() · 6b1b56f9
Andrew Morton authored Feb 18, 2004
```
From: Joe Thornber <thornber@redhat.com>

Tidy up the error path for alloc_dev()
```
6b1b56f9
[PATCH] dm: Maintain ordering when deferring bios · 54e37e09
Andrew Morton authored Feb 18, 2004
```
From: Joe Thornber <thornber@redhat.com>

Make sure that we maintain ordering when deferring bios.
```
54e37e09
[PATCH] dm: Get rid of struct dm_deferred_io in dm.c · a0befbbc
Andrew Morton authored Feb 18, 2004
```
From: Joe Thornber <thornber@redhat.com>

Remove struct dm_deferred_io from dm.c.  [Christophe Saout]
```
a0befbbc
[PATCH] dm: Move to_bytes() and to_sectors() into dm.h · 0901c174
Andrew Morton authored Feb 18, 2004
```
From: Joe Thornber <thornber@redhat.com>

Move to_bytes() and to_sectors() into dm.h
```
0901c174
[PATCH] dm: Export dm_vcalloc() · c087ec3d
Andrew Morton authored Feb 18, 2004
```
From: Joe Thornber <thornber@redhat.com>

Export dm_vcalloc()
```
c087ec3d

[PATCH] md: Allow partitioning of MD devices. · 1797a796

Andrew Morton authored Feb 18, 2004

From: NeilBrown <neilb@cse.unsw.edu.au>

With this patch, md used two major numbers for arrays.

One Major is number 9 with name 'md' have unpartitioned md arrays, one per
minor number.

The other Major is allocated dynamically with name 'mdp' and had on array for
every 64 minors, allowing for upto 63 partitions.

The arrays under one major are completely separate from the arrays under the
other.

The preferred name for devices with the new major are of the form:

  /dev/md/d1p3  # partion 3 of device 1 - minor 67

When a paritioned md device is assembled, the partitions are not recognised
until after the whole-array device is opened again.  A future version of
mdadm will perform this open so that the need will be transparent.

1797a796

[PATCH] md: Dynamically limit size of bio requests used for raid1 resync · 5077fef0

Andrew Morton authored Feb 18, 2004

From: NeilBrown <neilb@cse.unsw.edu.au>

Currently raid1 uses PAGE_SIZE read/write requests for resync, as it doesn't
know how to honour per-device restrictions. This patch uses to bio_add_page
to honour those restrictions and ups the limit on request size to 64K. This
has a measurable impact on rebuild speed (25M/s -> 60M/s)

5077fef0

[PATCH] md: Avoid unnecessary bio allocation during raid1 resync · 89654f5b

Andrew Morton authored Feb 18, 2004

From: NeilBrown <neilb@cse.unsw.edu.au>

For each resync request, we allocate a "r1_bio" which has a bio "master_bio"
attached that goes largely unused.  We also allocate a read_bio which is
used.  This patch removes the read_bio and just uses the master_bio instead.

This fixes a bug wherein bi_bdev of the master_bio wasn't being set, but was
being used.

We also introduce a new "sectors" field into the r1_bio as we can no-longer
rely in master_bio->bi_sectors.

89654f5b

[PATCH] md: Remove some un-needed fields from r1bio_s · d0d464b1

Andrew Morton authored Feb 18, 2004

From: NeilBrown <neilb@cse.unsw.edu.au>

next_r1 is never used, so it can just go.

read_bio isn't needed as we can easily use one of the pointers in the
write_bios array - write_bios[->read_disk].  So rename "write_bios" to "bios"
and store the pointer to the read bio in there.

d0d464b1

[PATCH] md: Discard the cmd field from r1_bio structure · ebf7768e

Andrew Morton authored Feb 18, 2004

From: NeilBrown <neilb@cse.unsw.edu.au>

The only time it is really needed is to differentiate a retry-on-fail from a
write-after-read-for-resync request to raid1d.  So we use a bit in 'state'
for that.

ebf7768e

[PATCH] md: Split read and write end_request handlers · c1dd448e

Andrew Morton authored Feb 18, 2004

From: NeilBrown <neilb@cse.unsw.edu.au>

Instead of having a single end_request handler that must determine whether it
was a read or a write request, we have two separate handlers, which makes
each of them easier to follow.

c1dd448e

[PATCH] md: Print "deprecated" warning when START_ARRAY is used. · a2c4e506

Andrew Morton authored Feb 18, 2004

From: NeilBrown <neilb@cse.unsw.edu.au>

The "START_ARRAY" ioctl depends on major/minor numbers (as stored in the raid
superblock) are stable over reboots, which is increasingly untrue.

There are better ways to start an array (e.g. with mdadm) so we mark the
ioctl as deprecated for 2.6, and will remove it in 2.7.

a2c4e506

[PATCH] kNFSd:fix build problems in nfs w/o proc_fs on 2.6.0-test5 · 67afcb4f

Andrew Morton authored Feb 18, 2004

From: NeilBrown <neilb@cse.unsw.edu.au>

From: Stephen Hemminger <shemminger@osdl.org>
Date: Fri, 12 Sep 2003 11:31:06 -0700

NFS won't build w/o CONFIG_PROC_FS.  Looks like typo's (or a C++
programmer) in stats.h

67afcb4f

[PATCH] kNFSd: convert NFS /proc interfaces to seq_file · 2a0807bd

Andrew Morton authored Feb 18, 2004

From: NeilBrown <neilb@cse.unsw.edu.au>

From: shemminger@osdl.org Sat Sep  6 09:19:50 2003
Date: Fri, 5 Sep 2003 16:19:30 -0700

Converts /proc/net/rpc/nfs and /proc/net/rpc/nfsd to use the simpler
seq_file interface.

2a0807bd

[PATCH] kNFSd: ip_map_init does a kmalloc which isn't checked... · bbcc5fa8

Andrew Morton authored Feb 18, 2004

From: NeilBrown <neilb@cse.unsw.edu.au>

There is no way to return an error from a cache init routine, so instead we
make sure to pre-allocate the memory needed, and free it after the lookup
if the lookup failed.

bbcc5fa8

[PATCH] kNFSd: Allow sunrpc/svc cache init function to modify the "key" · 9417bd87

Andrew Morton authored Feb 18, 2004

From: NeilBrown <neilb@cse.unsw.edu.au>

When adding a item to a sunrpc/svc cache that contains kmalloced data it is
usefully to move the malloced data out of the key object into the new cache
object rather than copying (as then we would need to cope with kmalloc
failure and such).  This means modifying the original.

If the kmalloced data forms part of the key, then we must not move the data
out until after the key isn't needed any more.  So this patch moves the
call to "INIT" on a new item (which fills in the key) to *after* the item
has been found (or not), and also makes sure we only call the HASH function
once.

Thanks to "J.  Bruce Fields" <bfields@fieldses.org>

also

 1/ remove unnecessary assignment
 2/ fix comments that lag behind implementation.

9417bd87

[PATCH] kNFSd: Fix possible scheduling_while_atomic in cache.c · 16b82dca

Andrew Morton authored Feb 18, 2004

From: NeilBrown <neilb@cse.unsw.edu.au>

We currently call cache_put, which can schedule(), under a spin_lock.  This
patch moves that call outside the spinlock.

16b82dca

[PATCH] #if versus #ifdef cleanup · c65febbb

Andrew Morton authored Feb 18, 2004

From: Valdis.Kletnieks@vt.edu

15 changes of #if to #ifdef and 2 places CONFIG_FOO should be
defined(CONFIG_FOO).  This gets rid of spurious warnings if you build with
"-Wundef" so you get a warning if you have a preprocessor command like:

#if CONFIG_ETRAX_DS1302_RSTBIT == 27

and you'll be told if it's substituting a zero rather than silent
weirdness and unexpected code generation.

c65febbb

[PATCH] MIPS: New 2.6 serial drivers · b7df53b3

Andrew Morton authored Feb 18, 2004

From: Ralf Baechle <ralf@linux-mips.org>

Three new MIPS-specific serial drivers. ip22.c is derived from the sparc
zilog driver; guess we should write a generic Zilog driver somewhen ...

b7df53b3

[PATCH] Enable coredumps > 2GB · 95b387a4

Andrew Morton authored Feb 18, 2004

From: Andi Kleen <ak@muc.de>

Some x86-64 users were complaining that coredumps >2GB don't work.

This will enable large coredump for everybody.  Apparently the 32bit
gdb/binutils cannot handle them, but I hear the binutils people are working
on fixing that.  I doubt it will harm people - unreadable coredumps are not
worse than no coredump and it won't make any difference in space usage if
you get a 1.99GB or a 2.5GB coredump.  So just enable it unconditionally.
If it should be really a problem for 32bit the rlimit defaults in
resource.h could be changed.

For file systems that don't support O_LARGEFILE you should just get an
truncated coredumps for big address spaces.

95b387a4

[PATCH] devfs: race fixes and cleanup · bf98c406

Andrew Morton authored Feb 18, 2004

From: Andrey Borzenkov <arvidjaar@mail.ru>

- use struct nameidata in devfs_d_revalidate_wait to detect when it is
  called without i_sem hold; take i_sem on parent in this case.  This
  prevents both deadlock with devfs_lookup by allowing it to drop i_sem
  consistently and oops in d_instantiate by ensuring that it always runs
  protected

- remove dead code that deals with major number allocation.  The only
  remaining user was devfs itself and patch changes it to

- use register_chardev to get device number for internal /dev/.devfsd and
  /dev/.statd.

- remove dead auto allocation flag as well

- remove code that does module get on dev open - it is handled by fops_get.
   Use init_special_inode consistently

- get rid of struct cdev_type and bdev_type - both have just single dev_t
  now

bf98c406

[PATCH] snprintf fixes · 01d1a791

Andrew Morton authored Feb 18, 2004

From: Juergen Quade <quade@hsnr.de>

Lots of places in the kernel are using [v]snprintf wrongly: they assume it
returns the number of characters copied.  It doesn't.  It returns the
number of characters which _would_ have been copied had the buffer not been
filled up.

So create new functions vscnprintf() and scnprintf() which have the
expected (sane) semaptics, and migrate callers over to using them.

01d1a791