Commits · 7c876d457b5c7e949032a4ac7aec64af0136d52a · Kirill Smelkov / linux

04 Jan, 2007 4 commits

Fix incorrect user space access locking in mincore() (CVE-2006-4814) · 7c876d45

Linus Torvalds authored Jan 04, 2007

Doug Chapman noticed that mincore() will doa "copy_to_user()" of the
result while holding the mmap semaphore for reading, which is a big
no-no.  While a recursive read-lock on a semaphore in the case of a page
fault happens to work, we don't actually allow them due to deadlock
schenarios with writers due to fairness issues.

Doug and Marcel sent in a patch to fix it, but I decided to just rewrite
the mess instead - not just fixing the locking problem, but making the
code smaller and (imho) much easier to understand.

Also included are two fixes for the original patch including one
by Oleg Nesterov.
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Signed-off-by: Adrian Bunk <bunk@stusta.de>

7c876d45

fuse: fix hang on SMP · 571525bb

Miklos Szeredi authored Jan 04, 2007

Fuse didn't always call i_size_write() with i_mutex held which caused
rare hangs on SMP/32bit.  This bug has been present since fuse-2.2,
well before being merged into mainline.

The simplest solution is to protect i_size_write() with the
per-connection spinlock.  Using i_mutex for this purpose would require
some restructuring of the code and I'm not even sure it's always safe
to acquire i_mutex in all places i_size needs to be set.

Since most of vmtruncate is already duplicated for other reasons,
duplicate the remaining part as well, making all i_size_write() calls
internal to fuse.

Using i_size_write() was unnecessary in fuse_init_inode(), since this
function is only called on a newly created locked inode.

Reported by a few people over the years, but special thanks to Dana
Henriksen who was persistent enough in helping me debug it.

Adrian Bunk:
Backported to 2.6.16.
Signed-off-by: Miklos Szeredi <miklos@szeredi.hu>
Signed-off-by: Adrian Bunk <bunk@stusta.de>

571525bb

[PKTGEN]: Fix module load/unload races. · e79366b5

Robert Olsson authored Jan 04, 2007

Adrian Bunk:
Backported to 2.6.16.
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Adrian Bunk <bunk@stusta.de>

e79366b5

i2c: fix broken ds1337 initialization · 51b73a03

Dirk Eibach authored Jan 04, 2007

On a custom board with ds1337 RTC I found that upgrade from 2.6.15 to
2.6.18 broke RTC support.

The main problem are changes to ds1337_init_client().
When a ds1337 recognizes a problem (e.g. power or clock failure) bit 7
in status register is set. This has to be reset by writing 0 to status
register. But since there are only 16 byte written to the chip and the
first byte is interpreted as an address, the status register (which is
the 16th) is never written.
The other problem is, that initializing all registers to zero is not
valid for day, date and month register. Funny enough this is checked by
ds1337_detect(), which depends on this values not being zero. So then
treated by ds1337_init_client() the ds1337 is not detected anymore,
whereas the failure bit in the status register is still set.

Broken by commit f9e89579 (2.6.16-rc1,
2006-01-06). This fix is in Linus' tree since 2.6.20-rc1 (commit
763d9c04).
Signed-off-by: Dirk Stieler <stieler@gdsys.de>
Signed-off-by: Dirk Eibach <eibach@gdsys.de>
Signed-off-by: Jean Delvare <khali@linux-fr.org>

51b73a03

03 Jan, 2007 1 commit

NET_SCHED: Fix fallout from dev->qdisc RCU change · 83d285a2

Patrick McHardy authored Jan 04, 2007

The move of qdisc destruction to a rcu callback broke locking in the
entire qdisc layer by invalidating previously valid assumptions about
the context in which changes to the qdisc tree occur.

The two assumptions were:

- since changes only happen in process context, read_lock doesn't need
  bottem half protection. Now invalid since destruction of inner qdiscs,
  classifiers, actions and estimators happens in the RCU callback unless
  they're manually deleted, resulting in dead-locks when read_lock in
  process context is interrupted by write_lock_bh in bottem half context.

- since changes only happen under the RTNL, no additional locking is
  necessary for data not used during packet processing (f.e. u32_list).
  Again, since destruction now happens in the RCU callback, this assumption
  is not valid anymore, causing races while using this data, which can
  result in corruption or use-after-free.

Instead of "fixing" this by disabling bottem halfs everywhere and adding
new locks/refcounting, this patch makes these assumptions valid again by
moving destruction back to process context. Since only the dev->qdisc
pointer is protected by RCU, but ->enqueue and the qdisc tree are still
protected by dev->qdisc_lock, destruction of the tree can be performed
immediately and only the final free needs to happen in the rcu callback
to make sure dev_queue_xmit doesn't access already freed memory.
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Adrian Bunk <bunk@stusta.de>

83d285a2

26 Dec, 2006 1 commit
- Linux 2.6.16.37 · ec7b3c30
  Adrian Bunk authored Dec 26, 2006
  
  ec7b3c30
18 Dec, 2006 4 commits

Linux 2.6.16.37-rc1 · fc190993
Adrian Bunk authored Dec 18, 2006

fc190993

NFS: nfs_lookup - don't hash dentry when optimising away the lookup · 1488da20

Trond Myklebust authored Dec 18, 2006

If the open intents tell us that a given lookup is going to result in a,
exclusive create, we currently optimize away the lookup call itself. The
reason is that the lookup would not be atomic with the create RPC call, so
why do it in the first place?

A problem occurs, however, if the VFS aborts the exclusive create operation
after the lookup, but before the call to create the file/directory: in this
case we will end up with a hashed negative dentry in the dcache that has
never been looked up.
Fix this by only actually hashing the dentry once the create operation has
been successfully completed.
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Signed-off-by: Adrian Bunk <bunk@stusta.de>

1488da20

[SCSI] DAC960: PCI id table fixup · e634599b

Brian King authored Dec 18, 2006

The PCI ID table in the DAC960 driver conflicts with some devices
that use the ipr driver. All ipr adapters that use this chip
have an IBM subvendor ID and all DAC960 adapters that use this
chip have a Mylex subvendor id.
Signed-off-by: Brian King <brking@us.ibm.com>
Signed-off-by: Adrian Bunk <bunk@stusta.de>

e634599b

bridge-netfilter: don't overwrite memory outside of skb · d0354e3b

Stephen Hemminger authored Dec 18, 2006

The bridge netfilter code needs to check for space at the
front of the skb before overwriting; otherwise if skb from
device doesn't have headroom, then it will cause random
memory corruption.
Signed-off-by: Stephen Hemminger <shemminger@osdl.org>
Signed-off-by: Adrian Bunk <bunk@stusta.de>

d0354e3b

17 Dec, 2006 11 commits

hvc_console suspend fix · 17735bd9

Andrew Morton authored Dec 18, 2006

Fix http://bugzilla.kernel.org/show_bug.cgi?id=7152Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Adrian Bunk <bunk@stusta.de>

17735bd9

[WATCHDOG] sc1200wdt.c pnp unregister fix. · 8132c7a5

Akinobu Mita authored Dec 18, 2006

If no devices found or invalid parameter is specified,
scl200wdt_pnp_driver is left unregistered.
It breaks global list of pnp drivers.
Signed-off-by: Akinobu Mita <akinobu.mita@gmail.com>
Signed-off-by: Adrian Bunk <bunk@stusta.de>

8132c7a5

[WATCHDOG] sc1200wdt.c printk fix · 18d16ac9

Dave Jones authored Dec 18, 2006

Fix printk output.

sc1200wdt: build 20020303<3>sc1200wdt: io parameter must be specified
Signed-off-by: Dave Jones <davej@redhat.com>
Signed-off-by: Adrian Bunk <bunk@stusta.de>

18d16ac9

ISDN: fix drivers, by handling errors thrown by ->readstat() · 99146a21

Jeff Garzik authored Dec 18, 2006

This is a particularly ugly on-failure bug, possibly security, since the
lack of error handling here is covering up another class of bug: failure to
handle copy_to_user() return values.

The I4L API function ->readstat() returns an integer, and by looking at
several existing driver implementations, it is clear that a negative return
value was meant to indicate an error.

Given that several drivers already return a negative value indicating an
errno-style error, the current code would blindly accept that [negative]
value as a valid amount of bytes read.  Obvious damage ensues.

Correcting ->readstat() handling to properly notice errors fixes the
existing code to work correctly on error, and enables future patches to
more easily indicate errors during operation.
Signed-off-by: Jeff Garzik <jeff@garzik.org>
Signed-off-by: Adrian Bunk <bunk@stusta.de>

99146a21

r8169: tweak the PCI data parity error recovery · 83eda202

Francois Romieu authored Dec 17, 2006

The 8110SB based n2100 board signals a lot of what ought to be
PCI data parity errors durint operation of the 8169 as target.
Experiment proved that the driver can ignore the error and
process the packet as if nothing had happened.

Let's add an ad-hoc knob to enable users to fix their system while
avoiding the risks of a wholesale change.
Signed-off-by: Francois Romieu <romieu@fr.zoreil.com>
Signed-off-by: Adrian Bunk <bunk@stusta.de>

83eda202

r8169: fix infinite loop during hotplug · 1d244821

Arnaud Patard authored Dec 17, 2006

Bug reported for PCMCIA.
Signed-off-by: Arnaud Patard <apatard@mandriva.com>
Signed-off-by: Francois Romieu <romieu@fr.zoreil.com>
Signed-off-by: Adrian Bunk <bunk@stusta.de>

1d244821

r8169: RX fifo overflow recovery · 0263245a

Francois Romieu authored Dec 17, 2006

Signed-off-by: Francois Romieu <romieu@fr.zoreil.com>
Signed-off-by: Adrian Bunk <bunk@stusta.de>

0263245a

x86-64: Mark rdtsc as sync only for netburst, not for core2 · 5871db77

Arjan van de Ven authored Dec 17, 2006

On the Core2 cpus, the rdtsc instruction is not serializing (as defined
in the architecture reference since rdtsc exists) and due to the deep
speculation of these cores, it's possible that you can observe time go
backwards between cores due to this speculation. Since the kernel
already deals with this with the SYNC_RDTSC flag, the solution is
simple, only assume that the instruction is serializing on family 15...

The price one pays for this is a slightly slower gettimeofday (by a
dozen or two cycles), but that increase is quite small to pay for a
really-going-forward tsc counter.

Backport by Chris Wright.
Signed-off-by: Arjan van de Ven <arjan@linux.intel.com>
Signed-off-by: Chris Wright <chrisw@sous-sol.org>
Signed-off-by: Adrian Bunk <bunk@stusta.de>

5871db77

[IPV4] ip_fragment: Always compute hash with ipfrag_lock held. · 862c2997

David S. Miller authored Dec 17, 2006

Otherwise we could compute an inaccurate hash due to the
random seed changing.

Noticed by Zach Brown and patch is based upon some feedback
from Herbert Xu.
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Adrian Bunk <bunk@stusta.de>

862c2997

IA64: bte_unaligned_copy() transfers one extra cache line. · 4adf3c78

Robin Holt authored Dec 17, 2006

When called to do a transfer that has a start offset within the cache
line which is uneven between source and destination and a length which
terminates the source of the copy exactly on a cache line, one extra
line gets copied into a temporary buffer.  This is normally not an issue
since the buffer is a kernel buffer and only the requested information
gets copied into the user buffer.

The problem arises when the source ends at the very last physical page
of memory.  That last cache line does not exist and results in the SHUB
chip raising an MCA.
Signed-off-by: Robin Holt <holt@sgi.com>
Signed-off-by: Tony Luck <tony.luck@intel.com>
Signed-off-by: Adrian Bunk <bunk@stusta.de>

4adf3c78

scsi: clear garbage after CDBs on SG_IO · 961428b9

Tejun Heo authored Dec 17, 2006

ATAPI devices transfer fixed number of bytes for CDBs (12 or 16).  Some
ATAPI devices choke when shorter CDB is used and the left bytes contain
garbage.  Block SG_IO cleared left bytes but SCSI SG_IO didn't.  This patch
makes SCSI SG_IO clear it and simplify CDB clearing in block SG_IO.
Signed-off-by: Tejun Heo <htejun@gmail.com>
Acked-by: Jens Axboe <jens.axboe@oracle.com>
Signed-off-by: Adrian Bunk <bunk@stusta.de>

961428b9

15 Dec, 2006 5 commits

AGP: Allocate AGP pages with GFP_DMA32 by default · dcc6e343

Linus Torvalds authored Dec 15, 2006

Not all graphic page remappers support physical addresses over the 4GB
mark for remapping, so while some do (the AMD64 GART always did, and I
just fixed the i965 to do so properly), we're safest off just forcing
GFP_DMA32 allocations to make sure graphics pages get allocated in the
low 32-bit address space by default.

AGP sub-drivers that really care, and can do better, could just choose
to implement their own allocator (or we could add another "64-bit safe"
default allocator for their use), but quite frankly, you're not likely
to care in practice.

So for now, this trivial change means that we won't be allocating pages
that we can't map correctly by mistake on x86-64.

[ On traditional 32-bit x86, this could never happen, because GFP_KERNEL
  would never allocate any highmem memory anyway ]
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Signed-off-by: Adrian Bunk <bunk@stusta.de>

dcc6e343

md: Fix md grow/size code to correctly find the maximum available space · 75ba82c6

Neil Brown authored Dec 15, 2006

An md array can be asked to change the amount of each device that it is using,
and in particular can be asked to use the maximum available space.  This
currently only works if the first device is not larger than the rest.  As
'size' gets changed and so 'fit' becomes wrong.  So check if a 'fit' is
required early and don't corrupt it.
Signed-off-by: Doug Ledford <dledford@redhat.com>
Signed-off-by: Neil Brown <neilb@suse.de>
Signed-off-by: Adrian Bunk <bunk@stusta.de>

75ba82c6

softirq: remove BUG_ONs which can incorrectly trigger · e89da8fc

Zachary Amsden authored Dec 15, 2006

It is possible to have tasklets get scheduled before softirqd has had a chance
to spawn on all CPUs. This is totally harmless; after success during action
CPU_UP_PREPARE, action CPU_ONLINE will be called, which immediately wakes
softirqd on the appropriate CPU to process the already pending tasklets. So
there is no danger of having a missed wakeup for any tasklets that were
already pending.

In particular, i386 is affected by this during startup, and is visible when
using a very large initrd; during the time it takes for the initrd to be
decompressed, a timer IRQ can come in and schedule RCU callbacks. It is also
possible that resending of a hardware IRQ via a softirq triggers the same bug.

Because of different timing conditions, this shows up in all emulators and
virtual machines tested, including Xen, VMware, Virtual PC, and Qemu. It is
also possible to trigger on native hardware with a large enough initrd,
although I don't have a reliable case demonstrating that.
Signed-off-by: Zachary Amsden <zach@vmware.com>
Signed-off-by: Adrian Bunk <bunk@stusta.de>

e89da8fc

dm crypt: Fix data corruption with dm-crypt over RAID5 · a26b7719

Christophe Saout authored Dec 15, 2006

Fix corruption issue with dm-crypt on top of software raid5. Cancelled
readahead bio's that report no error, just have BIO_UPTODATE cleared
were reported as successful reads to the higher layers (and leaving
random content in the buffer cache). Already fixed in 2.6.19.
Signed-off-by: Christophe Saout <christophe@saout.de>
Signed-off-by: Adrian Bunk <bunk@stusta.de>

a26b7719

Fix SUNRPC wakeup/execute race condition · bbb97831

Christophe Saout authored Dec 15, 2006

The sunrpc scheduler contains a race condition that can let an RPC
task end up being neither running nor on any wait queue. The race takes
place between rpc_make_runnable (called from rpc_wake_up_task) and
__rpc_execute under the following condition:

First __rpc_execute calls tk_action which puts the task on some wait
queue. The task is dequeued by another process before __rpc_execute
continues its execution. While executing rpc_make_runnable exactly after
setting the task `running' bit and before clearing the `queued' bit
__rpc_execute picks up execution, clears `running' and subsequently
both functions fall through, both under the false assumption somebody
else took the job.

Swapping rpc_test_and_set_running with rpc_clear_queued in
rpc_make_runnable fixes that hole. This introduces another possible
race condition that can be handled by checking for `queued' after
setting the `running' bit.

Bug noticed on a 4-way x86_64 system under XEN with an NFSv4 server
on the same physical machine, apparently one of the few ways to hit
this race condition at all.
Signed-off-by: Christophe Saout <christophe@saout.de>
Acked-by: Trond Myklebust <trond.myklebust@fys.uio.no>
Signed-off-by: Adrian Bunk <bunk@stusta.de>

bbb97831

14 Dec, 2006 14 commits

[ALSA] fix usbmixer double kfree · ea09f457

Dave Jones authored Dec 15, 2006

snd_ctl_add() kfree's the kcontrol already if we fail there,
so this driver is currently doing a double kfree.

Coverity bug #959
Signed-off-by: Dave Jones <davej@redhat.com>
Signed-off-by: Adrian Bunk <bunk@stusta.de>

ea09f457

[ALSA] sound/isa/sb/sb_mixer.c double kfree · 7819bfbb

Dave Jones authored Dec 15, 2006

snd_ctl_add() already does the free on error.

Coverity bug #957
Signed-off-by: Dave Jones <davej@redhat.com>
Signed-off-by: Adrian Bunk <bunk@stusta.de>

7819bfbb

[ALSA] Fix use after free in opl3_seq and opl3_oss · 6f6c1475

Dave Jones authored Dec 15, 2006

Don't read from free'd memory.  Also make use of the return
value, and don't register the device if something went wrong
creating the port.

Coverity #954, #955
Signed-off-by: Dave Jones <davej@redhat.com>
Signed-off-by: Adrian Bunk <bunk@stusta.de>

6f6c1475

[ALSA] ad1848 double free · 79bb7147

Dave Jones authored Dec 15, 2006

snd_ctl_add() already kfree's on error.

Coverity #956
Signed-off-by: Dave Jones <davej@redhat.com>
Signed-off-by: Adrian Bunk <bunk@stusta.de>

79bb7147

[ALSA] sound/pci/rme9652/hdspm.c: fix off-by-one errors · 5bee6f62

Adrian Bunk authored Dec 15, 2006

This patch fixes off-by-one errors found by the Coverity checker.
Signed-off-by: Adrian Bunk <bunk@stusta.de>

5bee6f62

[ALSA] fix some memory leaks · e2cde312

Adrian Bunk authored Dec 15, 2006

This patch fixes two memory leaks spotted by the Coverity checker.
Signed-off-by: Adrian Bunk <bunk@stusta.de>

e2cde312

[ALSA] sound/core/: fix 3 off-by-one errors · 2b126698

Adrian Bunk authored Dec 15, 2006

This patch fixes three off-by-one errors found by the Coverity checker.
Signed-off-by: Adrian Bunk <bunk@stusta.de>

2b126698

IDE: Add the support of nvidia PATA controllers of MCP67 to amd74xx.c · bbd75502

Peer Chen authored Dec 14, 2006

Add support for PATA controllers of MCP67 to amd74xx.c.
Signed-off-by: Peer Chen <pchen@nvidia.com>
Signed-off-by: Adrian Bunk <bunk@stusta.de>

bbd75502

pci_ids.h: Add NVIDIA PCI ID · d84a90e3
Peer Chen authored Dec 14, 2006
```
Signed-off-by: Adrian Bunk <bunk@stusta.de>
```
d84a90e3
amd74xx.c: add some NVIDIA chipset IDs · 88c9c162
Randy Dunlap authored Dec 14, 2006
```
Add some nVidia chipset ID's support.
Signed-off-by: Adrian Bunk <bunk@stusta.de>
```
88c9c162

sata_nv/amd74xx: Add MCP61 support · 2633840d

Andrew Chew authored Dec 14, 2006

Added MCP61 support to sata_nv and amd74xx.
Signed-off-by: Jeff Garzik <jeff@garzik.org>
Signed-off-by: Adrian Bunk <bunk@stusta.de>

2633840d

[libata] sata_nv: add PCI IDs · 74e7ab11

Jeff Garzik authored Dec 14, 2006

Based on a patch contributed by Andrew Chew @ NVIDIA.
Signed-off-by: Jeff Garzik <jeff@garzik.org>
Signed-off-by: Adrian Bunk <bunk@stusta.de>

74e7ab11

dm snapshot: fix metadata writing when suspending · 87b79cef

Mark McLoughlin authored Dec 14, 2006

When suspending a device-mapper device, dm_suspend() sleeps until all
necessary I/O is completed.  This state is triggered by a callback from
persistent_commit().  But some I/O can still be issued *after* the callback
(to prepare the next metadata area for use if the current one is full).  This
patch delays the callback until after that I/O is complete.
Signed-off-by: Mark McLoughlin <markmc@redhat.com>
Signed-off-by: Alasdair G Kergon <agk@redhat.com>
Signed-off-by: Adrian Bunk <bunk@stusta.de>

87b79cef

dm: Fix deadlock under high i/o load in raid1 setup. · 970d548a

Daniel Kobras authored Dec 14, 2006

On an nForce4-equipped machine with two SATA disk in raid1 setup using dmraid,
we experienced frequent deadlock of the system under high i/o load.  'cat
/dev/zero > ~/zero' was the most reliable way to reproduce them: Randomly
after a few GB, 'cp' would be left in 'D' state along with kjournald and
kmirrord.  The functions cp and kjournald were blocked in did vary, but
kmirrord's wchan always pointed to 'mempool_alloc()'.  We've seen this pattern
on 2.6.15 and 2.6.17 kernels.  http://lkml.org/lkml/2005/4/20/142 indicates
that this problem has been around even before.

So much for the facts, here's my interpretation: mempool_alloc() first tries
to atomically allocate the requested memory, or falls back to hand out
preallocated chunks from the mempool.  If both fail, it puts the calling
process (kmirrord in this case) on a private waitqueue until somebody refills
the pool.  Where the only 'somebody' is kmirrord itself, so we have a
deadlock.

I worked around this problem by falling back to a (blocking) kmalloc when
before kmirrord would have ended up on the waitqueue.  This defeats part of
the benefits of using the mempool, but at least keeps the system running.  And
it could be done with a two-line change.  Note that mempool_alloc() clears the
GFP_NOIO flag internally, and only uses it to decide whether to wait or return
an error if immediate allocation fails, so the attached patch doesn't change
behaviour in the non-deadlocking case.  Path is against current git
(2.6.18-rc4), but should apply to earlier versions as well.  I've tested on
2.6.15, where this patch makes the difference between random lockup and a
stable system.
Signed-off-by: Daniel Kobras <kobras@linux.de>
Acked-by: Alasdair G Kergon <agk@redhat.com>
Signed-off-by: Adrian Bunk <bunk@stusta.de>

970d548a