Commits · 348e9f7098ca1f4c6fb4e7457efec651cecc0dcb · Kirill Smelkov / linux

11 Oct, 2003 2 commits

[PATCH] Bug in timer_tsc cpufreq callback · 348e9f70

Venkatesh Pallipadi authored Oct 11, 2003

There is a bug in cpufreq call back funtion in timer_tsc routines,
that can result in system deadlock. The issue is: grabbing the
write_lock on xtime_lock without disabling the interrupts. So,=20
if we happen to get a timer interrupt while we are in this code,
system will go into a deadlock.

This bug only effects the kernels that have CONFIG_CPU_FREQ enabled.

348e9f70

[PATCH] SMP races in the timer code · 158fb15f

Ingo Molnar authored Oct 11, 2003

This fixes two del_timer_sync() races that are still in the timer code. 

The first race was actually triggered in a 2.4 backport of the 2.6 timer
code.  The second race was never triggered - it is mostly theoretical on
a standalone kernel.  (It's more likely in any virtualized or otherwise
preemptable environment.)

Both races happen when self-rearming timers are used.  One mainstream
example is kernel/itimer.c.  The effect of the races is that
del_timer_sync() lets a timer running instead of synchronizing with it,
causing logic bugs (and crashes) in the affected kernel code.  One
typical incarnation of the race is a double add_timer(). 

race #1:

this code in __run_timers() is running on CPU0:

                        list_del(&timer->entry);
                        timer->base = NULL;
			[*]
                        set_running_timer(base, timer);
                        spin_unlock_irq(&base->lock);
			[**]
                        fn(data);
                        spin_lock_irq(&base->lock);

CPU0 gets stuck at the [*] code-point briefly - after the timer->base has
been set to NULL, but before the base->running_timer pointer has been set
up. This is a fundamentally volatile scenario, as there's _zero_ knowledge
in the data structures that this timer is about to be executed!

Now CPU1 comes along and calls del_timer_sync(). It will find nothing -
neither timer->base nor base->running_timer will cause it to synchronize.  
It will return and report that the timer has been deleted - shortly
afterwards CPU1 continues to execute the timer fn, which will cause
crashes.

This particular race is easy to fix by reordering the timer->base
clearing with set_running_timer(), and putting a wmb() between them, but
there's more races:

race #2

The checking of del_timer_sync() for 'pending or running timer' is
fundamentally unrobust. Eg. if CPU0 gets stuck at the [***] point below:

                base = &per_cpu(tvec_bases, i);
                if (base->running_timer == timer) {
                        while (base->running_timer == timer) {
                                cpu_relax();
                                preempt_check_resched();
                        }
			[***]
                        break;
                }
        }
        smp_rmb();
        if (timer_pending(timer))
                goto del_again;


then del_timer_sync() has already decided that this timer is not running
(we just finished loop-waiting for it), but we have not done the
timer_pending() check yet.

If the timer has re-armed itself, and if the timer expires on CPU1 (this
needs a long delay on CPU0 but that's not hard to achieve eg.  in UML or
with kernel preemption enabled), then CPU1 could start to expire the
timer and gets to the [**] point in __run_timers (see above), then CPU1
gets stalled and CPU0 is unstalled, then the timer_pending() check in
del_timer_sync() will not notice the running timer, and del_timer_sync()
returns - while CPU1 is just about to run the timer!

Fixing this second race is hard - it involves a heavy race-check
operation that has to lock all bases, and has to re-check the
base->running_timer value, and timer_pending condition atomically.

This fix also fixes the first race, due to forcing del_timer_sync() to
always observe the timer state atomically, so the [*] code point will
always synchronize with del_timer_sync(). 

The patch is ugly but safe, and it has fixed the crashes in the 2.4
backport.  I tested the patch on 2.6.0-test7 with some heavy itimer use
and it works fine.  Removing self-arming timers safely is the sole
purpose of del_timer_sync(), so there's no way around this overhead i
think.  I believe we should ultimately fix all major del_timer_sync()
users to not use self-arming timers - having del_timer_sync() in the
thread-exit path is now a considerable source of SMP overhead.  But this
is out of the scope of current 2.6 fixes of course, and we have to
support self-arming timers as well.

158fb15f

10 Oct, 2003 4 commits
- Merge bk://kernel.bkbits.net/gregkh/linux/usb-2.6 · 606e1044
  Linus Torvalds authored Oct 10, 2003
```
into home.osdl.org:/home/torvalds/v2.5/linux
```
  606e1044
- Merge bk://kernel.bkbits.net/gregkh/linux/i2c-2.6 · e47b5d29
  Linus Torvalds authored Oct 10, 2003
```
into home.osdl.org:/home/torvalds/v2.5/linux
```
  e47b5d29
- Merge kroah.com:/home/greg/linux/BK/bleed-2.5 · bed2f5d0
  Greg Kroah-Hartman authored Oct 10, 2003
```
into kroah.com:/home/greg/linux/BK/usb-2.6
```
  bed2f5d0
- Merge bk://bk.arm.linux.org.uk/linux-2.6-rmk · ee3da660
  Linus Torvalds authored Oct 10, 2003
```
into home.osdl.org:/home/torvalds/v2.5/linux
```
  ee3da660
11 Oct, 2003 1 commit
- [ARM] Add missing exports for Integrator logic module drivers. · 9b53fb25
  Russell King authored Oct 11, 2003
```
This adds a couple of missing symbol exports for
lm_driver_register and lm_driver_unregister.
```
  9b53fb25
10 Oct, 2003 11 commits

[ARM] Remove no_action function from CLPS7500 code. · ac0f0f29
Russell King authored Oct 11, 2003
```
no_action is implemented by generic code; no need for machine class
code to implement it as well.
```
ac0f0f29
Merge kroah.com:/home/greg/linux/BK/bleed-2.5 · b90e038d
Greg Kroah-Hartman authored Oct 10, 2003
```
into kroah.com:/home/greg/linux/BK/i2c-2.6
```
b90e038d
Merge bk://linux-dj.bkbits.net/cpufreq · 4ee0756d
Linus Torvalds authored Oct 10, 2003
```
into home.osdl.org:/home/torvalds/v2.5/linux
```
4ee0756d
Merge wopr.codemonkey.org.uk:/mnt/nfs/sepia/bar/src/kernel/2.6/trees/bk-linus · 84179c9b
Dave Jones authored Oct 11, 2003
```
into wopr.codemonkey.org.uk:/mnt/nfs/sepia/bar/src/kernel/2.6/trees/cpufreq
```
84179c9b

[CPUFREQ] Kill longhaul warnings · 180ca8dd

Dave Jones authored Oct 11, 2003

(Blah about unused variables).
This code still won't be used, as its still not tested/debugged properly
on a Nehemiah.

180ca8dd

[CPUFREQ] Fix longhauls speed calculations. · 08a227f5

Dave Jones authored Oct 10, 2003

We got half multipliers horribly wrong, which made us think we could
clock the CPU much higher than we actually could.

08a227f5

[CPUFREQ] refix EBLCR FSB only works on Samuel1. · 2eb930bb
Dave Jones authored Oct 10, 2003
```
Now that longhaul=1 matches more than 1 CPU, this broke.
```
2eb930bb

[CPUFREQ] Make VIA longhaul work on Samuel2 & Ezra again. · d77bfa76

Dave Jones authored Oct 10, 2003

These CPUs are actually only longhaul v1 compliant.
This was catastrophic, as the MSRs moved between v1 and v2.
There was also massive confusion in the documentation regarding Ezra.
It's not another variant, so 'v2' never existed.
Renamed v3 (Powersaver) to v2 as a result of this.

d77bfa76

[PATCH] I2C: fix i2c-dev class release function bug. · 7ef83943

Greg Kroah-Hartman authored Oct 10, 2003

There was no release function, that was the bug :)
It caused bad messages to show up in the syslog whenever a i2c driver was removed, and could
easily oops.

7ef83943

[PATCH] I2C: i2c-sis630 driver fixes · 32c4478c

Alexander Malysh authored Oct 10, 2003

attached you can find a patch that should fix i2c-sis630 driver for
2.6.0-X kernel. With i2c-sis630 from stock 2.6.0-X we have oops and
driver was not correct registered against i2c-core.

Changes:
	1) fixed a oops while modprobing
	2) added check for buffer overflow for i2c block data read transaction
	3) added 'force' modprobe parameter. It's allow more easily
	   testing for not yet supported SiS chips.

32c4478c

Merge bk://kernel.bkbits.net/davem/sparc-2.5 · 58b401ac
Linus Torvalds authored Oct 09, 2003
```
into home.osdl.org:/home/torvalds/v2.5/linux
```
58b401ac

09 Oct, 2003 22 commits

[SPARC64]: Update defconfig. · 6d0830b2
David S. Miller authored Oct 09, 2003

6d0830b2
[NET]: Delete skb_shared() checks from loopback driver transmit. · 074ea612
David S. Miller authored Oct 09, 2003
```
This code is from ancient history when TCP did not used SKB cloning.
```
074ea612
[WAN]: Incorrect comparison for register_netdev in cosa.c · bfb506ec
Stephen Hemminger authored Oct 09, 2003

bfb506ec
[IRDA]: Fix memory leak in sa1100_ir.c · 8992505e
Stephen Hemminger authored Oct 09, 2003

8992505e
[NET]: Fix initialization sequence in SunGEM driver. · 5b131006
David Gibson authored Oct 09, 2003

5b131006
[ATM]: Mark unclean drivers as not-64BIT. · aa1c61c1
Andi Kleen authored Oct 09, 2003

aa1c61c1
[WAN]: Fix transmitted headers with generic HDLC + Cisco encap. · 2d164490
Krzysztof Halasa authored Oct 09, 2003

2d164490
[SPARC64]: Squelch bogus gcc warning in unaligned.c due to bad flow analysis. · 65b4ed6e
David S. Miller authored Oct 09, 2003

65b4ed6e
[VLAN]: kfree(skb) --> kfree_skb(skb). · de76f6d8
David S. Miller authored Oct 09, 2003

de76f6d8

Revert the process group accessor functions. They are buggy, and · 06349d9d

Linus Torvalds authored Oct 09, 2003

cause NULL pointer references in /proc.

Moreover, it's questionable whether the whole thing makes sense at all. 
Per-thread state is good.

Cset exclude: davem@nuts.ninka.net|ChangeSet|20031005193942|01097
Cset exclude: akpm@osdl.org[torvalds]|ChangeSet|20031005180420|42200
Cset exclude: akpm@osdl.org[torvalds]|ChangeSet|20031005180411|42211

06349d9d

[PATCH] I2C: correct some errors in i2c/chips/Kconfig · 71b19d4c
Jean Delvare authored Oct 08, 2003

71b19d4c
[PATCH] I2C: Chip driver initialization fixes · d088c14f
Jean Delvare authored Oct 08, 2003
```
fixes all chip drivers by moving the initialization before any sysfs
entry is created.
```
d088c14f

[PATCH] USB: Make Ethernet Gadget depend on CONFIG_NET · 4c3e7927

Noah J. Misch authored Oct 08, 2003

Previously, one could configure a kernel that wouldn't link by doing a 'make
allnoconfig' and then a 'make menuconfig' and enabling CONFIG_EXPERIMENTAL,
CONFIG_PCI, CONFIG_USB_GADGET, and CONFIG_USB_ETH.

4c3e7927

[PATCH] USB: Make ISD-200 USB/ATA Bridge depend on BLK_DEV_IDE · 7a98554c

Noah J. Misch authored Oct 08, 2003

This usb driver needs ide_fix_driveid from drivers/ide/ide-ops.c, which
needs BLK_DEV_IDE ("Enhanced IDE/MFM/RLL disk/cdrom/tape/floppy
support") to get built.

Without this patch, you can configure an un-linkable kernel by doing make
allnoconfig, make menuconfig, and setting CONFIG_PCI, CONFIG_USB,
CONFIG_USB_STORAGE, and CONFIG_USB_STORAGE_ISD200 only.

7a98554c

[PATCH] USB: Handspring Treo 600 id · 1e26f165
Dax Kelson authored Oct 08, 2003
```
I've got a new toy. This obviously correct 4 liner patches
the two files:
```
1e26f165

[PATCH] net/sunrpc/clnt.c compile fix · 6601345f

Paul Mundt authored Oct 08, 2003

net/sunrpc/clnt.c does not compile if RPC_DEBUG is not enabled, due to
the fact that tk_pid is protected by RPC_DEBUG (which in turn depends on
CONFIG_SYSCTL).

The printk's that use it don't actually need to print it out at all,
the rpc_task tag isn't really interesting here.

Acked by Trond Myklebust.

6601345f

[PATCH] Build fixes for zoran on PA-RISC · 9d75d797

Matthew Wilcox authored Oct 08, 2003

The Zoran driver doesn't include <asm/io.h> and thus won't compile
on architectures where that doesn't get implicitly included through
some other path.

Add the proper includes.

9d75d797

[PATCH] Prefetch workaround for Athlon/Opteron · 9b7a76f4

Andi Kleen authored Oct 08, 2003

This is the latest iteration of the workaround for the Athlon/Opteron
prefetch erratum.  Sometimes the CPU would incorrectly report an
exception on prefetch.

This supercedes the previous dumb workaround of checking for AMD CPUs in
prefetch().  That one bloated the kernel by several KB and lead to lots
of unnecessary checks in hot paths. 

Also this one handles user space faults too, so the kernel can
effectively isolte the user space from caring about this errata.

Instead it handles it in the slow path of the exception handler (the
check is only done when the kernel would normally trigger seg fault or
crash anyways)

All the serious criticisms to the previous patches have been addressed. 
It checks segment bases now, handles vm86 mode and avoids deadlocks when
the prefetch exception happened inside mmap_sem.

This includes review and fixes from Jamie Lokier and Andrew Morton.
Opcode decoder based on code from Richard Brunner.

9b7a76f4

[PATCH] USB: make more driver names match module names · ab07812f

David Brownell authored Oct 08, 2003

This resolves a bug in osdl bugtraq (#1261) by making
some more driver names match their module names.  Such
mismatches are bad because scripts often need to add
special cases, handling multiple names for drivers.

ab07812f

[PATCH] USB: remove stupid check for NULL in devio.c · 83a7bded

Oliver Neukum authored Oct 08, 2003

usually this would be too trivial, but is so obviously stupid that
people might think that there's some hidden trick in there.

We should not check for NULL _after_ following a pointer.
Consider it a small tiny step towards cleaning up this code.

83a7bded

[PATCH] USB: Fix USB suspend in 2.6.0-test6 · d5b16693

Paul Mackerras authored Oct 08, 2003

In drivers/usb/core/hcd-pci.c, the code forgets to set hcd->state to
USB_STATE_SUSPENDED on suspend.  The effect is that on resume, the
code refuses to wake the HCD up, and instead prints a message saying
the interface hasn't been suspended.

The patch below fixes this.  It is against 2.6.0-test6.  With this
patch I can suspend and resume my Apple PowerBook G4, and the USB
works after resuming.

d5b16693

[PATCH] USB: Fix two typos in drivers/usb/README · adaf12d8
Alexey Dobriyan authored Oct 08, 2003

adaf12d8