Commits · 653d48b22166db2d8b1515ebe6f9f0f7c95dfc86 · Kirill Smelkov / linux

17 Sep, 2010 2 commits

arm: fix really nasty sigreturn bug · 653d48b2

Al Viro authored Sep 17, 2010

If a signal hits us outside of a syscall and another gets delivered
when we are in sigreturn (e.g. because it had been in sa_mask for
the first one and got sent to us while we'd been in the first handler),
we have a chance of returning from the second handler to location one
insn prior to where we ought to return.  If r0 happens to contain -513
(-ERESTARTNOINTR), sigreturn will get confused into doing restart
syscall song and dance.

Incredible joy to debug, since it manifests as random, infrequent and
very hard to reproduce double execution of instructions in userland
code...

The fix is simple - mark it "don't bother with restarts" in wrapper,
i.e. set r8 to 0 in sys_sigreturn and sys_rt_sigreturn wrappers,
suppressing the syscall restart handling on return from these guys.
They can't legitimately return a restart-worthy error anyway.

Testcase:
	#include <unistd.h>
	#include <signal.h>
	#include <stdlib.h>
	#include <sys/time.h>
	#include <errno.h>

	void f(int n)
	{
		__asm__ __volatile__(
			"ldr r0, [%0]\n"
			"b 1f\n"
			"b 2f\n"
			"1:b .\n"
			"2:\n" : : "r"(&n));
	}

	void handler1(int sig) { }
	void handler2(int sig) { raise(1); }
	void handler3(int sig) { exit(0); }

	main()
	{
		struct sigaction s = {.sa_handler = handler2};
		struct itimerval t1 = { .it_value = {1} };
		struct itimerval t2 = { .it_value = {2} };

		signal(1, handler1);

		sigemptyset(&s.sa_mask);
		sigaddset(&s.sa_mask, 1);
		sigaction(SIGALRM, &s, NULL);

		signal(SIGVTALRM, handler3);

		setitimer(ITIMER_REAL, &t1, NULL);
		setitimer(ITIMER_VIRTUAL, &t2, NULL);

		f(-513); /* -ERESTARTNOINTR */

		write(1, "buggered\n", 9);
		return 1;
	}
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Acked-by: Russell King <rmk+kernel@arm.linux.org.uk>
Cc: stable@kernel.org
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

653d48b2

Merge branch 'x86-fixes-for-linus' of... · a5b61736

Linus Torvalds authored Sep 16, 2010

Merge branch 'x86-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip

* 'x86-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
  x86: hpet: Work around hardware stupidity
  x86, build: Disable -fPIE when compiling with CONFIG_CC_STACKPROTECTOR=y
  x86, cpufeature: Suppress compiler warning with gcc 3.x
  x86, UV: Fix initialization of max_pnode

a5b61736

16 Sep, 2010 10 commits

Merge git://git.kernel.org/pub/scm/linux/kernel/git/sfrench/cifs-2.6 · 03a7ab08

Linus Torvalds authored Sep 16, 2010

* git://git.kernel.org/pub/scm/linux/kernel/git/sfrench/cifs-2.6:
  cifs: fix potential double put of TCP session reference

03a7ab08

Merge branch 'release' of git://git.kernel.org/pub/scm/linux/kernel/git/aegl/linux-2.6 · 7bb41904
Linus Torvalds authored Sep 16, 2010
```
* 'release' of git://git.kernel.org/pub/scm/linux/kernel/git/aegl/linux-2.6:
  [IA64] Optimize ticket spinlocks in fsys_rt_sigprocmask
```
7bb41904

Merge branch '2.6.36-fixes' of git://github.com/schandinat/linux-2.6 · 1f0ce990

Linus Torvalds authored Sep 16, 2010

* '2.6.36-fixes' of git://github.com/schandinat/linux-2.6:
  drivers/video/via/ioctl.c: prevent reading uninitialized stack memory

1f0ce990

Merge branch 'urgent' of git://git.kernel.org/pub/scm/linux/kernel/git/brodo/pcmcia-2.6 · bd12e5c3

Linus Torvalds authored Sep 16, 2010

* 'urgent' of git://git.kernel.org/pub/scm/linux/kernel/git/brodo/pcmcia-2.6:
  pcmcia pcnet_cs: try setting io_lines to 16 if card setup fails
  pcmcia: per-device, not per-socket debug messages
  pcmcia serial_cs.c: fix multifunction card handling

bd12e5c3

Merge git://git.infradead.org/users/cbou/battery-2.6.36 · de109c98

Linus Torvalds authored Sep 16, 2010

* git://git.infradead.org/users/cbou/battery-2.6.36:
  apm_power: Add missing break statement
  intel_pmic_battery: Fix battery charging status on mrst

de109c98

Merge git://git.kernel.org/pub/scm/linux/kernel/git/wim/linux-2.6-watchdog · 7fd3fce3

Linus Torvalds authored Sep 16, 2010

* git://git.kernel.org/pub/scm/linux/kernel/git/wim/linux-2.6-watchdog:
  watchdog: Enable NXP LPC32XX support in Kconfig (resend)
  watchdog: ts72xx_wdt: disable watchdog at probe
  watchdog: sb_wdog: release irq and reboot notifier in error path and module_exit()

7fd3fce3

Merge branch 'stable' of git://git.kernel.org/pub/scm/linux/kernel/git/cmetcalf/linux-tile · 8be7eb35

Linus Torvalds authored Sep 16, 2010

* 'stable' of git://git.kernel.org/pub/scm/linux/kernel/git/cmetcalf/linux-tile:
  arch/tile: fix formatting bug in register dumps
  arch/tile: fix memcpy_fromio()/memcpy_toio() signatures
  arch/tile: Save and restore extra user state for tilegx
  arch/tile: Change struct sigcontext to be more useful
  arch/tile: finish const-ifying sys_execve()

8be7eb35

Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/lrg/voltage-2.6 · 3a919cf0

Linus Torvalds authored Sep 16, 2010

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/lrg/voltage-2.6:
  regulator: wm8350-regulator - fix the logic of checking REGULATOR_MODE_STANDBY mode
  regulator: wm831x-ldo - fix the logic to set REGULATOR_MODE_IDLE and REGULATOR_MODE_STANDBY modes
  regulator: ab8500 - fix off-by-one value range checking for selector
  regulator: 88pm8607 - fix value range checking for accessing info->vol_table
  regulator: isl6271a-regulator - fix regulator_desc parameter for regulator_register()
  regulator: ad5398 - fix a memory leak
  regulator: Update e-mail address for Liam Girdwood
  regulator: set max8998->dev to &pdev->dev.
  regulator: tps6586x-regulator - fix bit_mask parameter for tps6586x_set_bits()
  regulator: tps6586x-regulator - fix value range checking for val
  regulator: max8998 - set max8998->num_regulators
  regulator: max8998 - fix memory allocation size for max8998->rdev
  regulator: tps6507x - remove incorrect comments
  regulator: max1586 - improve the logic of choosing selector
  regulator: ab8500 - fix the logic to remove already registered regulators in error path
  regulator: ab3100 - fix the logic to remove already registered regulators in error path
  regulator/ab8500: move dereference below the check for NULL

3a919cf0

Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/wq · 94ca9d66
Linus Torvalds authored Sep 16, 2010
```
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/wq:
  workqueue: add documentation
```
94ca9d66

Merge branch 'drm-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/airlied/drm-2.6 · 2c35cd01

Linus Torvalds authored Sep 16, 2010

* 'drm-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/airlied/drm-2.6:
  drm/radeon/kms: only warn on mipmap size checks in r600 cs checker (v2)
  drm/radeon/kms: force legacy pll algo for RV620 LVDS
  drm: fix race between driver loading and userspace open.
  drm: Use a nondestructive mode for output detect when polling (v2)
  drm/radeon/kms: fix the colorbuffer CS checker for r300-r500
  drm/radeon/kms: increase lockup detection interval to 10 sec for r100-r500
  drm/radeon/kms/evergreen: fix backend setup
  drm: Use a nondestructive mode for output detect when polling
  drm/radeon: add some missing copyright headers
  drm: Only decouple the old_fb from the crtc is we call mode_set*
  drm/radeon/kms: don't enable underscan with interlaced modes
  drm/radeon/kms: add connector table for Mac x800
  drm/radeon/kms: fix regression in RMX code (v2)
  drm: Fix regression in disable polling e58f637b

2c35cd01

15 Sep, 2010 20 commits

drivers/video/via/ioctl.c: prevent reading uninitialized stack memory · b4aaa78f

Dan Rosenberg authored Sep 15, 2010

The VIAFB_GET_INFO device ioctl allows unprivileged users to read 246
bytes of uninitialized stack memory, because the "reserved" member of
the viafb_ioctl_info struct declared on the stack is not altered or
zeroed before being copied back to the user.  This patch takes care of
it.
Signed-off-by: Dan Rosenberg <dan.j.rosenberg@gmail.com>
Signed-off-by: Florian Tobias Schandinat <FlorianSchandinat@gmx.de>

b4aaa78f

[IA64] Optimize ticket spinlocks in fsys_rt_sigprocmask · 2d2b6901

Petr Tesarik authored Sep 15, 2010

Tony's fix (f574c843) has a small bug,
it incorrectly uses "r3" as a scratch register in the first of the two
unlock paths ... it is also inefficient.  Optimize the fast path again.
Signed-off-by: Petr Tesarik <ptesarik@suse.cz>
Signed-off-by: Tony Luck <tony.luck@intel.com>

2d2b6901

watchdog: Enable NXP LPC32XX support in Kconfig (resend) · 0a18e155

Kevin Wells authored Aug 17, 2010

The NXP LPC32XX processor use the same watchdog as the Philips
PNX4008 processor.
Signed-off-by: Kevin Wells <wellsk40@gmail.com>
Tested-by: Wolfram Sang <w.sang@pengutronix.de>
Signed-off-by: Wim Van Sebroeck <wim@iguana.be>

0a18e155

watchdog: ts72xx_wdt: disable watchdog at probe · 0e901bed

Mika Westerberg authored Aug 29, 2010

Since it may be already enabled by bootloader or some other utility. This patch
makes sure that the watchdog is disabled before any userspace daemon opens the
device. It is also required by the watchdog API.
Signed-off-by: Mika Westerberg <mika.westerberg@iki.fi>
Signed-off-by: Wim Van Sebroeck <wim@iguana.be>

0e901bed

watchdog: sb_wdog: release irq and reboot notifier in error path and module_exit() · ae44855a

Akinobu Mita authored Aug 21, 2010

irq and reboot notifier are acquired in module_init() but never released.
They should be released correctly, otherwise reloading the module or error
during module_init() will cause a problem.
Signed-off-by: Akinobu Mita <akinobu.mita@gmail.com>
Cc: Andrew Sharp <andy.sharp@lsi.com>
Signed-off-by: Wim Van Sebroeck <wim@iguana.be>

ae44855a

pcmcia pcnet_cs: try setting io_lines to 16 if card setup fails · b76dc054

Dominik Brodowski authored Sep 13, 2010

Some pcnet_cs compatible cards require an exact 16-lines match
of the ioport areas specified in CIS, but set the "iolines"
value in the CIS incorrectly. We can easily work around this
issue -- same as we do in serial_cs -- by first trying setting
iolines to the CIS-specified value, and then trying a 16-line
match.
Reported-and-tested-by: Wolfram Sang <w.sang@pengutronix.de>
Hardware-supplied-by: Jochen Frieling <j.frieling@pengutronix.de>
CC: netdev@vger.kernel.org
Signed-off-by: Dominik Brodowski <linux@dominikbrodowski.net>

b76dc054

pcmcia: per-device, not per-socket debug messages · eb838fe1

Dominik Brodowski authored Sep 13, 2010

As the iomem / ioport setup differs per device, it is much better
to print out the device instead of the socket.
Tested-by: Wolfram Sang <w.sang@pengutronix.de>
Signed-off-by: Dominik Brodowski <linux@dominikbrodowski.net>

eb838fe1

pcmcia serial_cs.c: fix multifunction card handling · c494bc6c

Dominik Brodowski authored Aug 30, 2010

We shouldn't overwrite pre-set values, and we should also
set the port address to the beginning, and not the end of
the 8-port range.

CC: linux-serial@vger.kernel.org
Reported-by: Komuro <komurojun-mbn@nifty.com>
Hardware-supplied-by: Jochen Frieling <j.frieling@pengutronix.de>
Tested-by: Wolfram Sang <w.sang@pengutronix.de>
Signed-off-by: Dominik Brodowski <linux@dominikbrodowski.net>

c494bc6c

arch/tile: fix formatting bug in register dumps · 7040dea4

Chris Metcalf authored Sep 15, 2010

This cut-and-paste bug was caused by rewriting the register dump
code to use only a single printk per line of output.
Signed-off-by: Chris Metcalf <cmetcalf@tilera.com>

7040dea4

arch/tile: fix memcpy_fromio()/memcpy_toio() signatures · 0fab59e5

Chris Metcalf authored Sep 15, 2010

This tripped up a driver (not yet committed to git).  Fix it now.
Signed-off-by: Chris Metcalf <cmetcalf@tilera.com>

0fab59e5

arch/tile: Save and restore extra user state for tilegx · a802fc68

Chris Metcalf authored Sep 15, 2010

During context switch, save and restore a couple of additional bits of
tilegx user state that can be persistently modified by userspace.
Signed-off-by: Chris Metcalf <cmetcalf@tilera.com>

a802fc68

arch/tile: Change struct sigcontext to be more useful · 74fca9da

Chris Metcalf authored Sep 15, 2010

Rather than just using pt_regs, it now contains the actual saved
state explicitly, similar to pt_regs.  By doing it this way, we
provide a cleaner API for userspace (or equivalently, we avoid the
need for libc to provide its own definition of sigcontext).

While we're at it, move PT_FLAGS_xxx to where they are not visible
from userspace.  And always pass siginfo and mcontext to signal
handlers, even if they claim they don't need it, since sometimes
they actually try to use it anyway in practice.
Signed-off-by: Chris Metcalf <cmetcalf@tilera.com>

74fca9da

arch/tile: finish const-ifying sys_execve() · e6e6c46d

Chris Metcalf authored Sep 15, 2010

The sys_execve() implementation was properly const-ified but not
the declaration, the syscall wrappers, or the compat version.
This change completes the constification process.
Signed-off-by: Chris Metcalf <cmetcalf@tilera.com>

e6e6c46d

drm/radeon/kms: only warn on mipmap size checks in r600 cs checker (v2) · fe725d4f

Alex Deucher authored Sep 14, 2010

The texture base address registers are in units of 256 bytes.
The original CS checker treated these offsets as bytes, so the
original check was wrong.  I fixed the units in a patch during
the 2.6.36 cycle, but this ended up breaking some existing
userspace (probably due to a bug in either userspace texture allocation
or the drm texture mipmap checker).  So for now, until we come
up with a better fix, just warn if the mipmap size it too large.
This will keep existing userspace working and it should be just
as safe as before when we were checking the wrong units.  These
are GPU MC addresses, so if they fall outside of the VRAM or
GART apertures, they end up at the GPU default page, so this should
be safe from a security perspective.

v2: Just disable the warning.  It just spams the log and there's
nothing the user can do about it.
Signed-off-by: Alex Deucher <alexdeucher@gmail.com>
Cc: Jerome Glisse <glisse@freedesktop.org>
Signed-off-by: Dave Airlie <airlied@redhat.com>

fe725d4f

Merge ssh://master.kernel.org/home/hpa/tree/sec · 9c03f162

Linus Torvalds authored Sep 14, 2010

* ssh://master.kernel.org/home/hpa/tree/sec:
  x86-64, compat: Retruncate rax after ia32 syscall entry tracing
  x86-64, compat: Test %rax for the syscall number, not %eax
  compat: Make compat_alloc_user_space() incorporate the access_ok()

9c03f162

MN10300: Fix up the IRQ names for the on-chip serial ports · a4128b03

David Howells authored Sep 14, 2010

Fix up the IRQ names for the MN10300 on-chip serial ports in the driver as
request_interrupt() no longer allows names containing slashes, giving a warning
like the following if one is encountered:

	------------[ cut here ]------------
	WARNING: at fs/proc/generic.c:323 __xlate_proc_name+0x62/0x7c()
	name 'ttySM0/Rx'
Signed-off-by: David Howells <dhowells@redhat.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

a4128b03

Merge git://git.infradead.org/mtd-2.6 · 65e0b598

Linus Torvalds authored Sep 14, 2010

* git://git.infradead.org/mtd-2.6:
  mtd: pxa3xx: fix build error when CONFIG_MTD_PARTITIONS is not defined
  mtd: mxc_nand: configure pages per block for v2 controller
  mtd: OneNAND: Fix loop hang when DMA error at Samsung SoCs
  mtd: OneNAND: Fix 2KiB pagesize handling at Samsung SoCs
  mtd: Blackfin NFC: fix invalid free in remove()
  mtd: Blackfin NFC: fix build error after nand_scan_ident() change
  mxc_nand: Do not do byte accesses to the NFC buffer.

65e0b598

Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/hid · d7a4b63b

Linus Torvalds authored Sep 14, 2010

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/hid:
  HID: fix hiddev's use of usb_find_interface
  HID: fixup blacklist entry for Asus T91MT
  HID: add device ID for new Asus Multitouch Controller
  HID: add no-get quirk for eGalax touch controller
  HID: Add quirk for eGalax touch controler.
  HID: add support for another BTC Emprex remote control
  HID: Set Report ID properly for Output reports on the Control endpoint.
  HID: Kanvus Note A5 tablet needs HID_QUIRK_MULTI_INPUT
  HID: Add support for chicony multitouch screens.

d7a4b63b

Merge branch 'bugfixes' of git://git.linux-nfs.org/projects/trondmy/nfs-2.6 · de8d4f5d

Linus Torvalds authored Sep 14, 2010

* 'bugfixes' of git://git.linux-nfs.org/projects/trondmy/nfs-2.6:
  SUNRPC: Fix the NFSv4 and RPCSEC_GSS Kconfig dependencies
  statfs() gives ESTALE error
  NFS: Fix a typo in nfs_sockaddr_match_ipaddr6
  sunrpc: increase MAX_HASHTABLE_BITS to 14
  gss:spkm3 miss returning error to caller when import security context
  gss:krb5 miss returning error to caller when import security context
  Remove incorrect do_vfs_lock message
  SUNRPC: cleanup state-machine ordering
  SUNRPC: Fix a race in rpc_info_open
  SUNRPC: Fix race corrupting rpc upcall
  Fix null dereference in call_allocate

de8d4f5d

aio: check for multiplication overflow in do_io_submit · 75e1c70f

Jeff Moyer authored Sep 10, 2010

Tavis Ormandy pointed out that do_io_submit does not do proper bounds
checking on the passed-in iocb array:

       if (unlikely(nr < 0))
               return -EINVAL;

       if (unlikely(!access_ok(VERIFY_READ, iocbpp, (nr*sizeof(iocbpp)))))
               return -EFAULT;                      ^^^^^^^^^^^^^^^^^^

The attached patch checks for overflow, and if it is detected, the
number of iocbs submitted is scaled down to a number that will fit in
the long.  This is an ok thing to do, as sys_io_submit is documented as
returning the number of iocbs submitted, so callers should handle a
return value of less than the 'nr' argument passed in.
Reported-by: Tavis Ormandy <taviso@cmpxchg8b.com>
Signed-off-by: Jeff Moyer <jmoyer@redhat.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

75e1c70f

14 Sep, 2010 8 commits

cifs: fix potential double put of TCP session reference · 460cf341

Jeff Layton authored Sep 14, 2010

cifs_get_smb_ses must be called on a server pointer on which it holds an
active reference. It first does a search for an existing SMB session. If
it finds one, it'll put the server reference and then try to ensure that
the negprot is done, etc.

If it encounters an error at that point then it'll return an error.
There's a potential problem here though. When cifs_get_smb_ses returns
an error, the caller will also put the TCP server reference leading to a
double-put.

Fix this by having cifs_get_smb_ses only put the server reference if
it found an existing session that it could use and isn't returning an
error.

Cc: stable@kernel.org
Reviewed-by: Suresh Jayaraman <sjayaraman@suse.de>
Signed-off-by: Jeff Layton <jlayton@redhat.com>
Signed-off-by: Steve French <sfrench@us.ibm.com>

460cf341

x86-64, compat: Retruncate rax after ia32 syscall entry tracing · eefdca04

Roland McGrath authored Sep 14, 2010

In commit d4d67150, we reopened an old hole for a 64-bit ptracer touching a
32-bit tracee in system call entry.  A %rax value set via ptrace at the
entry tracing stop gets used whole as a 32-bit syscall number, while we
only check the low 32 bits for validity.

Fix it by truncating %rax back to 32 bits after syscall_trace_enter,
in addition to testing the full 64 bits as has already been added.
Reported-by: Ben Hawkes <hawkes@sota.gen.nz>
Signed-off-by: Roland McGrath <roland@redhat.com>
Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>

eefdca04

x86-64, compat: Test %rax for the syscall number, not %eax · 36d001c7

H. Peter Anvin authored Sep 14, 2010

On 64 bits, we always, by necessity, jump through the system call
table via %rax.  For 32-bit system calls, in theory the system call
number is stored in %eax, and the code was testing %eax for a valid
system call number.  At one point we loaded the stored value back from
the stack to enforce zero-extension, but that was removed in checkin
d4d67150.  An actual 32-bit process
will not be able to introduce a non-zero-extended number, but it can
happen via ptrace.

Instead of re-introducing the zero-extension, test what we are
actually going to use, i.e. %rax.  This only adds a handful of REX
prefixes to the code.
Reported-by: Ben Hawkes <hawkes@sota.gen.nz>
Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
Cc: <stable@kernel.org>
Cc: Roland McGrath <roland@redhat.com>
Cc: Andrew Morton <akpm@linux-foundation.org>

36d001c7

compat: Make compat_alloc_user_space() incorporate the access_ok() · c41d68a5

H. Peter Anvin authored Sep 07, 2010

compat_alloc_user_space() expects the caller to independently call
access_ok() to verify the returned area.  A missing call could
introduce problems on some architectures.

This patch incorporates the access_ok() check into
compat_alloc_user_space() and also adds a sanity check on the length.
The existing compat_alloc_user_space() implementations are renamed
arch_compat_alloc_user_space() and are used as part of the
implementation of the new global function.

This patch assumes NULL will cause __get_user()/__put_user() to either
fail or access userspace on all architectures.  This should be
followed by checking the return value of compat_access_user_space()
for NULL in the callers, at which time the access_ok() in the callers
can also be removed.
Reported-by: Ben Hawkes <hawkes@sota.gen.nz>
Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
Acked-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Acked-by: Chris Metcalf <cmetcalf@tilera.com>
Acked-by: David S. Miller <davem@davemloft.net>
Acked-by: Ingo Molnar <mingo@elte.hu>
Acked-by: Thomas Gleixner <tglx@linutronix.de>
Acked-by: Tony Luck <tony.luck@intel.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Fenghua Yu <fenghua.yu@intel.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
Cc: Helge Deller <deller@gmx.de>
Cc: James Bottomley <jejb@parisc-linux.org>
Cc: Kyle McMartin <kyle@mcmartin.ca>
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Ralf Baechle <ralf@linux-mips.org>
Cc: <stable@kernel.org>

c41d68a5

x86: hpet: Work around hardware stupidity · 54ff7e59

Thomas Gleixner authored Sep 14, 2010

This more or less reverts commits 08be9796 (x86: Force HPET
readback_cmp for all ATI chipsets) and 30a564be (x86, hpet: Restrict
read back to affected ATI chipsets) to the status of commit 8da854cb
(x86, hpet: Erratum workaround for read after write of HPET
comparator).

The delta to commit 8da854cb is mostly comments and the change from
WARN_ONCE to printk_once as we know the call path of this function
already.

This needs really in depth explanation:

First of all the HPET design is a complete failure. Having a counter
compare register which generates an interrupt on matching values
forces the software to do at least one superfluous readback of the
counter register.

While it is nice in theory to program "absolute" time events it is
practically useless because the timer runs at some absurd frequency
which can never be matched to real world units. So we are forced to
calculate a relative delta and this forces a readout of the actual
counter value, adding the delta and programming the compare
register. When the delta is small enough we run into the danger that
we program a compare value which is already in the past. Due to the
compare for equal nature of HPET we need to read back the counter
value after writing the compare rehgister (btw. this is necessary for
absolute timeouts as well) to make sure that we did not miss the timer
event. We try to work around that by setting the minimum delta to a
value which is larger than the theoretical time which elapses between
the counter readout and the compare register write, but that's only
true in theory. A NMI or SMI which hits between the readout and the
write can easily push us beyond that limit. This would result in
waiting for the next HPET timer interrupt until the 32bit wraparound
of the counter happens which takes about 306 seconds.

So we designed the next event function to look like:

   match = read_cnt() + delta;
   write_compare_ref(match);
   return read_cnt() < match ? 0 : -ETIME;

At some point we got into trouble with certain ATI chipsets. Even the
above "safe" procedure failed. The reason was that the write to the
compare register was delayed probably for performance reasons. The
theory was that they wanted to avoid the synchronization of the write
with the HPET clock, which is understandable. So the write does not
hit the compare register directly instead it goes to some intermediate
register which is copied to the real compare register in sync with the
HPET clock. That opens another window for hitting the dreaded "wait
for a wraparound" problem.

To work around that "optimization" we added a read back of the compare
register which either enforced the update of the just written value or
just delayed the readout of the counter enough to avoid the issue. We
unfortunately never got any affirmative info from ATI/AMD about this.

One thing is sure, that we nuked the performance "optimization" that
way completely and I'm pretty sure that the result is worse than
before some HW folks came up with those.

Just for paranoia reasons I added a check whether the read back
compare register value was the same as the value we wrote right
before. That paranoia check triggered a couple of years after it was
added on an Intel ICH9 chipset. Venki added a workaround (commit
8da854cb) which was reading the compare register twice when the first
check failed. We considered this to be a penalty in general and
restricted the readback (thus the wasted CPU cycles) to the known to
be affected ATI chipsets.

This turned out to be a utterly wrong decision. 2.6.35 testers
experienced massive problems and finally one of them bisected it down
to commit 30a564be which spured some further investigation.

Finally we got confirmation that the write to the compare register can
be delayed by up to two HPET clock cycles which explains the problems
nicely. All we can do about this is to go back to Venki's initial
workaround in a slightly modified version.

Just for the record I need to say, that all of this could have been
avoided if hardware designers and of course the HPET committee would
have thought about the consequences for a split second. It's out of my
comprehension why designing a working timer is so hard. There are two
ways to achieve it:

 1) Use a counter wrap around aware compare_reg <= counter_reg
    implementation instead of the easy compare_reg == counter_reg

    Downsides:

	- It needs more silicon.

	- It needs a readout of the counter to apply a relative
	  timeout. This is necessary as the counter does not run in
	  any useful (and adjustable) frequency and there is no
	  guarantee that the counter which is used for timer events is
	  the same which is used for reading the actual time (and
	  therefor for calculating the delta)

    Upsides:

	- None

  2) Use a simple down counter for relative timer events

    Downsides:

	- Absolute timeouts are not possible, which is not a problem
	  at all in the context of an OS and the expected
	  max. latencies/jitter (also see Downsides of #1)

   Upsides:

	- It needs less or equal silicon.

	- It works ALWAYS

	- It is way faster than a compare register based solution (One
	  write versus one write plus at least one and up to four
	  reads)

I would not be so grumpy about all of this, if I would not have been
ignored for many years when pointing out these flaws to various
hardware folks. I really hate timers (at least those which seem to be
designed by janitors).

Though finally we got a reasonable explanation plus a solution and I
want to thank all the folks involved in chasing it down and providing
valuable input to this.
Bisected-by: Nix <nix@esperi.org.uk>
Reported-by: Artur Skawina <art.08.09@gmail.com>
Reported-by: Damien Wyart <damien.wyart@free.fr>
Reported-by: John Drescher <drescherjm@gmail.com>
Cc: Venkatesh Pallipadi <venki@google.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Arjan van de Ven <arjan@linux.intel.com>
Cc: Andreas Herrmann <andreas.herrmann3@amd.com>
Cc: Borislav Petkov <borislav.petkov@amd.com>
Cc: stable@kernel.org
Acked-by: Suresh Siddha <suresh.b.siddha@intel.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>

54ff7e59

drm/radeon/kms: force legacy pll algo for RV620 LVDS · f90087ee

Alex Deucher authored Sep 07, 2010

There has been periodic evidence that LVDS, on at least some
panels, prefers the dividers selected by the legacy pll algo.
This patch forces the use of the legacy pll algo on RV620
LVDS panels.  The old behavior (new pll algo) can be selected
by setting the new_pll module parameter to 1.

Fixes:
https://bugs.freedesktop.org/show_bug.cgi?id=30029Signed-off-by: Alex Deucher <alexdeucher@gmail.com>
Cc: stable@kernel.org
Signed-off-by: Dave Airlie <airlied@redhat.com>

f90087ee

drm: fix race between driver loading and userspace open. · b64c115e

Dave Airlie authored Sep 14, 2010

Not 100% sure this is due to BKL removal, its most likely a combination
of that + userspace timing changes in udev/plymouth. The drm adds the sysfs
device before the driver has completed internal loading, this causes udev
to make the node and plymouth to open it before we've completed loading.

The proper solution is to delay the sysfs manipulation until later in loading
however this causes knock on issues with sysfs connector nodes, so we can use
the global mutex to serialise loading and userspace opens.

Reported-by: Toni Spets (hifi on #radeon)
Signed-off-by: Dave Airlie <airlied@redhat.com>

b64c115e

drm: Use a nondestructive mode for output detect when polling (v2) · 930a9e28

Chris Wilson authored Sep 14, 2010

v2: Julien Cristau pointed out that @nondestructive results in
double-negatives and confusion when trying to interpret the parameter,
so use @force instead. Much easier to type as well. ;-)

And fix the miscompilation of vmgfx reported by Sedat Dilek.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: stable@kernel.org
Signed-off-by: Dave Airlie <airlied@redhat.com>

930a9e28