Commits · 6a29b109b9fb4c76bf55851223ae59e3322d9616 · Kirill Smelkov / linux

07 Oct, 2012 8 commits

ntp: Fix leap-second hrtimer livelock · 6a29b109

John Stultz authored Jul 17, 2012

This is a backport of 6b43ae8a

This should have been backported when it was commited, but I
mistook the problem as requiring the ntp_lock changes
that landed in 3.4 in order for it to occur.

Unfortunately the same issue can happen (with only one cpu)
as follows:
do_adjtimex()
 write_seqlock_irq(&xtime_lock);
  process_adjtimex_modes()
   process_adj_status()
    ntp_start_leap_timer()
     hrtimer_start()
      hrtimer_reprogram()
       tick_program_event()
        clockevents_program_event()
         ktime_get()
          seq = req_seqbegin(xtime_lock); [DEADLOCK]

This deadlock will no always occur, as it requires the
leap_timer to force a hrtimer_reprogram which only happens
if its set and there's no sooner timer to expire.

NOTE: This patch, being faithful to the original commit,
introduces a bug (we don't update wall_to_monotonic),
which will be resovled by backporting a following fix.

Original commit message below:

Since commit 7dffa3c6 the ntp
subsystem has used an hrtimer for triggering the leapsecond
adjustment. However, this can cause a potential livelock.

Thomas diagnosed this as the following pattern:
CPU 0                                                    CPU 1
do_adjtimex()
  spin_lock_irq(&ntp_lock);
    process_adjtimex_modes();				 timer_interrupt()
      process_adj_status();                                do_timer()
        ntp_start_leap_timer();                             write_lock(&xtime_lock);
          hrtimer_start();                                  update_wall_time();
             hrtimer_reprogram();                            ntp_tick_length()
               tick_program_event()                            spin_lock(&ntp_lock);
                 clockevents_program_event()
		   ktime_get()
                     seq = req_seqbegin(xtime_lock);

This patch tries to avoid the problem by reverting back to not using
an hrtimer to inject leapseconds, and instead we handle the leapsecond
processing in the second_overflow() function.

The downside to this change is that on systems that support highres
timers, the leap second processing will occur on a HZ tick boundary,
(ie: ~1-10ms, depending on HZ)  after the leap second instead of
possibly sooner (~34us in my tests w/ x86_64 lapic).

This patch applies on top of tip/timers/core.

CC: Sasha Levin <levinsasha928@gmail.com>
CC: Thomas Gleixner <tglx@linutronix.de>
Reported-by: Sasha Levin <levinsasha928@gmail.com>
Diagnoised-by: Thomas Gleixner <tglx@linutronix.de>
Tested-by: Sasha Levin <levinsasha928@gmail.com>
Cc: Prarit Bhargava <prarit@redhat.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Linux Kernel <linux-kernel@vger.kernel.org>
Signed-off-by: John Stultz <john.stultz@linaro.org>
Signed-off-by: Willy Tarreau <w@1wt.eu>

6a29b109

futex: Fix uninterruptible loop due to gate_area · a5bd5781

Hugh Dickins authored Dec 31, 2011

commit e6780f72 upstream.

It was found (by Sasha) that if you use a futex located in the gate
area we get stuck in an uninterruptible infinite loop, much like the
ZERO_PAGE issue.

While looking at this problem, PeterZ realized you'll get into similar
trouble when hitting any install_special_pages() mapping.  And are there
still drivers setting up their own special mmaps without page->mapping,
and without special VM or pte flags to make get_user_pages fail?

In most cases, if page->mapping is NULL, we do not need to retry at all:
Linus points out that even /proc/sys/vm/drop_caches poses no problem,
because it ends up using remove_mapping(), which takes care not to
interfere when the page reference count is raised.

But there is still one case which does need a retry: if memory pressure
called shmem_writepage in between get_user_pages_fast dropping page
table lock and our acquiring page lock, then the page gets switched from
filecache to swapcache (and ->mapping set to NULL) whatever the refcount.
Fault it back in to get the page->mapping needed for key->shared.inode.
Reported-by: Sasha Levin <levinsasha928@gmail.com>
Signed-off-by: Hugh Dickins <hughd@google.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
[PG: 2.6.34 variable is page, not page_head, since it doesn't have a5b338f2]
Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
Signed-off-by: Willy Tarreau <w@1wt.eu>

a5bd5781

fix pgd_lock deadlock · f4bced84

Philipp Hahn authored Apr 23, 2012

commit a79e53d8 upstream.

On Wednesday 16 February 2011 15:49:47 Andrea Arcangeli wrote:
> Subject: fix pgd_lock deadlock
>
> From: Andrea Arcangeli <aarcange@redhat.com>
>
> It's forbidden to take the page_table_lock with the irq disabled or if
> there's contention the IPIs (for tlb flushes) sent with the page_table_lock
> held will never run leading to a deadlock.
>
> Apparently nobody takes the pgd_lock from irq so the _irqsave can be
> removed.
>
> Signed-off-by: Andrea Arcangeli <aarcange@redhat.com>

This patch (original commit Id for 2.6.38 a79e53d8)
needs to be back-ported to 2.6.32.x as well.
I observed a dead-lock problem when running a PAE enabled Debian 2.6.32.46+
kernel with 6 VCPUs as a KVM on (2.6.32, 3.2, 3.3) kernel, which showed the
following behaviour:

1 VCPU is stuck in
  pgd_alloc() =E2=86=92 pgd_prepopulate_pmb() =E2=86=92... =E2=86=92  flush_tlb_others_ipi()
while (!cpumask_empty(to_cpumask(f->flush_cpumask)))
    cpu_relax();
(gdb) print f->flush_cpumask
$5 = {1}

while all other VCPUs are stuck in
  pgd_alloc() =E2=86=92 spin_lock_irqsave(pgd_lock)

I tracked it down to the commit
 2.6.39-rc1: 4981d01e
 2.6.32.34: ba456fd7
 x86: Flush TLB if PGD entry is changed in i386 PAE mode
which when reverted made the bug disappear.

Comparing 3.2 to 2.6.32.34 showed that the 'pgd-deadlock'-patch went into
2.6.38, that is before the 'PAE correctness'-patch, so the problem was
probably never observed in the main development branch.
But for 2.6.32 the 'pgd-deadlock' patch is still missing, so the 'PAE
corretness'-patch made the problem worse with 2.6.32.

The Patch was also back-ported to the OpenSUSE Kernel
<http://kernel.opensuse.org/cgit/kernel-source/commit/?id=ac27c01aa880c65d17043ab87249c613ac4c3635>,
Since the patch didn't apply cleanly on the current Debian kernel, I had to
backport it for us and Debian. The patch is also available from our (German)
Bugzilla <https://forge.univention.org/bugzilla/show_bug.cgi?id=26661> or
from the Debian BTS at <http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=669335>.

I have no easy test case, but running multiple parallel builds inside the VM
normally triggers the bug within seconds to minutes. With the patch applied
the VM survived a night building packages without any problem.
Signed-off-by: Philipp Hahn <hahn@univention.de>

Sincerely
Philipp
-
Philipp Hahn           Open Source Software Engineer      hahn@univention.de
Univention GmbH        be open.                       fon: +49 421 22 232- 0
Mary-Somerville-Str.1  D-28359 Bremen                 fax: +49 421 22 232-99
                                                   http://www.univention.de/

It's forbidden to take the page_table_lock with the irq disabled
or if there's contention the IPIs (for tlb flushes) sent with
the page_table_lock held will never run leading to a deadlock.

Nobody takes the pgd_lock from irq context so the _irqsave can be
removed.
Signed-off-by: Andrea Arcangeli <aarcange@redhat.com>
Acked-by: Rik van Riel <riel@redhat.com>
Tested-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: <stable@kernel.org>
LKML-Reference: <201102162345.p1GNjMjm021738@imap1.linux-foundation.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Git-commit: a79e53d8Signed-off-by: Willy Tarreau <w@1wt.eu>

f4bced84

jbd2: clear BH_Delay & BH_Unwritten in journal_unmap_buffer · 7cfe055f

Eric Sandeen authored Feb 20, 2012

commit 15291164 upstream.

journal_unmap_buffer()'s zap_buffer: code clears a lot of buffer head
state ala discard_buffer(), but does not touch _Delay or _Unwritten as
discard_buffer() does.

This can be problematic in some areas of the ext4 code which assume
that if they have found a buffer marked unwritten or delay, then it's
a live one.  Perhaps those spots should check whether it is mapped
as well, but if jbd2 is going to tear down a buffer, let's really
tear it down completely.

Without this I get some fsx failures on sub-page-block filesystems
up until v3.2, at which point 4e96b2db
and 189e868f make the failures go
away, because buried within that large change is some more flag
clearing.  I still think it's worth doing in jbd2, since
->invalidatepage leads here directly, and it's the right place
to clear away these flags.
Signed-off-by: Eric Sandeen <sandeen@redhat.com>
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
Cc: stable@vger.kernel.org

BugLink: http://bugs.launchpad.net/bugs/929781
CVE-2011-4086
Signed-off-by: Stefan Bader <stefan.bader@canonical.com>
Signed-off-by: Willy Tarreau <w@1wt.eu>

7cfe055f

Bluetooth: btusb: fix bInterval for high/super speed isochronous endpoints · b87881d8

Bing Zhao authored Mar 28, 2012

commit fa0fb93f upstream

For high-speed/super-speed isochronous endpoints, the bInterval
value is used as exponent, 2^(bInterval-1). Luckily we have
usb_fill_int_urb() function that handles it correctly. So we just
call this function to fill in the RX URB.

Cc: Marcel Holtmann <marcel@holtmann.org>
Signed-off-by: Bing Zhao <bzhao@marvell.com>
Acked-by: Marcel Holtmann <marcel@holtmann.org>
Signed-off-by: Gustavo F. Padovan <padovan@profusion.mobi>
Signed-off-by: Willy Tarreau <w@1wt.eu>

b87881d8

powerpc/pmac: Fix SMP kernels on pre-core99 UP machines · bff4680d

Benjamin Herrenschmidt authored Mar 21, 2012

commit 78c5c68a upstream.

The code for "powersurge" SMP would kick in and cause a crash
at boot due to the lack of a NULL test.

Adam Conrad reports that the 3.2 kernel, with CONFIG_SMP=y, will not
boot on an OldWorld G3; we're unconditionally writing to psurge_start,
but this is only set on powersurge machines.
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Jeremy Kerr <jeremy.kerr@canonical.com>
Reported-by: Adam Conrad <adconrad@ubuntu.com>
Tested-by: Adam Conrad <adconrad@ubuntu.com>
Signed-off-by: Willy Tarreau <w@1wt.eu>

bff4680d

Fix sparc build with newer tools. · 99612d20

David Miller authored Mar 15, 2012

commit e0adb990 upstream.

Newer version of binutils are more strict about specifying the
correct options to enable certain classes of instructions.

The sparc32 build is done for v7 in order to support sun4c systems
which lack hardware integer multiply and divide instructions.

So we have to pass -Av8 when building the assembler routines that
use these instructions and get patched into the kernel when we find
out that we have a v8 capable cpu.
Reported-by: Paul Gortmaker <paul.gortmaker@windriver.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Willy Tarreau <w@1wt.eu>

99612d20

netxen: support for GbE port settings · ef1ff0d0

Sony Chacko authored Mar 15, 2011

commit bfd823bd upstream.

o Enable setting speed and auto negotiation parameters for GbE ports.
o Hardware do not support half duplex setting currently.

David Miller:
	Amit please update your patch to silently reject link setting
	attempts that are unsupported by the device.

[jn: backported for 2.6.32.y by Ana Guerrero]
Signed-off-by: Sony Chacko <sony.chacko@qlogic.com>
Signed-off-by: Amit Kumar Salecha <amit.salecha@qlogic.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Tested-by: Ana Guerrero <ana@debian.org> # HP NC375i
Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>
Signed-off-by: Willy Tarreau <w@1wt.eu>
Acked-by: Sony Chacko <sony.chacko@qlogic.com>
Signed-off-by: Willy Tarreau <w@1wt.eu>

ef1ff0d0

17 Mar, 2012 13 commits

Linux 2.6.32.59 · 6d224606
Willy Tarreau authored Mar 17, 2012
```
Signed-off-by: Willy Tarreau <w@1wt.eu>
```
6d224606

blkfront: Fix backtrace in del_gendisk · bee69f9f

David Vrabel authored Mar 14, 2012

Commit 89de1669 upstream.

The call to del_gendisk follows an non-refcounted gd->queue
pointer. We release the last ref in blk_cleanup_queue. Fixed by
reordering releases accordingly.
Signed-off-by: Daniel Stodden <daniel.stodden@citrix.com>
Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
Signed-off-by: David Vrabel <david.vrabel@citrix.com>
Signed-off-by: Willy Tarreau <w@1wt.eu>

bee69f9f

watchdog: hpwdt: clean up set_memory_x call for 32 bit · 53148aeb

Maxim Uvarov authored Jan 15, 2012

commit 97d2a10d upstream.

1. address has to be page aligned.
2. set_memory_x uses page size argument, not size.
Bug causes with following commit:
	commit da28179b4e90dda56912ee825c7eaa62fc103797
	Author: Mingarelli, Thomas <Thomas.Mingarelli@hp.com>
	Date:   Mon Nov 7 10:59:00 2011 +0100

     watchdog: hpwdt: Changes to handle NX secure bit in 32bit path

    commit e67d668e upstream.

    This patch makes use of the set_memory_x() kernel API in order
    to make necessary BIOS calls to source NMIs.
Signed-off-by: Maxim Uvarov <maxim.uvarov@oracle.com>
Signed-off-by: Wim Van Sebroeck <wim@iguana.be>
Cc: stable <stable@vger.kernel.org>
Signed-off-by: Willy Tarreau <w@1wt.eu>

53148aeb

regset: Return -EFAULT, not -EIO, on host-side memory fault · 09735ec3

H. Peter Anvin authored Mar 02, 2012

commit 5189fa19 upstream.

There is only one error code to return for a bad user-space buffer
pointer passed to a system call in the same address space as the
system call is executed, and that is EFAULT.  Furthermore, the
low-level access routines, which catch most of the faults, return
EFAULT already.
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
Reviewed-by: Oleg Nesterov <oleg@redhat.com>
Acked-by: Roland McGrath <roland@hack.frob.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Willy Tarreau <w@1wt.eu>

09735ec3

regset: Prevent null pointer reference on readonly regsets · fe8b46f4

H. Peter Anvin authored Mar 02, 2012

commit c8e25258 upstream.

The regset common infrastructure assumed that regsets would always
have .get and .set methods, but not necessarily .active methods.
Unfortunately people have since written regsets without .set methods.

Rather than putting in stub functions everywhere, handle regsets with
null .get or .set methods explicitly.
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
Reviewed-by: Oleg Nesterov <oleg@redhat.com>
Acked-by: Roland McGrath <roland@hack.frob.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Willy Tarreau <w@1wt.eu>

fe8b46f4

net/usbnet: avoid recursive locking in usbnet_stop() · 559d6c82

Sebastian Siewior authored Mar 07, 2012

commit 4231d47e upstream.

|kernel BUG at kernel/rtmutex.c:724!
|[<c029599c>] (rt_spin_lock_slowlock+0x108/0x2bc) from [<c01c2330>] (defer_bh+0x1c/0xb4)
|[<c01c2330>] (defer_bh+0x1c/0xb4) from [<c01c3afc>] (rx_complete+0x14c/0x194)
|[<c01c3afc>] (rx_complete+0x14c/0x194) from [<c01cac88>] (usb_hcd_giveback_urb+0xa0/0xf0)
|[<c01cac88>] (usb_hcd_giveback_urb+0xa0/0xf0) from [<c01e1ff4>] (musb_giveback+0x34/0x40)
|[<c01e1ff4>] (musb_giveback+0x34/0x40) from [<c01e2b1c>] (musb_advance_schedule+0xb4/0x1c0)
|[<c01e2b1c>] (musb_advance_schedule+0xb4/0x1c0) from [<c01e2ca8>] (musb_cleanup_urb.isra.9+0x80/0x8c)
|[<c01e2ca8>] (musb_cleanup_urb.isra.9+0x80/0x8c) from [<c01e2ed0>] (musb_urb_dequeue+0xec/0x108)
|[<c01e2ed0>] (musb_urb_dequeue+0xec/0x108) from [<c01cbb90>] (unlink1+0xbc/0xcc)
|[<c01cbb90>] (unlink1+0xbc/0xcc) from [<c01cc2ec>] (usb_hcd_unlink_urb+0x54/0xa8)
|[<c01cc2ec>] (usb_hcd_unlink_urb+0x54/0xa8) from [<c01c2a84>] (unlink_urbs.isra.17+0x2c/0x58)
|[<c01c2a84>] (unlink_urbs.isra.17+0x2c/0x58) from [<c01c2b44>] (usbnet_terminate_urbs+0x94/0x10c)
|[<c01c2b44>] (usbnet_terminate_urbs+0x94/0x10c) from [<c01c2d68>] (usbnet_stop+0x100/0x15c)
|[<c01c2d68>] (usbnet_stop+0x100/0x15c) from [<c020f718>] (__dev_close_many+0x94/0xc8)

defer_bh() takes the lock which is hold during unlink_urbs(). The safe
walk suggest that the skb will be removed from the list and this is done
by defer_bh() so it seems to be okay to drop the lock here.

Cc: stable@kernel.org
Reported-by: Aníbal Almeida Pinto <anibal.pinto@efacec.com>
Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Acked-by: Oliver Neukum <oliver@neukum.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Willy Tarreau <w@1wt.eu>

559d6c82

cifs: fix dentry refcount leak when opening a FIFO on lookup · 3af36e8a

Jeff Layton authored Feb 23, 2012

commit 5bccda0e upstream.

The cifs code will attempt to open files on lookup under certain
circumstances. What happens though if we find that the file we opened
was actually a FIFO or other special file?

Currently, the open filehandle just ends up being leaked leading to
a dentry refcount mismatch and oops on umount. Fix this by having the
code close the filehandle on the server if it turns out not to be a
regular file. While we're at it, change this spaghetti if statement
into a switch too.

Cc: stable@vger.kernel.org
Reported-by: CAI Qian <caiqian@redhat.com>
Tested-by: CAI Qian <caiqian@redhat.com>
Reviewed-by: Shirish Pargaonkar <shirishpargaonkar@gmail.com>
Signed-off-by: Jeff Layton <jlayton@redhat.com>
Signed-off-by: Steve French <smfrench@gmail.com>
Signed-off-by: Willy Tarreau <w@1wt.eu>

3af36e8a

KEYS: Enable the compat keyctl wrapper on s390x · f30c620b

David Howells authored Feb 24, 2012

commit 1d057720 upstream.

Enable the compat keyctl wrapper on s390x so that 32-bit s390 userspace can
call the keyctl() syscall.

There's an s390x assembly wrapper that truncates all the register values to
32-bits and this then calls compat_sys_keyctl() - but the latter only exists if
CONFIG_KEYS_COMPAT is enabled, and the s390 Kconfig doesn't enable it.

Without this patch, 32-bit calls to the keyctl() syscall are given an ENOSYS
error:

	[root@devel4 ~]# keyctl show
	Session Keyring
	-3: key inaccessible (Function not implemented)
Signed-off-by: David Howells <dhowells@redhat.com>
Acked-by: dan@danny.cz
Cc: Carsten Otte <cotte@de.ibm.com>
Reviewed-by: Christian Borntraeger <borntraeger@de.ibm.com>
Cc: linux-s390@vger.kernel.org
Cc: stable@vger.kernel.org
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Signed-off-by: Willy Tarreau <w@1wt.eu>

f30c620b

eCryptfs: Handle failed metadata read in lookup · 70e74484

Tyler Hicks authored Mar 06, 2012

When failing to read the lower file's crypto metadata during a lookup,
eCryptfs must continue on without throwing an error. For example, there
may be a plaintext file in the lower mount point that the user wants to
delete through the eCryptfs mount.

If an error is encountered while reading the metadata in lookup(), the
eCryptfs inode's size could be incorrect. We must be sure to reread the
plaintext inode size from the metadata when performing an open() or
setattr(). The metadata is already being read in those paths, so this
adds minimal performance overhead.

This patch introduces a flag which will track whether or not the
plaintext inode size has been read so that an incorrect i_size can be
fixed in the open() or setattr() paths.

BugLink: http://bugs.launchpad.net/bugs/509180
https://lists.ubuntu.com/archives/kernel-team/2012-March/019137.html

Cc: <stable@kernel.org>
Signed-off-by: Tyler Hicks <tyhicks@linux.vnet.ibm.com>

(backported from 3aeb86ea)
Cc: Tyler Hicks <tyler.hicks@canonical.com>
Signed-off-by: Tim Gardner <tim.gardner@canonical.com>
Signed-off-by: Willy Tarreau <w@1wt.eu>

70e74484

bsg: fix sysfs link remove warning · 8e5ad053

Stanislaw Gruszka authored Mar 05, 2012

BugLink: http://bugs.launchpad.net/bugs/946928

We create "bsg" link if q->kobj.sd is not NULL, so remove it only
when the same condition is true.

Fixes:

WARNING: at fs/sysfs/inode.c:323 sysfs_hash_and_remove+0x2b/0x77()
sysfs: can not remove 'bsg', no directory
Call Trace:
  [<c0429683>] warn_slowpath_common+0x6a/0x7f
  [<c0537a68>] ? sysfs_hash_and_remove+0x2b/0x77
  [<c042970b>] warn_slowpath_fmt+0x2b/0x2f
  [<c0537a68>] sysfs_hash_and_remove+0x2b/0x77
  [<c053969a>] sysfs_remove_link+0x20/0x23
  [<c05d88f1>] bsg_unregister_queue+0x40/0x6d
  [<c0692263>] __scsi_remove_device+0x31/0x9d
  [<c069149f>] scsi_forget_host+0x41/0x52
  [<c0689fa9>] scsi_remove_host+0x71/0xe0
  [<f7de5945>] quiesce_and_remove_host+0x51/0x83 [usb_storage]
  [<f7de5a1e>] usb_stor_disconnect+0x18/0x22 [usb_storage]
  [<c06c29de>] usb_unbind_interface+0x4e/0x109
  [<c067a80f>] __device_release_driver+0x6b/0xa6
  [<c067a861>] device_release_driver+0x17/0x22
  [<c067a46a>] bus_remove_device+0xd6/0xe6
  [<c06785e2>] device_del+0xf2/0x137
  [<c06c101f>] usb_disable_device+0x94/0x1a0
Signed-off-by: Stanislaw Gruszka <sgruszka@redhat.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
(cherry picked from commit 37b40adf)
Signed-off-by: Tim Gardner <tim.gardner@canonical.com>
Signed-off-by: Willy Tarreau <w@1wt.eu>

8e5ad053

writeback: fixups for !dirty_writeback_centisecs · 05053902

Jens Axboe authored May 21, 2010

commit 6423104b upstream.

Commit 69b62d01 fixed up most of the places where we would enter
busy schedule() spins when disabling the periodic background
writeback. This fixes up the sb timer so that it doesn't get
hammered on with the delay disabled, and ensures that it gets
rearmed if needed when /proc/sys/vm/dirty_writeback_centisecs
gets modified.

bdi_forker_task() also needs to check for !dirty_writeback_centisecs
and use schedule() appropriately, fix that up too.
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
Tested-by: Xavier Roche <roche@httrack.com>
Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>
Signed-off-by: Willy Tarreau <w@1wt.eu>

05053902

IA64: Remove COMPAT_IA32 support · d9a25c03

Ben Hutchings authored Mar 09, 2012

commit 32974ad4 upstream

This just changes Kconfig rather than touching all the other files the
original commit did.

Patch description from the original commit :

  |  [IA64] Remove COMPAT_IA32 support
  |
  |  This has been broken since May 2008 when Al Viro killed altroot support.
  |  Since nobody has complained, it would appear that there are no users of
  |  this code (A plausible theory since the main OSVs that support ia64 prefer
  |  to use the IA32-EL software emulation).
  |
  |  Signed-off-by: Tony Luck <tony.luck@intel.com>
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
Signed-off-by: Willy Tarreau <w@1wt.eu>

d9a25c03

compat: Re-add missing asm/compat.h include to fix compile breakage on s390 · ee116431

Heiko Carstens authored Mar 05, 2012

For kernels < 3.0 the backport of 048cd4e5
"compat: fix compile breakage on s390" will break compilation...

Re-add a single #include <asm/compat.h> in order to fix this.

This patch is _not_ necessary for upstream, only for stable kernels
which include the "build fix" mentioned above.
Reported-by: Jiri Slaby <jslaby@suse.cz>
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Willy Tarreau <w@1wt.eu>

ee116431

04 Mar, 2012 19 commits

Linux 2.6.32.58 · 51c9aee2
Greg Kroah-Hartman authored Mar 04, 2012

51c9aee2

PM / Sleep: Fix read_unlock_usermodehelper() call. · bf698b51

Tetsuo Handa authored Feb 29, 2012

[ Upstream commit e4c89a50 ]

Commit b298d289
 "PM / Sleep: Fix freezer failures due to racy usermodehelper_is_disabled()"
added read_unlock_usermodehelper() but read_unlock_usermodehelper() is called
without read_lock_usermodehelper() when kmalloc() failed.
Signed-off-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
Acked-by: Srivatsa S. Bhat <srivatsa.bhat@linux.vnet.ibm.com>
Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
Signed-off-by: Srivatsa S. Bhat <srivatsa.bhat@linux.vnet.ibm.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

bf698b51

PM / Sleep: Fix freezer failures due to racy usermodehelper_is_disabled() · 0707a9cd

Srivatsa S. Bhat authored Feb 29, 2012

[ Upstream commit b298d289 ]

Commit a144c6a6 (PM: Print a warning if firmware is requested when tasks
are frozen) introduced usermodehelper_is_disabled() to warn and exit
immediately if firmware is requested when usermodehelpers are disabled.

However, it is racy. Consider the following scenario, currently used in
drivers/base/firmware_class.c:

...
if (usermodehelper_is_disabled())
        goto out;

/* Do actual work */
...

out:
        return err;

Nothing prevents someone from disabling usermodehelpers just after the check
in the 'if' condition, which means that it is quite possible to try doing the
"actual work" with usermodehelpers disabled, leading to undesirable
consequences.

In particular, this race condition in _request_firmware() causes task freezing
failures whenever suspend/hibernation is in progress because, it wrongly waits
to get the firmware/microcode image from userspace when actually the
usermodehelpers are disabled or userspace has been frozen.
Some of the example scenarios that cause freezing failures due to this race
are those that depend on userspace via request_firmware(), such as x86
microcode module initialization and microcode image reload.

Previous discussions about this issue can be found at:
http://thread.gmane.org/gmane.linux.kernel/1198291/focus=1200591

This patch adds proper synchronization to fix this issue.

It is to be noted that this patchset fixes the freezing failures but doesn't
remove the warnings. IOW, it does not attempt to add explicit synchronization
to x86 microcode driver to avoid requesting microcode image at inopportune
moments. Because, the warnings were introduced to highlight such cases, in the
first place. And we need not silence the warnings, since we take care of the
*real* problem (freezing failure) and hence, after that, the warnings are
pretty harmless anyway.
Signed-off-by: Srivatsa S. Bhat <srivatsa.bhat@linux.vnet.ibm.com>
Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

0707a9cd

firmware loader: allow builtin firmware load even if usermodehelper is disabled · 11dcbf06

Linus Torvalds authored Feb 29, 2012

[ Upstream commit caca9510 ]

In commit a144c6a6 ("PM: Print a warning if firmware is requested
when tasks are frozen") we not only printed a warning if somebody tried
to load the firmware when tasks are frozen - we also failed the load.

But that check was done before the check for built-in firmware, and then
when we disallowed usermode helpers during bootup (commit 288d5abe:
"Boot up with usermodehelper disabled"), that actually means that
built-in modules can no longer load their firmware even if the firmware
is built in too.  Which used to work, and some people depended on it for
the R100 driver.

So move the test for usermodehelper_is_disabled() down, to after
checking the built-in firmware.

This should fix:

	https://bugzilla.kernel.org/show_bug.cgi?id=40952Reported-by: James Cloos <cloos@hjcloos.com>
Bisected-by: Elimar Riesebieter <riesebie@lxtec.de>
Cc: Michel Dänzer <michel@daenzer.net>
Cc: Rafael Wysocki <rjw@sisk.pl>
Cc: Valdis Kletnieks <valdis.kletnieks@vt.edu>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Srivatsa S. Bhat <srivatsa.bhat@linux.vnet.ibm.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

11dcbf06

PM: Print a warning if firmware is requested when tasks are frozen · 4e5c29cf

Rafael J. Wysocki authored Feb 29, 2012

[ Upstream commit a144c6a6 ]

Some drivers erroneously use request_firmware() from their ->resume()
(or ->thaw(), or ->restore()) callbacks, which is not going to work
unless the firmware has been built in.  This causes system resume to
stall until the firmware-loading timeout expires, which makes users
think that the resume has failed and reboot their machines
unnecessarily.  For this reason, make _request_firmware() print a
warning and return immediately with error code if it has been called
when tasks are frozen and it's impossible to start any new usermode
helpers.
Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
Acked-by: Greg Kroah-Hartman <gregkh@suse.de>
Reviewed-by: Valdis Kletnieks <valdis.kletnieks@vt.edu>
Signed-off-by: Srivatsa S. Bhat <srivatsa.bhat@linux.vnet.ibm.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

4e5c29cf

compat: fix compile breakage on s390 · 7fce3f2d

Heiko Carstens authored Feb 27, 2012

commit 048cd4e5 upstream.

The new is_compat_task() define for the !COMPAT case in
include/linux/compat.h conflicts with a similar define in
arch/s390/include/asm/compat.h.

This is the minimal patch which fixes the build issues.
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Jonathan Nieder <jrnieder@gmail.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

7fce3f2d

Fix autofs compile without CONFIG_COMPAT · 25d10dda

Linus Torvalds authored Feb 26, 2012

commit 3c761ea0 upstream.

The autofs compat handling fix caused a compile failure when
CONFIG_COMPAT isn't defined.

Instead of adding random #ifdef'fery in autofs, let's just make the
compat helpers earlier to use: without CONFIG_COMPAT, is_compat_task()
just hardcodes to zero.

We could probably do something similar for a number of other cases where
we have #ifdef's in code, but this is the low-hanging fruit.
Reported-and-tested-by: Andreas Schwab <schwab@linux-m68k.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Jonathan Nieder <jrnieder@gmail.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

25d10dda

autofs: work around unhappy compat problem on x86-64 · 82e43e2a

Ian Kent authored Feb 22, 2012

commit a32744d4 upstream.

When the autofs protocol version 5 packet type was added in commit
5c0a32fc ("autofs4: add new packet type for v5 communications"), it
obvously tried quite hard to be word-size agnostic, and uses explicitly
sized fields that are all correctly aligned.

However, with the final "char name[NAME_MAX+1]" array at the end, the
actual size of the structure ends up being not very well defined:
because the struct isn't marked 'packed', doing a "sizeof()" on it will
align the size of the struct up to the biggest alignment of the members
it has.

And despite all the members being the same, the alignment of them is
different: a "__u64" has 4-byte alignment on x86-32, but native 8-byte
alignment on x86-64.  And while 'NAME_MAX+1' ends up being a nice round
number (256), the name[] array starts out a 4-byte aligned.

End result: the "packed" size of the structure is 300 bytes: 4-byte, but
not 8-byte aligned.

As a result, despite all the fields being in the same place on all
architectures, sizeof() will round up that size to 304 bytes on
architectures that have 8-byte alignment for u64.

Note that this is *not* a problem for 32-bit compat mode on POWER, since
there __u64 is 8-byte aligned even in 32-bit mode.  But on x86, 32-bit
and 64-bit alignment is different for 64-bit entities, and as a result
the structure that has exactly the same layout has different sizes.

So on x86-64, but no other architecture, we will just subtract 4 from
the size of the structure when running in a compat task.  That way we
will write the properly sized packet that user mode expects.

Not pretty.  Sadly, this very subtle, and unnecessary, size difference
has been encoded in user space that wants to read packets of *exactly*
the right size, and will refuse to touch anything else.
Reported-and-tested-by: Thomas Meyer <thomas@m3y3r.de>
Signed-off-by: Ian Kent <raven@themaw.net>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Jonathan Nieder <jrnieder@gmail.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

82e43e2a

cdrom: use copy_to_user() without the underscores · 3e9d6c33

Dan Carpenter authored Feb 06, 2012

commit 822bfa51 upstream.

"nframes" comes from the user and "nframes * CD_FRAMESIZE_RAW" can wrap
on 32 bit systems.  That would have been ok if we used the same wrapped
value for the copy, but we use a shifted value.  We should just use the
checked version of copy_to_user() because it's not going to make a
difference to the speed.
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

3e9d6c33

eCryptfs: Clear i_nlink in rmdir · d298243d

Tyler Hicks authored Apr 29, 2011

commit 07850552 upstream.

eCryptfs wasn't clearing the eCryptfs inode's i_nlink after a successful
vfs_rmdir() on the lower directory. This resulted in the inode evict and
destroy paths to be missed.

https://bugs.launchpad.net/ecryptfs/+bug/723518Signed-off-by: Tyler Hicks <tyhicks@linux.vnet.ibm.com>
Signed-off-by: Colin King <colin.king@canonical.com>
Signed-off-by: Tim Gardner <tim.gardner@canonical.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

d298243d

eCryptfs: Remove extra d_delete in ecryptfs_rmdir · 12a9959b

Tyler Hicks authored Apr 12, 2011

commit 35ffa948 upstream.

vfs_rmdir() already calls d_delete() on the lower dentry. That was being
duplicated in ecryptfs_rmdir() and caused a NULL pointer dereference
when NFSv3 was the lower filesystem.

BugLink: http://bugs.launchpad.net/bugs/723518Signed-off-by: Tyler Hicks <tyhicks@linux.vnet.ibm.com>
Signed-off-by: Colin King <colin.king@canonical.com>
Signed-off-by: Tim Gardner <tim.gardner@canonical.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

12a9959b

eCryptfs: Use notify_change for truncating lower inodes · 3c9b2dad

Tyler Hicks authored Oct 14, 2009

commit 5f3ef64f upstream.

When truncating inodes in the lower filesystem, eCryptfs directly
invoked vmtruncate(). As Christoph Hellwig pointed out, vmtruncate() is
a filesystem helper function, but filesystems may need to do more than
just a call to vmtruncate().

This patch moves the lower inode truncation out of ecryptfs_truncate()
and renames the function to truncate_upper().  truncate_upper() updates
an iattr for the lower inode to indicate if the lower inode needs to be
truncated upon return.  ecryptfs_setattr() then calls notify_change(),
using the updated iattr for the lower inode, to complete the truncation.

For eCryptfs functions needing to truncate, ecryptfs_truncate() is
reintroduced as a simple way to truncate the upper inode to a specified
size and then truncate the lower inode accordingly.

https://bugs.launchpad.net/bugs/451368Reported-by: Christoph Hellwig <hch@lst.de>
Acked-by: Dustin Kirkland <kirkland@canonical.com>
Cc: ecryptfs-devel@lists.launchpad.net
Cc: linux-fsdevel@vger.kernel.org
Signed-off-by: Tyler Hicks <tyhicks@linux.vnet.ibm.com>
Signed-off-by: Colin Ian King <colin.king@canonical.com>
Signed-off-by: Tim Gardner <tim.gardner@canonical.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

3c9b2dad

hdpvr: fix race conditon during start of streaming · 1bf5d0a4

Janne Grunau authored Feb 02, 2012

commit afa15953 upstream.

status has to be set to STREAMING before the streaming worker is
queued. hdpvr_transmit_buffers() will exit immediately otherwise.
Reported-by: Joerg Desch <vvd.joede@googlemail.com>
Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

1bf5d0a4

xhci: Fix encoding for HS bulk/control NAK rate. · 5027714d

Sarah Sharp authored Feb 13, 2012

commit 340a3504 upstream.

The xHCI 0.96 spec says that HS bulk and control endpoint NAK rate must
be encoded as an exponent of two number of microframes.  The endpoint
descriptor has the NAK rate encoded in number of microframes.  We were
just copying the value from the endpoint descriptor into the endpoint
context interval field, which was not correct.  This lead to the VIA
host rejecting the add of a bulk OUT endpoint from any USB 2.0 mass
storage device.

The fix is to use the correct encoding.  Refactor the code to convert
number of frames to an exponential number of microframes, and make sure
we convert the number of microframes in HS bulk and control endpoints to
an exponent.

This should be back ported to kernels as old as 2.6.31, that contain the
commit dfa49c4a "USB: xhci - fix math
in xhci_get_endpoint_interval"
Signed-off-by: Sarah Sharp <sarah.a.sharp@linux.intel.com>
Tested-by: Felipe Contreras <felipe.contreras@gmail.com>
Suggested-by: Andiry Xu <andiry.xu@amd.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

5027714d

USB: Fix handoff when BIOS disables host PCI device. · addf2e19

Sarah Sharp authored Feb 07, 2012

commit cab928ee upstream.

On some systems with an Intel Panther Point xHCI host controller, the
BIOS disables the xHCI PCI device during boot, and switches the xHCI
ports over to EHCI.  This allows the BIOS to access USB devices without
having xHCI support.

The downside is that the xHCI BIOS handoff mechanism will fail because
memory mapped I/O is not enabled for the disabled PCI device.
Jesse Barnes says this is expected behavior.  The PCI core will enable
BARs before quirks run, but it will leave it in an undefined state, and
it may not have memory mapped I/O enabled.

Make the generic USB quirk handler call pci_enable_device() to re-enable
MMIO, and call pci_disable_device() once the host-specific BIOS handoff
is finished.  This will balance the ref counts in the PCI core.  When
the PCI probe function is called, usb_hcd_pci_probe() will call
pci_enable_device() again.

This should be back ported to kernels as old as 2.6.31.  That was the
first kernel with xHCI support, and no one has complained about BIOS
handoffs failing due to memory mapped I/O being disabled on other hosts
(EHCI, UHCI, or OHCI).
Signed-off-by: Sarah Sharp <sarah.a.sharp@linux.intel.com>
Acked-by: Oliver Neukum <oneukum@suse.de>
Cc: Jesse Barnes <jbarnes@virtuousgeek.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

addf2e19

USB: Added Kamstrup VID/PIDs to cp210x serial driver. · c8070954

Bruno Thomsen authored Feb 21, 2012

commit c6c1e449 upstream.
Signed-off-by: Bruno Thomsen <bruno.thomsen@gmail.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

c8070954

ARM: 7325/1: fix v7 boot with lockdep enabled · fc218b7c

Rabin Vincent authored Feb 15, 2012

commit 8e43a905 upstream.

Bootup with lockdep enabled has been broken on v7 since b46c0f74
("ARM: 7321/1: cache-v7: Disable preemption when reading CCSIDR").

This is because v7_setup (which is called very early during boot) calls
v7_flush_dcache_all, and the save_and_disable_irqs added by that patch
ends up attempting to call into lockdep C code (trace_hardirqs_off())
when we are in no position to execute it (no stack, MMU off).

Fix this by using a notrace variant of save_and_disable_irqs.  The code
already uses the notrace variant of restore_irqs.
Reviewed-by: Nicolas Pitre <nico@linaro.org>
Acked-by: Stephen Boyd <sboyd@codeaurora.org>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Rabin Vincent <rabin@rab.in>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

fc218b7c

ARM: 7321/1: cache-v7: Disable preemption when reading CCSIDR · 0e003003

Stephen Boyd authored Feb 07, 2012

commit b46c0f74 upstream.

armv7's flush_cache_all() flushes caches via set/way. To
determine the cache attributes (line size, number of sets,
etc.) the assembly first writes the CSSELR register to select a
cache level and then reads the CCSIDR register. The CSSELR register
is banked per-cpu and is used to determine which cache level CCSIDR
reads. If the task is migrated between when the CSSELR is written and
the CCSIDR is read the CCSIDR value may be for an unexpected cache
level (for example L1 instead of L2) and incorrect cache flushing
could occur.

Disable interrupts across the write and read so that the correct
cache attributes are read and used for the cache flushing
routine. We disable interrupts instead of disabling preemption
because the critical section is only 3 instructions and we want
to call v7_dcache_flush_all from __v7_setup which doesn't have a
full kernel stack with a struct thread_info.

This fixes a problem we see in scm_call() when flush_cache_all()
is called from preemptible context and sometimes the L2 cache is
not properly flushed out.
Signed-off-by: Stephen Boyd <sboyd@codeaurora.org>
Acked-by: Catalin Marinas <catalin.marinas@arm.com>
Reviewed-by: Nicolas Pitre <nico@linaro.org>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

0e003003

SCSI: 3w-9xxx fix bug in sgl loading · 95964a5e

adam radford authored Dec 10, 2009

commit 53ca3535 upstream.

This small patch fixes a bug in the 3w-9xxx driver where it would load
an invalid sgl address in the ioctl path even if request length was zero.
Signed-off-by: Adam Radford <aradford@gmail.com>
Signed-off-by: James Bottomley <James.Bottomley@suse.de>
Cc: Ben Hutchings <ben@decadent.org.uk>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

95964a5e