1. 23 Jun, 2011 28 commits
    • Namhyung Kim's avatar
      loop: handle on-demand devices correctly · e4eb3c88
      Namhyung Kim authored
      commit a1c15c59 upstream.
      
      When finding or allocating a loop device, loop_probe() did not take
      partition numbers into account so that it can result to a different
      device. Consider following example:
      
      $ sudo modprobe loop max_part=15
      $ ls -l /dev/loop*
      brw-rw---- 1 root disk 7,   0 2011-05-24 22:16 /dev/loop0
      brw-rw---- 1 root disk 7,  16 2011-05-24 22:16 /dev/loop1
      brw-rw---- 1 root disk 7,  32 2011-05-24 22:16 /dev/loop2
      brw-rw---- 1 root disk 7,  48 2011-05-24 22:16 /dev/loop3
      brw-rw---- 1 root disk 7,  64 2011-05-24 22:16 /dev/loop4
      brw-rw---- 1 root disk 7,  80 2011-05-24 22:16 /dev/loop5
      brw-rw---- 1 root disk 7,  96 2011-05-24 22:16 /dev/loop6
      brw-rw---- 1 root disk 7, 112 2011-05-24 22:16 /dev/loop7
      $ sudo mknod /dev/loop8 b 7 128
      $ sudo losetup /dev/loop8 ~/temp/disk-with-3-parts.img
      $ sudo losetup -a
      /dev/loop128: [0805]:278201 (/home/namhyung/temp/disk-with-3-parts.img)
      $ ls -l /dev/loop*
      brw-rw---- 1 root disk 7,    0 2011-05-24 22:16 /dev/loop0
      brw-rw---- 1 root disk 7,   16 2011-05-24 22:16 /dev/loop1
      brw-rw---- 1 root disk 7, 2048 2011-05-24 22:18 /dev/loop128
      brw-rw---- 1 root disk 7, 2049 2011-05-24 22:18 /dev/loop128p1
      brw-rw---- 1 root disk 7, 2050 2011-05-24 22:18 /dev/loop128p2
      brw-rw---- 1 root disk 7, 2051 2011-05-24 22:18 /dev/loop128p3
      brw-rw---- 1 root disk 7,   32 2011-05-24 22:16 /dev/loop2
      brw-rw---- 1 root disk 7,   48 2011-05-24 22:16 /dev/loop3
      brw-rw---- 1 root disk 7,   64 2011-05-24 22:16 /dev/loop4
      brw-rw---- 1 root disk 7,   80 2011-05-24 22:16 /dev/loop5
      brw-rw---- 1 root disk 7,   96 2011-05-24 22:16 /dev/loop6
      brw-rw---- 1 root disk 7,  112 2011-05-24 22:16 /dev/loop7
      brw-r--r-- 1 root root 7,  128 2011-05-24 22:17 /dev/loop8
      
      After this patch, /dev/loop8 - instead of /dev/loop128 - was
      accessed correctly.
      
      In addition, 'range' passed to blk_register_region() should
      include all range of dev_t that LOOP_MAJOR can address. It does
      not need to be limited by partition numbers unless 'max_loop'
      param was specified.
      Signed-off-by: default avatarNamhyung Kim <namhyung@gmail.com>
      Cc: Laurent Vivier <Laurent.Vivier@bull.net>
      Signed-off-by: default avatarJens Axboe <jaxboe@fusionio.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@suse.de>
      e4eb3c88
    • Namhyung Kim's avatar
      loop: limit 'max_part' module param to DISK_MAX_PARTS · 2a140e31
      Namhyung Kim authored
      commit 78f4bb36 upstream.
      
      The 'max_part' parameter controls the number of maximum partition
      a loop block device can have. However if a user specifies very
      large value it would exceed the limitation of device minor number
      and can cause a kernel panic (or, at least, produce invalid
      device nodes in some cases).
      
      On my desktop system, following command kills the kernel. On qemu,
      it triggers similar oops but the kernel was alive:
      
      $ sudo modprobe loop max_part0000
       ------------[ cut here ]------------
       kernel BUG at /media/Linux_Data/project/linux/fs/sysfs/group.c:65!
       invalid opcode: 0000 [#1] SMP
       last sysfs file:
       CPU 0
       Modules linked in: loop(+)
      
       Pid: 43, comm: insmod Tainted: G        W   2.6.39-qemu+ #155 Bochs Bochs
       RIP: 0010:[<ffffffff8113ce61>]  [<ffffffff8113ce61>] internal_create_group=
      +0x2a/0x170
       RSP: 0018:ffff880007b3fde8  EFLAGS: 00000246
       RAX: 00000000ffffffef RBX: ffff880007b3d878 RCX: 00000000000007b4
       RDX: ffffffff8152da50 RSI: 0000000000000000 RDI: ffff880007b3d878
       RBP: ffff880007b3fe38 R08: ffff880007b3fde8 R09: 0000000000000000
       R10: ffff88000783b4a8 R11: ffff880007b3d878 R12: ffffffff8152da50
       R13: ffff880007b3d868 R14: 0000000000000000 R15: ffff880007b3d800
       FS:  0000000002137880(0063) GS:ffff880007c00000(0000) knlGS:00000000000000=
      00
       CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
       CR2: 0000000000422680 CR3: 0000000007b50000 CR4: 00000000000006b0
       DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
       DR3: 0000000000000000 DR6: 0000000000000000 DR7: 0000000000000000
       Process insmod (pid: 43, threadinfo ffff880007b3e000, task ffff880007afb9c=
      0)
       Stack:
        ffff880007b3fe58 ffffffff811e66dd ffff880007b3fe58 ffffffff811e570b
        0000000000000010 ffff880007b3d800 ffff880007a7b390 ffff880007b3d868
        0000000000400920 ffff880007b3d800 ffff880007b3fe48 ffffffff8113cfc8
       Call Trace:
        [<ffffffff811e66dd>] ? device_add+0x4bc/0x5af
        [<ffffffff811e570b>] ? dev_set_name+0x3c/0x3e
        [<ffffffff8113cfc8>] sysfs_create_group+0xe/0x12
        [<ffffffff810b420e>] blk_trace_init_sysfs+0x14/0x16
        [<ffffffff8116a090>] blk_register_queue+0x47/0xf7
        [<ffffffff8116f527>] add_disk+0xdf/0x290
        [<ffffffffa00060eb>] loop_init+0xeb/0x1b8 [loop]
        [<ffffffffa0006000>] ? 0xffffffffa0005fff
        [<ffffffff8100020a>] do_one_initcall+0x7a/0x12e
        [<ffffffff81096804>] sys_init_module+0x9c/0x1e0
        [<ffffffff813329bb>] system_call_fastpath+0x16/0x1b
       Code: c3 55 48 89 e5 41 57 41 56 41 89 f6 41 55 41 54 49 89 d4 53 48 89 fb=
       48 83 ec 28 48 85 ff 74 0b 85 f6 75 0b 48 83 7f 30 00 75 14 <0f> 0b eb fe =
      48 83 7f 30 00 b9 ea ff ff ff 0f 84 18 01 00 00 49
       RIP  [<ffffffff8113ce61>] internal_create_group+0x2a/0x170
        RSP <ffff880007b3fde8>
       ---[ end trace a123eb592043acad ]---
      Signed-off-by: default avatarNamhyung Kim <namhyung@gmail.com>
      Cc: Laurent Vivier <Laurent.Vivier@bull.net>
      Signed-off-by: default avatarJens Axboe <jaxboe@fusionio.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@suse.de>
      2a140e31
    • Linus Torvalds's avatar
      PCI: allow matching of prefetchable resources to non-prefetchable windows · d93bd2d1
      Linus Torvalds authored
      commit 8c8def26 upstream.
      
      I'm not entirely sure it needs to go into 32, but it's probably the right
      thing to do. Another way of explaining the patch is:
      
       - we currently pick the _first_ exactly matching bus resource entry, but
         the _last_ inexactly matching one. Normally first/last shouldn't
         matter, but bus resource entries aren't actually all created equal: in
         a transparent bus, the last resources will be the parent resources,
         which we should generally try to avoid unless we have no choice. So
         "first matching" is the thing we should always aim for.
      
       - the patch is a bit bigger than it needs to be, because I simplified the
         logic at the same time. It used to be a fairly incomprehensible
      
      	if ((res->flags & IORESOURCE_PREFETCH) && !(r->flags & IORESOURCE_PREFETCH))
      		best = r;       /* Approximating prefetchable by non-prefetchable */
      
         and technically, all the patch did was to make that complex choice be
         even more complex (it basically added a "&& !best" to say that if we
         already gound a non-prefetchable window for the prefetchable resource,
         then we won't override an earlier one with that later one: remember
         "first matching").
      
       - So instead of that complex one with three separate conditionals in one,
         I split it up a bit, and am taking advantage of the fact that we
         already handled the exact case, so if 'res->flags' has the PREFETCH
         bit, then we already know that 'r->flags' will _not_ have it. So the
         simplified code drops the redundant test, and does the new '!best' test
         separately. It also uses 'continue' as a way to ignore the bus
         resource we know doesn't work (ie a prefetchable bus resource is _not_
         acceptable for anything but an exact match), so it turns into:
      
      	/* We can't insert a non-prefetch resource inside a prefetchable parent .. */
      	if (r->flags & IORESOURCE_PREFETCH)
      		continue;
      	/* .. but we can put a prefetchable resource inside a non-prefetchable one */
      	if (!best)
      		best = r;
      
         instead. With the comments, it's now six lines instead of two, but it's
         conceptually simpler, and I _could_ have written it as two lines:
      
      	if ((res->flags & IORESOURCE_PREFETCH) && !best)
      		best = r;	/* Approximating prefetchable by non-prefetchable */
      
         but I thought that was too damn subtle.
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      Signed-off-by: default avatarJesse Barnes <jbarnes@virtuousgeek.org>
      Cc: Seth Forshee <seth.forshee@canonical.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@suse.de>
      
      d93bd2d1
    • Andrew Barry's avatar
      mm/page_alloc.c: prevent unending loop in __alloc_pages_slowpath() · 864fce82
      Andrew Barry authored
      commit cfa54a0f upstream.
      
      I believe I found a problem in __alloc_pages_slowpath, which allows a
      process to get stuck endlessly looping, even when lots of memory is
      available.
      
      Running an I/O and memory intensive stress-test I see a 0-order page
      allocation with __GFP_IO and __GFP_WAIT, running on a system with very
      little free memory.  Right about the same time that the stress-test gets
      killed by the OOM-killer, the utility trying to allocate memory gets stuck
      in __alloc_pages_slowpath even though most of the systems memory was freed
      by the oom-kill of the stress-test.
      
      The utility ends up looping from the rebalance label down through the
      wait_iff_congested continiously.  Because order=0,
      __alloc_pages_direct_compact skips the call to get_page_from_freelist.
      Because all of the reclaimable memory on the system has already been
      reclaimed, __alloc_pages_direct_reclaim skips the call to
      get_page_from_freelist.  Since there is no __GFP_FS flag, the block with
      __alloc_pages_may_oom is skipped.  The loop hits the wait_iff_congested,
      then jumps back to rebalance without ever trying to
      get_page_from_freelist.  This loop repeats infinitely.
      
      The test case is pretty pathological.  Running a mix of I/O stress-tests
      that do a lot of fork() and consume all of the system memory, I can pretty
      reliably hit this on 600 nodes, in about 12 hours.  32GB/node.
      Signed-off-by: default avatarAndrew Barry <abarry@cray.com>
      Signed-off-by: default avatarMinchan Kim <minchan.kim@gmail.com>
      Reviewed-by: Rik van Riel<riel@redhat.com>
      Acked-by: default avatarMel Gorman <mgorman@suse.de>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@suse.de>
      864fce82
    • Mark Brown's avatar
    • Mark Brown's avatar
      ASoC: Ensure output PGA is enabled for line outputs in wm_hubs · 01b242ac
      Mark Brown authored
      commit d0b48af6 upstream.
      
      Also fix a left/right typo while we're at it.
      Signed-off-by: default avatarMark Brown <broonie@opensource.wolfsonmicro.com>
      Acked-by: default avatarLiam Girdwood <lrg@ti.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@suse.de>
      01b242ac
    • David Henningsson's avatar
      ALSA: HDA: Use one dmic only for Dell Studio 1558 · ce9f8da9
      David Henningsson authored
      commit e033ebfb upstream.
      
      There are no signs of a dmic at node 0x0b, so the user is left with
      an additional internal mic which does not exist. This commit removes
      that non-existing mic.
      
      BugLink: http://bugs.launchpad.net/bugs/731706Reported-by: default avatarJames Page <james.page@canonical.com>
      Signed-off-by: default avatarDavid Henningsson <david.henningsson@canonical.com>
      Signed-off-by: default avatarTakashi Iwai <tiwai@suse.de>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@suse.de>
      ce9f8da9
    • Milton Miller's avatar
      seqlock: Don't smp_rmb in seqlock reader spin loop · ff3af587
      Milton Miller authored
      commit 5db1256a upstream.
      
      Move the smp_rmb after cpu_relax loop in read_seqlock and add
      ACCESS_ONCE to make sure the test and return are consistent.
      
      A multi-threaded core in the lab didn't like the update
      from 2.6.35 to 2.6.36, to the point it would hang during
      boot when multiple threads were active.  Bisection showed
      af5ab277 (clockevents:
      Remove the per cpu tick skew) as the culprit and it is
      supported with stack traces showing xtime_lock waits including
      tick_do_update_jiffies64 and/or update_vsyscall.
      
      Experimentation showed the combination of cpu_relax and smp_rmb
      was significantly slowing the progress of other threads sharing
      the core, and this patch is effective in avoiding the hang.
      
      A theory is the rmb is affecting the whole core while the
      cpu_relax is causing a resource rebalance flush, together they
      cause an interfernce cadance that is unbroken when the seqlock
      reader has interrupts disabled.
      
      At first I was confused why the refactor in
      3c22cd57 (kernel: optimise
      seqlock) didn't affect this patch application, but after some
      study that affected seqcount not seqlock. The new seqcount was
      not factored back into the seqlock.  I defer that the future.
      
      While the removal of the timer interrupt offset created
      contention for the xtime lock while a cpu does the
      additonal work to update the system clock, the seqlock
      implementation with the tight rmb spin loop goes back much
      further, and is just waiting for the right trigger.
      Signed-off-by: default avatarMilton Miller <miltonm@bga.com>
      Cc: <linuxppc-dev@lists.ozlabs.org>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Andi Kleen <andi@firstfloor.org>
      Cc: Nick Piggin <npiggin@kernel.dk>
      Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
      Cc: Anton Blanchard <anton@samba.org>
      Cc: Paul McKenney <paulmck@linux.vnet.ibm.com>
      Acked-by: default avatarEric Dumazet <eric.dumazet@gmail.com>
      Link: http://lkml.kernel.org/r/%3Cseqlock-rmb%40mdm.bga.com%3ESigned-off-by: default avatarThomas Gleixner <tglx@linutronix.de>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@suse.de>
      ff3af587
    • Timo Warns's avatar
      Fix for buffer overflow in ldm_frag_add not sufficient · 8bdae892
      Timo Warns authored
      commit cae13fe4 upstream.
      
      As Ben Hutchings discovered [1], the patch for CVE-2011-1017 (buffer
      overflow in ldm_frag_add) is not sufficient.  The original patch in
      commit c340b1d6 ("fs/partitions/ldm.c: fix oops caused by corrupted
      partition table") does not consider that, for subsequent fragments,
      previously allocated memory is used.
      
      [1] http://lkml.org/lkml/2011/5/6/407Reported-by: default avatarBen Hutchings <ben@decadent.org.uk>
      Signed-off-by: default avatarTimo Warns <warns@pre-sense.de>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@suse.de>
      8bdae892
    • David Chang's avatar
      staging: usbip: fix wrong endian conversion · 353be243
      David Chang authored
      commit cacd18a8 upstream.
      
      Fix number_of_packets wrong endian conversion in function
      correct_endian_ret_submit()
      Signed-off-by: default avatarDavid Chang <dchang@novell.com>
      Acked-by: default avatarArjan Mels <arjan.mels@gmx.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@suse.de>
      353be243
    • Frederic Weisbecker's avatar
      rcu: Fix unpaired rcu_irq_enter() from locking selftests · 6fa3d714
      Frederic Weisbecker authored
      commit ba9f207c upstream.
      
      HARDIRQ_ENTER() maps to irq_enter() which calls rcu_irq_enter().
      But HARDIRQ_EXIT() maps to __irq_exit() which doesn't call
      rcu_irq_exit().
      
      So for every locking selftest that simulates hardirq disabled,
      we create an imbalance in the rcu extended quiescent state
      internal state.
      
      As a result, after the first missing rcu_irq_exit(), subsequent
      irqs won't exit dyntick-idle mode after leaving the interrupt
      handler.  This means that RCU won't see the affected CPU as being
      in an extended quiescent state, resulting in long grace-period
      delays (as in grace periods extending for hours).
      
      To fix this, just use __irq_enter() to simulate the hardirq
      context. This is sufficient for the locking selftests as we
      don't need to exit any extended quiescent state or perform
      any check that irqs normally do when they wake up from idle.
      
      As a side effect, this patch makes it possible to restore
      "rcu: Decrease memory-barrier usage based on semi-formal proof",
      which eventually helped finding this bug.
      Reported-and-tested-by: default avatarYinghai Lu <yinghai@kernel.org>
      Signed-off-by: default avatarFrederic Weisbecker <fweisbec@gmail.com>
      Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
      Cc: Ingo Molnar <mingo@elte.hu>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Signed-off-by: default avatarPaul E. McKenney <paulmck@linux.vnet.ibm.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@suse.de>
      6fa3d714
    • Roedel, Joerg's avatar
      x86, amd: Use _safe() msr access for GartTlbWlk disable code · 03710bb4
      Roedel, Joerg authored
      commit d47cc0db upstream.
      
      The workaround for Bugzilla:
      
      	https://bugzilla.kernel.org/show_bug.cgi?id=33012
      
      introduced a read and a write to the MC4 mask msr.
      
      Unfortunatly this MSR is not emulated by the KVM hypervisor
      so that the kernel will get a #GP and crashes when applying
      this workaround when running inside KVM.
      
      This issue was reported as:
      
      	https://bugzilla.kernel.org/show_bug.cgi?id=35132
      
      and is fixed with this patch. The change just let the kernel
      ignore any #GP it gets while accessing this MSR by using the
      _safe msr access methods.
      Reported-by: default avatarTörök Edwin <edwintorok@gmail.com>
      Signed-off-by: default avatarJoerg Roedel <joerg.roedel@amd.com>
      Cc: Rafael J. Wysocki <rjw@sisk.pl>
      Cc: Maciej Rutecki <maciej.rutecki@gmail.com>
      Cc: Avi Kivity <avi@redhat.com>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarIngo Molnar <mingo@elte.hu>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@suse.de>
      03710bb4
    • Boris Ostrovsky's avatar
      x86, amd: Do not enable ARAT feature on AMD processors below family 0x12 · 94073753
      Boris Ostrovsky authored
      commit e9cdd343 upstream.
      
      Commit b87cf80a added support for
      ARAT (Always Running APIC timer) on AMD processors that are not
      affected by erratum 400. This erratum is present on certain processor
      families and prevents APIC timer from waking up the CPU when it
      is in a deep C state, including C1E state.
      
      Determining whether a processor is affected by this erratum may
      have some corner cases and handling these cases is somewhat
      complicated. In the interest of simplicity we won't claim ARAT
      support on processor families below 0x12 and will go back to
      broadcasting timer when going idle.
      Signed-off-by: default avatarBoris Ostrovsky <ostr@amd64.org>
      Link: http://lkml.kernel.org/r/1306423192-19774-1-git-send-email-ostr@amd64.orgTested-by: default avatarBoris Petkov <borislav.petkov@amd.com>
      Cc: Hans Rosenfeld <Hans.Rosenfeld@amd.com>
      Cc: Andreas Herrmann <Andreas.Herrmann3@amd.com>
      Cc: Chuck Ebbert <cebbert@redhat.com>
      Signed-off-by: default avatarH. Peter Anvin <hpa@linux.intel.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@suse.de>
      94073753
    • Samuel Thibault's avatar
      Fix Ultrastor asm snippet · 4efd0b08
      Samuel Thibault authored
      commit fad4dab5 upstream.
      
      Commit 1292500b replaced
      
      "=m" (*field) : "1" (*field)
      
      with
      
      "=m" (*field) :
      
      with comment "The following patch fixes it by using the '+' operator on
      the (*field) operand, marking it as read-write to gcc."
      '+' was actually forgotten.  This really puts it.
      Signed-off-by: default avatarSamuel Thibault <samuel.thibault@ens-lyon.org>
      Signed-off-by: default avatarJames Bottomley <jbottomley@parallels.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@suse.de>
      4efd0b08
    • Yang Ruirui's avatar
      ext4: release page cache in ext4_mb_load_buddy error path · 5252bdb1
      Yang Ruirui authored
      commit 26626f11 upstream.
      
      Add missing page_cache_release in the error path of ext4_mb_load_buddy
      Signed-off-by: default avatarYang Ruirui <ruirui.r.yang@tieto.com>
      Signed-off-by: default avatar"Theodore Ts'o" <tytso@mit.edu>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@suse.de>
      5252bdb1
    • Ted Ts'o's avatar
      jbd: fix fsync() tid wraparound bug · cdc57f82
      Ted Ts'o authored
      commit d9b01934 upstream.
      
      If an application program does not make any changes to the indirect
      blocks or extent tree, i_datasync_tid will not get updated.  If there
      are enough commits (i.e., 2**31) such that tid_geq()'s calculations
      wrap, and there isn't a currently active transaction at the time of
      the fdatasync() call, this can end up triggering a BUG_ON in
      fs/jbd/commit.c:
      
      	J_ASSERT(journal->j_running_transaction != NULL);
      
      It's pretty rare that this can happen, since it requires the use of
      fdatasync() plus *very* frequent and excessive use of fsync().  But
      with the right workload, it can.
      
      We fix this by replacing the use of tid_geq() with an equality test,
      since there's only one valid transaction id that is valid for us to
      start: namely, the currently running transaction (if it exists).
      
      Reported-by: Martin_Zielinski@McAfee.com
      Signed-off-by: default avatar"Theodore Ts'o" <tytso@mit.edu>
      Signed-off-by: default avatarJan Kara <jack@suse.cz>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@suse.de>
      cdc57f82
    • Jan Kara's avatar
      jbd: Fix forever sleeping process in do_get_write_access() · 538e7bf8
      Jan Kara authored
      commit 2842bb20 upstream.
      
      In do_get_write_access() we wait on BH_Unshadow bit for buffer to get
      from shadow state. The waking code in journal_commit_transaction() has
      a bug because it does not issue a memory barrier after the buffer is moved
      from the shadow state and before wake_up_bit() is called. Thus a waitqueue
      check can happen before the buffer is actually moved from the shadow state
      and waiting process may never be woken. Fix the problem by issuing proper
      barrier.
      Reported-by: default avatarTao Ma <boyu.mt@taobao.com>
      Signed-off-by: default avatarJan Kara <jack@suse.cz>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@suse.de>
      538e7bf8
    • Jan Kara's avatar
      ext3: Fix fs corruption when make_indexed_dir() fails · d23b7b62
      Jan Kara authored
      commit 86c4f6d8 upstream.
      
      When make_indexed_dir() fails (e.g. because of ENOSPC) after it has allocated
      block for index tree root, we did not properly mark all changed buffers dirty.
      This lead to only some of these buffers being written out and thus effectively
      corrupting the directory.
      
      Fix the issue by marking all changed data dirty even in the error failure case.
      Signed-off-by: default avatarJan Kara <jack@suse.cz>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@suse.de>
      d23b7b62
    • Jiri Olsa's avatar
      x86, 64-bit: Fix copy_[to/from]_user() checks for the userspace address limit · 45b0dfab
      Jiri Olsa authored
      commit 26afb7c6 upstream.
      
      As reported in BZ #30352:
      
        https://bugzilla.kernel.org/show_bug.cgi?id=30352
      
      there's a kernel bug related to reading the last allowed page on x86_64.
      
      The _copy_to_user() and _copy_from_user() functions use the following
      check for address limit:
      
        if (buf + size >= limit)
      	fail();
      
      while it should be more permissive:
      
        if (buf + size > limit)
      	fail();
      
      That's because the size represents the number of bytes being
      read/write from/to buf address AND including the buf address.
      So the copy function will actually never touch the limit
      address even if "buf + size == limit".
      
      Following program fails to use the last page as buffer
      due to the wrong limit check:
      
       #include <sys/mman.h>
       #include <sys/socket.h>
       #include <assert.h>
      
       #define PAGE_SIZE       (4096)
       #define LAST_PAGE       ((void*)(0x7fffffffe000))
      
       int main()
       {
              int fds[2], err;
              void * ptr = mmap(LAST_PAGE, PAGE_SIZE, PROT_READ | PROT_WRITE,
                                MAP_ANONYMOUS | MAP_PRIVATE | MAP_FIXED, -1, 0);
              assert(ptr == LAST_PAGE);
              err = socketpair(AF_LOCAL, SOCK_STREAM, 0, fds);
              assert(err == 0);
              err = send(fds[0], ptr, PAGE_SIZE, 0);
              perror("send");
              assert(err == PAGE_SIZE);
              err = recv(fds[1], ptr, PAGE_SIZE, MSG_WAITALL);
              perror("recv");
              assert(err == PAGE_SIZE);
              return 0;
       }
      
      The other place checking the addr limit is the access_ok() function,
      which is working properly. There's just a misleading comment
      for the __range_not_ok() macro - which this patch fixes as well.
      
      The last page of the user-space address range is a guard page and
      Brian Gerst observed that the guard page itself due to an erratum on K8 cpus
      (#121 Sequential Execution Across Non-Canonical Boundary Causes Processor
      Hang).
      
      However, the test code is using the last valid page before the guard page.
      The bug is that the last byte before the guard page can't be read
      because of the off-by-one error. The guard page is left in place.
      
      This bug would normally not show up because the last page is
      part of the process stack and never accessed via syscalls.
      Signed-off-by: default avatarJiri Olsa <jolsa@redhat.com>
      Acked-by: default avatarBrian Gerst <brgerst@gmail.com>
      Acked-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      Link: http://lkml.kernel.org/r/1305210630-7136-1-git-send-email-jolsa@redhat.comSigned-off-by: default avatarIngo Molnar <mingo@elte.hu>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@suse.de>
      45b0dfab
    • Felix Radensky's avatar
      mtd: mtdconcat: fix NAND OOB write · d9296aeb
      Felix Radensky authored
      commit 431e1eca upstream.
      
      Currently mtdconcat is broken for NAND. An attemtpt to create
      JFFS2 filesystem on concatenation of several NAND devices fails
      with OOB write errors. This patch fixes that problem.
      Signed-off-by: default avatarFelix Radensky <felix@embedded-sol.com>
      Signed-off-by: default avatarArtem Bityutskiy <Artem.Bityutskiy@nokia.com>
      Signed-off-by: default avatarDavid Woodhouse <David.Woodhouse@intel.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@suse.de>
      d9296aeb
    • James Bottomley's avatar
      block: add proper state guards to __elv_next_request · 5e4c1dbf
      James Bottomley authored
      commit 0a58e077 upstream.
      
      blk_cleanup_queue() calls elevator_exit() and after this, we can't
      touch the elevator without oopsing.  __elv_next_request() must check
      for this state because in the refcounted queue model, we can still
      call it after blk_cleanup_queue() has been called.
      
      This was reported as causing an oops attributable to scsi.
      Signed-off-by: default avatarJames Bottomley <James.Bottomley@suse.de>
      Signed-off-by: default avatarJens Axboe <jaxboe@fusionio.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@suse.de>
      5e4c1dbf
    • Tejun Heo's avatar
      block: rescan partitions on invalidated devices on -ENOMEDIA too · 5b2745db
      Tejun Heo authored
      commit 02e35228 upstream.
      
      __blkdev_get() doesn't rescan partitions if disk->fops->open() fails,
      which leads to ghost partition devices lingering after medimum removal
      is known to both the kernel and userland.  The behavior also creates a
      subtle inconsistency where O_NONBLOCK open, which doesn't fail even if
      there's no medium, clears the ghots partitions, which is exploited to
      work around the problem from userland.
      
      Fix it by updating __blkdev_get() to issue partition rescan after
      -ENOMEDIA too.
      
      This was reported in the following bz.
      
       https://bugzilla.kernel.org/show_bug.cgi?id=13029
      
      Stable: 2.6.38
      Signed-off-by: default avatarTejun Heo <tj@kernel.org>
      Reported-by: default avatarDavid Zeuthen <zeuthen@gmail.com>
      Reported-by: default avatarMartin Pitt <martin.pitt@ubuntu.com>
      Reported-by: default avatarKay Sievers <kay.sievers@vrfy.org>
      Tested-by: default avatarKay Sievers <kay.sievers@vrfy.org>
      Cc: Alan Cox <alan@lxorguk.ukuu.org.uk>
      Signed-off-by: default avatarJens Axboe <jaxboe@fusionio.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@suse.de>
      5b2745db
    • Eric B Munson's avatar
      powerpc/oprofile: Handle events that raise an exception without overflowing · 24fb3f4c
      Eric B Munson authored
      commit ad5d5292 upstream.
      
      Commit 0837e324 fixes a situation on POWER7
      where events can roll back if a specualtive event doesn't actually complete.
      This can raise a performance monitor exception.  We need to catch this to ensure
      that we reset the PMC.  In all cases the PMC will be less than 256 cycles from
      overflow.
      
      This patch lifts Anton's fix for the problem in perf and applies it to oprofile
      as well.
      Signed-off-by: default avatarEric B Munson <emunson@mgebm.net>
      Signed-off-by: default avatarBenjamin Herrenschmidt <benh@kernel.crashing.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@suse.de>
      24fb3f4c
    • Milton Miller's avatar
      powerpc/kexec: Fix memory corruption from unallocated slaves · c3bf5293
      Milton Miller authored
      commit 3d2cea73 upstream.
      
      Commit 1fc711f7 (powerpc/kexec: Fix race
      in kexec shutdown) moved the write to signal the cpu had exited the kernel
      from before the transition to real mode in kexec_smp_wait to kexec_wait.
      
      Unfornately it missed that kexec_wait is used both by cpus leaving the
      kernel and by secondary slave cpus that were not allocated a paca for
      what ever reason -- they could be beyond nr_cpus or not described in
      the current device tree for whatever reason (for example, kexec-load
      was not refreshed after a cpu hotplug operation).  Cpus coming through
      that path they will write to paca[NR_CPUS] which is beyond the space
      allocated for the paca data and overwrite memory not allocated to pacas
      but very likely still real mode accessable).
      
      Move the write back to kexec_smp_wait, which is used only by cpus that
      found their paca, but after the transition to real mode.
      Signed-off-by: default avatarMilton Miller <miltonm@bga.com>
      Signed-off-by: default avatarBenjamin Herrenschmidt <benh@kernel.crashing.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@suse.de>
      c3bf5293
    • steven finney's avatar
      Fix memory leak in cpufreq_stat · 301808c5
      steven finney authored
      commit 98586ed8 upstream.
      
      When a CPU is taken offline in an SMP system, cpufreq_remove_dev()
      nulls out the per-cpu policy before cpufreq_stats_free_table() can
      make use of it.  cpufreq_stats_free_table() then skips the
      call to sysfs_remove_group(), leaving about 100 bytes of sysfs-related
      memory unclaimed each time a CPU-removal occurs. Break up
      cpu_stats_free_table into sysfs and table portions, and
      call the sysfs portion early.
      Signed-off-by: default avatarSteven Finney <steven.finney@palm.com>
      Signed-off-by: default avatarDave Jones <davej@redhat.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@suse.de>
      301808c5
    • Jacob Shin's avatar
      CPU hotplug, re-create sysfs directory and symlinks · 1c6d0873
      Jacob Shin authored
      commit 27ecddc2 upstream.
      
      When we discover CPUs that are affected by each other's
      frequency/voltage transitions, the first CPU gets a sysfs directory
      created, and rest of the siblings get symlinks. Currently, when we
      hotplug off only the first CPU, all of the symlinks and the sysfs
      directory gets removed. Even though rest of the siblings are still
      online and functional, they are orphaned, and no longer governed by
      cpufreq.
      
      This patch, given the above scenario, creates a sysfs directory for
      the first sibling and symlinks for the rest of the siblings.
      
      Please note the recursive call, it was rather too ugly to roll it
      out. And the removal of redundant NULL setting (it is already taken
      care of near the top of the function).
      Signed-off-by: default avatarJacob Shin <jacob.shin@amd.com>
      Acked-by: default avatarMark Langsdorf <mark.langsdorf@amd.com>
      Reviewed-by: default avatarThomas Renninger <trenn@suse.de>
      Signed-off-by: default avatarDave Jones <davej@redhat.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@suse.de>
      1c6d0873
    • Catalin Marinas's avatar
      kmemleak: Do not return a pointer to an object that kmemleak did not get · b2300b3b
      Catalin Marinas authored
      commit 52c3ce4e upstream.
      
      The kmemleak_seq_next() function tries to get an object (and increment
      its use count) before returning it. If it could not get the last object
      during list traversal (because it may have been freed), the function
      should return NULL rather than a pointer to such object that it did not
      get.
      Signed-off-by: default avatarCatalin Marinas <catalin.marinas@arm.com>
      Reported-by: default avatarPhil Carmody <ext-phil.2.carmody@nokia.com>
      Acked-by: default avatarPhil Carmody <ext-phil.2.carmody@nokia.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@suse.de>
      b2300b3b
    • Steven Rostedt's avatar
      ftrace: Only update the function code on write to filter files · 66e69865
      Steven Rostedt authored
      commit 058e297d upstream.
      
      If function tracing is enabled, a read of the filter files will
      cause the call to stop_machine to update the function trace sites.
      It should only call stop_machine on write.
      Signed-off-by: default avatarSteven Rostedt <rostedt@goodmis.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@suse.de>
      66e69865
  2. 23 May, 2011 12 commits
    • Greg Kroah-Hartman's avatar
      Linux 2.6.32.41 · f9a11ede
      Greg Kroah-Hartman authored
      f9a11ede
    • Ben Hutchings's avatar
      netxen: Remove references to unified firmware file · 59b85444
      Ben Hutchings authored
      Commit c23a103f wrongly introduced
      references to the unified firmware file "phanfw.bin", which is not
      supported by netxen in 2.6.32.  The driver reports this filename when
      loading firmware from flash, and includes a MODULE_FIRMWARE hint for
      the filename even though it will never use it.
      Signed-off-by: default avatarBen Hutchings <ben@decadent.org.uk>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@suse.de>
      59b85444
    • Thomas Jarosch's avatar
      vmxnet3: Fix inconsistent LRO state after initialization · a7c1523f
      Thomas Jarosch authored
      commit ebde6f8a upstream.
      
      During initialization of vmxnet3, the state of LRO
      gets out of sync with netdev->features.
      
      This leads to very poor TCP performance in a IP forwarding
      setup and is hitting many VMware users.
      
      Simplified call sequence:
      1. vmxnet3_declare_features() initializes "adapter->lro" to true.
      
      2. The kernel automatically disables LRO if IP forwarding is enabled,
      so vmxnet3_set_flags() gets called. This also updates netdev->features.
      
      3. Now vmxnet3_setup_driver_shared() is called. "adapter->lro" is still
      set to true and LRO gets enabled again, even though
      netdev->features shows it's disabled.
      
      Fix it by updating "adapter->lro", too.
      
      The private vmxnet3 adapter flags are scheduled for removal
      in net-next, see commit a0d2730c
      "net: vmxnet3: convert to hw_features".
      
      Patch applies to 2.6.37 / 2.6.38 and 2.6.39-rc6.
      
      Please CC: comments.
      Signed-off-by: default avatarThomas Jarosch <thomas.jarosch@intra2net.com>
      Acked-by: default avatarStephen Hemminger <shemminger@vyatta.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@suse.de>
      a7c1523f
    • Bjørn Mork's avatar
      megaraid_sas: Sanity check user supplied length before passing it to dma_alloc_coherent() · 1ff463a1
      Bjørn Mork authored
      commit 98cb7e44 upstream.
      
      The ioc->sgl[i].iov_len value is supplied by the ioctl caller, and can be
      zero in some cases.  Assume that's valid and continue without error.
      
      Fixes (multiple individual reports of the same problem for quite a while):
      
      http://marc.info/?l=linux-ide&m=128941801715301
      http://bugs.debian.org/604627
      http://www.mail-archive.com/linux-poweredge@dell.com/msg02575.html
      
      megasas: Failed to alloc kernel SGL buffer for IOCTL
      
      and
      
      [   69.162538] ------------[ cut here ]------------
      [   69.162806] kernel BUG at /build/buildd/linux-2.6.32/lib/swiotlb.c:368!
      [   69.163134] invalid opcode: 0000 [#1] SMP
      [   69.163570] last sysfs file: /sys/devices/system/cpu/cpu3/cache/index2/shared_cpu_map
      [   69.163975] CPU 0
      [   69.164227] Modules linked in: fbcon tileblit font bitblit softcursor vga16fb vgastate ioatdma radeon ttm drm_kms_helper shpchp drm i2c_algo_bit lp parport floppy pata_jmicron megaraid_sas igb dca
      [   69.167419] Pid: 1206, comm: smartctl Tainted: G        W  2.6.32-25-server #45-Ubuntu X8DTN
      [   69.167843] RIP: 0010:[<ffffffff812c4dc5>]  [<ffffffff812c4dc5>] map_single+0x255/0x260
      [   69.168370] RSP: 0018:ffff88081c0ebc58  EFLAGS: 00010246
      [   69.168655] RAX: 000000000003bffc RBX: 00000000ffffffff RCX: 0000000000000002
      [   69.169000] RDX: 0000000000000000 RSI: 0000000000000000 RDI: ffff88001dffe000
      [   69.169346] RBP: ffff88081c0ebcb8 R08: 0000000000000000 R09: ffff880000030840
      [   69.169691] R10: 0000000000100000 R11: 0000000000000000 R12: 0000000000000000
      [   69.170036] R13: 00000000ffffffff R14: 0000000000000001 R15: 0000000000200000
      [   69.170382] FS:  00007fb8de189720(0000) GS:ffff88001de00000(0000) knlGS:0000000000000000
      [   69.170794] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
      [   69.171094] CR2: 00007fb8dd59237c CR3: 000000081a790000 CR4: 00000000000006f0
      [   69.171439] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
      [   69.171784] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
      [   69.172130] Process smartctl (pid: 1206, threadinfo ffff88081c0ea000, task ffff88081a760000)
      [   69.194513] Stack:
      [   69.205788]  0000000000000034 00000002817e3390 0000000000000000 ffff88081c0ebe00
      [   69.217739] <0> 0000000000000000 000000000003bffc 0000000000000000 0000000000000000
      [   69.241250] <0> 0000000000000000 00000000ffffffff ffff88081c5b4080 ffff88081c0ebe00
      [   69.277310] Call Trace:
      [   69.289278]  [<ffffffff812c52ac>] swiotlb_alloc_coherent+0xec/0x130
      [   69.301118]  [<ffffffff81038b31>] x86_swiotlb_alloc_coherent+0x61/0x70
      [   69.313045]  [<ffffffffa002d0ce>] megasas_mgmt_fw_ioctl+0x1ae/0x690 [megaraid_sas]
      [   69.336399]  [<ffffffffa002d748>] megasas_mgmt_ioctl_fw+0x198/0x240 [megaraid_sas]
      [   69.359346]  [<ffffffffa002f695>] megasas_mgmt_ioctl+0x35/0x50 [megaraid_sas]
      [   69.370902]  [<ffffffff81153b12>] vfs_ioctl+0x22/0xa0
      [   69.382322]  [<ffffffff8115da2a>] ? alloc_fd+0x10a/0x150
      [   69.393622]  [<ffffffff81153cb1>] do_vfs_ioctl+0x81/0x410
      [   69.404696]  [<ffffffff8155cc13>] ? do_page_fault+0x153/0x3b0
      [   69.415761]  [<ffffffff811540c1>] sys_ioctl+0x81/0xa0
      [   69.426640]  [<ffffffff810121b2>] system_call_fastpath+0x16/0x1b
      [   69.437491] Code: fe ff ff 48 8b 3d 74 38 76 00 41 bf 00 00 20 00 e8 51 f5 d7 ff 83 e0 ff 48 05 ff 07 00 00 48 c1 e8 0b 48 89 45 c8 e9 13 fe ff ff <0f> 0b eb fe 0f 1f 80 00 00 00 00 55 48 89 e5 48 83 ec 20 4c 89
      [   69.478216] RIP  [<ffffffff812c4dc5>] map_single+0x255/0x260
      [   69.489668]  RSP <ffff88081c0ebc58>
      [   69.500975] ---[ end trace 6a2181b634e2abc7 ]---
      Reported-by: default avatarBokhan Artem <aptem@ngs.ru>
      Reported by: Marc-Christian Petersen <m.c.p@gmx.de>
      Signed-off-by: default avatarBjørn Mork <bjorn@mork.no>
      Cc: Michael Benz <Michael.Benz@lsi.com>
      Signed-off-by: default avatarJames Bottomley <James.Bottomley@suse.de>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@suse.de>
      1ff463a1
    • Julia Lawall's avatar
      x86, mce, AMD: Fix leaving freed data in a list · c114f7a7
      Julia Lawall authored
      commit d9a5ac9e upstream.
      
      b may be added to a list, but is not removed before being freed
      in the case of an error.  This is done in the corresponding
      deallocation function, so the code here has been changed to
      follow that.
      
      The sematic match that finds this problem is as follows:
      (http://coccinelle.lip6.fr/)
      
      // <smpl>
      @@
      expression E,E1,E2;
      identifier l;
      @@
      
      *list_add(&E->l,E1);
      ... when != E1
          when != list_del(&E->l)
          when != list_del_init(&E->l)
          when != E = E2
      *kfree(E);// </smpl>
      Signed-off-by: default avatarJulia Lawall <julia@diku.dk>
      Cc: Borislav Petkov <borislav.petkov@amd.com>
      Cc: Robert Richter <robert.richter@amd.com>
      Cc: Yinghai Lu <yinghai@kernel.org>
      Cc: Andreas Herrmann <andreas.herrmann3@amd.com>
      Link: http://lkml.kernel.org/r/1305294731-12127-1-git-send-email-julia@diku.dkSigned-off-by: default avatarIngo Molnar <mingo@elte.hu>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@suse.de>
      c114f7a7
    • Youquan Song's avatar
      x86, apic: Fix spurious error interrupts triggering on all non-boot APs · 8e743dbf
      Youquan Song authored
      commit e503f9e4 upstream.
      
      This patch fixes a bug reported by a customer, who found
      that many unreasonable error interrupts reported on all
      non-boot CPUs (APs) during the system boot stage.
      
      According to Chapter 10 of Intel Software Developer Manual
      Volume 3A, Local APIC may signal an illegal vector error when
      an LVT entry is set as an illegal vector value (0~15) under
      FIXED delivery mode (bits 8-11 is 0), regardless of whether
      the mask bit is set or an interrupt actually happen. These
      errors are seen as error interrupts.
      
      The initial value of thermal LVT entries on all APs always reads
      0x10000 because APs are woken up by BSP issuing INIT-SIPI-SIPI
      sequence to them and LVT registers are reset to 0s except for
      the mask bits which are set to 1s when APs receive INIT IPI.
      
      When the BIOS takes over the thermal throttling interrupt,
      the LVT thermal deliver mode should be SMI and it is required
      from the kernel to keep AP's LVT thermal monitoring register
      programmed as such as well.
      
      This issue happens when BIOS does not take over thermal throttling
      interrupt, AP's LVT thermal monitor register will be restored to
      0x10000 which means vector 0 and fixed deliver mode, so all APs will
      signal illegal vector error interrupts.
      
      This patch check if interrupt delivery mode is not fixed mode before
      restoring AP's LVT thermal monitor register.
      Signed-off-by: default avatarYouquan Song <youquan.song@intel.com>
      Acked-by: default avatarSuresh Siddha <suresh.b.siddha@intel.com>
      Acked-by: default avatarYong Wang <yong.y.wang@intel.com>
      Cc: hpa@linux.intel.com
      Cc: joe@perches.com
      Cc: jbaron@redhat.com
      Cc: trenn@suse.de
      Cc: kent.liu@intel.com
      Cc: chaohong.guo@intel.com
      Link: http://lkml.kernel.org/r/1303402963-17738-1-git-send-email-youquan.song@intel.comSigned-off-by: default avatarIngo Molnar <mingo@elte.hu>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@suse.de>
      8e743dbf
    • Thomas Gleixner's avatar
      tick: Clear broadcast active bit when switching to oneshot · f56541cf
      Thomas Gleixner authored
      commit 07f4beb0 upstream.
      
      The first cpu which switches from periodic to oneshot mode switches
      also the broadcast device into oneshot mode. The broadcast device
      serves as a backup for per cpu timers which stop in deeper
      C-states. To avoid starvation of the cpus which might be in idle and
      depend on broadcast mode it marks the other cpus as broadcast active
      and sets the brodcast expiry value of those cpus to the next tick.
      
      The oneshot mode broadcast bit for the other cpus is sticky and gets
      only cleared when those cpus exit idle. If a cpu was not idle while
      the bit got set in consequence the bit prevents that the broadcast
      device is armed on behalf of that cpu when it enters idle for the
      first time after it switched to oneshot mode.
      
      In most cases that goes unnoticed as one of the other cpus has usually
      a timer pending which keeps the broadcast device armed with a short
      timeout. Now if the only cpu which has a short timer active has the
      bit set then the broadcast device will not be armed on behalf of that
      cpu and will fire way after the expected timer expiry. In the case of
      Christians bug report it took ~145 seconds which is about half of the
      wrap around time of HPET (the limit for that device) due to the fact
      that all other cpus had no timers armed which expired before the 145
      seconds timeframe.
      
      The solution is simply to clear the broadcast active bit
      unconditionally when a cpu switches to oneshot mode after the first
      cpu switched the broadcast device over. It's not idle at that point
      otherwise it would not be executing that code.
      
      [ I fundamentally hate that broadcast crap. Why the heck thought some
        folks that when going into deep idle it's a brilliant concept to
        switch off the last device which brings the cpu back from that
        state? ]
      
      Thanks to Christian for providing all the valuable debug information!
      Reported-and-tested-by: default avatarChristian Hoffmann <email@christianhoffmann.info>
      Cc: John Stultz <johnstul@us.ibm.com>
      Link: http://lkml.kernel.org/r/%3Calpine.LFD.2.02.1105161105170.3078%40ionos%3ESigned-off-by: default avatarThomas Gleixner <tglx@linutronix.de>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@suse.de>
      f56541cf
    • john stultz's avatar
      clocksource: Install completely before selecting · a83b90b7
      john stultz authored
      commit e05b2efb upstream.
      
      Christian Hoffmann reported that the command line clocksource override
      with acpi_pm timer fails:
      
       Kernel command line: <SNIP> clocksource=acpi_pm
       hpet clockevent registered
       Switching to clocksource hpet
       Override clocksource acpi_pm is not HRT compatible.
       Cannot switch while in HRT/NOHZ mode.
      
      The watchdog code is what enables CLOCK_SOURCE_VALID_FOR_HRES, but we
      actually end up selecting the clocksource before we enqueue it into
      the watchdog list, so that's why we see the warning and fail to switch
      to acpi_pm timer as requested. That's particularly bad when we want to
      debug timekeeping related problems in early boot.
      
      Put the selection call last.
      Reported-by: default avatarChristian Hoffmann <email@christianhoffmann.info>
      Signed-off-by: default avatarJohn Stultz <johnstul@us.ibm.com>
      Link: http://lkml.kernel.org/r/%3C1304558210.2943.24.camel%40work-vm%3ESigned-off-by: default avatarThomas Gleixner <tglx@linutronix.de>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@suse.de>
      a83b90b7
    • Borislav Petkov's avatar
      x86, AMD: Fix ARAT feature setting again · d9f6223c
      Borislav Petkov authored
      commit 14fb57dc upstream.
      
      Trying to enable the local APIC timer on early K8 revisions
      uncovers a number of other issues with it, in conjunction with
      the C1E enter path on AMD. Fixing those causes much more churn
      and troubles than the benefit of using that timer brings so
      don't enable it on K8 at all, falling back to the original
      functionality the kernel had wrt to that.
      Reported-and-bisected-by: default avatarNick Bowler <nbowler@elliptictech.com>
      Cc: Boris Ostrovsky <Boris.Ostrovsky@amd.com>
      Cc: Andreas Herrmann <andreas.herrmann3@amd.com>
      Cc: Greg Kroah-Hartman <greg@kroah.com>
      Cc: Hans Rosenfeld <hans.rosenfeld@amd.com>
      Cc: Nick Bowler <nbowler@elliptictech.com>
      Cc: Joerg-Volker-Peetz <jvpeetz@web.de>
      Signed-off-by: default avatarBorislav Petkov <borislav.petkov@amd.com>
      Link: http://lkml.kernel.org/r/1305636919-31165-3-git-send-email-bp@amd64.orgSigned-off-by: default avatarIngo Molnar <mingo@elte.hu>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@suse.de>
      d9f6223c
    • Borislav Petkov's avatar
      Revert "x86, AMD: Fix APIC timer erratum 400 affecting K8 Rev.A-E processors" · 88c38c2b
      Borislav Petkov authored
      commit 328935e6 upstream.
      
      This reverts commit e20a2d20, as it crashes
      certain boxes with specific AMD CPU models.
      
      Moving the lower endpoint of the Erratum 400 check to accomodate
      earlier K8 revisions (A-E) opens a can of worms which is simply
      not worth to fix properly by tweaking the errata checking
      framework:
      
      * missing IntPenging MSR on revisions < CG cause #GP:
      
      http://marc.info/?l=linux-kernel&m=130541471818831
      
      * makes earlier revisions use the LAPIC timer instead of the C1E
      idle routine which switches to HPET, thus not waking up in
      deeper C-states:
      
      http://lkml.org/lkml/2011/4/24/20
      
      Therefore, leave the original boundary starting with K8-revF.
      Signed-off-by: default avatarIngo Molnar <mingo@elte.hu>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@suse.de>
      88c38c2b
    • Jeff Layton's avatar
      cifs: add fallback in is_path_accessible for old servers · 46042172
      Jeff Layton authored
      commit 221d1d79 upstream.
      
      The is_path_accessible check uses a QPathInfo call, which isn't
      supported by ancient win9x era servers. Fall back to an older
      SMBQueryInfo call if it fails with the magic error codes.
      Reported-and-Tested-by: default avatarSandro Bonazzola <sandro.bonazzola@gmail.com>
      Signed-off-by: default avatarJeff Layton <jlayton@redhat.com>
      Signed-off-by: default avatarSteve French <sfrench@us.ibm.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@suse.de>
      46042172
    • Geert Uytterhoeven's avatar
      zorro8390: Fix regression caused during net_device_ops conversion · 6518ee48
      Geert Uytterhoeven authored
      commit cf7e032f upstream.
      
      Changeset b6114794 ("zorro8390: convert to
      net_device_ops") broke zorro8390 by adding 8390.o to the link. That
      meant that lib8390.c was included twice, once in zorro8390.c and once in
      8390.c, subject to different macros. This patch reverts that by
      avoiding the wrappers in 8390.c.
      
      Fix based on commits 217cbfa8 ("mac8390:
      fix regression caused during net_device_ops conversion") and
      4e0168fa ("mac8390: fix build with
      NET_POLL_CONTROLLER").
      Reported-by: default avatarChristian T. Steigies <cts@debian.org>
      Suggested-by: default avatarFinn Thain <fthain@telegraphics.com.au>
      Signed-off-by: default avatarGeert Uytterhoeven <geert@linux-m68k.org>
      Tested-by: default avatarChristian T. Steigies <cts@debian.org>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@suse.de>
      6518ee48