1. 23 Sep, 2012 5 commits
    • Silas Boyd-Wickizer's avatar
      Use get_online_cpus to avoid races involving CPU hotplug · a2db672a
      Silas Boyd-Wickizer authored
      If arch/x86/kernel/msr.c is a module, a CPU might offline or online
      between the for_each_online_cpu(i) loop and the call to
      register_hotcpu_notifier in msr_init or the call to
      unregister_hotcpu_notifier in msr_exit. The potential races can lead
      to leaks/duplicates, attempts to destroy non-existant devices, or
      random pointer dereferences.
      
      For example, in msr_init if:
      
              for_each_online_cpu(i) {
                      err = msr_device_create(i);
                      if (err != 0)
                              goto out_class;
              }
              <----- CPU offlines
              register_hotcpu_notifier(&msr_class_cpu_notifier);
      
      and the CPU never onlines before msr_exit, then the module will never
      call msr_device_destroy for the associated CPU.
      
      This fix surrounds for_each_online_cpu and register_hotcpu_notifier or
      unregister_hotcpu_notifier with get_online_cpus+put_online_cpus.
      
      Tested on a VM.
      Signed-off-by: default avatarSilas Boyd-Wickizer <sbw@mit.edu>
      Signed-off-by: default avatarPaul E. McKenney <paulmck@linux.vnet.ibm.com>
      a2db672a
    • Peter Zijlstra's avatar
      sched: Fix load avg vs cpu-hotplug · 5d180232
      Peter Zijlstra authored
      Rabik and Paul reported two different issues related to the same few
      lines of code.
      
      Rabik's issue is that the nr_uninterruptible migration code is wrong in
      that he sees artifacts due to this (Rabik please do expand in more
      detail).
      
      Paul's issue is that this code as it stands relies on us using
      stop_machine() for unplug, we all would like to remove this assumption
      so that eventually we can remove this stop_machine() usage altogether.
      
      The only reason we'd have to migrate nr_uninterruptible is so that we
      could use for_each_online_cpu() loops in favour of
      for_each_possible_cpu() loops, however since nr_uninterruptible() is the
      only such loop and its using possible lets not bother at all.
      
      The problem Rabik sees is (probably) caused by the fact that by
      migrating nr_uninterruptible we screw rq->calc_load_active for both rqs
      involved.
      
      So don't bother with fancy migration schemes (meaning we now have to
      keep using for_each_possible_cpu()) and instead fold any nr_active delta
      after we migrate all tasks away to make sure we don't have any skewed
      nr_active accounting.
      
      [ paulmck: Move call to calc_load_migration to CPU_DEAD to avoid
      miscounting noted by Rakib. ]
      Reported-by: default avatarRakib Mullick <rakib.mullick@gmail.com>
      Reported-by: default avatarPaul E. McKenney <paulmck@linux.vnet.ibm.com>
      Signed-off-by: default avatarPeter Zijlstra <a.p.zijlstra@chello.nl>
      Signed-off-by: default avatarPaul E. McKenney <paul.mckenney@linaro.org>
      5d180232
    • Paul E. McKenney's avatar
      rcu: Disallow callback registry on offline CPUs · 0d8ee37e
      Paul E. McKenney authored
      Posting a callback after the CPU_DEAD notifier effectively leaks
      that callback unless/until that CPU comes back online.  Silence is
      unhelpful when attempting to track down such leaks, so this commit emits
      a WARN_ON_ONCE() and unconditionally leaks the callback when an offline
      CPU attempts to register a callback.  The rdp->nxttail[RCU_NEXT_TAIL] is
      set to NULL in the CPU_DEAD notifier and restored in the CPU_UP_PREPARE
      notifier, allowing _call_rcu() to determine exactly when posting callbacks
      is illegal.
      Signed-off-by: default avatarPaul E. McKenney <paul.mckenney@linaro.org>
      Signed-off-by: default avatarPaul E. McKenney <paulmck@linux.vnet.ibm.com>
      Reviewed-by: default avatarJosh Triplett <josh@joshtriplett.org>
      0d8ee37e
    • Paul E. McKenney's avatar
      rcu: Remove _rcu_barrier() dependency on __stop_machine() · 1331e7a1
      Paul E. McKenney authored
      Currently, _rcu_barrier() relies on preempt_disable() to prevent
      any CPU from going offline, which in turn depends on CPU hotplug's
      use of __stop_machine().
      
      This patch therefore makes _rcu_barrier() use get_online_cpus() to
      block CPU-hotplug operations.  This has the added benefit of removing
      the need for _rcu_barrier() to adopt callbacks:  Because CPU-hotplug
      operations are excluded, there can be no callbacks to adopt.  This
      commit simplifies the code accordingly.
      Signed-off-by: default avatarPaul E. McKenney <paul.mckenney@linaro.org>
      Signed-off-by: default avatarPaul E. McKenney <paulmck@linux.vnet.ibm.com>
      Reviewed-by: default avatarJosh Triplett <josh@joshtriplett.org>
      1331e7a1
    • Paul E. McKenney's avatar
      rcu: Fix day-one dyntick-idle stall-warning bug · a10d206e
      Paul E. McKenney authored
      Each grace period is supposed to have at least one callback waiting
      for that grace period to complete.  However, if CONFIG_NO_HZ=n, an
      extra callback-free grace period is no big problem -- it will chew up
      a tiny bit of CPU time, but it will complete normally.  In contrast,
      CONFIG_NO_HZ=y kernels have the potential for all the CPUs to go to
      sleep indefinitely, in turn indefinitely delaying completion of the
      callback-free grace period.  Given that nothing is waiting on this grace
      period, this is also not a problem.
      
      That is, unless RCU CPU stall warnings are also enabled, as they are
      in recent kernels.  In this case, if a CPU wakes up after at least one
      minute of inactivity, an RCU CPU stall warning will result.  The reason
      that no one noticed until quite recently is that most systems have enough
      OS noise that they will never remain absolutely idle for a full minute.
      But there are some embedded systems with cut-down userspace configurations
      that consistently get into this situation.
      
      All this begs the question of exactly how a callback-free grace period
      gets started in the first place.  This can happen due to the fact that
      CPUs do not necessarily agree on which grace period is in progress.
      If a CPU still believes that the grace period that just completed is
      still ongoing, it will believe that it has callbacks that need to wait for
      another grace period, never mind the fact that the grace period that they
      were waiting for just completed.  This CPU can therefore erroneously
      decide to start a new grace period.  Note that this can happen in
      TREE_RCU and TREE_PREEMPT_RCU even on a single-CPU system:  Deadlock
      considerations mean that the CPU that detected the end of the grace
      period is not necessarily officially informed of this fact for some time.
      
      Once this CPU notices that the earlier grace period completed, it will
      invoke its callbacks.  It then won't have any callbacks left.  If no
      other CPU has any callbacks, we now have a callback-free grace period.
      
      This commit therefore makes CPUs check more carefully before starting a
      new grace period.  This new check relies on an array of tail pointers
      into each CPU's list of callbacks.  If the CPU is up to date on which
      grace periods have completed, it checks to see if any callbacks follow
      the RCU_DONE_TAIL segment, otherwise it checks to see if any callbacks
      follow the RCU_WAIT_TAIL segment.  The reason that this works is that
      the RCU_WAIT_TAIL segment will be promoted to the RCU_DONE_TAIL segment
      as soon as the CPU is officially notified that the old grace period
      has ended.
      
      This change is to cpu_needs_another_gp(), which is called in a number
      of places.  The only one that really matters is in rcu_start_gp(), where
      the root rcu_node structure's ->lock is held, which prevents any
      other CPU from starting or completing a grace period, so that the
      comparison that determines whether the CPU is missing the completion
      of a grace period is stable.
      Reported-by: default avatarBecky Bruce <bgillbruce@gmail.com>
      Reported-by: default avatarSubodh Nijsure <snijsure@grid-net.com>
      Reported-by: default avatarPaul Walmsley <paul@pwsan.com>
      Signed-off-by: default avatarPaul E. McKenney <paul.mckenney@linaro.org>
      Signed-off-by: default avatarPaul E. McKenney <paulmck@linux.vnet.ibm.com>
      Tested-by: Paul Walmsley <paul@pwsan.com>  # OMAP3730, OMAP4430
      Cc: stable@vger.kernel.org
      a10d206e
  2. 12 Sep, 2012 1 commit
  3. 08 Sep, 2012 3 commits
    • Linus Torvalds's avatar
      Linux 3.6-rc5 · 55d512e2
      Linus Torvalds authored
      55d512e2
    • Linus Torvalds's avatar
      Merge branch 'fixes-for-3.6' of git://git.linaro.org/people/mszyprowski/linux-dma-mapping · 32d687ca
      Linus Torvalds authored
      Pull DMA-mapping fixes from Marek Szyprowski:
       "Another set of fixes for ARM dma-mapping subsystem.
      
        Commit e9da6e99 replaced custom consistent buffer remapping code
        with generic vmalloc areas.  It however introduced some regressions
        caused by limited support for allocations in atomic context.  This
        series contains fixes for those regressions.
      
        For some subplatforms the default, pre-allocated pool for atomic
        allocations turned out to be too small, so a function for setting its
        size has been added.
      
        Another set of patches adds support for atomic allocations to
        IOMMU-aware DMA-mapping implementation.
      
        The last part of this pull request contains two fixes for Contiguous
        Memory Allocator, which relax too strict requirements."
      
      * 'fixes-for-3.6' of git://git.linaro.org/people/mszyprowski/linux-dma-mapping:
        ARM: dma-mapping: IOMMU allocates pages from atomic_pool with GFP_ATOMIC
        ARM: dma-mapping: Introduce __atomic_get_pages() for __iommu_get_pages()
        ARM: dma-mapping: Refactor out to introduce __in_atomic_pool
        ARM: dma-mapping: atomic_pool with struct page **pages
        ARM: Kirkwood: increase atomic coherent pool size
        ARM: DMA-Mapping: print warning when atomic coherent allocation fails
        ARM: DMA-Mapping: add function for setting coherent pool size from platform code
        ARM: relax conditions required for enabling Contiguous Memory Allocator
        mm: cma: fix alignment requirements for contiguous regions
      32d687ca
    • Linus Torvalds's avatar
      Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/dtor/input · 11be4bc6
      Linus Torvalds authored
      Pull input subsystem updates from Dmitry Torokhov.
      
      * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/dtor/input:
        Input: wacom - add support for EMR on Cintiq 24HD touch
        Input: i8042 - add Gigabyte T1005 series netbooks to noloop table
        Input: imx_keypad - reset the hardware before enabling
        Input: edt-ft5x06 - fix build error when compiling wthout CONFIG_DEBUG_FS
      11be4bc6
  4. 07 Sep, 2012 4 commits
  5. 06 Sep, 2012 14 commits
    • Linus Torvalds's avatar
      Merge tag 'fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc · eeea3ac9
      Linus Torvalds authored
      Pull ARM SoC bug fixes from Olof Johansson:
       "Mostly Renesas and Atmel bugfixes this time, targeting boot and build
        problems.  A couple of patches for gemini and kirkwood as well.  On a
        whole nothing very controversial."
      
      * tag 'fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc:
        ARM: gemini: fix the gemini build
        ARM: shmobile: armadillo800eva: enable rw rootfs mount
        ARM: Kirkwood: Fix 'SZ_1M' undeclared here for db88f6281-bp-setup.c
        ARM: shmobile: mackerel: fixup usb module order
        ARM: shmobile: armadillo800eva: fixup: sound card detection order
        ARM: shmobile: marzen: fixup smsc911x id for regulator
        ARM: at91/feature-removal-schedule: delay at91_mci removal
        ARM: mach-shmobile: armadillo800eva: Enable power button as wakeup source
        ARM: mach-shmobile: armadillo800eva: Fix GPIO buttons descriptions
        ARM: at91/dts: remove partial parameter in at91sam9g25ek.dts
        ARM: at91/clock: fix PLLA overclock warning
        ARM: at91: fix rtc-at91sam9 irq issue due to sparse irq support
        ARM: at91: fix system timer irq issue due to sparse irq support
        ARM: shmobile: sh73a0: fixup RELOC_BASE of intca_irq_pins_desc
      eeea3ac9
    • Linus Torvalds's avatar
      Merge tag 'hwmon-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/groeck/linux-staging · c7c6bf1e
      Linus Torvalds authored
      Pull a hwmon fix from Guenter Roeck:
       "One patch, fixing DIV_ROUND_CLOSEST to support negative dividends.
      
        While the changes are not in the drivers/hwmon directory, the problem
        primarily affects hwmon drivers, and it makes sense to push the patch
        through the hwmon tree."
      
      * tag 'hwmon-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/groeck/linux-staging:
        linux/kernel.h: Fix DIV_ROUND_CLOSEST to support negative dividends
      c7c6bf1e
    • Linus Torvalds's avatar
      Merge branch 'rc-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/mmarek/kbuild · bd12ce8c
      Linus Torvalds authored
      Pull kbuild fixes from Michal Marek:
       "These are two fixes that should go into 3.6.  The link-vmlinux.sh one
        is obvious.
      
        The other one fixes make firmware_install with certain configurations,
        where a file in the toplevel firmware tree gets installed first, and
        $(INSTALL_FW_PATH)/$$(dir <file>) results in /lib/firmware/./, which
        confuses make 3.82 for some reason."
      
      * 'rc-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/mmarek/kbuild:
        firmware: fix directory creation rule matching with make 3.82
        link-vmlinux.sh: Fix stray "echo" in error message
      bd12ce8c
    • Dave Jones's avatar
      Remove user-triggerable BUG from mpol_to_str · 80de7c31
      Dave Jones authored
      Trivially triggerable, found by trinity:
      
        kernel BUG at mm/mempolicy.c:2546!
        Process trinity-child2 (pid: 23988, threadinfo ffff88010197e000, task ffff88007821a670)
        Call Trace:
          show_numa_map+0xd5/0x450
          show_pid_numa_map+0x13/0x20
          traverse+0xf2/0x230
          seq_read+0x34b/0x3e0
          vfs_read+0xac/0x180
          sys_pread64+0xa2/0xc0
          system_call_fastpath+0x1a/0x1f
        RIP: mpol_to_str+0x156/0x360
      
      Cc: stable@vger.kernel.org
      Signed-off-by: default avatarDave Jones <davej@redhat.com>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      80de7c31
    • Konrad Rzeszutek Wilk's avatar
      xen/pciback: Fix proper FLR steps. · 80ba77df
      Konrad Rzeszutek Wilk authored
      When we do FLR and save PCI config we did it in the wrong order.
      The end result was that if a PCI device was unbind from
      its driver, then binded to xen-pciback, and then back to its
      driver we would get:
      
      > lspci -s 04:00.0
      04:00.0 Ethernet controller: Intel Corporation 82574L Gigabit Network Connection
      13:42:12 # 4 :~/
      > echo "0000:04:00.0" > /sys/bus/pci/drivers/pciback/unbind
      > modprobe e1000e
      e1000e: Intel(R) PRO/1000 Network Driver - 2.0.0-k
      e1000e: Copyright(c) 1999 - 2012 Intel Corporation.
      e1000e 0000:04:00.0: Disabling ASPM L0s L1
      e1000e 0000:04:00.0: enabling device (0000 -> 0002)
      xen: registering gsi 48 triggering 0 polarity 1
      Already setup the GSI :48
      e1000e 0000:04:00.0: Interrupt Throttling Rate (ints/sec) set to dynamic conservative mode
      e1000e: probe of 0000:04:00.0 failed with error -2
      
      This fixes it by first saving the PCI configuration space, then
      doing the FLR.
      Reported-by: default avatarRen, Yongjie <yongjie.ren@intel.com>
      Reported-and-Tested-by: default avatarTobias Geiger <tobias.geiger@vido.info>
      Signed-off-by: default avatarKonrad Rzeszutek Wilk <konrad.wilk@oracle.com>
      CC: stable@vger.kernel.org
      80ba77df
    • Linus Torvalds's avatar
      Merge tag 'mmc-fixes-for-3.6-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/cjb/mmc · 08090950
      Linus Torvalds authored
      Pull MMC fixes from Chris Ball:
       - a firmware bug on several Samsung MoviNAND eMMC models causes
         permanent corruption on the device when secure erase and secure trim
         requests are made, so we disable those requests on these eMMC devices.
       - atmel-mci: fix a hang with some SD cards by waiting for not-busy flag.
       - dw_mmc: low-power mode breaks SDIO interrupts; fix PIO error handling;
         fix handling of error interrupts.
       - mxs-mmc: fix deadlocks; fix compile error due to dma.h arch change.
       - omap: fix broken PIO mode causing memory corruption.
       - sdhci-esdhc: fix card detection.
      
      * tag 'mmc-fixes-for-3.6-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/cjb/mmc:
        mmc: omap: fix broken PIO mode
        mmc: card: Skip secure erase on MoviNAND; causes unrecoverable corruption.
        mmc: dw_mmc: Disable low power mode if SDIO interrupts are used
        mmc: dw_mmc: fix error handling in PIO mode
        mmc: dw_mmc: correct mishandling error interrupt
        mmc: dw_mmc: amend using error interrupt status
        mmc: atmel-mci: not busy flag has also to be used for read operations
        mmc: sdhci-esdhc: break out early if clock is 0
        mmc: mxs-mmc: fix deadlock caused by recursion loop
        mmc: mxs-mmc: fix deadlock in SDIO IRQ case
        mmc: bfin_sdh: fix dma_desc_array build error
      08090950
    • Miklos Szeredi's avatar
      uml: fix compile error in deliver_alarm() · bc6c8364
      Miklos Szeredi authored
      Fix the following compile error on UML.
      
        arch/um/os-Linux/time.c: In function 'deliver_alarm':
        arch/um/os-Linux/time.c:117:3: error: too few arguments to function 'alarm_handler'
        arch/um/os-Linux/internal.h:1:6: note: declared here
      
      The error was introduced by commit d3c1cfcd ("um: pass siginfo to guest
      process") in 3.6-rc1.
      Signed-off-by: default avatarMiklos Szeredi <mszeredi@suse.cz>
      CC: Martin Pärtel <martin.partel@gmail.com>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      bc6c8364
    • Alan Cox's avatar
      dj: memory scribble in logi_dj · 8a55ade7
      Alan Cox authored
      Allocate a structure not a pointer to it !
      Signed-off-by: default avatarAlan Cox <alan@linux.intel.com>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      8a55ade7
    • Linus Torvalds's avatar
      Merge branch 'merge' of git://git.kernel.org/pub/scm/linux/kernel/git/benh/powerpc · cb4f9a29
      Linus Torvalds authored
      Pull powerpc fixes from Benjamin Herrenschmidt:
       "Here are a few fixes for 3.6 that were piling up while I was away or
        busy (I was mostly MIA a week or two before San Diego).
      
        Some fixes from Anton fixing up issues with our relatively new DSCR
        control feature, and a few other fixes that are either regressions or
        bugs nasty enough to warrant not waiting."
      
      * 'merge' of git://git.kernel.org/pub/scm/linux/kernel/git/benh/powerpc:
        powerpc: Don't use __put_user() in patch_instruction
        powerpc: Make sure IPI handlers see data written by IPI senders
        powerpc: Restore correct DSCR in context switch
        powerpc: Fix DSCR inheritance in copy_thread()
        powerpc: Keep thread.dscr and thread.dscr_inherit in sync
        powerpc: Update DSCR on all CPUs when writing sysfs dscr_default
        powerpc/powernv: Always go into nap mode when CPU is offline
        powerpc: Give hypervisor decrementer interrupts their own handler
        powerpc/vphn: Fix arch_update_cpu_topology() return value
      cb4f9a29
    • Linus Torvalds's avatar
      Merge tag 'gpio-fixes-for-v3.6' of git://git.kernel.org/pub/scm/linux/kernel/git/linusw/linux-gpio · 813e6438
      Linus Torvalds authored
      Pull GPIO fixes from Linus Walleij:
       "These are some GPIO regression fixes for v3.6:
         - Erroneous debug message from of_get_named_gpio_flags()
         - Make sure the MC9S08DZ60 GPIO driver depend on I2C being compiled
           in (not module) or allmodconfig breaks.
         - Check return value from irq_alloc_descs() in the Emma Mobile GPIO
           driver.
         - Assign the owner field for the rdc321x driver so the module won't
           be removed if it has active GPIOs."
      
      * tag 'gpio-fixes-for-v3.6' of git://git.kernel.org/pub/scm/linux/kernel/git/linusw/linux-gpio:
        gpio: rdc321x: Prevent removal of modules exporting active GPIOs
        gpio: em: Fix checking return value of irq_alloc_descs
        gpio: mc9s08dz60: Fix build error if I2C=m
        gpio: Fix debug message in of_get_named_gpio_flags()
      813e6438
    • Linus Torvalds's avatar
      Merge tag 'sound-3.6' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound · 5e682c0e
      Linus Torvalds authored
      Pull sound fixes from Takashi Iwai:
       "There are nothing scaring, contains only small fixes for HD-audio and
        USB-audio:
         - EPSS regression fix and GPIO fix for HD-audio IDT codecs
         - A series of USB-audio regression fixes that are found since 3.5
           kernel"
      
      * tag 'sound-3.6' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound:
        ALSA: snd-usb: fix cross-interface streaming devices
        ALSA: snd-usb: fix calls to next_packet_size
        ALSA: snd-usb: restore delay information
        ALSA: snd-usb: use list_for_each_safe for endpoint resources
        ALSA: snd-usb: Fix URB cancellation at stream start
        ALSA: hda - Don't trust codec EPSS bit for IDT 92HD83xx & co
        ALSA: hda - Avoid unnecessary parameter read for EPSS
        ALSA: hda - Do not set GPIOs for speakers on IDT if there are no speakers
      5e682c0e
    • Linus Torvalds's avatar
      Merge tag 'fbdev-fixes-for-3.6-1' of git://github.com/schandinat/linux-2.6 · 6d1a0503
      Linus Torvalds authored
      Pull fbdev fixes from Florian Tobias Schandinat:
       - a fix by Paul Cercueil to prevent a possible buffer overflow
       - a fix by Bruno Prémont to prevent a rare sleep in invalid context
       - a fix by Julia Lawall for a double free in auo_k190x
       - a fix by Dan Carpenter to prevent a division by zero in mb862xxfb
       - a regression fix by Tomi Valkeinen for the SDI output in OMAP
       - a fix by Grazvydas Ignotas to fix the console colors in OMAP
      
      * tag 'fbdev-fixes-for-3.6-1' of git://github.com/schandinat/linux-2.6:
        OMAPFB: fix framebuffer console colors
        OMAPDSS: Fix SDI PLL locking
        video: mb862xxfb: prevent divide by zero bug
        drivers/video/auo_k190x.c: drop kfree of devm_kzalloc's data
        fbcon: Fix bit_putcs() call to kmalloc(s, GFP_KERNEL)
        fbcon: prevent possible buffer overflow.
      6d1a0503
    • Linus Torvalds's avatar
      Merge tag 'upstream-3.6-rc5' of git://git.infradead.org/linux-ubi · 50234c58
      Linus Torvalds authored
      Pull ubi fix from Artem Bityutskiy:
       "A single small fix for memory deallocation: we allocated memory using
        'kmem_cache_alloc()' but were freeing it using 'kfree()' in some
        cases.  Now we fix this by using 'kmem_cache_free()' instead."
      
      * tag 'upstream-3.6-rc5' of git://git.infradead.org/linux-ubi:
        UBI: fix a horrible memory deallocation bug
      50234c58
    • Mikulas Patocka's avatar
      Fix order of arguments to compat_put_time[spec|val] · ed6fe9d6
      Mikulas Patocka authored
      Commit 644595f8 ("compat: Handle COMPAT_USE_64BIT_TIME in
      net/socket.c") introduced a bug where the helper functions to take
      either a 64-bit or compat time[spec|val] got the arguments in the wrong
      order, passing the kernel stack pointer off as a user pointer (and vice
      versa).
      
      Because of the user address range check, that in turn then causes an
      EFAULT due to the user pointer range checking failing for the kernel
      address.  Incorrectly resuling in a failed system call for 32-bit
      processes with a 64-bit kernel.
      
      On odder architectures like HP-PA (with separate user/kernel address
      spaces), it can be used read kernel memory.
      Signed-off-by: default avatarMikulas Patocka <mpatocka@redhat.com>
      Cc: stable@vger.kernel.org
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      ed6fe9d6
  6. 05 Sep, 2012 13 commits
    • Ronny Hegewald's avatar
      xen: Use correct masking in xen_swiotlb_alloc_coherent. · b5031ed1
      Ronny Hegewald authored
      When running 32-bit pvops-dom0 and a driver tries to allocate a coherent
      DMA-memory the xen swiotlb-implementation returned memory beyond 4GB.
      
      The underlaying reason is that if the supplied driver passes in a
      DMA_BIT_MASK(64) ( hwdev->coherent_dma_mask is set to 0xffffffffffffffff)
      our dma_mask will be u64 set to 0xffffffffffffffff even if we set it to
      DMA_BIT_MASK(32) previously. Meaning we do not reset the upper bits.
      By using the dma_alloc_coherent_mask function - it does the proper casting
      and we get 0xfffffffff.
      
      This caused not working sound on a system with 4 GB and a 64-bit
      compatible sound-card with sets the DMA-mask to 64bit.
      
      On bare-metal and the forward-ported xen-dom0 patches from OpenSuse a coherent
      DMA-memory is always allocated inside the 32-bit address-range by calling
      dma_alloc_coherent_mask.
      
      This patch adds the same functionality to xen swiotlb and is a rebase of the
      original patch from Ronny Hegewald which never got upstream b/c the
      underlaying reason was not understood until now.
      
      The original email with the original patch is in:
      http://old-list-archives.xen.org/archives/html/xen-devel/2010-02/msg00038.html
      the original thread from where the discussion started is in:
      http://old-list-archives.xen.org/archives/html/xen-devel/2010-01/msg00928.htmlSigned-off-by: default avatarRonny Hegewald <ronny.hegewald@online.de>
      Signed-off-by: default avatarStefano Panella <stefano.panella@citrix.com>
      Acked-By: default avatarDavid Vrabel <david.vrabel@citrix.com>
      Signed-off-by: default avatarKonrad Rzeszutek Wilk <konrad.wilk@oracle.com>
      CC: stable@vger.kernel.org
      b5031ed1
    • Alex Shi's avatar
      xen: fix logical error in tlb flushing · ce7184bd
      Alex Shi authored
      While TLB_FLUSH_ALL gets passed as 'end' argument to
      flush_tlb_others(), the Xen code was made to check its 'start'
      parameter. That may give a incorrect op.cmd to MMUEXT_INVLPG_MULTI
      instead of MMUEXT_TLB_FLUSH_MULTI. Then it causes some page can not
      be flushed from TLB.
      
      This patch fixed this issue.
      Reported-by: default avatarJan Beulich <jbeulich@suse.com>
      Signed-off-by: default avatarAlex Shi <alex.shi@intel.com>
      Acked-by: default avatarJan Beulich <jbeulich@suse.com>
      Tested-by: default avatarYongjie Ren <yongjie.ren@intel.com>
      Signed-off-by: default avatarKonrad Rzeszutek Wilk <konrad.wilk@oracle.com>
      ce7184bd
    • Konrad Rzeszutek Wilk's avatar
      Merge commit '4cb38750' into stable/for-linus-3.6 · 593d0a3e
      Konrad Rzeszutek Wilk authored
      * commit '4cb38750': (6849 commits)
        bcma: fix invalid PMU chip control masks
        [libata] pata_cmd64x: whitespace cleanup
        libata-acpi: fix up for acpi_pm_device_sleep_state API
        sata_dwc_460ex: device tree may specify dma_channel
        ahci, trivial: fixed coding style issues related to braces
        ahci_platform: add hibernation callbacks
        libata-eh.c: local functions should not be exposed globally
        libata-transport.c: local functions should not be exposed globally
        sata_dwc_460ex: support hardreset
        ata: use module_pci_driver
        drivers/ata/pata_pcmcia.c: adjust suspicious bit operation
        pata_imx: Convert to clk_prepare_enable/clk_disable_unprepare
        ahci: Enable SB600 64bit DMA on MSI K9AGM2 (MS-7327) v2
        [libata] Prevent interface errors with Seagate FreeAgent GoFlex
        drivers/acpi/glue: revert accidental license-related 6b66d958 bits
        libata-acpi: add missing inlines in libata.h
        i2c-omap: Add support for I2C_M_STOP message flag
        i2c: Fall back to emulated SMBus if the operation isn't supported natively
        i2c: Add SCCB support
        i2c-tiny-usb: Add support for the Robofuzz OSIF USB/I2C converter
        ...
      593d0a3e
    • Konrad Rzeszutek Wilk's avatar
      xen/p2m: Fix one-off error in checking the P2M tree directory. · 50e90041
      Konrad Rzeszutek Wilk authored
      We would traverse the full P2M top directory (from 0->MAX_DOMAIN_PAGES
      inclusive) when trying to figure out whether we can re-use some of the
      P2M middle leafs.
      
      Which meant that if the kernel was compiled with MAX_DOMAIN_PAGES=512
      we would try to use the 512th entry. Fortunately for us the p2m_top_index
      has a check for this:
      
       BUG_ON(pfn >= MAX_P2M_PFN);
      
      which we hit and saw this:
      
      (XEN) domain_crash_sync called from entry.S
      (XEN) Domain 0 (vcpu#0) crashed on cpu#0:
      (XEN) ----[ Xen-4.1.2-OVM  x86_64  debug=n  Tainted:    C ]----
      (XEN) CPU:    0
      (XEN) RIP:    e033:[<ffffffff819cadeb>]
      (XEN) RFLAGS: 0000000000000212   EM: 1   CONTEXT: pv guest
      (XEN) rax: ffffffff81db5000   rbx: ffffffff81db4000   rcx: 0000000000000000
      (XEN) rdx: 0000000000480211   rsi: 0000000000000000   rdi: ffffffff81db4000
      (XEN) rbp: ffffffff81793db8   rsp: ffffffff81793d38   r8:  0000000008000000
      (XEN) r9:  4000000000000000   r10: 0000000000000000   r11: ffffffff81db7000
      (XEN) r12: 0000000000000ff8   r13: ffffffff81df1ff8   r14: ffffffff81db6000
      (XEN) r15: 0000000000000ff8   cr0: 000000008005003b   cr4: 00000000000026f0
      (XEN) cr3: 0000000661795000   cr2: 0000000000000000
      
      Fixes-Oracle-Bug: 14570662
      CC: stable@vger.kernel.org # only for v3.5
      Signed-off-by: default avatarKonrad Rzeszutek Wilk <konrad.wilk@oracle.com>
      50e90041
    • Benjamin Herrenschmidt's avatar
      powerpc: Don't use __put_user() in patch_instruction · 636802ef
      Benjamin Herrenschmidt authored
      patch_instruction() can be called very early on ppc32, when the kernel
      isn't yet running at it's linked address. That can cause the !
      is_kernel_addr() test in __put_user() to trip and call might_sleep()
      which is very bad at that point during boot.
      
      Use a lower level function instead for now, at least until we get to
      rework ppc32 boot process to do the code patching later, like ppc64
      does.
      Signed-off-by: default avatarBenjamin Herrenschmidt <benh@kernel.crashing.org>
      636802ef
    • Paul Mackerras's avatar
      powerpc: Make sure IPI handlers see data written by IPI senders · 9fb1b36c
      Paul Mackerras authored
      We have been observing hangs, both of KVM guest vcpu tasks and more
      generally, where a process that is woken doesn't properly wake up and
      continue to run, but instead sticks in TASK_WAKING state.  This
      happens because the update of rq->wake_list in ttwu_queue_remote()
      is not ordered with the update of ipi_message in
      smp_muxed_ipi_message_pass(), and the reading of rq->wake_list in
      scheduler_ipi() is not ordered with the reading of ipi_message in
      smp_ipi_demux().  Thus it is possible for the IPI receiver not to see
      the updated rq->wake_list and therefore conclude that there is nothing
      for it to do.
      
      In order to make sure that anything done before smp_send_reschedule()
      is ordered before anything done in the resulting call to scheduler_ipi(),
      this adds barriers in smp_muxed_message_pass() and smp_ipi_demux().
      The barrier in smp_muxed_message_pass() is a full barrier to ensure that
      there is a full ordering between the smp_send_reschedule() caller and
      scheduler_ipi().  In smp_ipi_demux(), we use xchg() rather than
      xchg_local() because xchg() includes release and acquire barriers.
      Using xchg() rather than xchg_local() makes sense given that
      ipi_message is not just accessed locally.
      
      This moves the barrier between setting the message and calling the
      cause_ipi() function into the individual cause_ipi implementations.
      Most of them -- those that used outb, out_8 or similar -- already had
      a full barrier because out_8 etc. include a sync before the MMIO
      store.  This adds an explicit barrier in the two remaining cases.
      
      These changes made no measurable difference to the speed of IPIs as
      measured using a simple ping-pong latency test across two CPUs on
      different cores of a POWER7 machine.
      
      The analysis of the reason why processes were not waking up properly
      is due to Milton Miller.
      
      Cc: stable@vger.kernel.org # v3.0+
      Reported-by: default avatarMilton Miller <miltonm@bga.com>
      Signed-off-by: default avatarPaul Mackerras <paulus@samba.org>
      Signed-off-by: default avatarBenjamin Herrenschmidt <benh@kernel.crashing.org>
      9fb1b36c
    • Anton Blanchard's avatar
      powerpc: Restore correct DSCR in context switch · 71433285
      Anton Blanchard authored
      During a context switch we always restore the per thread DSCR value.
      If we aren't doing explicit DSCR management
      (ie thread.dscr_inherit == 0) and the default DSCR changed while
      the process has been sleeping we end up with the wrong value.
      
      Check thread.dscr_inherit and select the default DSCR or per thread
      DSCR as required.
      
      This was found with the following test case, when running with
      more threads than CPUs (ie forcing context switching):
      
      http://ozlabs.org/~anton/junkcode/dscr_default_test.c
      
      With the four patches applied I can run a combination of all
      test cases successfully at the same time:
      
      http://ozlabs.org/~anton/junkcode/dscr_default_test.c
      http://ozlabs.org/~anton/junkcode/dscr_explicit_test.c
      http://ozlabs.org/~anton/junkcode/dscr_inherit_test.cSigned-off-by: default avatarAnton Blanchard <anton@samba.org>
      Cc: <stable@kernel.org> # 3.0+
      Signed-off-by: default avatarBenjamin Herrenschmidt <benh@kernel.crashing.org>
      71433285
    • Anton Blanchard's avatar
      powerpc: Fix DSCR inheritance in copy_thread() · 1021cb26
      Anton Blanchard authored
      If the default DSCR is non zero we set thread.dscr_inherit in
      copy_thread() meaning the new thread and all its children will ignore
      future updates to the default DSCR. This is not intended and is
      a change in behaviour that a number of our users have hit.
      
      We just need to inherit thread.dscr and thread.dscr_inherit from
      the parent which ends up being much simpler.
      
      This was found with the following test case:
      
      http://ozlabs.org/~anton/junkcode/dscr_default_test.cSigned-off-by: default avatarAnton Blanchard <anton@samba.org>
      Cc: <stable@kernel.org> # 3.0+
      Signed-off-by: default avatarBenjamin Herrenschmidt <benh@kernel.crashing.org>
      1021cb26
    • Anton Blanchard's avatar
      powerpc: Keep thread.dscr and thread.dscr_inherit in sync · 00ca0de0
      Anton Blanchard authored
      When we update the DSCR either via emulation of mtspr(DSCR) or via
      a change to dscr_default in sysfs we don't update thread.dscr.
      We will eventually update it at context switch time but there is
      a period where thread.dscr is incorrect.
      
      If we fork at this point we will copy the old value of thread.dscr
      into the child. To avoid this, always keep thread.dscr in sync with
      reality.
      
      This issue was found with the following testcase:
      
      http://ozlabs.org/~anton/junkcode/dscr_inherit_test.cSigned-off-by: default avatarAnton Blanchard <anton@samba.org>
      Cc: <stable@kernel.org> # 3.0+
      Signed-off-by: default avatarBenjamin Herrenschmidt <benh@kernel.crashing.org>
      00ca0de0
    • Anton Blanchard's avatar
      powerpc: Update DSCR on all CPUs when writing sysfs dscr_default · 1b6ca2a6
      Anton Blanchard authored
      Writing to dscr_default in sysfs doesn't actually change the DSCR -
      we rely on a context switch on each CPU to do the work. There is no
      guarantee we will get a context switch in a reasonable amount of time
      so fire off an IPI to force an immediate change.
      
      This issue was found with the following test case:
      
      http://ozlabs.org/~anton/junkcode/dscr_explicit_test.cSigned-off-by: default avatarAnton Blanchard <anton@samba.org>
      Cc: <stable@kernel.org> # 3.0+
      Signed-off-by: default avatarBenjamin Herrenschmidt <benh@kernel.crashing.org>
      1b6ca2a6
    • Paul Mackerras's avatar
      powerpc/powernv: Always go into nap mode when CPU is offline · 375f561a
      Paul Mackerras authored
      The CPU hotplug code for the powernv platform currently only puts
      offline CPUs into nap mode if the powersave_nap variable is set.
      However, HV-style KVM on this platform requires secondary CPU threads
      to be offline and in nap mode.  Since we know nap mode works just
      fine on all POWER7 machines, and the only machines that support the
      powernv platform are POWER7 machines, this changes the code to
      always put offline CPUs into nap mode, regardless of powersave_nap.
      Powersave_nap still controls whether or not CPUs go into nap mode
      when idle, as before.
      Signed-off-by: default avatarPaul Mackerras <paulus@samba.org>
      Signed-off-by: default avatarBenjamin Herrenschmidt <benh@kernel.crashing.org>
      375f561a
    • Paul Mackerras's avatar
      powerpc: Give hypervisor decrementer interrupts their own handler · dabe859e
      Paul Mackerras authored
      At the moment the handler for hypervisor decrementer interrupts is
      the same as for decrementer interrupts, i.e. timer_interrupt().
      This is bogus; if we ever do get a hypervisor decrementer interrupt
      it won't have anything to do with the next timer event.  In fact
      the only time we get hypervisor decrementer interrupts is when one
      is left pending on exit from a KVM guest.
      
      When we get a hypervisor decrementer interrupt we don't need to do
      anything special to clear it, since they are edge-triggered on the
      transition of HDEC from 0 to -1.  Thus this adds an empty handler
      function for them.  We don't need to have them masked when interrupts
      are soft-disabled, so we use STD_EXCEPTION_HV instead of
      MASKABLE_EXCEPTION_HV.
      Signed-off-by: default avatarPaul Mackerras <paulus@samba.org>
      Signed-off-by: default avatarBenjamin Herrenschmidt <benh@kernel.crashing.org>
      dabe859e
    • Jesse Larrew's avatar
      powerpc/vphn: Fix arch_update_cpu_topology() return value · 79c5fceb
      Jesse Larrew authored
      arch_update_cpu_topology() should only return 1 when the topology has
      actually changed, and should return 0 otherwise.
      
      This patch fixes a potential bug where rebuild_sched_domains() would
      reinitialize the sched domains even when the topology hasn't changed.
      Signed-off-by: default avatarJesse Larrew <jlarrew@linux.vnet.ibm.com>
      Signed-off-by: default avatarBenjamin Herrenschmidt <benh@kernel.crashing.org>
      79c5fceb