Commits · c856bafae7f5b3f59ac1d99279a9b99b3b36ad12 · Kirill Smelkov / linux

23 Sep, 2012 6 commits

rcu: Allow RCU grace-period cleanup to be preempted · c856bafa

Paul E. McKenney authored Jun 21, 2012

RCU grace-period cleanup is currently carried out with interrupts
disabled, which can result in excessive latency spikes on large systems
(many hundreds or thousands of CPUs).  This patch therefore makes the
RCU grace-period cleanup be preemptible, including voluntary preemption
points, which should eliminate those latency spikes.  Similar spikes from
forcing of quiescent states will be dealt with similarly by later patches.

Updated to replace uses of spin_lock_irqsave() with spin_lock_irq(), as
suggested by Peter Zijlstra.
Reported-by: Mike Galbraith <mgalbraith@suse.de>
Reported-by: Dimitri Sivanich <sivanich@sgi.com>
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Reviewed-by: Josh Triplett <josh@joshtriplett.org>

c856bafa

rcu: Move RCU grace-period cleanup into kthread · cabc49c1

Paul E. McKenney authored Jun 20, 2012

As a first step towards allowing grace-period cleanup to be preemptible,
this commit moves the RCU grace-period cleanup into the same kthread
that is now used to initialize grace periods.  This is needed to keep
scheduling latency down to a dull roar.

[ paulmck: Get rid of stray spin_lock_irqsave() calls. ]
Reported-by: Mike Galbraith <mgalbraith@suse.de>
Reported-by: Dimitri Sivanich <sivanich@sgi.com>
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Reviewed-by: Josh Triplett <josh@joshtriplett.org>

cabc49c1

rcu: Allow RCU grace-period initialization to be preempted · 755609a9

Paul E. McKenney authored Jun 19, 2012

RCU grace-period initialization is currently carried out with interrupts
disabled, which can result in 200-microsecond latency spikes on systems
on which RCU has been configured for 4096 CPUs.  This patch therefore
makes the RCU grace-period initialization be preemptible, which should
eliminate those latency spikes.  Similar spikes from grace-period cleanup
and the forcing of quiescent states will be dealt with similarly by later
patches.
Reported-by: Mike Galbraith <mgalbraith@suse.de>
Reported-by: Dimitri Sivanich <sivanich@sgi.com>
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Reviewed-by: Josh Triplett <josh@joshtriplett.org>

755609a9

rcu: Prevent initialization-time quiescent-state race · 79bce672

Paul E. McKenney authored Sep 17, 2012

The next step in reducing RCU's grace-period initialization latency on
large systems will make this initialization preemptible.  Unfortunately,
making the grace-period initialization subject to interrupts (let alone
preemption) exposes the following race on systems whose rcu_node tree
contains more than one node:

1.	CPU 31 starts initializing the grace period, including the
    	first leaf rcu_node structures, and is then preempted.

2.	CPU 0 refers to the first leaf rcu_node structure, and notes
    	that a new grace period has started.  It passes through a
    	quiescent state shortly thereafter, and informs the RCU core
    	of this rite of passage.

3.	CPU 0 enters an RCU read-side critical section, acquiring
    	a pointer to an RCU-protected data item.

4.	CPU 31 takes an interrupt whose handler removes the data item
	referenced by CPU 0 from the data structure, and registers an
	RCU callback in order to free it.

5.	CPU 31 resumes initializing the grace period, including its
    	own rcu_node structure.  In invokes rcu_start_gp_per_cpu(),
    	which advances all callbacks, including the one registered
    	in #4 above, to be handled by the current grace period.

6.	The remaining CPUs pass through quiescent states and inform
    	the RCU core, but CPU 0 remains in its RCU read-side critical
    	section, still referencing the now-removed data item.

7.	The grace period completes and all the callbacks are invoked,
    	including the one that frees the data item that CPU 0 is still
    	referencing.  Oops!!!

One way to avoid this race is to remove grace-period acceleration from
rcu_start_gp_per_cpu().  Now, the only reason for this acceleration was
to allow CPUs bringing RCU out of idle state to have their callbacks
invoked after only one grace period, rather than the two grace periods
that would otherwise be required.  But this acceleration does not
work when RCU grace-period initialization is moved to a kthread because
the CPU posting the callback is no longer necessarily the CPU that is
initializing the resulting grace period.

This commit therefore removes this now-pointless (and soon to be dangerous)
grace-period acceleration, thus avoiding the above race.
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>

79bce672

rcu: Move RCU grace-period initialization into a kthread · b3dbec76

Paul E. McKenney authored Jun 18, 2012

As the first step towards allowing grace-period initialization to be
preemptible, this commit moves the RCU grace-period initialization
into its own kthread.  This is needed to keep large-system scheduling
latency at reasonable levels.

Also change raw_spin_lock_irqsave() to raw_spin_lock_irq() as suggested
by Peter Zijlstra in review comments.
Reported-by: Mike Galbraith <mgalbraith@suse.de>
Reported-by: Dimitri Sivanich <sivanich@sgi.com>
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Reviewed-by: Josh Triplett <josh@joshtriplett.org>

b3dbec76

rcu: Fix day-one dyntick-idle stall-warning bug · a10d206e

Paul E. McKenney authored Sep 22, 2012

Each grace period is supposed to have at least one callback waiting
for that grace period to complete. However, if CONFIG_NO_HZ=n, an
extra callback-free grace period is no big problem -- it will chew up
a tiny bit of CPU time, but it will complete normally. In contrast,
CONFIG_NO_HZ=y kernels have the potential for all the CPUs to go to
sleep indefinitely, in turn indefinitely delaying completion of the
callback-free grace period. Given that nothing is waiting on this grace
period, this is also not a problem.

That is, unless RCU CPU stall warnings are also enabled, as they are
in recent kernels. In this case, if a CPU wakes up after at least one
minute of inactivity, an RCU CPU stall warning will result. The reason
that no one noticed until quite recently is that most systems have enough
OS noise that they will never remain absolutely idle for a full minute.
But there are some embedded systems with cut-down userspace configurations
that consistently get into this situation.

All this begs the question of exactly how a callback-free grace period
gets started in the first place. This can happen due to the fact that
CPUs do not necessarily agree on which grace period is in progress.
If a CPU still believes that the grace period that just completed is
still ongoing, it will believe that it has callbacks that need to wait for
another grace period, never mind the fact that the grace period that they
were waiting for just completed. This CPU can therefore erroneously
decide to start a new grace period. Note that this can happen in
TREE_RCU and TREE_PREEMPT_RCU even on a single-CPU system: Deadlock
considerations mean that the CPU that detected the end of the grace
period is not necessarily officially informed of this fact for some time.

Once this CPU notices that the earlier grace period completed, it will
invoke its callbacks. It then won't have any callbacks left. If no
other CPU has any callbacks, we now have a callback-free grace period.

This commit therefore makes CPUs check more carefully before starting a
new grace period. This new check relies on an array of tail pointers
into each CPU's list of callbacks. If the CPU is up to date on which
grace periods have completed, it checks to see if any callbacks follow
the RCU_DONE_TAIL segment, otherwise it checks to see if any callbacks
follow the RCU_WAIT_TAIL segment. The reason that this works is that
the RCU_WAIT_TAIL segment will be promoted to the RCU_DONE_TAIL segment
as soon as the CPU is officially notified that the old grace period
has ended.

This change is to cpu_needs_another_gp(), which is called in a number
of places. The only one that really matters is in rcu_start_gp(), where
the root rcu_node structure's ->lock is held, which prevents any
other CPU from starting or completing a grace period, so that the
comparison that determines whether the CPU is missing the completion
of a grace period is stable.
Reported-by: Becky Bruce <bgillbruce@gmail.com>
Reported-by: Subodh Nijsure <snijsure@grid-net.com>
Reported-by: Paul Walmsley <paul@pwsan.com>
Signed-off-by: Paul E. McKenney <paul.mckenney@linaro.org>
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Tested-by: Paul Walmsley <paul@pwsan.com> # OMAP3730, OMAP4430
Cc: stable@vger.kernel.org

a10d206e

12 Sep, 2012 1 commit

trace: Don't declare trace_*_rcuidle functions in modules · 7ece55a4

Josh Triplett authored Sep 04, 2012

Tracepoints declare a static inline trace_*_rcuidle variant of the trace
function, to support safely generating trace events from the idle loop.
Module code never actually uses that variant of trace functions, because
modules don't run code that needs tracing with RCU idled. However, the
declaration of those otherwise unused functions causes the module to
reference rcu_idle_exit and rcu_idle_enter, which RCU does not export to
modules.

To avoid this, don't generate trace_*_rcuidle functions for tracepoints
declared in module code.

Link: http://lkml.kernel.org/r/20120905062306.GA14756@leafReported-by: Steven Rostedt <rostedt@goodmis.org>
Acked-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Acked-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Signed-off-by: Josh Triplett <josh@joshtriplett.org>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>

7ece55a4

08 Sep, 2012 3 commits

Linux 3.6-rc5 · 55d512e2
Linus Torvalds authored Sep 08, 2012

55d512e2

Merge branch 'fixes-for-3.6' of git://git.linaro.org/people/mszyprowski/linux-dma-mapping · 32d687ca

Linus Torvalds authored Sep 08, 2012

Pull DMA-mapping fixes from Marek Szyprowski:
"Another set of fixes for ARM dma-mapping subsystem.

Commit e9da6e99 replaced custom consistent buffer remapping code
with generic vmalloc areas. It however introduced some regressions
caused by limited support for allocations in atomic context. This
series contains fixes for those regressions.

For some subplatforms the default, pre-allocated pool for atomic
allocations turned out to be too small, so a function for setting its
size has been added.

Another set of patches adds support for atomic allocations to
IOMMU-aware DMA-mapping implementation.

The last part of this pull request contains two fixes for Contiguous
Memory Allocator, which relax too strict requirements."

* 'fixes-for-3.6' of git://git.linaro.org/people/mszyprowski/linux-dma-mapping:
ARM: dma-mapping: IOMMU allocates pages from atomic_pool with GFP_ATOMIC
ARM: dma-mapping: Introduce __atomic_get_pages() for __iommu_get_pages()
ARM: dma-mapping: Refactor out to introduce __in_atomic_pool
ARM: dma-mapping: atomic_pool with struct page **pages
ARM: Kirkwood: increase atomic coherent pool size
ARM: DMA-Mapping: print warning when atomic coherent allocation fails
ARM: DMA-Mapping: add function for setting coherent pool size from platform code
ARM: relax conditions required for enabling Contiguous Memory Allocator
mm: cma: fix alignment requirements for contiguous regions

32d687ca

Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/dtor/input · 11be4bc6

Linus Torvalds authored Sep 08, 2012

Pull input subsystem updates from Dmitry Torokhov.

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/dtor/input:
  Input: wacom - add support for EMR on Cintiq 24HD touch
  Input: i8042 - add Gigabyte T1005 series netbooks to noloop table
  Input: imx_keypad - reset the hardware before enabling
  Input: edt-ft5x06 - fix build error when compiling wthout CONFIG_DEBUG_FS

11be4bc6

07 Sep, 2012 4 commits

Merge branch 'upstream-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/hid · 5b6e7f1c

Linus Torvalds authored Sep 07, 2012

Pull HID updates from Jiri Kosina:
 "It contains a fix for Eaton Ellipse MAX UPS from Alan Stern,
  performance improvement (not processing debug data if noone is
  interested), by Henrik Rydberg, and allowing tpkbd-driven devices to
  work even with generic driver in a crippled mode, by Andres Freund."

* 'upstream-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/hid:
  HID: tpkbd: work even if the new Lenovo Keyboard driver is not configured
  HID: Only dump input if someone is listening
  HID: add NOGET quirk for Eaton Ellipse MAX UPS

5b6e7f1c

HID: tpkbd: work even if the new Lenovo Keyboard driver is not configured · aad932e7

Andres Freund authored Aug 30, 2012

c1dcad2d added a new driver configured by
HID_LENOVO_TPKBD but made the hid_have_special_driver entry non-optional which
lead to a recognized but non-working device if the new driver wasn't
configured (which is the correct default).
Signed-off-by: Andres Freund <andres@anarazel.de>
Signed-off-by: Jiri Kosina <jkosina@suse.cz>

aad932e7

Merge tag 'stable/for-linus-3.6-rc4-tag' of... · bf71d0e1

Linus Torvalds authored Sep 06, 2012

Merge tag 'stable/for-linus-3.6-rc4-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/konrad/xen

Pull Xen bug-fixes from Konrad Rzeszutek Wilk:
 * Fix for TLB flushing introduced in v3.6
 * Fix Xen-SWIOTLB not using proper DMA mask - device had 64bit but
   in a 32-bit kernel we need to allocate for coherent pages from a
   32-bit pool.
 * When trying to re-use P2M nodes we had a one-off error and triggered
   a BUG_ON check with specific CONFIG_ option.
 * When doing FLR in Xen-PCI-backend we would first do FLR then save the
   PCI configuration space. We needed to do it the other way around.

* tag 'stable/for-linus-3.6-rc4-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/konrad/xen:
  xen/pciback: Fix proper FLR steps.
  xen: Use correct masking in xen_swiotlb_alloc_coherent.
  xen: fix logical error in tlb flushing
  xen/p2m: Fix one-off error in checking the P2M tree directory.

bf71d0e1

Merge tag '3.6-pci-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci · f8b9cf0f

Linus Torvalds authored Sep 06, 2012

Pull PCI updates from Bjorn Helgaas:
 "Power management
    - PCI/PM: Enable D3/D3cold by default for most devices
    - PCI/PM: Keep parent bridge active when probing device
    - PCI/PM: Fix config reg access for D3cold and bridge suspending
    - PCI/PM: Add ABI document for sysfs file d3cold_allowed
  Core
    - PCI: Don't print anything while decoding is disabled"

* tag '3.6-pci-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci:
  PCI: Don't print anything while decoding is disabled
  PCI/PM: Add ABI document for sysfs file d3cold_allowed
  PCI/PM: Fix config reg access for D3cold and bridge suspending
  PCI/PM: Keep parent bridge active when probing device
  PCI/PM: Enable D3/D3cold by default for most devices

f8b9cf0f

06 Sep, 2012 14 commits

Merge tag 'fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc · eeea3ac9

Linus Torvalds authored Sep 06, 2012

Pull ARM SoC bug fixes from Olof Johansson:
 "Mostly Renesas and Atmel bugfixes this time, targeting boot and build
  problems.  A couple of patches for gemini and kirkwood as well.  On a
  whole nothing very controversial."

* tag 'fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc:
  ARM: gemini: fix the gemini build
  ARM: shmobile: armadillo800eva: enable rw rootfs mount
  ARM: Kirkwood: Fix 'SZ_1M' undeclared here for db88f6281-bp-setup.c
  ARM: shmobile: mackerel: fixup usb module order
  ARM: shmobile: armadillo800eva: fixup: sound card detection order
  ARM: shmobile: marzen: fixup smsc911x id for regulator
  ARM: at91/feature-removal-schedule: delay at91_mci removal
  ARM: mach-shmobile: armadillo800eva: Enable power button as wakeup source
  ARM: mach-shmobile: armadillo800eva: Fix GPIO buttons descriptions
  ARM: at91/dts: remove partial parameter in at91sam9g25ek.dts
  ARM: at91/clock: fix PLLA overclock warning
  ARM: at91: fix rtc-at91sam9 irq issue due to sparse irq support
  ARM: at91: fix system timer irq issue due to sparse irq support
  ARM: shmobile: sh73a0: fixup RELOC_BASE of intca_irq_pins_desc

eeea3ac9

Merge tag 'hwmon-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/groeck/linux-staging · c7c6bf1e

Linus Torvalds authored Sep 06, 2012

Pull a hwmon fix from Guenter Roeck:
 "One patch, fixing DIV_ROUND_CLOSEST to support negative dividends.

  While the changes are not in the drivers/hwmon directory, the problem
  primarily affects hwmon drivers, and it makes sense to push the patch
  through the hwmon tree."

* tag 'hwmon-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/groeck/linux-staging:
  linux/kernel.h: Fix DIV_ROUND_CLOSEST to support negative dividends

c7c6bf1e

Merge branch 'rc-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/mmarek/kbuild · bd12ce8c

Linus Torvalds authored Sep 06, 2012

Pull kbuild fixes from Michal Marek:
 "These are two fixes that should go into 3.6.  The link-vmlinux.sh one
  is obvious.

  The other one fixes make firmware_install with certain configurations,
  where a file in the toplevel firmware tree gets installed first, and
  $(INSTALL_FW_PATH)/$$(dir <file>) results in /lib/firmware/./, which
  confuses make 3.82 for some reason."

* 'rc-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/mmarek/kbuild:
  firmware: fix directory creation rule matching with make 3.82
  link-vmlinux.sh: Fix stray "echo" in error message

bd12ce8c

Remove user-triggerable BUG from mpol_to_str · 80de7c31

Dave Jones authored Sep 06, 2012

Trivially triggerable, found by trinity:

  kernel BUG at mm/mempolicy.c:2546!
  Process trinity-child2 (pid: 23988, threadinfo ffff88010197e000, task ffff88007821a670)
  Call Trace:
    show_numa_map+0xd5/0x450
    show_pid_numa_map+0x13/0x20
    traverse+0xf2/0x230
    seq_read+0x34b/0x3e0
    vfs_read+0xac/0x180
    sys_pread64+0xa2/0xc0
    system_call_fastpath+0x1a/0x1f
  RIP: mpol_to_str+0x156/0x360

Cc: stable@vger.kernel.org
Signed-off-by: Dave Jones <davej@redhat.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

80de7c31

xen/pciback: Fix proper FLR steps. · 80ba77df

Konrad Rzeszutek Wilk authored Sep 05, 2012

When we do FLR and save PCI config we did it in the wrong order.
The end result was that if a PCI device was unbind from
its driver, then binded to xen-pciback, and then back to its
driver we would get:

> lspci -s 04:00.0
04:00.0 Ethernet controller: Intel Corporation 82574L Gigabit Network Connection
13:42:12 # 4 :~/
> echo "0000:04:00.0" > /sys/bus/pci/drivers/pciback/unbind
> modprobe e1000e
e1000e: Intel(R) PRO/1000 Network Driver - 2.0.0-k
e1000e: Copyright(c) 1999 - 2012 Intel Corporation.
e1000e 0000:04:00.0: Disabling ASPM L0s L1
e1000e 0000:04:00.0: enabling device (0000 -> 0002)
xen: registering gsi 48 triggering 0 polarity 1
Already setup the GSI :48
e1000e 0000:04:00.0: Interrupt Throttling Rate (ints/sec) set to dynamic conservative mode
e1000e: probe of 0000:04:00.0 failed with error -2

This fixes it by first saving the PCI configuration space, then
doing the FLR.
Reported-by: Ren, Yongjie <yongjie.ren@intel.com>
Reported-and-Tested-by: Tobias Geiger <tobias.geiger@vido.info>
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
CC: stable@vger.kernel.org

80ba77df

Merge tag 'mmc-fixes-for-3.6-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/cjb/mmc · 08090950

Linus Torvalds authored Sep 05, 2012

Pull MMC fixes from Chris Ball:
 - a firmware bug on several Samsung MoviNAND eMMC models causes
   permanent corruption on the device when secure erase and secure trim
   requests are made, so we disable those requests on these eMMC devices.
 - atmel-mci: fix a hang with some SD cards by waiting for not-busy flag.
 - dw_mmc: low-power mode breaks SDIO interrupts; fix PIO error handling;
   fix handling of error interrupts.
 - mxs-mmc: fix deadlocks; fix compile error due to dma.h arch change.
 - omap: fix broken PIO mode causing memory corruption.
 - sdhci-esdhc: fix card detection.

* tag 'mmc-fixes-for-3.6-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/cjb/mmc:
  mmc: omap: fix broken PIO mode
  mmc: card: Skip secure erase on MoviNAND; causes unrecoverable corruption.
  mmc: dw_mmc: Disable low power mode if SDIO interrupts are used
  mmc: dw_mmc: fix error handling in PIO mode
  mmc: dw_mmc: correct mishandling error interrupt
  mmc: dw_mmc: amend using error interrupt status
  mmc: atmel-mci: not busy flag has also to be used for read operations
  mmc: sdhci-esdhc: break out early if clock is 0
  mmc: mxs-mmc: fix deadlock caused by recursion loop
  mmc: mxs-mmc: fix deadlock in SDIO IRQ case
  mmc: bfin_sdh: fix dma_desc_array build error

08090950

uml: fix compile error in deliver_alarm() · bc6c8364

Miklos Szeredi authored Sep 05, 2012

Fix the following compile error on UML.

  arch/um/os-Linux/time.c: In function 'deliver_alarm':
  arch/um/os-Linux/time.c:117:3: error: too few arguments to function 'alarm_handler'
  arch/um/os-Linux/internal.h:1:6: note: declared here

The error was introduced by commit d3c1cfcd ("um: pass siginfo to guest
process") in 3.6-rc1.
Signed-off-by: Miklos Szeredi <mszeredi@suse.cz>
CC: Martin Pärtel <martin.partel@gmail.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

bc6c8364

dj: memory scribble in logi_dj · 8a55ade7

Alan Cox authored Sep 04, 2012

Allocate a structure not a pointer to it !
Signed-off-by: Alan Cox <alan@linux.intel.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

8a55ade7

Merge branch 'merge' of git://git.kernel.org/pub/scm/linux/kernel/git/benh/powerpc · cb4f9a29

Linus Torvalds authored Sep 05, 2012

Pull powerpc fixes from Benjamin Herrenschmidt:
 "Here are a few fixes for 3.6 that were piling up while I was away or
  busy (I was mostly MIA a week or two before San Diego).

  Some fixes from Anton fixing up issues with our relatively new DSCR
  control feature, and a few other fixes that are either regressions or
  bugs nasty enough to warrant not waiting."

* 'merge' of git://git.kernel.org/pub/scm/linux/kernel/git/benh/powerpc:
  powerpc: Don't use __put_user() in patch_instruction
  powerpc: Make sure IPI handlers see data written by IPI senders
  powerpc: Restore correct DSCR in context switch
  powerpc: Fix DSCR inheritance in copy_thread()
  powerpc: Keep thread.dscr and thread.dscr_inherit in sync
  powerpc: Update DSCR on all CPUs when writing sysfs dscr_default
  powerpc/powernv: Always go into nap mode when CPU is offline
  powerpc: Give hypervisor decrementer interrupts their own handler
  powerpc/vphn: Fix arch_update_cpu_topology() return value

cb4f9a29

Merge tag 'gpio-fixes-for-v3.6' of git://git.kernel.org/pub/scm/linux/kernel/git/linusw/linux-gpio · 813e6438

Linus Torvalds authored Sep 05, 2012

Pull GPIO fixes from Linus Walleij:
 "These are some GPIO regression fixes for v3.6:
   - Erroneous debug message from of_get_named_gpio_flags()
   - Make sure the MC9S08DZ60 GPIO driver depend on I2C being compiled
     in (not module) or allmodconfig breaks.
   - Check return value from irq_alloc_descs() in the Emma Mobile GPIO
     driver.
   - Assign the owner field for the rdc321x driver so the module won't
     be removed if it has active GPIOs."

* tag 'gpio-fixes-for-v3.6' of git://git.kernel.org/pub/scm/linux/kernel/git/linusw/linux-gpio:
  gpio: rdc321x: Prevent removal of modules exporting active GPIOs
  gpio: em: Fix checking return value of irq_alloc_descs
  gpio: mc9s08dz60: Fix build error if I2C=m
  gpio: Fix debug message in of_get_named_gpio_flags()

813e6438

Merge tag 'sound-3.6' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound · 5e682c0e

Linus Torvalds authored Sep 05, 2012

Pull sound fixes from Takashi Iwai:
 "There are nothing scaring, contains only small fixes for HD-audio and
  USB-audio:
   - EPSS regression fix and GPIO fix for HD-audio IDT codecs
   - A series of USB-audio regression fixes that are found since 3.5
     kernel"

* tag 'sound-3.6' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound:
  ALSA: snd-usb: fix cross-interface streaming devices
  ALSA: snd-usb: fix calls to next_packet_size
  ALSA: snd-usb: restore delay information
  ALSA: snd-usb: use list_for_each_safe for endpoint resources
  ALSA: snd-usb: Fix URB cancellation at stream start
  ALSA: hda - Don't trust codec EPSS bit for IDT 92HD83xx & co
  ALSA: hda - Avoid unnecessary parameter read for EPSS
  ALSA: hda - Do not set GPIOs for speakers on IDT if there are no speakers

5e682c0e

Merge tag 'fbdev-fixes-for-3.6-1' of git://github.com/schandinat/linux-2.6 · 6d1a0503

Linus Torvalds authored Sep 05, 2012

Pull fbdev fixes from Florian Tobias Schandinat:
 - a fix by Paul Cercueil to prevent a possible buffer overflow
 - a fix by Bruno Prémont to prevent a rare sleep in invalid context
 - a fix by Julia Lawall for a double free in auo_k190x
 - a fix by Dan Carpenter to prevent a division by zero in mb862xxfb
 - a regression fix by Tomi Valkeinen for the SDI output in OMAP
 - a fix by Grazvydas Ignotas to fix the console colors in OMAP

* tag 'fbdev-fixes-for-3.6-1' of git://github.com/schandinat/linux-2.6:
  OMAPFB: fix framebuffer console colors
  OMAPDSS: Fix SDI PLL locking
  video: mb862xxfb: prevent divide by zero bug
  drivers/video/auo_k190x.c: drop kfree of devm_kzalloc's data
  fbcon: Fix bit_putcs() call to kmalloc(s, GFP_KERNEL)
  fbcon: prevent possible buffer overflow.

6d1a0503

Merge tag 'upstream-3.6-rc5' of git://git.infradead.org/linux-ubi · 50234c58

Linus Torvalds authored Sep 05, 2012

Pull ubi fix from Artem Bityutskiy:
 "A single small fix for memory deallocation: we allocated memory using
  'kmem_cache_alloc()' but were freeing it using 'kfree()' in some
  cases.  Now we fix this by using 'kmem_cache_free()' instead."

* tag 'upstream-3.6-rc5' of git://git.infradead.org/linux-ubi:
  UBI: fix a horrible memory deallocation bug

50234c58

Fix order of arguments to compat_put_time[spec|val] · ed6fe9d6

Mikulas Patocka authored Sep 01, 2012

Commit 644595f8 ("compat: Handle COMPAT_USE_64BIT_TIME in
net/socket.c") introduced a bug where the helper functions to take
either a 64-bit or compat time[spec|val] got the arguments in the wrong
order, passing the kernel stack pointer off as a user pointer (and vice
versa).

Because of the user address range check, that in turn then causes an
EFAULT due to the user pointer range checking failing for the kernel
address.  Incorrectly resuling in a failed system call for 32-bit
processes with a 64-bit kernel.

On odder architectures like HP-PA (with separate user/kernel address
spaces), it can be used read kernel memory.
Signed-off-by: Mikulas Patocka <mpatocka@redhat.com>
Cc: stable@vger.kernel.org
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

ed6fe9d6

05 Sep, 2012 12 commits

xen: Use correct masking in xen_swiotlb_alloc_coherent. · b5031ed1

Ronny Hegewald authored Aug 31, 2012

When running 32-bit pvops-dom0 and a driver tries to allocate a coherent
DMA-memory the xen swiotlb-implementation returned memory beyond 4GB.

The underlaying reason is that if the supplied driver passes in a
DMA_BIT_MASK(64) ( hwdev->coherent_dma_mask is set to 0xffffffffffffffff)
our dma_mask will be u64 set to 0xffffffffffffffff even if we set it to
DMA_BIT_MASK(32) previously. Meaning we do not reset the upper bits.
By using the dma_alloc_coherent_mask function - it does the proper casting
and we get 0xfffffffff.

This caused not working sound on a system with 4 GB and a 64-bit
compatible sound-card with sets the DMA-mask to 64bit.

On bare-metal and the forward-ported xen-dom0 patches from OpenSuse a coherent
DMA-memory is always allocated inside the 32-bit address-range by calling
dma_alloc_coherent_mask.

This patch adds the same functionality to xen swiotlb and is a rebase of the
original patch from Ronny Hegewald which never got upstream b/c the
underlaying reason was not understood until now.

The original email with the original patch is in:
http://old-list-archives.xen.org/archives/html/xen-devel/2010-02/msg00038.html
the original thread from where the discussion started is in:
http://old-list-archives.xen.org/archives/html/xen-devel/2010-01/msg00928.htmlSigned-off-by: Ronny Hegewald <ronny.hegewald@online.de>
Signed-off-by: Stefano Panella <stefano.panella@citrix.com>
Acked-By: David Vrabel <david.vrabel@citrix.com>
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
CC: stable@vger.kernel.org

b5031ed1

xen: fix logical error in tlb flushing · ce7184bd

Alex Shi authored Aug 24, 2012

While TLB_FLUSH_ALL gets passed as 'end' argument to
flush_tlb_others(), the Xen code was made to check its 'start'
parameter. That may give a incorrect op.cmd to MMUEXT_INVLPG_MULTI
instead of MMUEXT_TLB_FLUSH_MULTI. Then it causes some page can not
be flushed from TLB.

This patch fixed this issue.
Reported-by: Jan Beulich <jbeulich@suse.com>
Signed-off-by: Alex Shi <alex.shi@intel.com>
Acked-by: Jan Beulich <jbeulich@suse.com>
Tested-by: Yongjie Ren <yongjie.ren@intel.com>
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>

ce7184bd

Merge commit '' into stable/for-linus-3.6 · 593d0a3e

Konrad Rzeszutek Wilk authored Sep 05, 2012

* commit '4cb38750': (6849 commits)
  bcma: fix invalid PMU chip control masks
  [libata] pata_cmd64x: whitespace cleanup
  libata-acpi: fix up for acpi_pm_device_sleep_state API
  sata_dwc_460ex: device tree may specify dma_channel
  ahci, trivial: fixed coding style issues related to braces
  ahci_platform: add hibernation callbacks
  libata-eh.c: local functions should not be exposed globally
  libata-transport.c: local functions should not be exposed globally
  sata_dwc_460ex: support hardreset
  ata: use module_pci_driver
  drivers/ata/pata_pcmcia.c: adjust suspicious bit operation
  pata_imx: Convert to clk_prepare_enable/clk_disable_unprepare
  ahci: Enable SB600 64bit DMA on MSI K9AGM2 (MS-7327) v2
  [libata] Prevent interface errors with Seagate FreeAgent GoFlex
  drivers/acpi/glue: revert accidental license-related 6b66d958 bits
  libata-acpi: add missing inlines in libata.h
  i2c-omap: Add support for I2C_M_STOP message flag
  i2c: Fall back to emulated SMBus if the operation isn't supported natively
  i2c: Add SCCB support
  i2c-tiny-usb: Add support for the Robofuzz OSIF USB/I2C converter
  ...

593d0a3e

xen/p2m: Fix one-off error in checking the P2M tree directory. · 50e90041

Konrad Rzeszutek Wilk authored Sep 04, 2012

We would traverse the full P2M top directory (from 0->MAX_DOMAIN_PAGES
inclusive) when trying to figure out whether we can re-use some of the
P2M middle leafs.

Which meant that if the kernel was compiled with MAX_DOMAIN_PAGES=512
we would try to use the 512th entry. Fortunately for us the p2m_top_index
has a check for this:

 BUG_ON(pfn >= MAX_P2M_PFN);

which we hit and saw this:

(XEN) domain_crash_sync called from entry.S
(XEN) Domain 0 (vcpu#0) crashed on cpu#0:
(XEN) ----[ Xen-4.1.2-OVM  x86_64  debug=n  Tainted:    C ]----
(XEN) CPU:    0
(XEN) RIP:    e033:[<ffffffff819cadeb>]
(XEN) RFLAGS: 0000000000000212   EM: 1   CONTEXT: pv guest
(XEN) rax: ffffffff81db5000   rbx: ffffffff81db4000   rcx: 0000000000000000
(XEN) rdx: 0000000000480211   rsi: 0000000000000000   rdi: ffffffff81db4000
(XEN) rbp: ffffffff81793db8   rsp: ffffffff81793d38   r8:  0000000008000000
(XEN) r9:  4000000000000000   r10: 0000000000000000   r11: ffffffff81db7000
(XEN) r12: 0000000000000ff8   r13: ffffffff81df1ff8   r14: ffffffff81db6000
(XEN) r15: 0000000000000ff8   cr0: 000000008005003b   cr4: 00000000000026f0
(XEN) cr3: 0000000661795000   cr2: 0000000000000000

Fixes-Oracle-Bug: 14570662
CC: stable@vger.kernel.org # only for v3.5
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>

50e90041

powerpc: Don't use __put_user() in patch_instruction · 636802ef

Benjamin Herrenschmidt authored Sep 04, 2012

patch_instruction() can be called very early on ppc32, when the kernel
isn't yet running at it's linked address. That can cause the !
is_kernel_addr() test in __put_user() to trip and call might_sleep()
which is very bad at that point during boot.

Use a lower level function instead for now, at least until we get to
rework ppc32 boot process to do the code patching later, like ppc64
does.
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>

636802ef

powerpc: Make sure IPI handlers see data written by IPI senders · 9fb1b36c

Paul Mackerras authored Sep 04, 2012

We have been observing hangs, both of KVM guest vcpu tasks and more
generally, where a process that is woken doesn't properly wake up and
continue to run, but instead sticks in TASK_WAKING state.  This
happens because the update of rq->wake_list in ttwu_queue_remote()
is not ordered with the update of ipi_message in
smp_muxed_ipi_message_pass(), and the reading of rq->wake_list in
scheduler_ipi() is not ordered with the reading of ipi_message in
smp_ipi_demux().  Thus it is possible for the IPI receiver not to see
the updated rq->wake_list and therefore conclude that there is nothing
for it to do.

In order to make sure that anything done before smp_send_reschedule()
is ordered before anything done in the resulting call to scheduler_ipi(),
this adds barriers in smp_muxed_message_pass() and smp_ipi_demux().
The barrier in smp_muxed_message_pass() is a full barrier to ensure that
there is a full ordering between the smp_send_reschedule() caller and
scheduler_ipi().  In smp_ipi_demux(), we use xchg() rather than
xchg_local() because xchg() includes release and acquire barriers.
Using xchg() rather than xchg_local() makes sense given that
ipi_message is not just accessed locally.

This moves the barrier between setting the message and calling the
cause_ipi() function into the individual cause_ipi implementations.
Most of them -- those that used outb, out_8 or similar -- already had
a full barrier because out_8 etc. include a sync before the MMIO
store.  This adds an explicit barrier in the two remaining cases.

These changes made no measurable difference to the speed of IPIs as
measured using a simple ping-pong latency test across two CPUs on
different cores of a POWER7 machine.

The analysis of the reason why processes were not waking up properly
is due to Milton Miller.

Cc: stable@vger.kernel.org # v3.0+
Reported-by: Milton Miller <miltonm@bga.com>
Signed-off-by: Paul Mackerras <paulus@samba.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>

9fb1b36c

powerpc: Restore correct DSCR in context switch · 71433285

Anton Blanchard authored Sep 03, 2012

During a context switch we always restore the per thread DSCR value.
If we aren't doing explicit DSCR management
(ie thread.dscr_inherit == 0) and the default DSCR changed while
the process has been sleeping we end up with the wrong value.

Check thread.dscr_inherit and select the default DSCR or per thread
DSCR as required.

This was found with the following test case, when running with
more threads than CPUs (ie forcing context switching):

http://ozlabs.org/~anton/junkcode/dscr_default_test.c

With the four patches applied I can run a combination of all
test cases successfully at the same time:

http://ozlabs.org/~anton/junkcode/dscr_default_test.c
http://ozlabs.org/~anton/junkcode/dscr_explicit_test.c
http://ozlabs.org/~anton/junkcode/dscr_inherit_test.cSigned-off-by: Anton Blanchard <anton@samba.org>
Cc: <stable@kernel.org> # 3.0+
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>

71433285

powerpc: Fix DSCR inheritance in copy_thread() · 1021cb26

Anton Blanchard authored Sep 03, 2012

If the default DSCR is non zero we set thread.dscr_inherit in
copy_thread() meaning the new thread and all its children will ignore
future updates to the default DSCR. This is not intended and is
a change in behaviour that a number of our users have hit.

We just need to inherit thread.dscr and thread.dscr_inherit from
the parent which ends up being much simpler.

This was found with the following test case:

http://ozlabs.org/~anton/junkcode/dscr_default_test.cSigned-off-by: Anton Blanchard <anton@samba.org>
Cc: <stable@kernel.org> # 3.0+
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>

1021cb26

powerpc: Keep thread.dscr and thread.dscr_inherit in sync · 00ca0de0

Anton Blanchard authored Sep 03, 2012

When we update the DSCR either via emulation of mtspr(DSCR) or via
a change to dscr_default in sysfs we don't update thread.dscr.
We will eventually update it at context switch time but there is
a period where thread.dscr is incorrect.

If we fork at this point we will copy the old value of thread.dscr
into the child. To avoid this, always keep thread.dscr in sync with
reality.

This issue was found with the following testcase:

http://ozlabs.org/~anton/junkcode/dscr_inherit_test.cSigned-off-by: Anton Blanchard <anton@samba.org>
Cc: <stable@kernel.org> # 3.0+
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>

00ca0de0

powerpc: Update DSCR on all CPUs when writing sysfs dscr_default · 1b6ca2a6

Anton Blanchard authored Sep 03, 2012

Writing to dscr_default in sysfs doesn't actually change the DSCR -
we rely on a context switch on each CPU to do the work. There is no
guarantee we will get a context switch in a reasonable amount of time
so fire off an IPI to force an immediate change.

This issue was found with the following test case:

http://ozlabs.org/~anton/junkcode/dscr_explicit_test.cSigned-off-by: Anton Blanchard <anton@samba.org>
Cc: <stable@kernel.org> # 3.0+
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>

1b6ca2a6

powerpc/powernv: Always go into nap mode when CPU is offline · 375f561a

Paul Mackerras authored Jul 26, 2012

The CPU hotplug code for the powernv platform currently only puts
offline CPUs into nap mode if the powersave_nap variable is set.
However, HV-style KVM on this platform requires secondary CPU threads
to be offline and in nap mode.  Since we know nap mode works just
fine on all POWER7 machines, and the only machines that support the
powernv platform are POWER7 machines, this changes the code to
always put offline CPUs into nap mode, regardless of powersave_nap.
Powersave_nap still controls whether or not CPUs go into nap mode
when idle, as before.
Signed-off-by: Paul Mackerras <paulus@samba.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>

375f561a

powerpc: Give hypervisor decrementer interrupts their own handler · dabe859e

Paul Mackerras authored Jul 26, 2012

At the moment the handler for hypervisor decrementer interrupts is
the same as for decrementer interrupts, i.e. timer_interrupt().
This is bogus; if we ever do get a hypervisor decrementer interrupt
it won't have anything to do with the next timer event.  In fact
the only time we get hypervisor decrementer interrupts is when one
is left pending on exit from a KVM guest.

When we get a hypervisor decrementer interrupt we don't need to do
anything special to clear it, since they are edge-triggered on the
transition of HDEC from 0 to -1.  Thus this adds an empty handler
function for them.  We don't need to have them masked when interrupts
are soft-disabled, so we use STD_EXCEPTION_HV instead of
MASKABLE_EXCEPTION_HV.
Signed-off-by: Paul Mackerras <paulus@samba.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>

dabe859e