Commits · 3008ddbe061b0f1d5c8ffbb599f105b67cf06637 · Kirill Smelkov / linux

21 Jan, 2014 5 commits

mfd: max14577: Add max14577 MFD driver core · 3008ddbe

Chanwoo Choi authored Nov 22, 2013

This patch adds max14577 core/irq driver to support MUIC(Micro USB IC)
device and charger device and support irq domain method to control
internal interrupt of max14577 device. Also, this patch supports DT
binding with max14577_i2c_parse_dt().

The MAXIM 14577 chip contains Micro-USB Interface Circuit and Li+ Battery
Charger. It contains accessory and USB charger detection logic. It supports
USB 2.0 Hi-Speed, UART and stereo audio signals over Micro-USB connector.

The battery charger is compliant with the USB Battery Charging Specification
Revision 1.1. It has also SFOUT LDO output for powering USB devices.
Reviewed-by: Mark Brown <broonie@linaro.org>
Signed-off-by: Chanwoo Choi <cw00.choi@samsung.com>
Signed-off-by: Krzysztof Kozlowski <k.kozlowski@samsung.com>
Signed-off-by: Kyungmin Park <kyungmin.park@samsung.com>
Signed-off-by: Lee Jones <lee.jones@linaro.org>

3008ddbe

mfd: sec: Add PM ops and make it a wake up source · f6d6daaf

Krzysztof Kozlowski authored Nov 26, 2013

Add PM suspend/resume ops to the sec MFD core driver and make it a wake
up source. This allows proper waking from suspend to RAM and also fixes
broken interrupts after resuming:
[   42.705703] sec_pmic 7-0066: Failed to read IRQ status: -5

Interrupts stop working after first resume initiated by them (e.g. by
RTC Alarm interrupt) because interrupt registers were not cleared properly.

When device is woken up from suspend by RTC Alarm, an interrupt occurs
before resuming I2C bus controller. The interrupt is handled by
regmap_irq_thread which tries to read RTC registers. This read fails
(I2C is still suspended) and RTC Alarm interrupt is disabled.

Disable the S5M8767 interrupts during suspend (disable_irq()) and enable
them during resume so the device will be still woken up but the interrupt
won't happen before resuming I2C bus.
Signed-off-by: Krzysztof Kozlowski <k.kozlowski@samsung.com>
Signed-off-by: Kyungmin Park <kyungmin.park@samsung.com>
Signed-off-by: Lee Jones <lee.jones@linaro.org>

f6d6daaf

mfd: cros ec: i2c: Use consistent function names · 192afe5e

Thierry Reding authored Nov 25, 2013

Rename cros_ec_{probe,remove}_i2c() to cros_ec_i2c_{probe,remove}() for
consistency.
Signed-off-by: Thierry Reding <treding@nvidia.com>
Signed-off-by: Lee Jones <lee.jones@linaro.org>

192afe5e

mfd: cros ec: spi: Use consistent function names · 9922b412

Thierry Reding authored Nov 25, 2013

Rename cros_ec_{probe,remove}_spi() to cros_ec_spi_{probe,remove}() for
consistency.
Signed-off-by: Thierry Reding <treding@nvidia.com>
Signed-off-by: Lee Jones <lee.jones@linaro.org>

9922b412

mfd: mc13xxx: Remove unnecessary spi_set_drvdata() · e553fa6e

Jingoo Han authored Nov 25, 2013

The driver core clears the driver data to NULL after device_release
or on probe failure. Thus, it is not needed to manually clear the
device driver data to NULL.
Signed-off-by: Jingoo Han <jg1.han@samsung.com>
Signed-off-by: Lee Jones <lee.jones@linaro.org>

e553fa6e

06 Jan, 2014 17 commits

mfd: as3722: Add watchdog support · 603ab143

Bibek Basu authored Nov 21, 2013

Add watchdog device support for as3722
Signed-off-by: Bibek Basu <bbasu@nvidia.com>
Signed-off-by: Lee Jones <lee.jones@linaro.org>

603ab143

mfd: ti: Constify struct mfd_cell where possible · 30fe2b5b

Geert Uytterhoeven authored Nov 18, 2013

As of commit 03e361b2 ("mfd: Stop setting
refcounting pointers in original mfd_cell arrays"), the "cell" parameter of
mfd_add_devices() is "const" again. Hence make all cell data passed to
mfd_add_devices() const where possible.
Signed-off-by: Geert Uytterhoeven <geert@linux-m68k.org>
Signed-off-by: Lee Jones <lee.jones@linaro.org>

30fe2b5b

mfd: Constify struct mfd_cell where possible · 5ac98553

Geert Uytterhoeven authored Nov 18, 2013

As of commit 03e361b2 ("mfd: Stop setting
refcounting pointers in original mfd_cell arrays"), the "cell" parameter of
mfd_add_devices() is "const" again. Hence make all cell data passed to
mfd_add_devices() const where possible.
Signed-off-by: Geert Uytterhoeven <geert@linux-m68k.org>
Signed-off-by: Lee Jones <lee.jones@linaro.org>

5ac98553

mfd: staging: Constify struct mfd_cell where possible · 2977dc92

Geert Uytterhoeven authored Nov 18, 2013

As of commit 03e361b2 ("mfd: Stop setting
refcounting pointers in original mfd_cell arrays"), the "cell" parameter of
mfd_add_devices() is "const" again. Hence make all cell data passed to
mfd_add_devices() const where possible.

Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: devel@driverdev.osuosl.org
Signed-off-by: Geert Uytterhoeven <geert@linux-m68k.org>
Signed-off-by: Lee Jones <lee.jones@linaro.org>

2977dc92

mfd: intel: Constify struct mfd_cell where possible · cf3c7cf6

Geert Uytterhoeven authored Nov 18, 2013

As of commit 03e361b2 ("mfd: Stop setting
refcounting pointers in original mfd_cell arrays"), the "cell" parameter of
mfd_add_devices() is "const" again. Hence make all cell data passed to
mfd_add_devices() const where possible.
Signed-off-by: Geert Uytterhoeven <geert@linux-m68k.org>
Signed-off-by: Lee Jones <lee.jones@linaro.org>

cf3c7cf6

mfd: maxim: Constify struct mfd_cell where possible · 7c0517b1

Geert Uytterhoeven authored Nov 18, 2013

As of commit 03e361b2 ("mfd: Stop setting
refcounting pointers in original mfd_cell arrays"), the "cell" parameter of
mfd_add_devices() is "const" again. Hence make all cell data passed to
mfd_add_devices() const where possible.
Signed-off-by: Geert Uytterhoeven <geert@linux-m68k.org>
Signed-off-by: Lee Jones <lee.jones@linaro.org>

7c0517b1

mfd: marvell: Constify struct mfd_cell where possible · 04e02417

Geert Uytterhoeven authored Nov 18, 2013

As of commit 03e361b2 ("mfd: Stop setting
refcounting pointers in original mfd_cell arrays"), the "cell" parameter of
mfd_add_devices() is "const" again. Hence make all cell data passed to
mfd_add_devices() const where possible.
Signed-off-by: Geert Uytterhoeven <geert@linux-m68k.org>
Signed-off-by: Lee Jones <lee.jones@linaro.org>

04e02417

mfd: toshiba: Constify struct mfd_cell where possible · afb580a9

Geert Uytterhoeven authored Nov 18, 2013

As of commit 03e361b2 ("mfd: Stop setting
refcounting pointers in original mfd_cell arrays"), the "cell" parameter of
mfd_add_devices() is "const" again. Hence make all cell data passed to
mfd_add_devices() const where possible.
Signed-off-by: Geert Uytterhoeven <geert@linux-m68k.org>
Signed-off-by: Lee Jones <lee.jones@linaro.org>

afb580a9

mfd: wolfson: Constify struct mfd_cell where possible · ad59de48

Geert Uytterhoeven authored Nov 18, 2013

As of commit 03e361b2 ("mfd: Stop setting
refcounting pointers in original mfd_cell arrays"), the "cell" parameter of
mfd_add_devices() is "const" again. Hence make all cell data passed to
mfd_add_devices() const where possible.
Signed-off-by: Geert Uytterhoeven <geert@linux-m68k.org>
Signed-off-by: Lee Jones <lee.jones@linaro.org>

ad59de48

mfd: dialog: Constify struct mfd_cell where possible · c8f675ff

Geert Uytterhoeven authored Nov 18, 2013

As of commit 03e361b2 ("mfd: Stop setting
refcounting pointers in original mfd_cell arrays"), the "cell" parameter of
mfd_add_devices() is "const" again. Hence make all cell data passed to
mfd_add_devices() const where possible.
Signed-off-by: Geert Uytterhoeven <geert@linux-m68k.org>
Signed-off-by: Lee Jones <lee.jones@linaro.org>

c8f675ff

mfd: stmicro: Constify struct mfd_cell where possible · 6bbb3c4c

Geert Uytterhoeven authored Nov 18, 2013

As of commit 03e361b2 ("mfd: Stop setting
refcounting pointers in original mfd_cell arrays"), the "cell" parameter of
mfd_add_devices() is "const" again. Hence make all cell data passed to
mfd_add_devices() const where possible.
Signed-off-by: Geert Uytterhoeven <geert@linux-m68k.org>
Signed-off-by: Lee Jones <lee.jones@linaro.org>

6bbb3c4c

mfd: cros ec: spi: Fix debug output · 9e146f43

Thierry Reding authored Nov 18, 2013

The dev_cont() symbol doesn't exist, so replace it with pr_cont(). While
at it, also append a newline to the debug output to make it look nicer.
Signed-off-by: Thierry Reding <treding@nvidia.com>
Signed-off-by: Lee Jones <lee.jones@linaro.org>

9e146f43

mfd: cros ec: spi: Increase EC transaction delay · 49f91ac3

Derek Basehore authored Nov 18, 2013

50 us is not a long enough delay between EC transactions. At least 70 us
are needed for the 16 MHz STM32L part. Increase the delay to 200 us for
an extra safety margin.
Reviewed-by: Randall Spangler <rspangler@chromium.org>
Signed-off-by: Derek Basehore <dbasehore@chromium.org>
Signed-off-by: Thierry Reding <treding@nvidia.com>
Signed-off-by: Lee Jones <lee.jones@linaro.org>

49f91ac3

mfd: omap-usb-host: Raw read and write endian fix · 9981a314

Victor Kamensky authored Nov 16, 2013

All OMAP IP blocks expect LE data, but CPU may operate in BE mode.
Need to use endian neutral functions to read/write h/w registers.
I.e instead of __raw_read[lw] and __raw_write[lw] functions code
need to use read[lw]_relaxed and write[lw]_relaxed functions.
If the first simply reads/writes register, the second will byteswap
it if host operates in BE mode.

Changes are trivial sed like replacement of __raw_xxx functions
with xxx_relaxed variant.
Signed-off-by: Victor Kamensky <victor.kamensky@linaro.org>
Signed-off-by: Taras Kondratiuk <taras.kondratiuk@linaro.org>
Signed-off-by: Lee Jones <lee.jones@linaro.org>

9981a314

mfd: omap-usb-tll: Raw read and write endian fix · dd6eb26f

Victor Kamensky authored Nov 16, 2013

All OMAP IP blocks expect LE data, but CPU may operate in BE mode.
Need to use endian neutral functions to read/write h/w registers.
I.e instead of __raw_read[lw] and __raw_write[lw] functions code
need to use read[lw]_relaxed and write[lw]_relaxed functions.
If the first simply reads/writes register, the second will byteswap
it if host operates in BE mode.

Changes are trivial sed like replacement of __raw_xxx functions
with xxx_relaxed variant.
Signed-off-by: Victor Kamensky <victor.kamensky@linaro.org>
Signed-off-by: Taras Kondratiuk <taras.kondratiuk@linaro.org>
Signed-off-by: Lee Jones <lee.jones@linaro.org>

dd6eb26f

mfd: wm5110: Correct default for register LDO2 Control 1 · c94d37c0

Charles Keepax authored Nov 15, 2013

Signed-off-by: Charles Keepax <ckeepax@opensource.wolfsonmicro.com>
Signed-off-by: Lee Jones <lee.jones@linaro.org>

c94d37c0

mfd: ab8500-debugfs: Move dereference after check for NULL · 7c0b2380

Dan Carpenter authored Nov 13, 2013

We dereference "desc" before check if it is NULL. I've shifted it
around so we check first before dereferencing.
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Lee Jones <lee.jones@linaro.org>

7c0b2380

04 Jan, 2014 1 commit
- Linux 3.13-rc7 · d6e0a2dd
  Linus Torvalds authored Jan 04, 2014
  
  d6e0a2dd
03 Jan, 2014 2 commits

Merge tag 'for-v3.13-fixes' of git://git.infradead.org/battery-2.6 · 9a2f1aad

Linus Torvalds authored Jan 03, 2014

Pull battery fixes from Anton Vorontsov:
 "Two fixes:

   - fix build error caused by max17042_battery conversion to the regmap
     API.

   - fix kernel oops when booting with wakeup_source_activate enabled"

* tag 'for-v3.13-fixes' of git://git.infradead.org/battery-2.6:
  max17042_battery: Fix build errors caused by missing REGMAP_I2C config
  power_supply: Fix Oops from NULL pointer dereference from wakeup_source_activate

9a2f1aad

Merge tag 'pm+acpi-3.13-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm · 23e8e590

Linus Torvalds authored Jan 03, 2014

Pull ACPI and PM fixes and new device IDs from Rafael Wysocki:
"These commits, except for one, are regression fixes and the remaining
one fixes a divide error leading to a kernel panic. The majority of
the regressions fixed here were introduced during the 3.12 cycle, one
of them is from this cycle and one is older.

Specifics:

- VGA switcheroo was broken for some users as a result of the
ACPI-based PCI hotplug (ACPIPHP) changes in 3.12, because some
previously ignored hotplug events started to be handled. The fix
causes them to be ignored again.

- There are two more issues related to cpufreq's suspend/resume
handling changes from the 3.12 cycle addressed by Viresh Kumar's
fixes.

- intel_pstate triggers a divide error in a timer function if the
P-state information it needs is missing during initialization.
This leads to kernel panics on nested KVM clients and is fixed by
failing the initialization cleanly in those cases.

- PCI initalization code changes during the 3.9 cycle uncovered BIOS
issues related to ACPI wakeup notifications (some BIOSes send them
for devices that aren't supposed to support ACPI wakeup). Work
around them by installing an ACPI wakeup notify handler for all PCI
devices with ACPI support.

- The Calxeda cpuilde driver's probe function is tagged as __init,
which is incorrect and causes a section mismatch to occur during
build. Fix from Andre Przywara removes the __init tag from there.

- During the 3.12 cycle ACPIPHP started to print warnings about
missing _ADR for devices that legitimately don't have it. Fix from
Toshi Kani makes it only print the warnings where they make sense"

* tag 'pm+acpi-3.13-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm:
ACPIPHP / radeon / nouveau: Fix VGA switcheroo problem related to hotplug
intel_pstate: Fail initialization if P-state information is missing
ARM/cpuidle: remove __init tag from Calxeda cpuidle probe function
PCI / ACPI: Install wakeup notify handlers for all PCI devs with ACPI
cpufreq: preserve user_policy across suspend/resume
cpufreq: Clean up after a failing light-weight initialization
ACPI / PCI / hotplug: Avoid warning when _ADR not present

23e8e590

02 Jan, 2014 15 commits

Merge git://git.kernel.org/pub/scm/virt/kvm/kvm · 7a262d2e

Linus Torvalds authored Jan 02, 2014

Pull kvm bugfixes from Marcelo Tosatti.

* git://git.kernel.org/pub/scm/virt/kvm/kvm:
  KVM: nVMX: Unconditionally uninit the MMU on nested vmexit
  KVM: x86: Fix APIC map calculation after re-enabling

7a262d2e

Merge branch 'akpm' (incoming from Andrew) · 06f055f3

Linus Torvalds authored Jan 02, 2014

Merge patches from Andrew Morton:
 "Ten fixes"

* emailed patches from Andrew Morton <akpm@linux-foundation.org>:
  epoll: do not take the nested ep->mtx on EPOLL_CTL_DEL
  sh: add EXPORT_SYMBOL(min_low_pfn) and EXPORT_SYMBOL(max_low_pfn) to sh_ksyms_32.c
  drivers/dma/ioat/dma.c: check DMA mapping error in ioat_dma_self_test()
  mm/memory-failure.c: transfer page count from head page to tail page after split thp
  MAINTAINERS: set up proper record for Xilinx Zynq
  mm: remove bogus warning in copy_huge_pmd()
  memcg: fix memcg_size() calculation
  mm: fix use-after-free in sys_remap_file_pages
  mm: munlock: fix deadlock in __munlock_pagevec()
  mm: munlock: fix a bug where THP tail page is encountered

06f055f3

epoll: do not take the nested ep->mtx on EPOLL_CTL_DEL · 4ff36ee9

Jason Baron authored Jan 02, 2014

The EPOLL_CTL_DEL path of epoll contains a classic, ab-ba deadlock.
That is, epoll_ctl(a, EPOLL_CTL_DEL, b, x), will deadlock with
epoll_ctl(b, EPOLL_CTL_DEL, a, x).  The deadlock was introduced with
commmit 67347fe4 ("epoll: do not take global 'epmutex' for simple
topologies").

The acquistion of the ep->mtx for the destination 'ep' was added such
that a concurrent EPOLL_CTL_ADD operation would see the correct state of
the ep (Specifically, the check for '!list_empty(&f.file->f_ep_links')

However, by simply not acquiring the lock, we do not serialize behind
the ep->mtx from the add path, and thus may perform a full path check
when if we had waited a little longer it may not have been necessary.
However, this is a transient state, and performing the full loop
checking in this case is not harmful.

The important point is that we wouldn't miss doing the full loop
checking when required, since EPOLL_CTL_ADD always locks any 'ep's that
its operating upon.  The reason we don't need to do lock ordering in the
add path, is that we are already are holding the global 'epmutex'
whenever we do the double lock.  Further, the original posting of this
patch, which was tested for the intended performance gains, did not
perform this additional locking.
Signed-off-by: Jason Baron <jbaron@akamai.com>
Cc: Nathan Zimmer <nzimmer@sgi.com>
Cc: Eric Wong <normalperson@yhbt.net>
Cc: Nelson Elhage <nelhage@nelhage.com>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Davide Libenzi <davidel@xmailserver.org>
Cc: "Paul E. McKenney" <paulmck@us.ibm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

4ff36ee9

sh: add EXPORT_SYMBOL(min_low_pfn) and EXPORT_SYMBOL(max_low_pfn) to sh_ksyms_32.c · ad70b029

Nobuhiro Iwamatsu authored Jan 02, 2014

Min_low_pfn and max_low_pfn were used in pfn_valid macro if defined
CONFIG_FLATMEM.  When the functions that use the pfn_valid is used in
driver module, max_low_pfn and min_low_pfn is to undefined, and fail to
build.

  ERROR: "min_low_pfn" [drivers/block/aoe/aoe.ko] undefined!
  ERROR: "max_low_pfn" [drivers/block/aoe/aoe.ko] undefined!
  make[2]: *** [__modpost] Error 1
  make[1]: *** [modules] Error 2

This patch fix this problem.
Signed-off-by: Nobuhiro Iwamatsu <nobuhiro.iwamatsu.yj@renesas.com>
Cc: Kuninori Morimoto <kuninori.morimoto.gx@gmail.com>
Cc: Paul Mundt <lethal@linux-sh.org>
Cc: Geert Uytterhoeven <geert@linux-m68k.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

ad70b029

drivers/dma/ioat/dma.c: check DMA mapping error in ioat_dma_self_test() · 3532e566

Jiang Liu authored Jan 02, 2014

Check DMA mapping return values in function ioat_dma_self_test() to get
rid of following warning message.

  ------------[ cut here ]------------
  WARNING: CPU: 0 PID: 1203 at lib/dma-debug.c:937 check_unmap+0x4c0/0x9a0()
  ioatdma 0000:00:04.0: DMA-API: device driver failed to check map error[device address=0x000000085191b000] [size=2000 bytes] [mapped as single]
  Modules linked in: ioatdma(+) mac_hid wmi acpi_pad lp parport hidd_generic usbhid hid ixgbe isci dca libsas ahci ptp libahci scsi_transport_sas meegaraid_sas pps_core mdio
  CPU: 0 PID: 1203 Comm: systemd-udevd Not tainted 3.13.0-rc4+ #8
  Hardware name: Intel Corporation BRICKLAND/BRICKLAND, BIOS BRIVTIIN1.86B.0044.L09.1311181644 11/18/2013
  Call Trace:
    dump_stack+0x4d/0x66
    warn_slowpath_common+0x7d/0xa0
    warn_slowpath_fmt+0x4c/0x50
    check_unmap+0x4c0/0x9a0
    debug_dma_unmap_page+0x81/0x90
    ioat_dma_self_test+0x3d2/0x680 [ioatdma]
    ioat3_dma_self_test+0x12/0x30 [ioatdma]
    ioat_probe+0xf4/0x110 [ioatdma]
    ioat3_dma_probe+0x268/0x410 [ioatdma]
    ioat_pci_probe+0x122/0x1b0 [ioatdma]
    local_pci_probe+0x45/0xa0
    pci_device_probe+0xd9/0x130
    driver_probe_device+0x171/0x490
    __driver_attach+0x93/0xa0
    bus_for_each_dev+0x6b/0xb0
    driver_attach+0x1e/0x20
    bus_add_driver+0x1f8/0x2b0
    driver_register+0x81/0x110
    __pci_register_driver+0x60/0x70
    ioat_init_module+0x89/0x1000 [ioatdma]
    do_one_initcall+0xe2/0x250
    load_module+0x2313/0x2a00
    SyS_init_module+0xd9/0x130
    system_call_fastpath+0x1a/0x1f
  ---[ end trace 990c591681d27c31 ]---
  Mapped at:
    debug_dma_map_page+0xbe/0x180
    ioat_dma_self_test+0x1ab/0x680 [ioatdma]
    ioat3_dma_self_test+0x12/0x30 [ioatdma]
    ioat_probe+0xf4/0x110 [ioatdma]
    ioat3_dma_probe+0x268/0x410 [ioatdma]
Signed-off-by: Jiang Liu <jiang.liu@linux.intel.com>
Cc: Vinod Koul <vinod.koul@intel.com>
Cc: Dan Williams <dan.j.williams@intel.com>
Cc: Bartlomiej Zolnierkiewicz <b.zolnierkie@samsung.com>
Cc: Kyungmin Park <kyungmin.park@samsung.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

3532e566

mm/memory-failure.c: transfer page count from head page to tail page after split thp · a3e0f9e4

Naoya Horiguchi authored Jan 02, 2014

Memory failures on thp tail pages cause kernel panic like below:

   mce: [Hardware Error]: Machine check events logged
   MCE exception done on CPU 7
   BUG: unable to handle kernel NULL pointer dereference at 0000000000000058
   IP: [<ffffffff811b7cd1>] dequeue_hwpoisoned_huge_page+0x131/0x1e0
   PGD bae42067 PUD ba47d067 PMD 0
   Oops: 0000 [#1] SMP
  ...
   CPU: 7 PID: 128 Comm: kworker/7:2 Tainted: G   M       O 3.13.0-rc4-131217-1558-00003-g83b7df08e462 #25
  ...
   Call Trace:
     me_huge_page+0x3e/0x50
     memory_failure+0x4bb/0xc20
     mce_process_work+0x3e/0x70
     process_one_work+0x171/0x420
     worker_thread+0x11b/0x3a0
     ? manage_workers.isra.25+0x2b0/0x2b0
     kthread+0xe4/0x100
     ? kthread_create_on_node+0x190/0x190
     ret_from_fork+0x7c/0xb0
     ? kthread_create_on_node+0x190/0x190
  ...
   RIP   dequeue_hwpoisoned_huge_page+0x131/0x1e0
   CR2: 0000000000000058

The reasoning of this problem is shown below:
 - when we have a memory error on a thp tail page, the memory error
   handler grabs a refcount of the head page to keep the thp under us.
 - Before unmapping the error page from processes, we split the thp,
   where page refcounts of both of head/tail pages don't change.
 - Then we call try_to_unmap() over the error page (which was a tail
   page before). We didn't pin the error page to handle the memory error,
   this error page is freed and removed from LRU list.
 - We never have the error page on LRU list, so the first page state
   check returns "unknown page," then we move to the second check
   with the saved page flag.
 - The saved page flag have PG_tail set, so the second page state check
   returns "hugepage."
 - We call me_huge_page() for freed error page, then we hit the above panic.

The root cause is that we didn't move refcount from the head page to the
tail page after split thp.  So this patch suggests to do this.

This panic was introduced by commit 524fca1e ("HWPOISON: fix
misjudgement of page_action() for errors on mlocked pages").  Note that we
did have the same refcount problem before this commit, but it was just
ignored because we had only first page state check which returned "unknown
page." The commit changed the refcount problem from "doesn't work" to
"kernel panic."
Signed-off-by: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com>
Reviewed-by: Wanpeng Li <liwanp@linux.vnet.ibm.com>
Cc: Andi Kleen <andi@firstfloor.org>
Cc: <stable@vger.kernel.org>	[3.9+]
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

a3e0f9e4

MAINTAINERS: set up proper record for Xilinx Zynq · c2fd4e38

Michal Simek authored Jan 02, 2014

Setup correct zynq entry.
 - Add missing cadence_ttc_timer maintainership
 - Add zynq wildcard
 - Add xilinx wildcard
Signed-off-by: Michal Simek <michal.simek@xilinx.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

c2fd4e38

mm: remove bogus warning in copy_huge_pmd() · d0319bd5

Mel Gorman authored Jan 02, 2014

Sasha Levin reported the following warning being triggered

  WARNING: CPU: 28 PID: 35287 at mm/huge_memory.c:887 copy_huge_pmd+0x145/ 0x3a0()
  Call Trace:
    copy_huge_pmd+0x145/0x3a0
    copy_page_range+0x3f2/0x560
    dup_mmap+0x2c9/0x3d0
    dup_mm+0xad/0x150
    copy_process+0xa68/0x12e0
    do_fork+0x96/0x270
    SyS_clone+0x16/0x20
    stub_clone+0x69/0x90

This warning was introduced by "mm: numa: Avoid unnecessary disruption
of NUMA hinting during migration" for paranoia reasons but the warning
is bogus.  I was thinking of parallel races between NUMA hinting faults
and forks but this warning would also be triggered by a parallel reclaim
splitting a THP during a fork.  Remote the bogus warning.
Signed-off-by: Mel Gorman <mgorman@suse.de>
Reported-by: Sasha Levin <sasha.levin@oracle.com>
Cc: Alex Thorlton <athorlton@sgi.com>
Cc: Rik van Riel <riel@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

d0319bd5

memcg: fix memcg_size() calculation · 695c6083

Vladimir Davydov authored Jan 02, 2014

The mem_cgroup structure contains nr_node_ids pointers to
mem_cgroup_per_node objects, not the objects themselves.
Signed-off-by: Vladimir Davydov <vdavydov@parallels.com>
Acked-by: Michal Hocko <mhocko@suse.cz>
Cc: Glauber Costa <glommer@openvz.org>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Balbir Singh <bsingharora@gmail.com>
Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

695c6083

mm: fix use-after-free in sys_remap_file_pages · 4eb91982

Rik van Riel authored Jan 02, 2014

remap_file_pages calls mmap_region, which may merge the VMA with other
existing VMAs, and free "vma".  This can lead to a use-after-free bug.
Avoid the bug by remembering vm_flags before calling mmap_region, and
not trying to dereference vma later.
Signed-off-by: Rik van Riel <riel@redhat.com>
Reported-by: Dmitry Vyukov <dvyukov@google.com>
Cc: PaX Team <pageexec@freemail.hu>
Cc: Kees Cook <keescook@chromium.org>
Cc: Michel Lespinasse <walken@google.com>
Cc: Cyrill Gorcunov <gorcunov@openvz.org>
Cc: Hugh Dickins <hughd@google.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

4eb91982

mm: munlock: fix deadlock in __munlock_pagevec() · 3b25df93

Vlastimil Babka authored Jan 02, 2014

Commit 7225522b ("mm: munlock: batch non-THP page isolation and
munlock+putback using pagevec" introduced __munlock_pagevec() to speed
up munlock by holding lru_lock over multiple isolated pages.  Pages that
fail to be isolated are put_page()d immediately, also within the lock.

This can lead to deadlock when __munlock_pagevec() becomes the holder of
the last page pin and put_page() leads to __page_cache_release() which
also locks lru_lock.  The deadlock has been observed by Sasha Levin
using trinity.

This patch avoids the deadlock by deferring put_page() operations until
lru_lock is released.  Another pagevec (which is also used by later
phases of the function is reused to gather the pages for put_page()
operation.
Signed-off-by: Vlastimil Babka <vbabka@suse.cz>
Reported-by: Sasha Levin <sasha.levin@oracle.com>
Cc: Michel Lespinasse <walken@google.com>
Cc: Andrea Arcangeli <aarcange@redhat.com>
Cc: Rik van Riel <riel@redhat.com>
Cc: Mel Gorman <mgorman@suse.de>
Cc: Hugh Dickins <hughd@google.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

3b25df93

mm: munlock: fix a bug where THP tail page is encountered · c424be1c

Vlastimil Babka authored Jan 02, 2014

Since commit ff6a6da6 ("mm: accelerate munlock() treatment of THP
pages") munlock skips tail pages of a munlocked THP page.  However, when
the head page already has PageMlocked unset, it will not skip the tail
pages.

Commit 7225522b ("mm: munlock: batch non-THP page isolation and
munlock+putback using pagevec") has added a PageTransHuge() check which
contains VM_BUG_ON(PageTail(page)).  Sasha Levin found this triggered
using trinity, on the first tail page of a THP page without PageMlocked
flag.

This patch fixes the issue by skipping tail pages also in the case when
PageMlocked flag is unset.  There is still a possibility of race with
THP page split between clearing PageMlocked and determining how many
pages to skip.  The race might result in former tail pages not being
skipped, which is however no longer a bug, as during the skip the
PageTail flags are cleared.

However this race also affects correctness of NR_MLOCK accounting, which
is to be fixed in a separate patch.
Signed-off-by: Vlastimil Babka <vbabka@suse.cz>
Reported-by: Sasha Levin <sasha.levin@oracle.com>
Cc: Michel Lespinasse <walken@google.com>
Cc: Andrea Arcangeli <aarcange@redhat.com>
Cc: Rik van Riel <riel@redhat.com>
Cc: Mel Gorman <mgorman@suse.de>
Cc: Hugh Dickins <hughd@google.com>
Cc: Bob Liu <bob.liu@oracle.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

c424be1c

Merge tag 'gfs2-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/steve/gfs2-3.0-fixes · 152b734a

Linus Torvalds authored Jan 02, 2014

Pull GFS2 fixes from Steven Whitehouse:
 "Here is a set of small fixes for GFS2.  There is a fix to drop
  s_umount which is copied in from the core vfs, two patches relate to a
  hard to hit "use after free" and memory leak.  Two patches related to
  using DIO and buffered I/O on the same file to ensure correct
  operation in relation to glock state changes.  The final patch adds an
  RCU read lock to ensure correct locking on an error path"

* tag 'gfs2-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/steve/gfs2-3.0-fixes:
  GFS2: Fix unsafe dereference in dump_holder()
  GFS2: Wait for async DIO in glock state changes
  GFS2: Fix incorrect invalidation for DIO/buffered I/O
  GFS2: Fix slab memory leak in gfs2_bufdata
  GFS2: Fix use-after-free race when calling gfs2_remove_from_ail
  GFS2: don't hold s_umount over blkdev_put

152b734a

Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux · b4796679

Linus Torvalds authored Jan 02, 2014

Pull s390 fixes from Martin Schwidefsky:
 "Two small bug fixes and a follow-up to the CONFIG_NR_CPUS change.

  A kernel compiled with CONFIG_NR_CPUS=256 will waste quite a bit of
  memory for the per-cpu arrays.  Under z/VM the maximum number of CPUs
  is 64, the code now limits the possible cpu mask to 64 if running
  under z/VM"

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux:
  s390/pci: obtain function handle in hotplug notifier
  s390/3270: fix allocation of tty3270_screen structure
  s390/smp: improve setup of possible cpu mask

b4796679

KVM: nVMX: Unconditionally uninit the MMU on nested vmexit · 29bf08f1

Jan Kiszka authored Dec 28, 2013

Three reasons for doing this: 1. arch.walk_mmu points to arch.mmu anyway
in case nested EPT wasn't in use. 2. this aligns VMX with SVM. But 3. is
most important: nested_cpu_has_ept(vmcs12) queries the VMCS page, and if
one guest VCPU manipulates the page of another VCPU in L2, we may be
fooled to skip over the nested_ept_uninit_mmu_context, leaving mmu in
nested state. That can crash the host later on if nested_ept_get_cr3 is
invoked while L1 already left vmxon and nested.current_vmcs12 became
NULL therefore.

Cc: stable@kernel.org
Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>

29bf08f1