Commits · 9823d713bf257e2e0dc5b6ea64f270d33e1212ee · Kirill Smelkov / linux

16 Dec, 2014 5 commits

net: mvneta: fix Tx interrupt delay · 9823d713

willy tarreau authored Dec 02, 2014

[ Upstream commit aebea2ba ]

The mvneta driver sets the amount of Tx coalesce packets to 16 by
default. Normally that does not cause any trouble since the driver
uses a much larger Tx ring size (532 packets). But some sockets
might run with very small buffers, much smaller than the equivalent
of 16 packets. This is what ping is doing for example, by setting
SNDBUF to 324 bytes rounded up to 2kB by the kernel.

The problem is that there is no documented method to force a specific
packet to emit an interrupt (eg: the last of the ring) nor is it
possible to make the NIC emit an interrupt after a given delay.

In this case, it causes trouble, because when ping sends packets over
its raw socket, the few first packets leave the system, and the first
15 packets will be emitted without an IRQ being generated, so without
the skbs being freed. And since the socket's buffer is small, there's
no way to reach that amount of packets, and the ping ends up with
"send: no buffer available" after sending 6 packets. Running with 3
instances of ping in parallel is enough to hide the problem, because
with 6 packets per instance, that's 18 packets total, which is enough
to grant a Tx interrupt before all are sent.

The original driver in the LSP kernel worked around this design flaw
by using a software timer to clean up the Tx descriptors. This timer
was slow and caused terrible network performance on some Tx-bound
workloads (such as routing) but was enough to make tools like ping
work correctly.

Instead here, we simply set the packet counts before interrupt to 1.
This ensures that each packet sent will produce an interrupt. NAPI
takes care of coalescing interrupts since the interrupt is disabled
once generated.

No measurable performance impact nor CPU usage were observed on small
nor large packets, including when saturating the link on Tx, and this
fixes tools like ping which rely on too small a send buffer. If one
wants to increase this value for certain workloads where it is safe
to do so, "ethtool -C $dev tx-frames" will override this default
setting.

This fix needs to be applied to stable kernels starting with 3.10.
Tested-By: Maggie Mae Roxas <maggie.mae.roxas@gmail.com>
Signed-off-by: Willy Tarreau <w@1wt.eu>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

9823d713

mips: bpf: Fix broken BPF_MOD · 6c2f1fef

Denis Kirjanov authored Dec 01, 2014

[ Upstream commit 2e46477a ]

Remove optimize_div() from BPF_MOD | BPF_K case
since we don't know the dividend and fix the
emit_mod() by reading the mod operation result from HI register
Signed-off-by: Denis Kirjanov <kda@linux-powerpc.org>
Reviewed-by: Markos Chandras <markos.chandras@imgtec.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

6c2f1fef

openvswitch: Fix flow mask validation. · 3e496d49

Pravin B Shelar authored Nov 30, 2014

[ Upstream commit f2a01517 ]

Following patch fixes typo in the flow validation. This prevented
installation of ARP and IPv6 flows.

Fixes: 19e7a3df ("openvswitch: Fix NDP flow mask validation")
Signed-off-by: Pravin B Shelar <pshelar@nicira.com>
Reviewed-by: Thomas Graf <tgraf@suug.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

3e496d49

gre: Set inner mac header in gro complete · 435dcf66

Tom Herbert authored Nov 29, 2014

[ Upstream commit 6fb2a756 ]

Set the inner mac header to point to the GRE payload when
doing GRO. This is needed if we proceed to send the packet
through GRE GSO which now uses the inner mac header instead
of inner network header to determine the length of encapsulation
headers.

Fixes: 14051f04 ("gre: Use inner mac length when computing tunnel length")
Reported-by: Wolfgang Walter <linux@stwm.de>
Signed-off-by: Tom Herbert <therbert@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

435dcf66

Fix race condition between vxlan_sock_add and vxlan_sock_release · 8407165b

Marcelo Leitner authored Dec 11, 2014

[ Upstream commit 00c83b01 ]

Currently, when trying to reuse a socket, vxlan_sock_add will grab
vn->sock_lock, locate a reusable socket, inc refcount and release
vn->sock_lock.

But vxlan_sock_release() will first decrement refcount, and then grab
that lock. refcnt operations are atomic but as currently we have
deferred works which hold vs->refcnt each, this might happen, leading to
a use after free (specially after vxlan_igmp_leave):

  CPU 1                            CPU 2

deferred work                    vxlan_sock_add
  ...                              ...
                                   spin_lock(&vn->sock_lock)
                                   vs = vxlan_find_sock();
  vxlan_sock_release
    dec vs->refcnt, reaches 0
    spin_lock(&vn->sock_lock)
                                   vxlan_sock_hold(vs), refcnt=1
                                   spin_unlock(&vn->sock_lock)
    hlist_del_rcu(&vs->hlist);
    vxlan_notify_del_rx_port(vs)
    spin_unlock(&vn->sock_lock)

So when we look for a reusable socket, we check if it wasn't freed
already before reusing it.
Signed-off-by: Marcelo Ricardo Leitner <mleitner@redhat.com>
Fixes: 7c47cedf ("vxlan: move IGMP join/leave to work queue")
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

8407165b

07 Dec, 2014 2 commits

Linux 3.18 · b2776bf7
Linus Torvalds authored Dec 07, 2014

b2776bf7

Merge branch 'for-3.18-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/libata · 820b688b

Linus Torvalds authored Dec 07, 2014

Pull libata fixes from Tejun Heo:
 "Three libata fixes for v3.18.  Nothing too interesting.  PCI ID ID and
  quirk additions to ahci and an error handling path fix in sata_fsl"

* 'for-3.18-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/libata:
  ahci: disable MSI on SAMSUNG 0xa800 SSD
  sata_fsl: fix error handling of irq_of_parse_and_map
  AHCI: Add DeviceIDs for Sunrise Point-LP SATA controller

820b688b

06 Dec, 2014 2 commits

Merge git://www.linux-watchdog.org/linux-watchdog · 19b02257

Linus Torvalds authored Dec 06, 2014

Pull watchdog fix from Wim Van Sebroeck:
 "Fix the watchdog mask bit offset for Exynos7"

* git://www.linux-watchdog.org/linux-watchdog:
  watchdog: s3c2410_wdt: Fix the mask bit offset for Exynos7

19b02257

Merge branch 'i2c/for-current' of git://git.kernel.org/pub/scm/linux/kernel/git/wsa/linux · 15bd1e5c

Linus Torvalds authored Dec 06, 2014

Pull i2c fixes from Wolfram Sang:
 "Here are two more driver bugfixes for I2C which would be good to have"

* 'i2c/for-current' of git://git.kernel.org/pub/scm/linux/kernel/git/wsa/linux:
  i2c: cadence: Set the hardware time-out register to maximum value
  i2c: davinci: generate STP always when NACK is received

15bd1e5c

05 Dec, 2014 7 commits

watchdog: s3c2410_wdt: Fix the mask bit offset for Exynos7 · 5476b2b7

Abhilash Kesavan authored Oct 17, 2014

The watchdog mask bit offset listed for Exynos7 is incorrect.
Fix this.
Signed-off-by: Abhilash Kesavan <a.kesavan@samsung.com>
Acked-by: Naveen Krishna Chatradhi <naveenkrishna.ch@gmail.com
Reviewd-by: Guenter Roeck <linux@roeck-us.net>
Signed-off-by: Wim Van Sebroeck <wim@iguana.be>

5476b2b7

Merge branch 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip · beb5af40

Linus Torvalds authored Dec 05, 2014

Pull x86 fixes from Thomas Gleixner:
 "Two final fixlets for 3.18:
   - Prevent microcode reload wreckage on 32bit
   - Unbreak cross compilation"

* 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  x86, microcode: Limit the microcode reloading to 64-bit for now
  x86: Use $(OBJDUMP) instead of plain objdump

beb5af40

Merge tag 'sound-3.18' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound · 32f0880b

Linus Torvalds authored Dec 05, 2014

Pull sound fixlet from Takashi Iwai:
 "Just one commit for adding a copule of HD-audio quirk entries"

* tag 'sound-3.18' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound:
  ALSA: hda/realtek - Add headset Mic support for new Dell machine

32f0880b

Merge branch 'drm-fixes' of git://people.freedesktop.org/~airlied/linux · ba2cb64b

Linus Torvalds authored Dec 04, 2014

Pull drm intel fixes from Dave Airlie:
 "Two intel stable fixes, that should be it from me for this round"

* 'drm-fixes' of git://people.freedesktop.org/~airlied/linux:
  drm/i915: Unlock panel even when LVDS is disabled
  drm/i915: More cautious with pch fifo underruns

ba2cb64b

Merge tag 'pm+acpi-3.18-rc8' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm · 56c67ce1

Linus Torvalds authored Dec 04, 2014

Pull ACPI backlight fix from Rafael Wysocki:
 "This is a simple fix for an ACPI backlight regression introduced by a
  recent commit that overlooked a corner case which should have been
  taken into account"

* tag 'pm+acpi-3.18-rc8' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm:
  ACPI / video: update condition to check if device is in _DOD list

56c67ce1

Merge tag 'drm-intel-fixes-2014-12-04' of git://anongit.freedesktop.org/drm-intel into drm-fixes · 3e3282c0

Dave Airlie authored Dec 05, 2014

Silence some pch fifo underrun reports and panel locking backtraces,
both cc: stable.

* tag 'drm-intel-fixes-2014-12-04' of git://anongit.freedesktop.org/drm-intel:
  drm/i915: Unlock panel even when LVDS is disabled
  drm/i915: More cautious with pch fifo underruns

3e3282c0

Merge tag 'media/v3.18-rc8' of git://git.kernel.org/pub/scm/linux/kernel/git/mchehab/linux-media · ebea76f5

Linus Torvalds authored Dec 04, 2014

Pull media fixes from Mauro Carvalho Chehab:
 "A core fix and some driver fixes:
   - regression fix in Remote Controller core affecting RC6 protocol
     handling
   - fix video buffer handling in cx23885
   - race fix in solo6x10
   - fix image selection in smiapp
   - fix reported payload size on s2255drv
   - two updates for MAINTAINERS file"

* tag 'media/v3.18-rc8' of git://git.kernel.org/pub/scm/linux/kernel/git/mchehab/linux-media:
  [media] rc-core: fix toggle handling in the rc6 decoder
  MAINTAINERS: Update mchehab's addresses
  [media] cx23885: use sg = sg_next(sg) instead of sg++
  [media] s2255drv: fix payload size for JPG, MJPEG
  [media] Update MAINTAINERS for solo6x10
  [media] solo6x10: fix a race in IRQ handler
  [media] smiapp: Only some selection targets are settable

ebea76f5

04 Dec, 2014 5 commits

uapi: fix to export linux/vm_sockets.h · d0747f10

Masahiro Yamada authored Dec 04, 2014

A typo "header=y" was introduced by commit 7071cf7f ("uapi: add
missing network related headers to kbuild").
Signed-off-by: Masahiro Yamada <yamada.m@jp.panasonic.com>
Cc: Stephen Hemminger <stephen@networkplumber.org>
Cc: David Miller <davem@davemloft.net>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

d0747f10

i2c: cadence: Set the hardware time-out register to maximum value · 681d15a0

Vishnu Motghare authored Dec 03, 2014

Cadence I2C controller has bug wherein it generates invalid read transactions
after timeout in master receiver mode. This driver does not use the HW
timeout and this interrupt is disabled but the feature itself cannot be
disabled. Hence, this patch writes the maximum value (0xFF) to this register.
This is one of the workarounds to this bug and it will not avoid the issue
completely but reduces the chances of error.
Signed-off-by: Vishnu Motghare <vishnum@xilinx.com>
Signed-off-by: Harini Katakam <harinik@xilinx.com>
Signed-off-by: Wolfram Sang <wsa@the-dreams.de>
Cc: stable@kernel.org

681d15a0

i2c: davinci: generate STP always when NACK is received · 9ea359f7

Grygorii Strashko authored Dec 01, 2014

According to I2C specification the NACK should be handled as follows:
"When SDA remains HIGH during this ninth clock pulse, this is defined as the Not
Acknowledge signal. The master can then generate either a STOP condition to
abort the transfer, or a repeated START condition to start a new transfer."
[I2C spec Rev. 6, 3.1.6: http://www.nxp.com/documents/user_manual/UM10204.pdf]

Currently the Davinci i2c driver interrupts the transfer on receipt of a
NACK but fails to send a STOP in some situations and so makes the bus
stuck until next I2C IP reset (idle/enable).

For example, the issue will happen during SMBus read transfer which
consists from two i2c messages write command/address and read data:

S Slave Address Wr A Command Code A Sr Slave Address Rd A D1..Dn A P
<--- write -----------------------> <--- read --------------------->

The I2C client device will send NACK if it can't recognize "Command Code"
and it's expected from I2C master to generate STP in this case.
But now, Davinci i2C driver will just exit with -EREMOTEIO and STP will
not be generated.

Hence, fix it by generating Stop condition (STP) always when NACK is received.

This patch fixes Davinci I2C in the same way it was done for OMAP I2C
commit cda2109a ("i2c: omap: query STP always when NACK is received").
Reviewed-by: Uwe Kleine-König <u.kleine-koenig@pengutronix.de>
Reported-by: Hein Tibosch <hein_tibosch@yahoo.es>
Signed-off-by: Grygorii Strashko <grygorii.strashko@ti.com>
Signed-off-by: Wolfram Sang <wsa@the-dreams.de>
Cc: stable@kernel.org

9ea359f7

ahci: disable MSI on SAMSUNG 0xa800 SSD · 2b21ef0a

Tejun Heo authored Dec 04, 2014

Just like 0x1600 which got blacklisted by 66a7cbc3 ("ahci: disable
MSI instead of NCQ on Samsung pci-e SSDs on macbooks"), 0xa800 chokes
on NCQ commands if MSI is enabled.  Disable MSI.
Signed-off-by: Tejun Heo <tj@kernel.org>
Reported-by: Dominik Mierzejewski <dominik@greysector.net>
Link: https://bugzilla.kernel.org/show_bug.cgi?id=89171
Cc: stable@vger.kernel.org

2b21ef0a

context_tracking: Restore previous state in schedule_user · 7cc78f8f

Andy Lutomirski authored Dec 03, 2014

It appears that some SCHEDULE_USER (asm for schedule_user) callers
in arch/x86/kernel/entry_64.S are called from RCU kernel context,
and schedule_user will return in RCU user context.  This causes RCU
warnings and possible failures.

This is intended to be a minimal fix suitable for 3.18.
Reported-and-tested-by: Dave Jones <davej@redhat.com>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Frédéric Weisbecker <fweisbec@gmail.com>
Acked-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Signed-off-by: Andy Lutomirski <luto@amacapital.net>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

7cc78f8f

03 Dec, 2014 19 commits

Merge branch 'i2c/for-current' of git://git.kernel.org/pub/scm/linux/kernel/git/wsa/linux · ebcd241a

Linus Torvalds authored Dec 03, 2014

Pull i2c bugfixes from Wolfram Sang:
 "A few driver bugfixes for 3.18"

* 'i2c/for-current' of git://git.kernel.org/pub/scm/linux/kernel/git/wsa/linux:
  i2c: omap: fix i207 errata handling
  i2c: designware: prevent early stop on TX FIFO empty
  i2c: omap: fix NACK and Arbitration Lost irq handling

ebcd241a

Merge tag 'pci-v3.18-fixes-4' of git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci · 5dc62635

Linus Torvalds authored Dec 03, 2014

Pull PCI fix from Bjorn Helgaas:
 "This fixes a Tegra20 regression that we introduced during the v3.18
  merge window"

* tag 'pci-v3.18-fixes-4' of git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci:
  PCI: tegra: Use physical range for I/O mapping

5dc62635

Merge tag 'devicetree-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/glikely/linux · b48a20a5

Linus Torvalds authored Dec 03, 2014

Pull devicetree bugfix from Grant Likely:
 "One more bug fix for v3.18.  I debated whether or not to send you this
  merge request because we're at such a late rc.  The bug isn't critical
  in that there is only one system known to be affected and the patch is
  easy to backport.  The codepath is used by pretty much every DT based
  system, so there is risk a of regression (it /should/ be safe, but
  I've been bitten by stuff that should be safe before).  I've had it in
  linux-next for a week and haven't received any complaints.

  I think it probably should just be merged right away rather than
  waiting for the merge window and backporting.  It does fix a real bug
  and the code is theoretically safer after the change.  I can't think
  of any situation where it would be dangerous to reserve the DT memory
  an extra time.

  Summary from tag:

    Single bugfix for boot failure seen in the wild.  The memory reserve
    code tries to be clever about reserving the FDT, but it should just
    go ahead and reserve it unconditionally to avoid the problem of
    partial overlap described in the patch"

* tag 'devicetree-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/glikely/linux:
  of/fdt: memblock_reserve /memreserve/ regions in the case of partial overlap

b48a20a5

Merge branch 'for-linus' of git://git.kernel.dk/linux-block · 93bd38b3

Linus Torvalds authored Dec 03, 2014

Pull block core regression fix from Jens Axboe:
 "Single fix for a regression introduced in this development cycle,
  where dm on top of dif/dix is broken.  From Darrick Wong"

* 'for-linus' of git://git.kernel.dk/linux-block:
  block: fix regression where bio_integrity_process uses wrong bio_vec iterator

93bd38b3

Merge branch 'drm-fixes' of git://people.freedesktop.org/~airlied/linux · 46d967ae

Linus Torvalds authored Dec 03, 2014

Pull drm fixes from Dave Airlie:
 "Radeon and Nouveau fixes:

  So nouveau had a few regression introduced, Ben and Maarten finally
  tracked down the one that was causing problems on my MacBookPro, also
  nvidia gave some info on the an engine we were using incorrectly, so
  disable our use of it, and one regresion with pci hotplug affecting
  optimus users.

  Radeon has an oops fixs, sync fix, and one workaround to avoid broken
  functionality on 32-bit x86, this needs better root causing and a
  better fix, but the bandaid is a lot safer at this point"

* 'drm-fixes' of git://people.freedesktop.org/~airlied/linux:
  drm/radeon: kernel panic in drm_calc_vbltimestamp_from_scanoutpos with 3.18.0-rc6
  drm/radeon: Ignore RADEON_GEM_GTT_WC on 32-bit x86
  drm/radeon: sync all BOs involved in a CS v2
  nouveau: move the hotplug ignore to correct place.
  drm/nouveau/gf116: remove copy1 engine
  drm/nouveau: prevent stale fence->channel pointers, and protect with rcu
  drm/nouveau/fifo/g84-: ack non-stall interrupt before handling it

46d967ae

Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net · 9044f940

Linus Torvalds authored Dec 03, 2014

Pull networking fixes from David Miller:

 1) Fill in ethtool link parameters for all link types in cxgb4, from
    Hariprasad Shenai.

 2) Fix probe regressions in stmmac driver, from Huacai Chen.

 3) Network namespace leaks on errirs in rtnetlink, from Nicolas
    Dichtel.

 4) Remove erroneous BUG check which can actually trigger legitimately,
    in xen-netfront.  From Seth Forshee.

 5) Validate length of IFLA_BOND_ARP_IP_TARGET netlink attributes, from
    Thomas Grag.

* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net:
  cxgb4: Fill in supported link mode for SFP modules
  xen-netfront: Remove BUGs on paged skb data which crosses a page boundary
  sh_eth: Fix sleeping function called from invalid context
  stmmac: platform: Move plat_dat checking earlier
  sh_eth: Fix skb alloc size and alignment adjust rule.
  rtnetlink: release net refcnt on error in do_setlink()
  bond: Check length of IFLA_BOND_ARP_IP_TARGET attributes

9044f940

Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/linux-security · 23c836ce

Linus Torvalds authored Dec 03, 2014

Pull keyring/nfs fixes from James Morris:
 "From David Howells:

  The first one fixes the handling of maximum buffer size for key
  descriptions, fixing the size at 4095 + NUL char rather than whatever
  PAGE_SIZE happens to be and permits you to read back the full
  description without it getting clipped because some extra information
  got prepended.

  The second and third fix a bug in NFS idmapper handling whereby a key
  representing a mapping between an id and a name expires and causing
  EKEYEXPIRED to be seen internally in NFS (which prevents the mapping
  from happening) rather than re-looking up the mapping"

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/linux-security:
  KEYS: request_key() should reget expired keys rather than give EKEYEXPIRED
  KEYS: Simplify KEYRING_SEARCH_{NO,DO}_STATE_CHECK flags
  KEYS: Fix the size of the key description passed to/from userspace

23c836ce

Merge branch 'akpm' (patches from Andrew Morton) · 1dd909af

Linus Torvalds authored Dec 03, 2014

Merge misc fixes from Andrew Morton:
 "10 fixes"

* emailed patches from Andrew Morton <akpm@linux-foundation.org>:
  slab: fix nodeid bounds check for non-contiguous node IDs
  lib/genalloc.c: export devm_gen_pool_create() for modules
  mm: fix anon_vma_clone() error treatment
  mm: fix swapoff hang after page migration and fork
  fat: fix oops on corrupted vfat fs
  ipc/sem.c: fully initialize sem_array before making it visible
  drivers/input/evdev.c: don't kfree() a vmalloc address
  mm/vmpressure.c: fix race in vmpressure_work_fn()
  mm: frontswap: invalidate expired data on a dup-store failure
  mm: do not overwrite reserved pages counter at show_mem()

1dd909af

slab: fix nodeid bounds check for non-contiguous node IDs · 7c3fbbdd

Paul Mackerras authored Dec 02, 2014

The bounds check for nodeid in ____cache_alloc_node gives false
positives on machines where the node IDs are not contiguous, leading to
a panic at boot time.  For example, on a POWER8 machine the node IDs are
typically 0, 1, 16 and 17.  This means that num_online_nodes() returns
4, so when ____cache_alloc_node is called with nodeid = 16 the VM_BUG_ON
triggers, like this:

  kernel BUG at /home/paulus/kernel/kvm/mm/slab.c:3079!
  Call Trace:
    .____cache_alloc_node+0x5c/0x270 (unreliable)
    .kmem_cache_alloc_node_trace+0xdc/0x360
    .init_list+0x3c/0x128
    .kmem_cache_init+0x1dc/0x258
    .start_kernel+0x2a0/0x568
    start_here_common+0x20/0xa8

To fix this, we instead compare the nodeid with MAX_NUMNODES, and
additionally make sure it isn't negative (since nodeid is an int).  The
check is there mainly to protect the array dereference in the get_node()
call in the next line, and the array being dereferenced is of size
MAX_NUMNODES.  If the nodeid is in range but invalid (for example if the
node is off-line), the BUG_ON in the next line will catch that.

Fixes: 14e50c6a ("mm: slab: Verify the nodeid passed to ____cache_alloc_node")
Signed-off-by: Paul Mackerras <paulus@samba.org>
Reviewed-by: Yasuaki Ishimatsu <isimatu.yasuaki@jp.fujitsu.com>
Reviewed-by: Pekka Enberg <penberg@kernel.org>
Acked-by: David Rientjes <rientjes@google.com>
Cc: Christoph Lameter <cl@linux.com>
Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

7c3fbbdd

lib/genalloc.c: export devm_gen_pool_create() for modules · b724aa21

Michal Simek authored Dec 02, 2014

Modules can use this function for creating pool.
Signed-off-by: Michal Simek <michal.simek@xilinx.com>
Acked-by: Lad, Prabhakar <prabhakar.csengg@gmail.com>
Cc: Laura Abbott <lauraa@codeaurora.org>
Cc: Olof Johansson <olof@lixom.net>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Will Deacon <will.deacon@arm.com>
Cc: Vladimir Zapolskiy <vladimir_zapolskiy@mentor.com>
Cc: Philipp Zabel <p.zabel@pengutronix.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

b724aa21

mm: fix anon_vma_clone() error treatment · c4ea95d7

Daniel Forrest authored Dec 02, 2014

Andrew Morton noticed that the error return from anon_vma_clone() was
being dropped and replaced with -ENOMEM (which is not itself a bug
because the only error return value from anon_vma_clone() is -ENOMEM).

I did an audit of callers of anon_vma_clone() and discovered an actual
bug where the error return was being lost.  In __split_vma(), between
Linux 3.11 and 3.12 the code was changed so the err variable is used
before the call to anon_vma_clone() and the default initial value of
-ENOMEM is overwritten.  So a failure of anon_vma_clone() will return
success since err at this point is now zero.

Below is a patch which fixes this bug and also propagates the error
return value from anon_vma_clone() in all cases.

Fixes: ef0855d3 ("mm: mempolicy: turn vma_set_policy() into vma_dup_policy()")
Signed-off-by: Daniel Forrest <dan.forrest@ssec.wisc.edu>
Reviewed-by: Michal Hocko <mhocko@suse.cz>
Cc: Konstantin Khlebnikov <koct9i@gmail.com>
Cc: Andrea Arcangeli <aarcange@redhat.com>
Cc: Rik van Riel <riel@redhat.com>
Cc: Tim Hartrick <tim@edgecast.com>
Cc: Hugh Dickins <hughd@google.com>
Cc: Michel Lespinasse <walken@google.com>
Cc: Vlastimil Babka <vbabka@suse.cz>
Cc: <stable@vger.kernel.org>	[3.12+]
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

c4ea95d7

mm: fix swapoff hang after page migration and fork · 2022b4d1

Hugh Dickins authored Dec 02, 2014

I've been seeing swapoff hangs in recent testing: it's cycling around
trying unsuccessfully to find an mm for some remaining pages of swap.

I have been exercising swap and page migration more heavily recently,
and now notice a long-standing error in copy_one_pte(): it's trying to
add dst_mm to swapoff's mmlist when it finds a swap entry, but is doing
so even when it's a migration entry or an hwpoison entry.

Which wouldn't matter much, except it adds dst_mm next to src_mm,
assuming src_mm is already on the mmlist: which may not be so.  Then if
pages are later swapped out from dst_mm, swapoff won't be able to find
where to replace them.

There's already a !non_swap_entry() test for stats: move that up before
the swap_duplicate() and the addition to mmlist.
Signed-off-by: Hugh Dickins <hughd@google.com>
Cc: Kelley Nielsen <kelleynnn@gmail.com>
Cc: <stable@vger.kernel.org>	[2.6.18+]
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

2022b4d1

fat: fix oops on corrupted vfat fs · 1ead0e79

Al Viro authored Dec 02, 2014

a) don't bother with ->d_time for positives - we only check it for
   negatives anyway.

b) make sure to set it at unlink and rmdir time - at *that* point
   soon-to-be negative dentry matches then-current directory contents

c) don't go into renaming of old alias in vfat_lookup() unless it
   has the same parent (which it will, unless we are seeing corrupted
   image)

[hirofumi@mail.parknet.co.jp: make change minimum, don't call d_move() for dir]
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: OGAWA Hirofumi <hirofumi@mail.parknet.co.jp>
Cc: <stable@vger.kernel.org>	[3.17.x]
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

1ead0e79

ipc/sem.c: fully initialize sem_array before making it visible · e8577d1f

Manfred Spraul authored Dec 02, 2014

ipc_addid() makes a new ipc identifier visible to everyone.  New objects
start as locked, so that the caller can complete the initialization
after the call.  Within struct sem_array, at least sma->sem_base and
sma->sem_nsems are accessed without any locks, therefore this approach
doesn't work.

Thus: Move the ipc_addid() to the end of the initialization.
Signed-off-by: Manfred Spraul <manfred@colorfullife.com>
Reported-by: Rik van Riel <riel@redhat.com>
Acked-by: Rik van Riel <riel@redhat.com>
Acked-by: Davidlohr Bueso <dave@stgolabs.net>
Acked-by: Rafael Aquini <aquini@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

e8577d1f

drivers/input/evdev.c: don't kfree() a vmalloc address · 92788ac1

Andrew Morton authored Dec 02, 2014

If kzalloc() failed and then evdev_open_device() fails, evdev_open()
will pass a vmalloc'ed pointer to kfree.

This might fix https://bugzilla.kernel.org/show_bug.cgi?id=88401, where
there was a crash in kfree().
Reported-by: Christian Casteyde <casteyde.christian@free.fr>
Belatedly-Acked-by: Dmitry Torokhov <dmitry.torokhov@gmail.com>
Cc: Henrik Rydberg <rydberg@euromail.se>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

92788ac1

cxgb4: Fill in supported link mode for SFP modules · 4c2d5186

Hariprasad Shenai authored Nov 28, 2014

Signed-off-by: Hariprasad Shenai <hariprasad@chelsio.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

4c2d5186

xen-netfront: Remove BUGs on paged skb data which crosses a page boundary · 8d609725

Seth Forshee authored Nov 25, 2014

These BUGs can be erroneously triggered by frags which refer to
tail pages within a compound page. The data in these pages may
overrun the hardware page while still being contained within the
compound page, but since compound_order() evaluates to 0 for tail
pages the assertion fails. The code already iterates through
subsequent pages correctly in this scenario, so the BUGs are
unnecessary and can be removed.

Fixes: f36c3747 ("xen/netfront: handle compound page fragments on transmit")
Cc: <stable@vger.kernel.org> # 3.7+
Signed-off-by: Seth Forshee <seth.forshee@canonical.com>
Reviewed-by: David Vrabel <david.vrabel@citrix.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

8d609725

mm/vmpressure.c: fix race in vmpressure_work_fn() · 91b57191

Andrew Morton authored Dec 02, 2014

In some android devices, there will be a "divide by zero" exception.
vmpr->scanned could be zero before spin_lock(&vmpr->sr_lock).

Addresses https://bugzilla.kernel.org/show_bug.cgi?id=88051

[akpm@linux-foundation.org: neaten]
Reported-by: ji_ang <ji_ang@163.com>
Cc: Anton Vorontsov <anton.vorontsov@linaro.org>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

91b57191

mm: frontswap: invalidate expired data on a dup-store failure · fb993fa1

Weijie Yang authored Dec 02, 2014

If a frontswap dup-store failed, it should invalidate the expired page
in the backend, or it could trigger some data corruption issue.
Such as:
 1. use zswap as the frontswap backend with writeback feature
 2. store a swap page(version_1) to entry A, success
 3. dup-store a newer page(version_2) to the same entry A, fail
 4. use __swap_writepage() write version_2 page to swapfile, success
 5. zswap do shrink, writeback version_1 page to swapfile
 6. version_2 page is overwrited by version_1, data corrupt.

This patch fixes this issue by invalidating expired data immediately
when meet a dup-store failure.
Signed-off-by: Weijie Yang <weijie.yang@samsung.com>
Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Cc: Seth Jennings <sjennings@variantweb.net>
Cc: Dan Streetman <ddstreet@ieee.org>
Cc: Minchan Kim <minchan@kernel.org>
Cc: Bob Liu <bob.liu@oracle.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

fb993fa1