Commits · b92c4a09f58934e95871e8e4b98f184bfaab0196 · Kirill Smelkov / linux

15 Mar, 2017 22 commits

IB/IPoIB: Add destination address when re-queue packet · b92c4a09

Erez Shitrit authored Feb 01, 2017

commit 2b084176 upstream.

When sending packet to destination that was not resolved yet
via path query, the driver keeps the skb and tries to re-send it
again when the path is resolved.

But when re-sending via dev_queue_xmit the kernel doesn't call
to dev_hard_header, so IPoIB needs to keep 20 bytes in the skb
and to put the destination address inside them.

In that way the dev_start_xmit will have the correct destination,
and the driver won't take the destination from the skb->data, while
nothing exists there, which causes to packet be be dropped.

The test flow is:
1. Run the SM on remote node,
2. Restart the driver.
4. Ping some destination,
3. Observe that first ICMP request will be dropped.

Fixes: fc791b63 ("IB/ipoib: move back IB LL address into the hard header")
Signed-off-by: Erez Shitrit <erezsh@mellanox.com>
Signed-off-by: Noa Osherovich <noaos@mellanox.com>
Signed-off-by: Leon Romanovsky <leon@kernel.org>
Tested-by: Yuval Shaia <yuval.shaia@oracle.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

b92c4a09

IB/ipoib: Fix deadlock between rmmod and set_mode · 91948b09

Feras Daoud authored Dec 28, 2016

commit 0a0007f2 upstream.

When calling set_mode from sys/fs, the call flow locks the sys/fs lock
first and then tries to lock rtnl_lock (when calling ipoib_set_mod).
On the other hand, the rmmod call flow takes the rtnl_lock first
(when calling unregister_netdev) and then tries to take the sys/fs
lock. Deadlock a->b, b->a.

The problem starts when ipoib_set_mod frees it's rtnl_lck and tries
to get it after that.

    set_mod:
    [<ffffffff8104f2bd>] ? check_preempt_curr+0x6d/0x90
    [<ffffffff814fee8e>] __mutex_lock_slowpath+0x13e/0x180
    [<ffffffff81448655>] ? __rtnl_unlock+0x15/0x20
    [<ffffffff814fed2b>] mutex_lock+0x2b/0x50
    [<ffffffff81448675>] rtnl_lock+0x15/0x20
    [<ffffffffa02ad807>] ipoib_set_mode+0x97/0x160 [ib_ipoib]
    [<ffffffffa02b5f5b>] set_mode+0x3b/0x80 [ib_ipoib]
    [<ffffffff8134b840>] dev_attr_store+0x20/0x30
    [<ffffffff811f0fe5>] sysfs_write_file+0xe5/0x170
    [<ffffffff8117b068>] vfs_write+0xb8/0x1a0
    [<ffffffff8117ba81>] sys_write+0x51/0x90
    [<ffffffff8100b0f2>] system_call_fastpath+0x16/0x1b

    rmmod:
    [<ffffffff81279ffc>] ? put_dec+0x10c/0x110
    [<ffffffff8127a2ee>] ? number+0x2ee/0x320
    [<ffffffff814fe6a5>] schedule_timeout+0x215/0x2e0
    [<ffffffff8127cc04>] ? vsnprintf+0x484/0x5f0
    [<ffffffff8127b550>] ? string+0x40/0x100
    [<ffffffff814fe323>] wait_for_common+0x123/0x180
    [<ffffffff81060250>] ? default_wake_function+0x0/0x20
    [<ffffffff8119661e>] ? ifind_fast+0x5e/0xb0
    [<ffffffff814fe43d>] wait_for_completion+0x1d/0x20
    [<ffffffff811f2e68>] sysfs_addrm_finish+0x228/0x270
    [<ffffffff811f2fb3>] sysfs_remove_dir+0xa3/0xf0
    [<ffffffff81273f66>] kobject_del+0x16/0x40
    [<ffffffff8134cd14>] device_del+0x184/0x1e0
    [<ffffffff8144e59b>] netdev_unregister_kobject+0xab/0xc0
    [<ffffffff8143c05e>] rollback_registered+0xae/0x130
    [<ffffffff8143c102>] unregister_netdevice+0x22/0x70
    [<ffffffff8143c16e>] unregister_netdev+0x1e/0x30
    [<ffffffffa02a91b0>] ipoib_remove_one+0xe0/0x120 [ib_ipoib]
    [<ffffffffa01ed95f>] ib_unregister_device+0x4f/0x100 [ib_core]
    [<ffffffffa021f5e1>] mlx4_ib_remove+0x41/0x180 [mlx4_ib]
    [<ffffffffa01ab771>] mlx4_remove_device+0x71/0x90 [mlx4_core]

Fixes: 862096a8 ("IB/ipoib: Add more rtnl_link_ops callbacks")
Cc: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: Feras Daoud <ferasda@mellanox.com>
Signed-off-by: Erez Shitrit <erezsh@mellanox.com>
Signed-off-by: Leon Romanovsky <leon@kernel.org>
Signed-off-by: Doug Ledford <dledford@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

91948b09

mnt: Tuck mounts under others instead of creating shadow/side mounts. · 6de9d08a

Eric W. Biederman authored Jan 20, 2017

commit 1064f874 upstream.

Ever since mount propagation was introduced in cases where a mount in
propagated to parent mount mountpoint pair that is already in use the
code has placed the new mount behind the old mount in the mount hash
table.

This implementation detail is problematic as it allows creating
arbitrary length mount hash chains.

Furthermore it invalidates the constraint maintained elsewhere in the
mount code that a parent mount and a mountpoint pair will have exactly
one mount upon them.  Making it hard to deal with and to talk about
this special case in the mount code.

Modify mount propagation to notice when there is already a mount at
the parent mount and mountpoint where a new mount is propagating to
and place that preexisting mount on top of the new mount.

Modify unmount propagation to notice when a mount that is being
unmounted has another mount on top of it (and no other children), and
to replace the unmounted mount with the mount on top of it.

Move the MNT_UMUONT test from __lookup_mnt_last into
__propagate_umount as that is the only call of __lookup_mnt_last where
MNT_UMOUNT may be set on any mount visible in the mount hash table.

These modifications allow:
 - __lookup_mnt_last to be removed.
 - attach_shadows to be renamed __attach_mnt and its shadow
   handling to be removed.
 - commit_tree to be simplified
 - copy_tree to be simplified

The result is an easier to understand tree of mounts that does not
allow creation of arbitrary length hash chains in the mount hash table.

The result is also a very slight userspace visible difference in semantics.
The following two cases now behave identically, where before order
mattered:

case 1: (explicit user action)
	B is a slave of A
	mount something on A/a , it will propagate to B/a
	and than mount something on B/a

case 2: (tucked mount)
	B is a slave of A
	mount something on B/a
	and than mount something on A/a

Histroically umount A/a would fail in case 1 and succeed in case 2.
Now umount A/a succeeds in both configurations.

This very small change in semantics appears if anything to be a bug
fix to me and my survey of userspace leads me to believe that no programs
will notice or care of this subtle semantic change.

v2: Updated to mnt_change_mountpoint to not call dput or mntput
and instead to decrement the counts directly.  It is guaranteed
that there will be other references when mnt_change_mountpoint is
called so this is safe.

v3: Moved put_mountpoint under mount_lock in attach_recursive_mnt
    As the locking in fs/namespace.c changed between v2 and v3.

v4: Reworked the logic in propagate_mount_busy and __propagate_umount
    that detects when a mount completely covers another mount.

v5: Removed unnecessary tests whose result is alwasy true in
    find_topper and attach_recursive_mnt.

v6: Document the user space visible semantic difference.

Fixes: b90fa9ae ("[PATCH] shared mount handling: bind and rbind")
Tested-by: Andrei Vagin <avagin@virtuozzo.com>
Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

6de9d08a

brcmfmac: fix incorrect event channel deduction · f03d5078

Gavin Li authored Jan 17, 2017

commit 8e290cec upstream.

brcmf_sdio_fromevntchan() was being called on the the data frame
rather than the software header, causing some frames to be
mischaracterized as on the event channel rather than the data channel.

This fixes a major performance regression (due to dropped packets). With
this patch the download speed jumped from 1Mbit/s back up to 40MBit/s due
to the sheer amount of packets being incorrectly processed.

Fixes: c56caa9d ("brcmfmac: screening firmware event packet")
Signed-off-by: Gavin Li <git@thegavinli.com>
Acked-by: Arend van Spriel <arend.vanspriel@broadcom.com>
[kvalo@codeaurora.org: improve commit logs based on email discussion]
Signed-off-by: Kalle Valo <kvalo@codeaurora.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

f03d5078

cxl: fix nested locking hang during EEH hotplug · 8cdfa0d8

Andrew Donnellan authored Feb 06, 2017

commit 171ed0fc upstream.

Commit 14a3ae34 ("cxl: Prevent read/write to AFU config space while AFU
not configured") introduced a rwsem to fix an invalid memory access that
occurred when someone attempts to access the config space of an AFU on a
vPHB whilst the AFU is deconfigured, such as during EEH recovery.

It turns out that it's possible to run into a nested locking issue when EEH
recovery fails and a full device hotplug is required.
cxl_pci_error_detected() deconfigures the AFU, taking a writer lock on
configured_rwsem. When EEH recovery fails, the EEH code calls
pci_hp_remove_devices() to remove the device, which in turn calls
cxl_remove() -> cxl_pci_remove_afu() -> pci_deconfigure_afu(), which tries
to grab the writer lock that's already held.

Standard rwsem semantics don't express what we really want to do here and
don't allow for nested locking. Fix this by replacing the rwsem with an
atomic_t which we can control more finely. Allow the AFU to be locked
multiple times so long as there are no readers.

Fixes: 14a3ae34 ("cxl: Prevent read/write to AFU config space while AFU not configured")
Signed-off-by: Andrew Donnellan <andrew.donnellan@au1.ibm.com>
Acked-by: Frederic Barrat <fbarrat@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

8cdfa0d8

cxl: Prevent read/write to AFU config space while AFU not configured · e5603a5c

Andrew Donnellan authored Dec 09, 2016

commit 14a3ae34 upstream.

During EEH recovery, we deconfigure all AFUs whilst leaving the
corresponding vPHB and virtual PCI device in place.

If something attempts to interact with the AFU's PCI config space (e.g.
running lspci) after the AFU has been deconfigured and before it's
reconfigured, cxl_pcie_{read,write}_config() will read invalid values from
the deconfigured struct cxl_afu and proceed to Oops when they try to
dereference pointers that have been set to NULL during deconfiguration.

Add a rwsem to struct cxl_afu so we can prevent interaction with config
space while the AFU is deconfigured.
Reported-by: Pradipta Ghosh <pradghos@in.ibm.com>
Suggested-by: Frederic Barrat <fbarrat@linux.vnet.ibm.com>
Signed-off-by: Andrew Donnellan <andrew.donnellan@au1.ibm.com>
Signed-off-by: Vaibhav Jain <vaibhav@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

e5603a5c

net: mvpp2: fix DMA address calculation in mvpp2_txq_inc_put() · 4144a307

Thomas Petazzoni authored Feb 21, 2017

commit 239a3b66 upstream.

When TX descriptors are filled in, the buffer DMA address is split
between the tx_desc->buf_phys_addr field (high-order bits) and
tx_desc->packet_offset field (5 low-order bits).

However, when we re-calculate the DMA address from the TX descriptor in
mvpp2_txq_inc_put(), we do not take tx_desc->packet_offset into
account. This means that when the DMA address is not aligned on a 32
bytes boundary, we end up calling dma_unmap_single() with a DMA address
that was not the one returned by dma_map_single().

This inconsistency is detected by the kernel when DMA_API_DEBUG is
enabled. We fix this problem by properly calculating the DMA address in
mvpp2_txq_inc_put().
Signed-off-by: Thomas Petazzoni <thomas.petazzoni@free-electrons.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

4144a307

s390: use correct input data address for setup_randomness · c9ac3e94

Heiko Carstens authored Feb 05, 2017

commit 4920e3cf upstream.

The current implementation of setup_randomness uses the stack address
and therefore the pointer to the SYSIB 3.2.2 block as input data
address. Furthermore the length of the input data is the number of
virtual-machine description blocks which is typically one.

This means that typically a single zero byte is fed to
add_device_randomness.

Fix both of these and use the address of the first virtual machine
description block as input data address and also use the correct
length.

Fixes: bcfcbb6b ("s390: add system information as device randomness")
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

c9ac3e94

s390: make setup_randomness work · 0075504d

Heiko Carstens authored Feb 04, 2017

commit da8fd820 upstream.

Commit bcfcbb6b ("s390: add system information as device
randomness") intended to add some virtual machine specific information
to the randomness pool.

Unfortunately it uses the page allocator before it is ready to use. In
result the page allocator always returns NULL and the setup_randomness
function never adds anything to the randomness pool.

To fix this use memblock_alloc and memblock_free instead.

Fixes: bcfcbb6b ("s390: add system information as device randomness")
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

0075504d

s390/topology: correct allocation of topology information · ca54585d

Martin Schwidefsky authored Feb 04, 2017

commit 8676caa4 upstream.

The data stored by the STSI instruction can be up to a page in size
but the memblock_virt_alloc allocation for tl_info only specifies
16 bytes. The memory after the short allocation is overwritten
every time arch_update_cpu_topology is called.

Fixes: 8c910580 "s390/numa: establish cpu to node mapping early"
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

ca54585d

s390: TASK_SIZE for kernel threads · c61a874e

Martin Schwidefsky authored Feb 24, 2017

commit fb94a687 upstream.

Return a sensible value if TASK_SIZE if called from a kernel thread.

This gets us around an issue with copy_mount_options that does a magic
size calculation "TASK_SIZE - (unsigned long)data" while in a kernel
thread and data pointing to kernel space.
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

c61a874e

s390/chsc: Add exception handler for CHSC instruction · 162668c0

Peter Oberparleiter authored Feb 20, 2017

commit 77759137 upstream.

Prevent kernel crashes due to unhandled exceptions raised by the CHSC
instruction which may for example be triggered by invalid ioctl data.

Fixes: 64150adf ("s390/cio: Introduce generic synchronous CHSC IOCTL")
Signed-off-by: Peter Oberparleiter <oberpar@linux.vnet.ibm.com>
Reviewed-by: Sebastian Ott <sebott@linux.vnet.ibm.com>
Reviewed-by: Cornelia Huck <cornelia.huck@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

162668c0

s390/kdump: Use "LINUX" ELF note name instead of "CORE" · 836f9814

Michael Holzheu authored Feb 07, 2017

commit a4a81d8e upstream.

In binutils/libbfd (bfd/elf.c) it is enforced that all s390 specific ELF
notes like e.g. NT_S390_PREFIX or NT_S390_CTRS have "LINUX" specified
as note name. Otherwise the notes are ignored.

For /proc/vmcore we currently use "CORE" for these notes.

Up to now this has not been a real problem because the dump analysis tool
"crash" does not check the note name. But it will break all programs that
use libbfd for processing ELF notes.

So fix this and use "LINUX" for all s390 specific notes to comply with
libbfd.
Reported-by: Philipp Rudo <prudo@linux.vnet.ibm.com>
Reviewed-by: Philipp Rudo <prudo@linux.vnet.ibm.com>
Signed-off-by: Michael Holzheu <holzheu@linux.vnet.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

836f9814

s390/dcssblk: fix device size calculation in dcssblk_direct_access() · 1f2659aa

Gerald Schaefer authored Jan 30, 2017

commit a63f53e3 upstream.

Since commit dd22f551 "block: Change direct_access calling convention",
the device size calculation in dcssblk_direct_access() is off-by-one.
This results in bdev_direct_access() always returning -ENXIO because the
returned value is not page aligned.

Fix this by adding 1 to the dev_sz calculation.

Fixes: dd22f551 ("block: Change direct_access calling convention")
Signed-off-by: Gerald Schaefer <gerald.schaefer@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

1f2659aa

s390/qdio: clear DSCI prior to scanning multiple input queues · 3c3c4d25

Julian Wiedmann authored Nov 21, 2016

commit 1e4a382f upstream.

For devices with multiple input queues, tiqdio_call_inq_handlers()
iterates over all input queues and clears the device's DSCI
during each iteration. If the DSCI is re-armed during one
of the later iterations, we therefore do not scan the previous
queues again.
The re-arming also raises a new adapter interrupt. But its
handler does not trigger a rescan for the device, as the DSCI
has already been erroneously cleared.
This can result in queue stalls on devices with multiple
input queues.

Fix it by clearing the DSCI just once, prior to scanning the queues.

As the code is moved in front of the loop, we also need to access
the DSCI directly (ie irq->dsci) instead of going via each queue's
parent pointer to the same irq. This is not a functional change,
and a follow-up patch will clean up the other users.

In practice, this bug only affects CQ-enabled HiperSockets devices,
ie. devices with sysfs-attribute "hsuid" set. Setting a hsuid is
needed for AF_IUCV socket applications that use HiperSockets
communication.

Fixes: 104ea556 ("qdio: support asynchronous delivery of storage blocks")
Reviewed-by: Ursula Braun <ubraun@linux.vnet.ibm.com>
Signed-off-by: Julian Wiedmann <jwi@linux.vnet.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

3c3c4d25

phy: qcom-ufs: Fix misplaced jump label · ac7c6461

Vivek Gautam authored Jan 27, 2017

commit 0b10f64d upstream.

We want to skip only tx/rx_iface clocks and not ref_clk_src
as well. Fix the jump label accordingly.

Fixes: 300f9677 ("phy: qcom-ufs: Skip obtaining rx/tx_iface_clk for msm8996 based phy")
Cc: Subhash Jadavani <subhashj@codeaurora.org>
Cc: Martin K. Petersen <martin.petersen@oracle.com>
Cc: Kishon Vijay Abraham I <kishon@ti.com>
Signed-off-by: Vivek Gautam <vivek.gautam@codeaurora.org>
Signed-off-by: Kishon Vijay Abraham I <kishon@ti.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

ac7c6461

phy: qcom-ufs: Don't kfree devres resource · 04b51010

Bjorn Andersson authored Jan 22, 2017

commit e7d5e412 upstream.

Upon failing to acquire regulator supplies the qcom-ufs driver calls
kfree() on the devm allocated memory used to store the name of the
regulator, leading to devres corruption.

Rather than switching to using the appropriate free function the patch
acknowledge the fact that "name" is always a constant string and we
don't actually need to create a local copy of it, but rather just
reference the constant string.

Fixes: add78fc0 ("phy: qcom-ufs: Use devm sibling of kstrdup for regulator names")
Reviewed-by: Subhash Jadavani <subhashj@codeaurora.org>
Signed-off-by: Bjorn Andersson <bjorn.andersson@linaro.org>
Signed-off-by: Kishon Vijay Abraham I <kishon@ti.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

04b51010

Bluetooth: Add another AR3012 04ca:3018 device · a6ed492d

Dmitry Tunin authored Jan 05, 2017

commit 441ad62d upstream.

T:  Bus=01 Lev=01 Prnt=01 Port=07 Cnt=04 Dev#=  5 Spd=12  MxCh= 0
D:  Ver= 1.10 Cls=e0(wlcon) Sub=01 Prot=01 MxPS=64 #Cfgs=  1
P:  Vendor=04ca ProdID=3018 Rev=00.01
C:  #Ifs= 2 Cfg#= 1 Atr=e0 MxPwr=100mA
I:  If#= 0 Alt= 0 #EPs= 3 Cls=e0(wlcon) Sub=01 Prot=01 Driver=btusb
I:  If#= 1 Alt= 0 #EPs= 2 Cls=e0(wlcon) Sub=01 Prot=01 Driver=btusb
Signed-off-by: Dmitry Tunin <hanipouspilot@gmail.com>
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

a6ed492d

KVM: VMX: use correct vmcs_read/write for guest segment selector/base · 3904b32c

Chao Peng authored Feb 21, 2017

commit 96794e4e upstream.

Guest segment selector is 16 bit field and guest segment base is natural
width field. Fix two incorrect invocations accordingly.

Without this patch, build fails when aggressive inlining is used with ICC.
Signed-off-by: Chao Peng <chao.p.peng@linux.intel.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

3904b32c

KVM: s390: Disable dirty log retrieval for UCONTROL guests · f89d6db0

Janosch Frank authored Feb 02, 2017

commit e1e8a962 upstream.

User controlled KVM guests do not support the dirty log, as they have
no single gmap that we can check for changes.

As they have no single gmap, kvm->arch.gmap is NULL and all further
referencing to it for dirty checking will result in a NULL
dereference.

Let's return -EINVAL if a caller tries to sync dirty logs for a
UCONTROL guest.

Fixes: 15f36ebd ("KVM: s390: Add proper dirty bitmap support to S390 kvm.")
Signed-off-by: Janosch Frank <frankja@linux.vnet.ibm.com>
Reported-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Reviewed-by: Cornelia Huck <cornelia.huck@de.ibm.com>
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

f89d6db0

serial: 8250_pci: Add MKS Tenta SCOM-0800 and SCOM-0801 cards · c9dc3873

Ian Abbott authored Feb 03, 2017

commit 1c9c858e upstream.

The MKS Instruments SCOM-0800 and SCOM-0801 cards (originally by Tenta
Technologies) are 3U CompactPCI serial cards with 4 and 8 serial ports,
respectively. The first 4 ports are implemented by an OX16PCI954 chip,
and the second 4 ports are implemented by an OX16C954 chip on a local
bus, bridged by the second PCI function of the OX16PCI954. The ports
are jumper-selectable as RS-232 and RS-422/485, and the UARTs use a
non-standard oscillator frequency of 20 MHz (base_baud = 1250000).
Signed-off-by: Ian Abbott <abbotti@mev.co.uk>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

c9dc3873

tty: n_hdlc: get rid of racy n_hdlc.tbuf · 72e54402

Alexander Popov authored Feb 28, 2017

commit 82f2341c upstream.

Currently N_HDLC line discipline uses a self-made singly linked list for
data buffers and has n_hdlc.tbuf pointer for buffer retransmitting after
an error.

The commit be10eb75
("tty: n_hdlc add buffer flushing") introduced racy access to n_hdlc.tbuf.
After tx error concurrent flush_tx_queue() and n_hdlc_send_frames() can put
one data buffer to tx_free_buf_list twice. That causes double free in
n_hdlc_release().

Let's use standard kernel linked list and get rid of n_hdlc.tbuf:
in case of tx error put current data buffer after the head of tx_buf_list.
Signed-off-by: Alexander Popov <alex.popov@linux.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

72e54402

12 Mar, 2017 18 commits

Linux 4.10.2 · 1e4d4778
Greg Kroah-Hartman authored Mar 12, 2017

1e4d4778

ceph: update readpages osd request according to size of pages · 92d90f08

Yan, Zheng authored Jan 19, 2017

commit d641df81 upstream.

add_to_page_cache_lru() can fails, so the actual pages to read
can be smaller than the initial size of osd request. We need to
update osd request size in that case.
Signed-off-by: Yan, Zheng <zyan@redhat.com>
Reviewed-by: Jeff Layton <jlayton@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

92d90f08

scsi: lpfc: Correct WQ creation for pagesize · 519f6fa2

James Smart authored Feb 12, 2017

commit 8ea73db4 upstream.

Correct WQ creation for pagesize

The driver was calculating the adapter command pagesize indicator from
the system pagesize. However, the buffers the driver allocates are only
one size (SLI4_PAGE_SIZE), so no calculation was necessary.
Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: James Smart <james.smart@broadcom.com>
Reviewed-by: Hannes Reinecke <hare@suse.com>
Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Cc: Mauricio Faria de Oliveira <mauricfo@linux.vnet.ibm.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

519f6fa2

MIPS: IP22: Fix build error due to binutils 2.25 uselessnes. · 209cf1f2

Ralf Baechle authored Dec 15, 2016

commit ae2f5e5e upstream.

Fix the following build error with binutils 2.25.

  CC      arch/mips/mm/sc-ip22.o
{standard input}: Assembler messages:
{standard input}:132: Error: number (0x9000000080000000) larger than 32 bits
{standard input}:159: Error: number (0x9000000080000000) larger than 32 bits
{standard input}:200: Error: number (0x9000000080000000) larger than 32 bits
scripts/Makefile.build:293: recipe for target 'arch/mips/mm/sc-ip22.o' failed
make[1]: *** [arch/mips/mm/sc-ip22.o] Error 1

MIPS has used .set mips3 to temporarily switch the assembler to 64 bit
mode in 64 bit kernels virtually forever.  Binutils 2.25 broke this
behavious partially by happily accepting 64 bit instructions in .set mips3
mode but puking on 64 bit constants when generating 32 bit ELF.  Binutils
2.26 restored the old behaviour again.

Fix build with binutils 2.25 by open coding the offending

	dli $1, 0x9000000080000000

as

	li	$1, 0x9000
	dsll	$1, $1, 48

which is ugly be the only thing that will build on all binutils vintages.
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
Cc: stable@vger.kernel.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

209cf1f2

MIPS: IP22: Reformat inline assembler code to modern standards. · b6472849

Ralf Baechle authored Dec 15, 2016

commit f9f1c8db upstream.
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

b6472849

module: fix memory leak on early load_module() failures · 84c131c8

Luis R. Rodriguez authored Feb 10, 2017

commit a5544880 upstream.

While looking for early possible module loading failures I was
able to reproduce a memory leak possible with kmemleak. There
are a few rare ways to trigger a failure:

  o we've run into a failure while processing kernel parameters
    (parse_args() returns an error)
  o mod_sysfs_setup() fails
  o we're a live patch module and copy_module_elf() fails

Chances of running into this issue is really low.

kmemleak splat:

unreferenced object 0xffff9f2c4ada1b00 (size 32):
  comm "kworker/u16:4", pid 82, jiffies 4294897636 (age 681.816s)
  hex dump (first 32 bytes):
    6d 65 6d 73 74 69 63 6b 30 00 00 00 00 00 00 00  memstick0.......
    00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
  backtrace:
    [<ffffffff8c6cfeba>] kmemleak_alloc+0x4a/0xa0
    [<ffffffff8c200046>] __kmalloc_track_caller+0x126/0x230
    [<ffffffff8c1bc581>] kstrdup+0x31/0x60
    [<ffffffff8c1bc5d4>] kstrdup_const+0x24/0x30
    [<ffffffff8c3c23aa>] kvasprintf_const+0x7a/0x90
    [<ffffffff8c3b5481>] kobject_set_name_vargs+0x21/0x90
    [<ffffffff8c4fbdd7>] dev_set_name+0x47/0x50
    [<ffffffffc07819e5>] memstick_check+0x95/0x33c [memstick]
    [<ffffffff8c09c893>] process_one_work+0x1f3/0x4b0
    [<ffffffff8c09cb98>] worker_thread+0x48/0x4e0
    [<ffffffff8c0a2b79>] kthread+0xc9/0xe0
    [<ffffffff8c6dab5f>] ret_from_fork+0x1f/0x40
    [<ffffffffffffffff>] 0xffffffffffffffff

Fixes: e180a6b7 ("param: fix charp parameters set via sysfs")
Reviewed-by: Miroslav Benes <mbenes@suse.cz>
Reviewed-by: Aaron Tomlin <atomlin@redhat.com>
Reviewed-by: Rusty Russell <rusty@rustcorp.com.au>
Acked-by: Kees Cook <keescook@chromium.org>
Signed-off-by: Luis R. Rodriguez <mcgrof@kernel.org>
Signed-off-by: Jessica Yu <jeyu@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

84c131c8

powerpc/mm/hash: Always clear UPRT and Host Radix bits when setting up CPU · cf1c6bea

Aneesh Kumar K.V authored Feb 22, 2017

commit fda2d27d upstream.

We will set LPCR with correct value for radix during int. This make sure we
start with a sanitized value of LPCR. In case of kexec, cpus can have LPCR
value based on the previous translation mode we were running.

Fixes: fe036a06 ("powerpc/64/kexec: Fix MMU cleanup on radix")
Acked-by: Michael Neuling <mikey@neuling.org>
Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

cf1c6bea

powerpc/mm: Add MMU_FTR_KERNEL_RO to possible feature mask · 543fd2ab

Aneesh Kumar K.V authored Feb 07, 2017

commit a5ecdad4 upstream.

Without this we will always find the feature disabled.

Fixes: 984d7a1e ("powerpc/mm: Fixup kernel read only mapping")
Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Acked-by: Balbir Singh <bsingharora@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

543fd2ab

powerpc/xmon: Fix data-breakpoint · 4ffde229

Ravi Bangoria authored Nov 22, 2016

commit c21a493a upstream.

Currently xmon data-breakpoint feature is broken.

Whenever there is a watchpoint match occurs, hw_breakpoint_handler will
be called by do_break via notifier chains mechanism. If watchpoint is
registered by xmon, hw_breakpoint_handler won't find any associated
perf_event and returns immediately with NOTIFY_STOP. Similarly, do_break
also returns without notifying to xmon.

Solve this by returning NOTIFY_DONE when hw_breakpoint_handler does not
find any perf_event associated with matched watchpoint, rather than
NOTIFY_STOP, which tells the core code to continue calling the other
breakpoint handlers including the xmon one.
Signed-off-by: Ravi Bangoria <ravi.bangoria@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

4ffde229

xprtrdma: Reduce required number of send SGEs · 737af93c

Chuck Lever authored Feb 08, 2017

commit 16f906d6 upstream.

The MAX_SEND_SGES check introduced in commit 655fec69
("xprtrdma: Use gathered Send for large inline messages") fails
for devices that have a small max_sge.

Instead of checking for a large fixed maximum number of SGEs,
check for a minimum small number. RPC-over-RDMA will switch to
using a Read chunk if an xdr_buf has more pages than can fit in
the device's max_sge limit. This is considerably better than
failing all together to mount the server.

This fix supports devices that have as few as three send SGEs
available.
Reported-by: Selvin Xavier <selvin.xavier@broadcom.com>
Reported-by: Devesh Sharma <devesh.sharma@broadcom.com>
Reported-by: Honggang Li <honli@redhat.com>
Reported-by: Ram Amrani <Ram.Amrani@cavium.com>
Fixes: 655fec69 ("xprtrdma: Use gathered Send for large ...")
Tested-by: Honggang Li <honli@redhat.com>
Tested-by: Ram Amrani <Ram.Amrani@cavium.com>
Tested-by: Steve Wise <swise@opengridcomputing.com>
Reviewed-by: Parav Pandit <parav@mellanox.com>
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

737af93c

xprtrdma: Disable pad optimization by default · 387fb7dc

Chuck Lever authored Feb 08, 2017

commit c95a3c6b upstream.

Commit d5440e27 ("xprtrdma: Enable pad optimization") made the
Linux client omit XDR round-up padding in normal Read and Write
chunks so that the client doesn't have to register and invalidate
3-byte memory regions that contain no real data.

Unfortunately, my cheery 2014 assessment that this optimization "is
supported now by both Linux and Solaris servers" was premature.
We've found bugs in Solaris in this area since commit d5440e27
("xprtrdma: Enable pad optimization") was merged (SYMLINK is the
main offender).

So for maximum interoperability, I'm disabling this optimization
again. If a CM private message is exchanged when connecting, the
client recognizes that the server is Linux, and enables the
optimization for that connection.

Until now the Solaris server bugs did not impact common operations,
and were thus largely benign. Soon, less capable devices on Linux
NFS/RDMA clients will make use of Read chunks more often, and these
Solaris bugs will prevent interoperation in more cases.

Fixes: 677eb17e ("xprtrdma: Fix XDR tail buffer marshalling")
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

387fb7dc

xprtrdma: Per-connection pad optimization · 5d53884b

Chuck Lever authored Feb 08, 2017

commit b5f0afbe upstream.

Pad optimization is changed by echoing into
/proc/sys/sunrpc/rdma_pad_optimize. This is a global setting,
affecting all RPC-over-RDMA connections to all servers.

The marshaling code picks up that value and uses it for decisions
about how to construct each RPC-over-RDMA frame. Having it change
suddenly in mid-operation can result in unexpected failures. And
some servers a client mounts might need chunk round-up, while
others don't.

So instead, copy the pad_optimize setting into each connection's
rpcrdma_ia when the transport is created, and use the copy, which
can't change during the life of the connection, instead.

This also removes a hack: rpcrdma_convert_iovs was using
the remote-invalidation-expected flag to predict when it could leave
out Write chunk padding. This is because the Linux server handles
implicit XDR padding on Write chunks correctly, and only Linux
servers can set the connection's remote-invalidation-expected flag.

It's more sensible to use the pad optimization setting instead.

Fixes: 677eb17e ("xprtrdma: Fix XDR tail buffer marshalling")
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

5d53884b

xprtrdma: Fix Read chunk padding · 921fe03a

Chuck Lever authored Feb 08, 2017

commit 24abdf1b upstream.

When pad optimization is disabled, rpcrdma_convert_iovs still
does not add explicit XDR round-up padding to a Read chunk.

Commit 677eb17e ("xprtrdma: Fix XDR tail buffer marshalling")
incorrectly short-circuited the test for whether round-up padding
is needed that appears later in rpcrdma_convert_iovs.

However, if this is indeed a regular Read chunk (and not a
Position-Zero Read chunk), the tail iovec _always_ contains the
chunk's padding, and never anything else.

So, it's easy to just skip the tail when padding optimization is
enabled, and add the tail in a subsequent Read chunk segment, if
disabled.

Fixes: 677eb17e ("xprtrdma: Fix XDR tail buffer marshalling")
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

921fe03a

dmaengine: ipu: Make sure the interrupt routine checks all interrupts. · 143ac52c

Magnus Lilja authored Dec 21, 2016

commit adee40b2 upstream.

Commit 3d8cc000 ("dmaengine: ipu: Consolidate duplicated irq handlers")
consolidated the two interrupts routines into one, but the remaining
interrupt routine only checks the status of the error interrupts, not the
normal interrupts.

This patch fixes that problem (tested on i.MX31 PDK board).

Fixes: 3d8cc000 ("dmaengine: ipu: Consolidate duplicated irq handlers")
Cc: Vinod Koul <vinod.koul@intel.com>
Signed-off-by: Magnus Lilja <lilja.magnus@gmail.com>
Signed-off-by: Vinod Koul <vinod.koul@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

143ac52c

mtd: nand: ifc: Fix location of eccstat registers for IFC V1.0 · 700c30c5

Mark Marshall authored Jan 26, 2017

commit 65644147 upstream.

The commit 7a654172 ("mtd/ifc: Add support for IFC controller
version 2.0") added support for version 2.0 of the IFC controller.
The version 2.0 controller has the ECC status registers at a different
location to the previous versions.

Correct the fsl_ifc_nand structure so that the ECC status can be read
from the correct location for both version 1.0 and 2.0 of the controller.

Fixes: 7a654172 ("mtd/ifc: Add support for IFC controller version 2.0")
Signed-off-by: Mark Marshall <mark.marshall@omicronenergy.com>
Signed-off-by: Boris Brezillon <boris.brezillon@free-electrons.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

700c30c5

bcma: use (get|put)_device when probing/removing device driver · 6c12c1ce

Rafał Miłecki authored Jan 28, 2017

commit a971df0b upstream.

This allows tracking device state and e.g. makes devm work as expected.
Signed-off-by: Rafał Miłecki <rafal@milecki.pl>
Signed-off-by: Kalle Valo <kvalo@codeaurora.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

6c12c1ce

md linear: fix a race between linear_add() and linear_congested() · fe83da69

colyli@suse.de authored Jan 28, 2017

commit 03a9e24e upstream.

Recently I receive a bug report that on Linux v3.0 based kerenl, hot add
disk to a md linear device causes kernel crash at linear_congested(). From
the crash image analysis, I find in linear_congested(), mddev->raid_disks
contains value N, but conf->disks[] only has N-1 pointers available. Then
a NULL pointer deference crashes the kernel.

There is a race between linear_add() and linear_congested(), RCU stuffs
used in these two functions cannot avoid the race. Since Linuv v4.0
RCU code is replaced by introducing mddev_suspend().  After checking the
upstream code, it seems linear_congested() is not called in
generic_make_request() code patch, so mddev_suspend() cannot provent it
from being called. The possible race still exists.

Here I explain how the race still exists in current code.  For a machine
has many CPUs, on one CPU, linear_add() is called to add a hard disk to a
md linear device; at the same time on other CPU, linear_congested() is
called to detect whether this md linear device is congested before issuing
an I/O request onto it.

Now I use a possible code execution time sequence to demo how the possible
race happens,

seq    linear_add()                linear_congested()
 0                                 conf=mddev->private
 1   oldconf=mddev->private
 2   mddev->raid_disks++
 3                              for (i=0; i<mddev->raid_disks;i++)
 4                                bdev_get_queue(conf->disks[i].rdev->bdev)
 5   mddev->private=newconf

In linear_add() mddev->raid_disks is increased in time seq 2, and on
another CPU in linear_congested() the for-loop iterates conf->disks[i] by
the increased mddev->raid_disks in time seq 3,4. But conf with one more
element (which is a pointer to struct dev_info type) to conf->disks[] is
not updated yet, accessing its structure member in time seq 4 will cause a
NULL pointer deference fault.

To fix this race, there are 2 parts of modification in the patch,
 1) Add 'int raid_disks' in struct linear_conf, as a copy of
    mddev->raid_disks. It is initialized in linear_conf(), always being
    consistent with pointers number of 'struct dev_info disks[]'. When
    iterating conf->disks[] in linear_congested(), use conf->raid_disks to
    replace mddev->raid_disks in the for-loop, then NULL pointer deference
    will not happen again.
 2) RCU stuffs are back again, and use kfree_rcu() in linear_add() to
    free oldconf memory. Because oldconf may be referenced as mddev->private
    in linear_congested(), kfree_rcu() makes sure that its memory will not
    be released until no one uses it any more.
Also some code comments are added in this patch, to make this modification
to be easier understandable.

This patch can be applied for kernels since v4.0 after commit:
3be260cc ("md/linear: remove rcu protections in favour of
suspend/resume"). But this bug is reported on Linux v3.0 based kernel, for
people who maintain kernels before Linux v4.0, they need to do some back
back port to this patch.

Changelog:
 - V3: add 'int raid_disks' in struct linear_conf, and use kfree_rcu() to
       replace rcu_call() in linear_add().
 - v2: add RCU stuffs by suggestion from Shaohua and Neil.
 - v1: initial effort.
Signed-off-by: Coly Li <colyli@suse.de>
Cc: Shaohua Li <shli@fb.com>
Cc: Neil Brown <neilb@suse.com>
Signed-off-by: Shaohua Li <shli@fb.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

fe83da69

rtc: sun6i: Switch to the external oscillator · 3c1afb4c

Maxime Ripard authored Jan 23, 2017

commit fb61bb82 upstream.

The RTC is clocked from either an internal, imprecise, oscillator or an
external one, which is usually much more accurate.

The difference perceived between the time elapsed and the time reported by
the RTC is in a 10% scale, which prevents the RTC from being useful at all.

Fortunately, the external oscillator is reported to be mandatory in the
Allwinner datasheet, so we can just switch to it.

Fixes: 9765d2d9 ("rtc: sun6i: Add sun6i RTC driver")
Signed-off-by: Maxime Ripard <maxime.ripard@free-electrons.com>
Signed-off-by: Alexandre Belloni <alexandre.belloni@free-electrons.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

3c1afb4c