Commits · d4d783d829cd3ade06c6b39fd7d02f00026ba699 · Kirill Smelkov / linux

17 Aug, 2016 6 commits

ext4: validate s_reserved_gdt_blocks on mount · d4d783d8

Theodore Ts'o authored Jul 05, 2016

[ Upstream commit 5b9554dc ]

If s_reserved_gdt_blocks is extremely large, it's possible for
ext4_init_block_bitmap(), which is called when ext4 sets up an
uninitialized block bitmap, to corrupt random kernel memory.  Add the
same checks which e2fsck has --- it must never be larger than
blocksize / sizeof(__u32) --- and then add a backup check in
ext4_init_block_bitmap() in case the superblock gets modified after
the file system is mounted.
Reported-by: Vegard Nossum <vegard.nossum@oracle.com>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Cc: stable@vger.kernel.org
Signed-off-by: Sasha Levin <alexander.levin@verizon.com>

d4d783d8

iwlwifi: add new 8260 PCI IDs · d579705e

Sasha Levin authored Aug 14, 2016

[ Upstream commit 4b79deec ]

Add 3 new 8260 series PCI IDs:
  - (0x24F3, 0x10B0)
  - (0x24F3, 0xD0B0)
  - (0x24F3, 0xB0B0)

CC: <stable@vger.kernel.org> [4.1+]
Signed-off-by: Oren Givon <oren.givon@intel.com>
Signed-off-by: David Spinadel <david.spinadel@intel.com>
Signed-off-by: Luca Coelho <luciano.coelho@intel.com>
Signed-off-by: Sasha Levin <alexander.levin@verizon.com>

d579705e

ARM: dts: sunxi: Add a startup delay for fixed regulator enabled phys · 2fc0cdf6

Hans de Goede authored Jun 04, 2016

[ Upstream commit fc51b632 ]

It seems that recent kernels have a shorter timeout when scanning for
ethernet phys causing us to hit a timeout on boards where the phy's
regulator gets enabled just before scanning, which leads to non working
ethernet.

A 10ms startup delay seems to be enough to fix it, this commit adds a
20ms startup delay just to be safe.

This has been tested on a sun4i-a10-a1000 and sun5i-a10s-wobo-i5 board,
both of which have non-working ethernet on recent kernels without this
fix.

Cc: stable@vger.kernel.org
Signed-off-by: Hans de Goede <hdegoede@redhat.com>
Signed-off-by: Maxime Ripard <maxime.ripard@free-electrons.com>
Signed-off-by: Sasha Levin <alexander.levin@verizon.com>

2fc0cdf6

ext4: don't call ext4_should_journal_data() on the journal inode · e19f0ec5

Vegard Nossum authored Jul 04, 2016

[ Upstream commit 6a7fd522 ]

If ext4_fill_super() fails early, it's possible for ext4_evict_inode()
to call ext4_should_journal_data() before superblock options and flags
are fully set up.  In that case, the iput() on the journal inode can
end up causing a BUG().

Work around this problem by reordering the tests so we only call
ext4_should_journal_data() after we know it's not the journal inode.

Fixes: 2d859db3 ("ext4: fix data corruption in inodes with journalled data")
Fixes: 2b405bfa ("ext4: fix data=journal fast mount/umount hang")
Cc: Jan Kara <jack@suse.cz>
Cc: stable@vger.kernel.org
Signed-off-by: Vegard Nossum <vegard.nossum@oracle.com>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Reviewed-by: Jan Kara <jack@suse.cz>
Signed-off-by: Sasha Levin <alexander.levin@verizon.com>

e19f0ec5

ext4: fix deadlock during page writeback · 906d6f4d

Jan Kara authored Jul 04, 2016

[ Upstream commit 646caa9c ]

Commit 06bd3c36 (ext4: fix data exposure after a crash) uncovered a
deadlock in ext4_writepages() which was previously much harder to hit.
After this commit xfstest generic/130 reproduces the deadlock on small
filesystems.

The problem happens when ext4_do_update_inode() sets LARGE_FILE feature
and marks current inode handle as synchronous. That subsequently results
in ext4_journal_stop() called from ext4_writepages() to block waiting for
transaction commit while still holding page locks, reference to io_end,
and some prepared bio in mpd structure each of which can possibly block
transaction commit from completing and thus results in deadlock.

Fix the problem by releasing page locks, io_end reference, and
submitting prepared bio before calling ext4_journal_stop().

[ Changed to defer the call to ext4_journal_stop() only if the handle
  is synchronous.  --tytso ]
Reported-and-tested-by: Eryu Guan <eguan@redhat.com>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
CC: stable@vger.kernel.org
Signed-off-by: Jan Kara <jack@suse.cz>
Signed-off-by: Sasha Levin <alexander.levin@verizon.com>

906d6f4d

SUNRPC: Don't allocate a full sockaddr_storage for tracing · 547df965

Trond Myklebust authored Jun 24, 2016

[ Upstream commit db1bb44c ]

We're always tracing IPv4 or IPv6 addresses, so we can save a lot
of space on the ringbuffer by allocating the correct sockaddr size.
Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
Cc: stable@vger.kernel.org
Fixes: 83a712e0 "sunrpc: add some tracepoints around ..."
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
Signed-off-by: Sasha Levin <alexander.levin@verizon.com>

547df965

16 Aug, 2016 19 commits

ext4: check for extents that wrap around · c580d82e

Vegard Nossum authored Jun 30, 2016

[ Upstream commit f70749ca ]

An extent with lblock = 4294967295 and len = 1 will pass the
ext4_valid_extent() test:

	ext4_lblk_t last = lblock + len - 1;

	if (len == 0 || lblock > last)
		return 0;

since last = 4294967295 + 1 - 1 = 4294967295. This would later trigger
the BUG_ON(es->es_lblk + es->es_len < es->es_lblk) in ext4_es_end().

We can simplify it by removing the - 1 altogether and changing the test
to use lblock + len <= lblock, since now if len = 0, then lblock + 0 ==
lblock and it fails, and if len > 0 then lblock + len > lblock in order
to pass (i.e. it doesn't overflow).

Fixes: 5946d089 ("ext4: check for overlapping extents in ext4_valid_extent_entries()")
Fixes: 2f974865 ("ext4: check for zero length extent explicitly")
Cc: Eryu Guan <guaneryu@gmail.com>
Cc: stable@vger.kernel.org
Signed-off-by: Phil Turnbull <phil.turnbull@oracle.com>
Signed-off-by: Vegard Nossum <vegard.nossum@oracle.com>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Signed-off-by: Sasha Levin <alexander.levin@verizon.com>

c580d82e

mfd: qcom_rpm: Parametrize also ack selector size · 988777b0

Linus Walleij authored Jun 22, 2016

[ Upstream commit f37be01e ]

The RPM has two sets of selectors (IPC bit fields): request and
acknowledge. Apparently, some models use 4*32 bit words for select
and some use 7*32 bit words for request, but all use 7*32 words
for acknowledge bits.

So apparently you can on the models with requests of 4*32 select
bits send 4*32 messages and get 7*32 different replies, so on ACK
interrupt, 7*32 bit words need to be read. This is how the vendor
code apparently works.

Cc: stable@vger.kernel.org
Reported-by: Stephen Boyd <sboyd@codeaurora.org>
Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
Reviewed-by: Bjorn Andersson <bjorn.andersson@linaro.org>
Signed-off-by: Lee Jones <lee.jones@linaro.org>
Signed-off-by: Sasha Levin <alexander.levin@verizon.com>

988777b0

mfd: qcom_rpm: Fix offset error for msm8660 · e6b04eb1

Linus Walleij authored Jun 15, 2016

[ Upstream commit 9835f1b7 ]

The RPM in MSM8660/APQ8060 has different offsets to the selector
ACK and request context ACK registers. Make all these register
offsets part of the per-SoC data and assign the right values.

The bug was found by verifying backwards to the vendor tree in
the out-of-tree files <mach/rpm-[8660|8064|8960]>: all were using
offsets 3,11,15,23 and a select size of 4, except the MSM8660/APQ8060
which was using offsets 3,11,19,27 and a select size of 7.

All other platforms apart from msm8660 were affected by reading
excess registers, since 7 was hardcoded as the number of select
words, this patch makes also this part dynamic so we only write/read
as many select words as the platform actually use.

Symptoms of this bug when using msm8660: the first RPM transaction
would work, but the next would stall or raise an error since the
previous transaction was not properly ACKed as the ACK words were
read at the wrong offset.

Cc: stable@vger.kernel.org
Fixes: 58e21438 ("mfd: qcom-rpm: Driver for the Qualcomm RPM")
Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
Reviewed-by: Björn Andersson <bjorn.andersson@linaro.org>
Signed-off-by: Lee Jones <lee.jones@linaro.org>
Signed-off-by: Sasha Levin <alexander.levin@verizon.com>

e6b04eb1

usb: renesas_usbhs: protect the CFIFOSEL setting in usbhsg_ep_enable() · 69974963

Yoshihiro Shimoda authored Jun 08, 2016

[ Upstream commit 15e4292a ]

This patch fixes an issue that the CFIFOSEL register value is possible
to be changed by usbhsg_ep_enable() wrongly. And then, a data transfer
using CFIFO may not work correctly.

For example:
 # modprobe g_multi file=usb-storage.bin
 # ifconfig usb0 192.168.1.1 up
 (During the USB host is sending file to the mass storage)
 # ifconfig usb0 down

In this case, since the u_ether.c may call usb_ep_enable() in
eth_stop(), if the renesas_usbhs driver is also using CFIFO for
mass storage, the mass storage may not work correctly.

So, this patch adds usbhs_lock() and usbhs_unlock() calling in
usbhsg_ep_enable() to protect CFIFOSEL register. This is because:
 - CFIFOSEL.CURPIPE = 0 is also needed for the pipe configuration
 - The CFIFOSEL (fifo->sel) is already protected by usbhs_lock()

Fixes: 97664a20 ("usb: renesas_usbhs: shrink spin lock area")
Cc: <stable@vger.kernel.org> # v3.1+
Signed-off-by: Yoshihiro Shimoda <yoshihiro.shimoda.uh@renesas.com>
Signed-off-by: Felipe Balbi <felipe.balbi@linux.intel.com>
Signed-off-by: Sasha Levin <alexander.levin@verizon.com>

69974963

usb: renesas_usbhs: fix NULL pointer dereference in xfer_work() · 83335cb7

Yoshihiro Shimoda authored Jun 08, 2016

[ Upstream commit 4fdef698 ]

This patch fixes an issue that the xfer_work() is possible to cause
NULL pointer dereference if the usb cable is disconnected while data
transfer is running.

In such case, a gadget driver may call usb_ep_disable()) before
xfer_work() is actually called. In this case, the usbhs_pkt_pop()
will call usbhsf_fifo_unselect(), and then usbhs_pipe_to_fifo()
in xfer_work() will return NULL.

Fixes: e73a9891 ("usb: renesas_usbhs: add DMAEngine support")
Cc: <stable@vger.kernel.org> # v3.1+
Signed-off-by: Yoshihiro Shimoda <yoshihiro.shimoda.uh@renesas.com>
Signed-off-by: Felipe Balbi <felipe.balbi@linux.intel.com>
Signed-off-by: Sasha Levin <alexander.levin@verizon.com>

83335cb7

hp-wmi: Fix wifi cannot be hard-unblocked · 542330f7

Alex Hung authored Jun 13, 2016

[ Upstream commit fc8a601e ]

Several users reported wifi cannot be unblocked as discussed in [1].
This patch removes the use of the 2009 flag by BIOS but uses the actual
WMI function calls - it will be skipped if WMI reports unsupported.

[1] https://bugzilla.kernel.org/show_bug.cgi?id=69131Signed-off-by: Alex Hung <alex.hung@canonical.com>
Tested-by: Evgenii Shatokhin <eugene.shatokhin@yandex.ru>
Cc: stable@vger.kernel.org
Signed-off-by: Darren Hart <dvhart@linux.intel.com>
Signed-off-by: Sasha Levin <alexander.levin@verizon.com>

542330f7

serial: samsung: Fix ERR pointer dereference on deferred probe · 230dc3e9

Krzysztof Kozlowski authored Jun 16, 2016

[ Upstream commit e51e4d8a ]

When the clk_get() of "uart" clock returns EPROBE_DEFER, the next re-probe
finishes with success but uses invalid (ERR_PTR) values.  This leads to
dereferencing of ERR_PTR stored under ourport->clk:

	12c30000.serial: Controller clock not found
	(...)
	12c30000.serial: ttySAC3 at MMIO 0x12c30000 (irq = 61, base_baud = 0) is a S3C6400/10
	Unable to handle kernel paging request at virtual address fffffdfb

	(clk_prepare) from [<c039f7d0>] (s3c24xx_serial_pm+0x20/0x128)
	(s3c24xx_serial_pm) from [<c0395414>] (uart_change_pm+0x38/0x40)
	(uart_change_pm) from [<c039689c>] (uart_add_one_port+0x31c/0x44c)
	(uart_add_one_port) from [<c03a035c>] (s3c24xx_serial_probe+0x2a8/0x418)
	(s3c24xx_serial_probe) from [<c03ee110>] (platform_drv_probe+0x50/0xb0)
	(platform_drv_probe) from [<c03ecb44>] (driver_probe_device+0x1f4/0x2b0)
	(driver_probe_device) from [<c03eb0c0>] (bus_for_each_drv+0x44/0x8c)
	(bus_for_each_drv) from [<c03ec8c8>] (__device_attach+0x9c/0x100)
	(__device_attach) from [<c03ebf54>] (bus_probe_device+0x84/0x8c)
	(bus_probe_device) from [<c03ec388>] (deferred_probe_work_func+0x60/0x8c)
	(deferred_probe_work_func) from [<c012fee4>] (process_one_work+0x120/0x328)
	(process_one_work) from [<c0130150>] (worker_thread+0x2c/0x4ac)
	(worker_thread) from [<c0135320>] (kthread+0xd8/0xf4)
	(kthread) from [<c0107978>] (ret_from_fork+0x14/0x3c)

The first unsuccessful clk_get() causes s3c24xx_serial_init_port() to
exit with failure but the s3c24xx_uart_port is left half-configured
(e.g. port->mapbase is set, clk contains ERR_PTR).  On next re-probe,
the function s3c24xx_serial_init_port() will exit early with success
because of configured port->mapbase and driver will use old values,
including the ERR_PTR as clock.

Fix this by cleaning the port->mapbase on error path so each re-probe
will initialize all of the port settings.

Fixes: 60e93575 ("serial: samsung: enable clock before clearing pending interrupts during init")
Cc: <stable@vger.kernel.org>
Signed-off-by: Krzysztof Kozlowski <k.kozlowski@samsung.com>
Reviewed-by: Javier Martinez Canillas <javier@osg.samsung.com>
Tested-by: Javier Martinez Canillas <javier@osg.samsung.com>
Tested-by: Kevin Hilman <khilman@baylibre.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Sasha Levin <alexander.levin@verizon.com>

230dc3e9

tty/serial: atmel: fix RS485 half duplex with DMA · cf61375f

Alexandre Belloni authored May 28, 2016

[ Upstream commit 0058f087 ]

When using DMA, half duplex doesn't work properly because rx is not stopped
before starting tx. Ensure we call atmel_stop_rx() in the DMA case.
Signed-off-by: Alexandre Belloni <alexandre.belloni@free-electrons.com>
Acked-by: Nicolas Ferre <nicolas.ferre@atmel.com>
Cc: stable <stable@vger.kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Sasha Levin <alexander.levin@verizon.com>

cf61375f

tty/serial: at91: remove bunch of macros to access UART registers · ebe13983

Cyrille Pitchen authored Jul 02, 2015

[ Upstream commit 4e7decda ]

This patch replaces the UART_PUT_*, resp. UART_GET_*, macros by
atmel_uart_writel(), resp. atmel_uart_readl(), inline function calls.
Signed-off-by: Cyrille Pitchen <cyrille.pitchen@atmel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Sasha Levin <alexander.levin@verizon.com>

ebe13983

of: fix memory leak related to safe_name() · 343c4efc

Frank Rowand authored Jun 16, 2016

[ Upstream commit d9fc8807 ]

Fix a memory leak resulting from memory allocation in safe_name().
This patch fixes all call sites of safe_name().

Mathieu Malaterre reported the memory leak on boot:

On my PowerMac device-tree would generate a duplicate name:

[    0.023043] device-tree: Duplicate name in PowerPC,G4@0, renamed to "l2-cache#1"

in this case a newly allocated name is generated by `safe_name`. However
in this case it is never deallocated.

The bug was found using kmemleak reported as:

unreferenced object 0xdf532e60 (size 32):
  comm "swapper", pid 1, jiffies 4294892300 (age 1993.532s)
  hex dump (first 32 bytes):
    6c 32 2d 63 61 63 68 65 23 31 00 dd e4 dd 1e c2  l2-cache#1......
    ec d4 ba ce 04 ec cc de 8e 85 e9 ca c4 ec cc 9e  ................
  backtrace:
    [<c02d3350>] kvasprintf+0x64/0xc8
    [<c02d3400>] kasprintf+0x4c/0x5c
    [<c0453814>] safe_name.isra.1+0x80/0xc4
    [<c04545d8>] __of_attach_node_sysfs+0x6c/0x11c
    [<c075f21c>] of_core_init+0x8c/0xf8
    [<c0729594>] kernel_init_freeable+0xd4/0x208
    [<c00047e8>] kernel_init+0x24/0x11c
    [<c00158ec>] ret_from_kernel_thread+0x5c/0x64

Link: https://bugzilla.kernel.org/show_bug.cgi?id=120331Signed-off-by: Frank Rowand <frank.rowand@am.sony.com>
Reported-by: mathieu.malaterre@gmail.com
Tested-by: Mathieu Malaterre <mathieu.malaterre@gmail.com>
Cc: stable@vger.kernel.org
Signed-off-by: Rob Herring <robh@kernel.org>
Signed-off-by: Sasha Levin <alexander.levin@verizon.com>

343c4efc

crypto: gcm - Filter out async ghash if necessary · 84313784

Herbert Xu authored Jun 15, 2016

[ Upstream commit b30bdfa8 ]

As it is if you ask for a sync gcm you may actually end up with
an async one because it does not filter out async implementations
of ghash.

This patch fixes this by adding the necessary filter when looking
for ghash.

Cc: stable@vger.kernel.org
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: Sasha Levin <alexander.levin@verizon.com>

84313784

usb: dwc3: fix for the isoc transfer EP_BUSY flag · b42d3789

Konrad Leszczynski authored Feb 08, 2016

[ Upstream commit 9cad39fe ]

commit f3af3651 ("usb: dwc3: gadget: always
enable IOC on bulk/interrupt transfers") ended up
regressing Isochronous endpoints by clearing
DWC3_EP_BUSY flag too early, which resulted in
choppy audio playback over USB.

Fix that by partially reverting original commit and
making sure that we check for isochronous endpoints.

Fixes: f3af3651 ("usb: dwc3: gadget: always enable IOC
		on bulk/interrupt transfers")
Cc: <stable@vger.kernel.org>
Signed-off-by: Konrad Leszczynski <konrad.leszczynski@intel.com>
Signed-off-by: Rafal Redzimski <rafal.f.redzimski@intel.com>
Signed-off-by: Felipe Balbi <felipe.balbi@linux.intel.com>
Signed-off-by: Sasha Levin <alexander.levin@verizon.com>

b42d3789

pinctrl: cherryview: prevent concurrent access to GPIO controllers · 4426cc65

Dan O'Donovan authored Jun 10, 2016

[ Upstream commit 0bd50d71 ]

Due to a silicon issue on the Atom X5-Z8000 "Cherry Trail" processor
series, a common lock must be used to prevent concurrent accesses
across the 4 GPIO controllers managed by this driver.

See Intel Atom Z8000 Processor Series Specification Update
(Rev. 005), errata #CHT34, for further information.

Cc: stable <stable@vger.kernel.org>
Signed-off-by: Dan O'Donovan <dan@emutex.com>
Acked-by: Mika Westerberg <mika.westerberg@linux.intel.com>
Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
Signed-off-by: Sasha Levin <alexander.levin@verizon.com>

4426cc65

pinctrl: cherryview: Use raw_spinlock for locking · 40357fd7

Mika Westerberg authored Aug 17, 2015

[ Upstream commit 109fdf15 ]

When running -rt kernel and an interrupt happens on a GPIO line controlled by
Intel Cherryview/Braswell pinctrl driver we get:

 BUG: sleeping function called from invalid context at kernel/locking/rtmutex.c:917
 in_atomic(): 1, irqs_disabled(): 1, pid: 0, name: swapper/0
 Preemption disabled at:[<ffffffff81092e9f>] cpu_startup_entry+0x17f/0x480

 CPU: 0 PID: 0 Comm: swapper/0 Not tainted 4.1.5-rt5 #16
  ...
 Call Trace:
  <IRQ>  [<ffffffff816283c6>] dump_stack+0x4a/0x61
  [<ffffffff81077e17>] ___might_sleep+0xe7/0x170
  [<ffffffff8162d6cf>] rt_spin_lock+0x1f/0x50
  [<ffffffff812e52ed>] chv_gpio_irq_ack+0x3d/0xa0
  [<ffffffff810a72f5>] handle_edge_irq+0x75/0x180
  [<ffffffff810a3457>] generic_handle_irq+0x27/0x40
  [<ffffffff812e57de>] chv_gpio_irq_handler+0x7e/0x110
  [<ffffffff810050aa>] handle_irq+0xaa/0x190
  ...

This is because desc->lock is raw_spinlock and is held when chv_gpio_irq_ack()
is called by the genirq core. chv_gpio_irq_ack() in turn takes pctrl->lock
which in -rt is an rt-mutex causing might_sleep() rightfully to complain about
sleeping function called from invalid context.

In order to keep -rt happy but at the same time make sure that register
accesses get serialized, convert the driver to use raw_spinlock instead.
Suggested-by: Linus Walleij <linus.walleij@linaro.org>
Signed-off-by: Mika Westerberg <mika.westerberg@linux.intel.com>
Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
Signed-off-by: Sasha Levin <alexander.levin@verizon.com>

40357fd7

pinctrl: cherryview: Serialize all register access · ba42156d

Mika Westerberg authored Aug 03, 2015

[ Upstream commit 4585b000 ]

There is a hardware issue in Intel Braswell/Cherryview where concurrent
GPIO register access might results reads of 0xffffffff and writes might get
dropped.

Prevent this from happening by taking the serializing lock for all places
where it is possible that more than one thread might be accessing the
hardware concurrently.
Signed-off-by: Mika Westerberg <mika.westerberg@linux.intel.com>
Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
Signed-off-by: Sasha Levin <alexander.levin@verizon.com>

ba42156d

Update my main e-mails at the Kernel tree · e0577738

Mauro Carvalho Chehab authored Jun 14, 2016

[ Upstream commit dc19ed15 ]

For the third time in three years, I'm changing my e-mail at
Samsung. That's bad, as it may stop communications with me for
a while. So, this time, I'll also the mchehab@kernel.org e-mail,
as it remains stable since ever.

Cc: stable@vger.kernel.org
Signed-off-by: Mauro Carvalho Chehab <mchehab@s-opensource.com>
Signed-off-by: Sasha Levin <alexander.levin@verizon.com>

e0577738

gpio: pca953x: Fix NBANK calculation for PCA9536 · 10dd8c10

Vignesh R authored Jun 09, 2016

[ Upstream commit a246b819 ]

NBANK() macro assumes that ngpios is a multiple of 8(BANK_SZ) and
hence results in 0 banks for PCA9536 which has just 4 gpios. This is
wrong as PCA9356 has 1 bank with 4 gpios. This results in uninitialized
PCA953X_INVERT register. Fix this by using DIV_ROUND_UP macro in
NBANK().

Cc: stable@vger.kernel.org
Signed-off-by: Vignesh R <vigneshr@ti.com>
Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
Signed-off-by: Sasha Levin <alexander.levin@verizon.com>

10dd8c10

PCI: Mark Atheros AR9485 and QCA9882 to avoid bus reset · 27b54905

Chris Blake authored May 30, 2016

[ Upstream commit 9ac0108c ]

Similar to the AR93xx series, the AR94xx and the Qualcomm QCA988x also have
the same quirk for the Bus Reset.

Fixes: c3e59ee4 ("PCI: Mark Atheros AR93xx to avoid bus reset")
Signed-off-by: Chris Blake <chrisrblake93@gmail.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
CC: stable@vger.kernel.org  # v3.14+
Signed-off-by: Sasha Levin <alexander.levin@verizon.com>

27b54905

Revert "drm/i915/ilk: Don't disable SSC source if it's in use" · b3780073

Daniel Vetter authored Jun 09, 2016

[ Upstream commit 8c07eb68 ]

This reverts commit f165d283.

It breaks one of our CI systems. Quoting from Ville:

[   13.100979] [drm:ironlake_init_pch_refclk] has_panel 1 has_lvds 1 has_ck505 0 using_ssc_source 1
[   13.101413] ------------[ cut here ]------------
[   13.101429] kernel BUG at drivers/gpu/drm/i915/intel_display.c:8528!

"which is the 'BUG_ON(val != final)' at the end of ironlake_init_pch_refclk()."

Cc: stable@vger.kernel.org
Cc: Ville Syrjälä <ville.syrjala@linux.intel.com>
Cc: Lyude <cpaul@redhat.com>
Cc: marius.c.vlad@intel.com
References: https://www.spinics.net/lists/dri-devel/msg109557.htmlAcked-by: Lyude <cpaul@redhat.com>
Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>
Signed-off-by: Sasha Levin <alexander.levin@verizon.com>

b3780073

14 Aug, 2016 5 commits

netlabel: add address family checks to netlbl_{sock,req}_delattr() · 7a85bf9e

Paul Moore authored Jun 06, 2016

[ Upstream commit 0e0e3677 ]

It seems risky to always rely on the caller to ensure the socket's
address family is correct before passing it to the NetLabel kAPI,
especially since we see at least one LSM which didn't. Add address
family checks to the *_delattr() functions to help prevent future
problems.

Cc: <stable@vger.kernel.org>
Reported-by: Maninder Singh <maninder1.s@samsung.com>
Signed-off-by: Paul Moore <paul@paul-moore.com>
Signed-off-by: Sasha Levin <alexander.levin@verizon.com>

7a85bf9e

s5p-mfc: Add release callback for memory region devs · 85470a56

Javier Martinez Canillas authored May 03, 2016

[ Upstream commit 6311f126 ]

When s5p_mfc_remove() calls put_device() for the reserved memory region
devs, the driver core warns that the dev doesn't have a release callback:

WARNING: CPU: 0 PID: 591 at drivers/base/core.c:251 device_release+0x8c/0x90
Device 's5p-mfc-l' does not have a release() function, it is broken and must be fixed.

Also, the declared DMA memory using dma_declare_coherent_memory() isn't
relased so add a dev .release that calls dma_release_declared_memory().

Cc: <stable@vger.kernel.org>
Fixes: 6e83e6e2 ("[media] s5p-mfc: Fix kernel warning on memory init")
Signed-off-by: Javier Martinez Canillas <javier@osg.samsung.com>
Tested-by: Marek Szyprowski <m.szyprowski@samsung.com>
Signed-off-by: Sylwester Nawrocki <s.nawrocki@samsung.com>
Signed-off-by: Sasha Levin <alexander.levin@verizon.com>

85470a56

s5p-mfc: Set device name for reserved memory region devs · 1dd12c31

Javier Martinez Canillas authored May 03, 2016

[ Upstream commit 29debab0 ]

The devices don't have a name set, so makes dev_name() returns NULL which
makes harder to identify the devices that are causing issues, for example:

WARNING: CPU: 2 PID: 616 at drivers/base/core.c:251 device_release+0x8c/0x90
Device '(null)' does not have a release() function, it is broken and must be fixed.

And after setting the device name:

WARNING: CPU: 0 PID: 591 at drivers/base/core.c:251 device_release+0x8c/0x90
Device 's5p-mfc-l' does not have a release() function, it is broken and must be fixed.

Cc: <stable@vger.kernel.org>
Fixes: 6e83e6e2 ("[media] s5p-mfc: Fix kernel warning on memory init")
Signed-off-by: Javier Martinez Canillas <javier@osg.samsung.com>
Tested-by: Marek Szyprowski <m.szyprowski@samsung.com>
Signed-off-by: Sylwester Nawrocki <s.nawrocki@samsung.com>
Signed-off-by: Sasha Levin <alexander.levin@verizon.com>

1dd12c31

HID: uhid: fix timeout when probe races with IO · fce67161

Roderick Colenbrander authored May 18, 2016

[ Upstream commit 67f8ecc5 ]

Many devices use userspace bluetooth stacks like BlueZ or Bluedroid in combination
with uhid. If any of these stacks is used with a HID device for which the driver
performs a HID request as part .probe (or technically another HID operation),
this results in a deadlock situation. The deadlock results in a 5 second timeout
for I/O operations in HID drivers, so isn't fatal, but none of the I/O operations
have a chance of succeeding.

The root cause for the problem is that uhid only allows for one request to be
processed at a time per uhid instance and locks out other operations. This means
that if a user space is creating a new HID device through 'UHID_CREATE', which
ultimately triggers '.probe' through the HID layer. Then any HID request e.g. a
read for calibration data would trigger a HID operation on uhid again, but it
won't go out to userspace, because it is still stuck in UHID_CREATE.
In addition bluetooth stacks are typically single threaded, so they wouldn't be
able to handle any requests while waiting on uhid.

Lucikly the UHID spec is somewhat flexible and allows for fixing the issue,
without breaking user space. The idea which the patch implements as discussed
with David Herrmann is to decouple adding of a hid device (which triggers .probe)
from UHID_CREATE. The work will kick off roughly once UHID_CREATE completed (or
else will wait a tiny bit of time in .probe for a lock). A HID driver has to call
HID to call 'hid_hw_start()' as part of .probe once it is ready for I/O, which
triggers UHID_START to user space. Any HID operations should function now within
.probe and won't deadlock because userspace is stuck on UHID_CREATE.

We verified this patch on Bluedroid with Android 6.0 and on desktop Linux with
BlueZ stacks. Prior to the patch they had the deadlock issue.

[jkosina@suse.cz: reword subject]
Signed-off-by: Roderick Colenbrander <roderick.colenbrander@sony.com>
Cc: stable@vger.kernel.org
Signed-off-by: Jiri Kosina <jkosina@suse.cz>
Signed-off-by: Sasha Levin <alexander.levin@verizon.com>

fce67161

arm64: kernel: Save and restore addr_limit on exception entry · 223b3917

James Morse authored Aug 12, 2016

commit e19a6ee2 upstream.

If we take an exception while at EL1, the exception handler inherits
the original context's addr_limit value. To be consistent always reset
addr_limit and PSTATE.UAO on (re-)entry to EL1. This prevents accidental
re-use of the original context's addr_limit.

Based on a similar patch for arm from Russell King.
Acked-by: Will Deacon <will.deacon@arm.com>
Reviewed-by: Mark Rutland <mark.rutland@arm.com>
Signed-off-by: James Morse <james.morse@arm.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
[ backport to stop perf misusing inherited addr_limit.
  Removed code interacting with UAO and the irqstack ]
Link: https://bugs.chromium.org/p/project-zero/issues/detail?id=822Signed-off-by: James Morse <james.morse@arm.com>
Cc: <stable@vger.kernel.org> #4.1
Signed-off-by: Sasha Levin <alexander.levin@verizon.com>

223b3917

12 Aug, 2016 1 commit

fs/proc/task_mmu.c: fix mm_access() mode parameter in pagemap_read() · 5c576457

Kenny Keslar authored Aug 12, 2016

Backport of caaee623 ("ptrace: use fsuid,
fsgid, effective creds for fs access checks") to v4.1 failed to update the
mode parameter in the mm_access() call in pagemap_read() to have one of the
new PTRACE_MODE_*CREDS flags.

Attempting to read any other process' pagemap results in a WARN()

WARNING: CPU: 0 PID: 883 at kernel/ptrace.c:229 __ptrace_may_access+0x14a/0x160()
denying ptrace access check without PTRACE_MODE_*CREDS
Modules linked in: loop sg e1000 i2c_piix4 ppdev virtio_balloon virtio_pci parport_pc i2c_core virtio_ring ata_generic serio_raw pata_acpi virtio parport pcspkr floppy acpi_cpufreq ip_tables ext3 mbcache jbd sd_mod ata_piix crc32c_intel libata
CPU: 0 PID: 883 Comm: cat Tainted: G        W       4.1.12-51.el7uek.x86_64 #2
Hardware name: Bochs Bochs, BIOS Bochs 01/01/2011
  0000000000000286 00000000619f225a ffff88003b6fbc18 ffffffff81717021
  ffff88003b6fbc70 ffffffff819be870 ffff88003b6fbc58 ffffffff8108477a
  000000003b6fbc58 0000000000000001 ffff88003d287000 0000000000000001
Call Trace:
  [<ffffffff81717021>] dump_stack+0x63/0x81
  [<ffffffff8108477a>] warn_slowpath_common+0x8a/0xc0
  [<ffffffff81084805>] warn_slowpath_fmt+0x55/0x70
  [<ffffffff8108e57a>] __ptrace_may_access+0x14a/0x160
  [<ffffffff8108f372>] ptrace_may_access+0x32/0x50
  [<ffffffff81081bad>] mm_access+0x6d/0xb0
  [<ffffffff81278c81>] pagemap_read+0xe1/0x360
  [<ffffffff811a046b>] ? lru_cache_add_active_or_unevictable+0x2b/0xa0
  [<ffffffff8120d2e7>] __vfs_read+0x37/0x100
  [<ffffffff812b9ab4>] ? security_file_permission+0x84/0xa0
  [<ffffffff8120d8b6>] ? rw_verify_area+0x56/0xe0
  [<ffffffff8120d9c6>] vfs_read+0x86/0x140
  [<ffffffff8120e945>] SyS_read+0x55/0xd0
  [<ffffffff8171eb6e>] system_call_fastpath+0x12/0x71

Fixes: ab88ce5f (ptrace: use fsuid, fsgid, effective creds for fs access checks)
Signed-off-by: Kenny Keslar <kenny.keslar@oracle.com>
Cc: Roland McGrath <roland@hack.frob.com>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: stable@vger.kernel.org
Signed-off-by: Sasha Levin <alexander.levin@verizon.com>

5c576457

11 Aug, 2016 1 commit

netfilter: nf_nat_redirect: add missing NULL pointer check · 6a468737

Munehisa Kamata authored Oct 26, 2015

[ Upstream commit 94f9cd81 ]

Commit 8b13eddf ("netfilter: refactor NAT
redirect IPv4 to use it from nf_tables") has introduced a trivial logic
change which can result in the following crash.

BUG: unable to handle kernel NULL pointer dereference at 0000000000000030
IP: [<ffffffffa033002d>] nf_nat_redirect_ipv4+0x2d/0xa0 [nf_nat_redirect]
PGD 3ba662067 PUD 3ba661067 PMD 0
Oops: 0000 [#1] SMP
Modules linked in: ipv6(E) xt_REDIRECT(E) nf_nat_redirect(E) xt_tcpudp(E) iptable_nat(E) nf_conntrack_ipv4(E) nf_defrag_ipv4(E) nf_nat_ipv4(E) nf_nat(E) nf_conntrack(E) ip_tables(E) x_tables(E) binfmt_misc(E) xfs(E) libcrc32c(E) evbug(E) evdev(E) psmouse(E) i2c_piix4(E) i2c_core(E) acpi_cpufreq(E) button(E) ext4(E) crc16(E) jbd2(E) mbcache(E) dm_mirror(E) dm_region_hash(E) dm_log(E) dm_mod(E)
CPU: 0 PID: 2536 Comm: ip Tainted: G            E   4.1.7-15.23.amzn1.x86_64 #1
Hardware name: Xen HVM domU, BIOS 4.2.amazon 05/06/2015
task: ffff8800eb438000 ti: ffff8803ba664000 task.ti: ffff8803ba664000
[...]
Call Trace:
 <IRQ>
 [<ffffffffa0334065>] redirect_tg4+0x15/0x20 [xt_REDIRECT]
 [<ffffffffa02e2e99>] ipt_do_table+0x2b9/0x5e1 [ip_tables]
 [<ffffffffa0328045>] iptable_nat_do_chain+0x25/0x30 [iptable_nat]
 [<ffffffffa031777d>] nf_nat_ipv4_fn+0x13d/0x1f0 [nf_nat_ipv4]
 [<ffffffffa0328020>] ? iptable_nat_ipv4_fn+0x20/0x20 [iptable_nat]
 [<ffffffffa031785e>] nf_nat_ipv4_in+0x2e/0x90 [nf_nat_ipv4]
 [<ffffffffa03280a5>] iptable_nat_ipv4_in+0x15/0x20 [iptable_nat]
 [<ffffffff81449137>] nf_iterate+0x57/0x80
 [<ffffffff814491f7>] nf_hook_slow+0x97/0x100
 [<ffffffff814504d4>] ip_rcv+0x314/0x400

unsigned int
nf_nat_redirect_ipv4(struct sk_buff *skb,
...
{
...
		rcu_read_lock();
		indev = __in_dev_get_rcu(skb->dev);
		if (indev != NULL) {
			ifa = indev->ifa_list;
			newdst = ifa->ifa_local; <---
		}
		rcu_read_unlock();
...
}

Before the commit, 'ifa' had been always checked before access. After the
commit, however, it could be accessed even if it's NULL. Interestingly,
this was once fixed in 2003.

http://marc.info/?l=netfilter-devel&m=106668497403047&w=2

In addition to the original one, we have seen the crash when packets that
need to be redirected somehow arrive on an interface which hasn't been
yet fully configured.

This change just reverts the logic to the old behavior to avoid the crash.

Fixes: 8b13eddf ("netfilter: refactor NAT redirect IPv4 to use it from nf_tables")
Signed-off-by: Munehisa Kamata <kamatam@amazon.com>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Signed-off-by: Sasha Levin <alexander.levin@verizon.com>

6a468737

09 Aug, 2016 1 commit
- Linux 4.1.30 · 558ba5fd
  Sasha Levin authored Aug 08, 2016
```
Signed-off-by: Sasha Levin <alexander.levin@verizon.com>
```
  558ba5fd
08 Aug, 2016 4 commits

x86/quirks: Reintroduce scanning of secondary buses · 629d0452

Lukas Wunner authored Jun 12, 2016

[ Upstream commit 850c3210 ]

We used to scan secondary buses until the following commit that
was applied in 2009:

  8659c406 ("x86: only scan the root bus in early PCI quirks")

which commit constrained early quirks to the root bus only. Its
motivation was to prevent application of the nvidia_bugs quirk
on secondary buses.

We're about to add a quirk to reset the Broadcom 4331 wireless card on
2011/2012 Macs, which is located on a secondary bus behind a PCIe root
port. To facilitate that, reintroduce scanning of secondary buses.

The commit message of 8659c406 notes that scanning only the root bus
"saves quite some unnecessary scanning work". The algorithm used prior
to 8659c406 was particularly time consuming because it scanned
buses 0 to 31 brute force. To avoid lengthening boot time, employ a
recursive strategy which only scans buses that are actually reachable
from the root bus.

Yinghai Lu pointed out that the secondary bus number read from a
bridge's config space may be invalid, in particular a value of 0 would
cause an infinite loop. The PCI core goes beyond that and recurses to a
child bus only if its bus number is greater than the parent bus number
(see pci_scan_bridge()). Since the root bus is numbered 0, this implies
that secondary buses may not be 0. Do the same on early scanning.

If this algorithm is found to significantly impact boot time or cause
infinite loops on broken hardware, it would be possible to limit its
recursion depth: The Broadcom 4331 quirk applies at depth 1, all others
at depth 0, so the bus need not be scanned deeper than that for now. An
alternative approach would be to revert to scanning only the root bus,
and apply the Broadcom 4331 quirk to the root ports 8086:1c12, 8086:1e12
and 8086:1e16. Apple always positioned the card behind either of these
three ports. The quirk would then check presence of the card in slot 0
below the root port and do its deed.
Signed-off-by: Lukas Wunner <lukas@wunner.de>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Bjorn Helgaas <bhelgaas@google.com>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Josh Poimboeuf <jpoimboe@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Yinghai Lu <yinghai@kernel.org>
Cc: linux-pci@vger.kernel.org
Link: http://lkml.kernel.org/r/f0daa70dac1a9b2483abdb31887173eb6ab77bdf.1465690253.git.lukas@wunner.deSigned-off-by: Ingo Molnar <mingo@kernel.org>
Signed-off-by: Sasha Levin <alexander.levin@verizon.com>

629d0452

x86/quirks: Apply nvidia_bugs quirk only on root bus · f2da7dfd

Lukas Wunner authored Jun 12, 2016

[ Upstream commit 447d29d1 ]

Since the following commit:

  8659c406 ("x86: only scan the root bus in early PCI quirks")

... early quirks are only applied to devices on the root bus.

The motivation was to prevent application of the nvidia_bugs quirk on
secondary buses.

We're about to reintroduce scanning of secondary buses for a quirk to
reset the Broadcom 4331 wireless card on 2011/2012 Macs. To prevent
regressions, open code the requirement to apply nvidia_bugs only on the
root bus.
Signed-off-by: Lukas Wunner <lukas@wunner.de>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Bjorn Helgaas <bhelgaas@google.com>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Josh Poimboeuf <jpoimboe@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Yinghai Lu <yinghai@kernel.org>
Link: http://lkml.kernel.org/r/4d5477c1d76b2f0387a780f2142bbcdd9fee869b.1465690253.git.lukas@wunner.deSigned-off-by: Ingo Molnar <mingo@kernel.org>
Signed-off-by: Sasha Levin <alexander.levin@verizon.com>

f2da7dfd

Revert "MIPS: Reserve nosave data for hibernation" · 6264b577
Sasha Levin authored Aug 07, 2016
```
This reverts commit e8ebd0cf.
Signed-off-by: Sasha Levin <alexander.levin@verizon.com>
```
6264b577

Revert "sparc64: Fix numa node distance initialization" · 84d08218

Sasha Levin authored Aug 07, 2016

This reverts commit bfbe327d556707c59c5c0536d831078b41a68429.
Signed-off-by: Sasha Levin <alexander.levin@verizon.com>

84d08218

06 Aug, 2016 3 commits

pps: do not crash when failed to register · bd6d85d6

Jiri Slaby authored Jul 20, 2016

[ Upstream commit 368301f2 ]

With this command sequence:

  modprobe plip
  modprobe pps_parport
  rmmod pps_parport

the partport_pps modules causes this crash:

  BUG: unable to handle kernel NULL pointer dereference at (null)
  IP: parport_detach+0x1d/0x60 [pps_parport]
  Oops: 0000 [#1] SMP
  ...
  Call Trace:
    parport_unregister_driver+0x65/0xc0 [parport]
    SyS_delete_module+0x187/0x210

The sequence that builds up to this is:

 1) plip is loaded and takes the parport device for exclusive use:

    plip0: Parallel port at 0x378, using IRQ 7.

 2) pps_parport then fails to grab the device:

    pps_parport: parallel port PPS client
    parport0: cannot grant exclusive access for device pps_parport
    pps_parport: couldn't register with parport0

 3) rmmod of pps_parport is then killed because it tries to access
    pardev->name, but pardev (taken from port->cad) is NULL.

So add a check for NULL in the test there too.

Link: http://lkml.kernel.org/r/20160714115245.12651-1-jslaby@suse.czSigned-off-by: Jiri Slaby <jslaby@suse.cz>
Acked-by: Rodolfo Giometti <giometti@enneenne.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Sasha Levin <alexander.levin@verizon.com>

bd6d85d6

radix-tree: fix radix_tree_iter_retry() for tagged iterators. · bea9acd8

Andrey Ryabinin authored Jul 20, 2016

[ Upstream commit 3cb9185c ]

radix_tree_iter_retry() resets slot to NULL, but it doesn't reset tags.
Then NULL slot and non-zero iter.tags passed to radix_tree_next_slot()
leading to crash:

  RIP: radix_tree_next_slot include/linux/radix-tree.h:473
    find_get_pages_tag+0x334/0x930 mm/filemap.c:1452
  ....
  Call Trace:
    pagevec_lookup_tag+0x3a/0x80 mm/swap.c:960
    mpage_prepare_extent_to_map+0x321/0xa90 fs/ext4/inode.c:2516
    ext4_writepages+0x10be/0x2b20 fs/ext4/inode.c:2736
    do_writepages+0x97/0x100 mm/page-writeback.c:2364
    __filemap_fdatawrite_range+0x248/0x2e0 mm/filemap.c:300
    filemap_write_and_wait_range+0x121/0x1b0 mm/filemap.c:490
    ext4_sync_file+0x34d/0xdb0 fs/ext4/fsync.c:115
    vfs_fsync_range+0x10a/0x250 fs/sync.c:195
    vfs_fsync fs/sync.c:209
    do_fsync+0x42/0x70 fs/sync.c:219
    SYSC_fdatasync fs/sync.c:232
    SyS_fdatasync+0x19/0x20 fs/sync.c:230
    entry_SYSCALL_64_fastpath+0x23/0xc1 arch/x86/entry/entry_64.S:207

We must reset iterator's tags to bail out from radix_tree_next_slot()
and go to the slow-path in radix_tree_next_chunk().

Fixes: 46437f9a ("radix-tree: fix race in gang lookup")
Link: http://lkml.kernel.org/r/1468495196-10604-1-git-send-email-aryabinin@virtuozzo.comSigned-off-by: Andrey Ryabinin <aryabinin@virtuozzo.com>
Reported-by: Dmitry Vyukov <dvyukov@google.com>
Acked-by: Konstantin Khlebnikov <koct9i@gmail.com>
Cc: Matthew Wilcox <willy@linux.intel.com>
Cc: Hugh Dickins <hughd@google.com>
Cc: Ross Zwisler <ross.zwisler@linux.intel.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Sasha Levin <alexander.levin@verizon.com>

bea9acd8

libceph: apply new_state before new_up_client on incrementals · 6831c98c

Ilya Dryomov authored Jul 19, 2016

[ Upstream commit 930c5328 ]

Currently, osd_weight and osd_state fields are updated in the encoding
order.  This is wrong, because an incremental map may look like e.g.

    new_up_client: { osd=6, addr=... } # set osd_state and addr
    new_state: { osd=6, xorstate=EXISTS } # clear osd_state

Suppose osd6's current osd_state is EXISTS (i.e. osd6 is down).  After
applying new_up_client, osd_state is changed to EXISTS | UP.  Carrying
on with the new_state update, we flip EXISTS and leave osd6 in a weird
"!EXISTS but UP" state.  A non-existent OSD is considered down by the
mapping code

2087    for (i = 0; i < pg->pg_temp.len; i++) {
2088            if (ceph_osd_is_down(osdmap, pg->pg_temp.osds[i])) {
2089                    if (ceph_can_shift_osds(pi))
2090                            continue;
2091
2092                    temp->osds[temp->size++] = CRUSH_ITEM_NONE;

and so requests get directed to the second OSD in the set instead of
the first, resulting in OSD-side errors like:

[WRN] : client.4239 192.168.122.21:0/2444980242 misdirected client.4239.1:2827 pg 2.5df899f2 to osd.4 not [1,4,6] in e680/680

and hung rbds on the client:

[  493.566367] rbd: rbd0: write 400000 at 11cc00000 (0)
[  493.566805] rbd: rbd0:   result -6 xferred 400000
[  493.567011] blk_update_request: I/O error, dev rbd0, sector 9330688

The fix is to decouple application from the decoding and:
- apply new_weight first
- apply new_state before new_up_client
- twiddle osd_state flags if marking in
- clear out some of the state if osd is destroyed

Fixes: http://tracker.ceph.com/issues/14901

Cc: stable@vger.kernel.org # 3.15+: 6dd74e44: libceph: set 'exists' flag for newly up osd
Cc: stable@vger.kernel.org # 3.15+
Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
Reviewed-by: Josh Durgin <jdurgin@redhat.com>
Signed-off-by: Sasha Levin <alexander.levin@verizon.com>

6831c98c