Commits · f1bac1ba17822414d4031f840913b4ea27793ba8 · Kirill Smelkov / linux

02 Jul, 2014 29 commits

lz4: ensure length does not wrap · f1bac1ba

Greg Kroah-Hartman authored Jun 20, 2014

commit 206204a1 upstream.

Given some pathologically compressed data, lz4 could possibly decide to
wrap a few internal variables, causing unknown things to happen.  Catch
this before the wrapping happens and abort the decompression.
Reported-by: "Don A. Bailey" <donb@securitymouse.com>
Signed-off-by: Jiri Slaby <jslaby@suse.cz>

f1bac1ba

lzo: properly check for overruns · 83722ba4

Greg Kroah-Hartman authored Jun 20, 2014

commit 206a81c1 upstream.

The lzo decompressor can, if given some really crazy data, possibly
overrun some variable types.  Modify the checking logic to properly
detect overruns before they happen.
Reported-by: "Don A. Bailey" <donb@securitymouse.com>
Tested-by: "Don A. Bailey" <donb@securitymouse.com>
Signed-off-by: Jiri Slaby <jslaby@suse.cz>

83722ba4

iio: Fix endianness issue in ak8975_read_axis() · 2618b9ba

Peter Meerwald authored May 06, 2014

commit 8ba42fb7 upstream.

i2c_smbus_read_word_data() does host endian conversion already,
no need for le16_to_cpu()
Signed-off-by: Peter Meerwald <pmeerw@pmeerw.net>
Signed-off-by: Jonathan Cameron <jic23@kernel.org>
Signed-off-by: Jiri Slaby <jslaby@suse.cz>

2618b9ba

iio: adc: at91: signedness bug in at91_adc_get_trigger_value_by_name() · f65aa47d

Dan Carpenter authored Nov 06, 2014

commit 4f3bcd87 upstream.

at91_adc_get_trigger_value_by_name() was returning -ENOMEM truncated to
a positive u8 and that doesn't work.  I've changed it to int and
refactored it to preserve the error code.
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Acked-by: Alexandre Belloni <alexandre.belloni@free-electrons.com>
Tested-by: Alexandre Belloni <alexandre.belloni@free-electrons.com>
Signed-off-by: Jonathan Cameron <jic23@kernel.org>
Signed-off-by: Jiri Slaby <jslaby@suse.cz>

f65aa47d

staging: iio: tsl2x7x_core: fix proximity treshold · 7ea3386d

Mario Schuknecht authored May 27, 2014

commit c404618c upstream.

Consider high byte of proximity min and max treshold in function
'tsl2x7x_chip_on'. So far, the high byte was not set.
Signed-off-by: Mario Schuknecht <mario.schuknecht@dresearch-fe.de>
Signed-off-by: Jonathan Cameron <jic23@kernel.org>
Signed-off-by: Jiri Slaby <jslaby@suse.cz>

7ea3386d

ASoC: tlv320aci3x: Fix custom snd_soc_dapm_put_volsw_aic3x() function · ed550318

Peter Ujfalusi authored May 30, 2014

commit e6c111fa upstream.

For some unknown reason the parameters for snd_soc_test_bits() were in wrong
order:
It was:
snd_soc_test_bits(codec, val, mask, reg); /* WRONG!!! */
while it should be:
snd_soc_test_bits(codec, reg, mask, val);
Signed-off-by: Peter Ujfalusi <peter.ujfalusi@ti.com>
Signed-off-by: Mark Brown <broonie@linaro.org>
Signed-off-by: Jiri Slaby <jslaby@suse.cz>

ed550318

ASoC: max98090: Fix reset at resume time · 0fa09d2e

Liam Girdwood authored May 16, 2014

commit 25b4ab43 upstream.

Reset needs to wait 20ms before other codec IO is performed. This wait
was not being performed. Fix this by making sure the reset register is not
restored with the cache, but use the manual reset method in resume with
the wait.
Signed-off-by: Liam Girdwood <liam.r.girdwood@linux.intel.com>
Signed-off-by: Jarkko Nikula <jarkko.nikula@linux.intel.com>
Signed-off-by: Mark Brown <broonie@linaro.org>
Signed-off-by: Jiri Slaby <jslaby@suse.cz>

0fa09d2e

Drivers: hv: balloon: Ensure pressure reports are posted regularly · 347b9fed

K. Y. Srinivasan authored Apr 23, 2014

commit ae339336 upstream.

The current code posts periodic memory pressure status from a dedicated thread.
Under some conditions, especially when we are releasing a lot of memory into
the guest, we may not send timely pressure reports back to the host. Fix this
issue by reporting pressure in all contexts that can be active in this driver.
Signed-off-by: K. Y. Srinivasan <kys@microsoft.com>
Signed-off-by: Jiri Slaby <jslaby@suse.cz>

347b9fed

USB: cdc-acm: fix runtime PM imbalance at shutdown · 26c05be0

Johan Hovold authored May 26, 2014

commit 5292afa6 upstream.

Make sure only to decrement the PM counters if they were actually
incremented.

Note that the USB PM counter, but not necessarily the driver core PM
counter, is reset when the interface is unbound.

Fixes: 11ea859d ("USB: additional power savings for cdc-acm devices
that support remote wakeup")
Signed-off-by: Johan Hovold <jhovold@gmail.com>
Signed-off-by: Jiri Slaby <jslaby@suse.cz>

26c05be0

USB: cdc-acm: fix I/O after failed open · ffc415d3

Johan Hovold authored May 26, 2014

commit e4c36076 upstream.

Make sure to kill any already submitted read urbs on read-urb submission
failures in open in order to prevent doing I/O for a closed port.

Fixes: 088c64f8 ("USB: cdc-acm: re-write read processing")
Signed-off-by: Johan Hovold <jhovold@gmail.com>
Signed-off-by: Jiri Slaby <jslaby@suse.cz>

ffc415d3

USB: cdc-acm: fix potential urb leak and PM imbalance in write · 5d62cbeb

Johan Hovold authored May 26, 2014

commit 183a4508 upstream.

Make sure to check return value of autopm get in write() in order to
avoid urb leak and PM counter imbalance on errors.

Fixes: 11ea859d ("USB: additional power savings for cdc-acm devices
that support remote wakeup")
Signed-off-by: Johan Hovold <jhovold@gmail.com>
Signed-off-by: Jiri Slaby <jslaby@suse.cz>

5d62cbeb

USB: cdc-acm: fix shutdown and suspend race · 4ed88943

Johan Hovold authored May 26, 2014

commit ed797074 upstream.

We should stop I/O unconditionally at suspend rather than rely on the
tty-port initialised flag (which is set prior to stopping I/O during
shutdown) in order to prevent suspend returning with URBs still active.

Fixes: 11ea859d ("USB: additional power savings for cdc-acm devices
that support remote wakeup")
Signed-off-by: Johan Hovold <jhovold@gmail.com>
Signed-off-by: Jiri Slaby <jslaby@suse.cz>

4ed88943

USB: cdc-acm: fix runtime PM for control messages · 4d9aa622

Johan Hovold authored May 26, 2014

commit bae3f4c5 upstream.

Fix runtime PM handling of control messages by adding the required PM
counter operations.

Fixes: 11ea859d ("USB: additional power savings for cdc-acm devices
that support remote wakeup")
Signed-off-by: Johan Hovold <jhovold@gmail.com>
Signed-off-by: Jiri Slaby <jslaby@suse.cz>

4d9aa622

USB: cdc-acm: fix broken runtime suspend · 870306cf

Johan Hovold authored May 26, 2014

commit 140cb81a upstream.

The current ACM runtime-suspend implementation is broken in several
ways:

Firstly, it buffers only the first write request being made while
suspended -- any further writes are silently dropped.

Secondly, writes being dropped also leak write urbs, which are never
reclaimed (until the device is unbound).

Thirdly, even the single buffered write is not cleared at shutdown
(which may happen before the device is resumed), something which can
lead to another urb leak as well as a PM usage-counter leak.

Fix this by implementing a delayed-write queue using urb anchors and
making sure to discard the queue properly at shutdown.

Fixes: 11ea859d ("USB: additional power savings for cdc-acm devices
that support remote wakeup")
Reported-by: Xiao Jin <jin.xiao@intel.com>
Signed-off-by: Johan Hovold <jhovold@gmail.com>
Signed-off-by: Jiri Slaby <jslaby@suse.cz>

870306cf

USB: cdc-acm: fix write and resume race · 529ce0fc

Johan Hovold authored May 26, 2014

commit e144ed28 upstream.

Fix race between write() and resume() due to improper locking that could
lead to writes being reordered.

Resume must be done atomically and susp_count be protected by the
write_lock in order to prevent racing with write(). This could otherwise
lead to writes being reordered if write() grabs the write_lock after
susp_count is decremented, but before the delayed urb is submitted.

Fixes: 11ea859d ("USB: additional power savings for cdc-acm devices
that support remote wakeup")
Signed-off-by: Johan Hovold <jhovold@gmail.com>
Signed-off-by: Jiri Slaby <jslaby@suse.cz>

529ce0fc

USB: cdc-acm: fix write and suspend race · 06c2f546

Johan Hovold authored May 26, 2014

commit 5a345c20 upstream.

Fix race between write() and suspend() which could lead to writes being
dropped (or I/O while suspended) if the device is runtime suspended
while a write request is being processed.

Specifically, suspend() releases the write_lock after determining the
device is idle but before incrementing the susp_count, thus leaving a
window where a concurrent write() can submit an urb.

Fixes: 11ea859d ("USB: additional power savings for cdc-acm devices
that support remote wakeup")
Signed-off-by: Johan Hovold <jhovold@gmail.com>
Signed-off-by: Jiri Slaby <jslaby@suse.cz>

06c2f546

MIPS: KVM: Allocate at least 16KB for exception handlers · 263be4bc

James Hogan authored May 29, 2014

commit 7006e2df upstream.

Each MIPS KVM guest has its own copy of the KVM exception vector. This
contains the TLB refill exception handler at offset 0x000, the general
exception handler at offset 0x180, and interrupt exception handlers at
offset 0x200 in case Cause_IV=1. A common handler is copied to offset
0x2000 and offset 0x3000 is used for temporarily storing k1 during entry
from guest.

However the amount of memory allocated for this purpose is calculated as
0x200 rounded up to the next page boundary, which is insufficient if 4KB
pages are in use. This can lead to the common handler at offset 0x2000
being overwritten and infinitely recursive exceptions on the next exit
from the guest.

Increase the minimum size from 0x200 to 0x4000 to cover the full use of
the page.
Signed-off-by: James Hogan <james.hogan@imgtec.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Gleb Natapov <gleb@kernel.org>
Cc: kvm@vger.kernel.org
Cc: Ralf Baechle <ralf@linux-mips.org>
Cc: linux-mips@linux-mips.org
Cc: Sanjay Lal <sanjayl@kymasys.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Jiri Slaby <jslaby@suse.cz>

263be4bc

KVM: lapic: sync highest ISR to hardware apic on EOI · 331f2b02

Paolo Bonzini authored May 14, 2014

commit fc57ac2c upstream.

When Hyper-V enlightenments are in effect, Windows prefers to issue an
Hyper-V MSR write to issue an EOI rather than an x2apic MSR write.
The Hyper-V MSR write is not handled by the processor, and besides
being slower, this also causes bugs with APIC virtualization.  The
reason is that on EOI the processor will modify the highest in-service
interrupt (SVI) field of the VMCS, as explained in section 29.1.4 of
the SDM; every other step in EOI virtualization is already done by
apic_send_eoi or on VM entry, but this one is missing.

We need to do the same, and be careful not to muck with the isr_count
and highest_isr_cache fields that are unused when virtual interrupt
delivery is enabled.
Reviewed-by: Yang Zhang <yang.z.zhang@intel.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Jiri Slaby <jslaby@suse.cz>

331f2b02

mfd: sm501: dbg_regs attribute must be read-only · 7228e303

Guenter Roeck authored Sep 08, 2013

commit 8a8320c2 upstream.

Fix:

sm501 sm501: SM501 At b3e00000: Version 050100a0, 8 Mb, IRQ 100
Attribute dbg_regs: write permission without 'store'
------------[ cut here ]------------
WARNING: at drivers/base/core.c:620

dbg_regs does not have a write function and must therefore be marked
as read-only.
Signed-off-by: Guenter Roeck <linux@roeck-us.net>
Signed-off-by: Lee Jones <lee.jones@linaro.org>
Signed-off-by: Jiri Slaby <jslaby@suse.cz>

7228e303

aio: fix aio request leak when events are reaped by userspace · 1c55a373

Benjamin LaHaise authored Jun 24, 2014

commit f8567a38 upstream.

The aio cleanups and optimizations by kmo that were merged into the 3.10
tree added a regression for userspace event reaping.  Specifically, the
reference counts are not decremented if the event is reaped in userspace,
leading to the application being unable to submit further aio requests.
This patch applies to 3.12+.  A separate backport is required for 3.10/3.11.
This issue was uncovered as part of CVE-2014-0206.
Signed-off-by: Benjamin LaHaise <bcrl@kvack.org>
Cc: stable@vger.kernel.org
Cc: Kent Overstreet <kmo@daterainc.com>
Cc: Mateusz Guzik <mguzik@redhat.com>
Cc: Petr Matousek <pmatouse@redhat.com>
Signed-off-by: Jiri Slaby <jslaby@suse.cz>

1c55a373

aio: fix kernel memory disclosure in io_getevents() introduced in v3.10 · bee3f7b8

Benjamin LaHaise authored Jun 24, 2014

commit edfbbf38 upstream.

A kernel memory disclosure was introduced in aio_read_events_ring() in v3.10
by commit a31ad380.  The changes made to
aio_read_events_ring() failed to correctly limit the index into
ctx->ring_pages[], allowing an attacked to cause the subsequent kmap() of
an arbitrary page with a copy_to_user() to copy the contents into userspace.
This vulnerability has been assigned CVE-2014-0206.  Thanks to Mateusz and
Petr for disclosing this issue.

This patch applies to v3.12+.  A separate backport is needed for 3.10/3.11.
Signed-off-by: Benjamin LaHaise <bcrl@kvack.org>
Cc: Mateusz Guzik <mguzik@redhat.com>
Cc: Petr Matousek <pmatouse@redhat.com>
Cc: Kent Overstreet <kmo@daterainc.com>
Cc: Jeff Moyer <jmoyer@redhat.com>
Cc: stable@vger.kernel.org
Signed-off-by: Jiri Slaby <jslaby@suse.cz>

bee3f7b8

serial: 8250_dw: Fix LCR workaround regression · be31bc4b

James Hogan authored Dec 10, 2013

commit 6979f8d2 upstream.

Commit c49436b6 (serial: 8250_dw: Improve unwritable LCR workaround)
caused a regression. It added a check that the LCR was written properly
to detect and workaround the busy quirk, but the behaviour of bit 5
(UART_LCR_SPAR) differs between IP versions 3.00a and 3.14c per the
docs. On older versions this caused the check to fail and it would
repeatedly force idle and rewrite the LCR register, causing delays and
preventing any input from serial being received.

This is fixed by masking out UART_LCR_SPAR before making the comparison.
Signed-off-by: James Hogan <james.hogan@imgtec.com>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Jiri Slaby <jslaby@suse.cz>
Cc: Tim Kryger <tim.kryger@linaro.org>
Cc: Ezequiel Garcia <ezequiel.garcia@free-electrons.com>
Cc: Matt Porter <matt.porter@linaro.org>
Cc: Markus Mayer <markus.mayer@linaro.org>
Tested-by: Tim Kryger <tim.kryger@linaro.org>
Tested-by: Ezequiel Garcia <ezequiel.garcia@free-electrons.com>
Tested-by: Heikki Krogerus <heikki.krogerus@linux.intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Jiri Slaby <jslaby@suse.cz>

be31bc4b

serial: 8250_dw: Improve unwritable LCR workaround · 24015775

Tim Kryger authored Oct 01, 2013

commit c49436b6 upstream.

When configured with UART_16550_COMPATIBLE=NO or in versions prior to
the introduction of this option, the Designware UART will ignore writes
to the LCR if the UART is busy.  The current workaround saves a copy of
the last written LCR and re-writes it in the ISR for a special interrupt
that is raised when a write was ignored.

Unfortunately, interrupts are typically disabled prior to performing a
sequence of register writes that include the LCR so the point at which
the retry occurs is too late.  An example is serial8250_do_set_termios()
where an ignored LCR write results in the baud divisor not being set and
instead a garbage character is sent out the transmitter.

Furthermore, since serial_port_out() offers no way to indicate failure,
a serious effort must be made to ensure that the LCR is actually updated
before returning back to the caller.  This is difficult, however, as a
UART that was busy during the first attempt is likely to still be busy
when a subsequent attempt is made unless some extra action is taken.

This updated workaround reads back the LCR after each write to confirm
that the new value was accepted by the hardware.  Should the hardware
ignore a write, the TX/RX FIFOs are cleared and the receive buffer read
before attempting to rewrite the LCR out of the hope that doing so will
force the UART into an idle state.  While this may seem unnecessarily
aggressive, writes to the LCR are used to change the baud rate, parity,
stop bit, or data length so the data that may be lost is likely not
important.  Admittedly, this is far from ideal but it seems to be the
best that can be done given the hardware limitations.

Lastly, the revised workaround doesn't touch the LCR in the ISR, so it
avoids the possibility of a "serial8250: too much work for irq" lock up.
This problem is rare in real situations but can be reproduced easily by
wiring up two UARTs and running the following commands.

  # stty -F /dev/ttyS1 echo
  # stty -F /dev/ttyS2 echo
  # cat /dev/ttyS1 &
  [1] 375
  # echo asdf > /dev/ttyS1
  asdf

  [   27.700000] serial8250: too much work for irq96
  [   27.700000] serial8250: too much work for irq96
  [   27.710000] serial8250: too much work for irq96
  [   27.710000] serial8250: too much work for irq96
  [   27.720000] serial8250: too much work for irq96
  [   27.720000] serial8250: too much work for irq96
  [   27.730000] serial8250: too much work for irq96
  [   27.730000] serial8250: too much work for irq96
  [   27.740000] serial8250: too much work for irq96
Signed-off-by: Tim Kryger <tim.kryger@linaro.org>
Reviewed-by: Matt Porter <matt.porter@linaro.org>
Reviewed-by: Markus Mayer <markus.mayer@linaro.org>
Reviewed-by: Heikki Krogerus <heikki.krogerus@linux.intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Jiri Slaby <jslaby@suse.cz>

Conflicts:
	drivers/tty/serial/8250/8250_dw.c

24015775

mm: add !pte_present() check on existing hugetlb_entry callbacks · 7032d5fb

Naoya Horiguchi authored Jun 06, 2014

commit d4c54919 upstream.

The age table walker doesn't check non-present hugetlb entry in common
path, so hugetlb_entry() callbacks must check it.  The reason for this
behavior is that some callers want to handle it in its own way.

[ I think that reason is bogus, btw - it should just do what the regular
  code does, which is to call the "pte_hole()" function for such hugetlb
  entries  - Linus]

However, some callers don't check it now, which causes unpredictable
result, for example when we have a race between migrating hugepage and
reading /proc/pid/numa_maps.  This patch fixes it by adding !pte_present
checks on buggy callbacks.

This bug exists for years and got visible by introducing hugepage
migration.

ChangeLog v2:
- fix if condition (check !pte_present() instead of pte_present())
Reported-by: Sasha Levin <sasha.levin@oracle.com>
Signed-off-by: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com>
Cc: Rik van Riel <riel@redhat.com>
Cc: <stable@vger.kernel.org> [3.12+]
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
[ Backported to 3.15.  Signed-off-by: Josh Boyer <jwboyer@fedoraproject.org> ]
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Jiri Slaby <jslaby@suse.cz>

7032d5fb

nfsd: don't halt scanning the DRC LRU list when there's an RC_INPROG entry · 5b96c379

Jeff Layton authored Jun 05, 2014

commit 1b19453d upstream.

Currently, the DRC cache pruner will stop scanning the list when it
hits an entry that is RC_INPROG. It's possible however for a call to
take a *very* long time. In that case, we don't want it to block other
entries from being pruned if they are expired or we need to trim the
cache to get back under the limit.

Fix the DRC cache pruner to just ignore RC_INPROG entries.
Signed-off-by: Jeff Layton <jlayton@primarydata.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
Signed-off-by: Jiri Slaby <jslaby@suse.cz>

5b96c379

aio: block io_destroy() until all context requests are completed · 04242f7d

Anatol Pomozov authored Apr 15, 2014

commit e02ba72a upstream.

deletes aio context and all resources related to. It makes sense that
no IO operations connected to the context should be running after the context
is destroyed. As we removed io_context we have no chance to
get requests status or call io_getevents().

man page for io_destroy says that this function may block until
all context's requests are completed. Before kernel 3.11 io_destroy()
blocked indeed, but since aio refactoring in 3.11 it is not true anymore.

Here is a pseudo-code that shows a testcase for a race condition discovered
in 3.11:

  initialize io_context
  io_submit(read to buffer)
  io_destroy()

  // context is destroyed so we can free the resources
  free(buffers);

  // if the buffer is allocated by some other user he'll be surprised
  // to learn that the buffer still filled by an outstanding operation
  // from the destroyed io_context

The fix is straight-forward - add a completion struct and wait on it
in io_destroy, complete() should be called when number of in-fligh requests
reaches zero.

If two or more io_destroy() called for the same context simultaneously then
only the first one waits for IO completion, other calls behaviour is undefined.

Tested: ran http://pastebin.com/LrPsQ4RL testcase for several hours and
  do not see the race condition anymore.
Signed-off-by: Anatol Pomozov <anatol.pomozov@gmail.com>
Signed-off-by: Benjamin LaHaise <bcrl@kvack.org>
Signed-off-by: Jiri Slaby <jslaby@suse.cz>

04242f7d

nfsd: don't try to reuse an expired DRC entry off the list · 0c9a0cfb

Jeff Layton authored Dec 05, 2013

commit a0ef5e19 upstream.

Currently when we are processing a request, we try to scrape an expired
or over-limit entry off the list in preference to allocating a new one
from the slab.

This is unnecessarily complicated. Just use the slab layer.
Signed-off-by: Jeff Layton <jlayton@redhat.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
Signed-off-by: Jiri Slaby <jslaby@suse.cz>

0c9a0cfb

drivers/video/fbdev/fb-puv3.c: Add header files for function unifb_mmap · 9e6f2084

Zhichuang SUN authored May 21, 2014

commit fbc6c4a1 upstream.

Function unifb_mmap calls functions which are defined in linux/mm.h
and asm/pgtable.h

The related error (for unicore32 with unicore32_defconfig):
	CC      drivers/video/fbdev/fb-puv3.o
	drivers/video/fbdev/fb-puv3.c: In function 'unifb_mmap':
	drivers/video/fbdev/fb-puv3.c:646: error: implicit declaration of
				      function 'vm_iomap_memory'
	drivers/video/fbdev/fb-puv3.c:646: error: implicit declaration of
				      function 'pgprot_noncached'
Signed-off-by: Zhichuang Sun <sunzc522@gmail.com>
Cc: Jean-Christophe Plagniol-Villard <plagnioj@jcrosoft.com>
Cc: Tomi Valkeinen <tomi.valkeinen@ti.com>
Cc: Jingoo Han <jg1.han@samsung.com>
Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
Cc: Joe Perches <joe@perches.com>
Cc: Laurent Pinchart <laurent.pinchart@ideasonboard.com>
Cc: linux-fbdev@vger.kernel.org
Acked-by: Xuetao Guan <gxt@mprc.pku.edu.cn>
Signed-off-by: Tomi Valkeinen <tomi.valkeinen@ti.com>
Signed-off-by: Jiri Slaby <jslaby@suse.cz>

9e6f2084

arch/unicore32/mm/alignment.c: include "asm/pgtable.h" to avoid compiling error · 976c2ec4

Chen Gang authored Mar 24, 2014

commit 1ff38c56 upstream.

Need include "asm/pgtable.h" to include "asm-generic/pgtable-nopmd.h",
so can let 'pmd_t' defined. The related error with allmodconfig:

CC arch/unicore32/mm/alignment.o
In file included from arch/unicore32/mm/alignment.c:24:
arch/unicore32/include/asm/tlbflush.h:135: error: expected .). before .*. token
arch/unicore32/include/asm/tlbflush.h:154: error: expected .). before .*. token
In file included from arch/unicore32/mm/alignment.c:27:
arch/unicore32/mm/mm.h:15: error: expected .=., .,., .;., .sm. or ._attribute__. before .*. token
arch/unicore32/mm/mm.h:20: error: expected .=., .,., .;., .sm. or ._attribute__. before .*. token
arch/unicore32/mm/mm.h:25: error: expected .=., .,., .;., .sm. or ._attribute__. before .*. token
make[1]: *** [arch/unicore32/mm/alignment.o] Error 1
make: *** [arch/unicore32/mm] Error 2
Signed-off-by: Chen Gang <gang.chen.5i5j@gmail.com>
Acked-by: Xuetao Guan <gxt@mprc.pku.edu.cn>
Signed-off-by: Xuetao Guan <gxt@mprc.pku.edu.cn>
Signed-off-by: Jiri Slaby <jslaby@suse.cz>

976c2ec4

27 Jun, 2014 11 commits

ocfs2: revert iput deferring code in ocfs2_drop_dentry_lock · fc597a30

Goldwyn Rodrigues authored Apr 03, 2014

commit 8ed6b237 upstream.

The following patches are reverted in this patch because these patches
caused performance regression in the remote unlink() calls.

  ea455f8a - ocfs2: Push out dropping of dentry lock to ocfs2_wq
  f7b1aa69 - ocfs2: Fix deadlock on umount
  5fd13189 - ocfs2: Don't oops in ocfs2_kill_sb on a failed mount

Previous patches in this series removed the possible deadlocks from
downconvert thread so the above patches shouldn't be needed anymore.

The regression is caused because these patches delay the iput() in case
of dentry unlocks.  This also delays the unlocking of the open lockres.
The open lockresource is required to test if the inode can be wiped from
disk or not.  When the deleting node does not get the open lock, it
marks it as orphan (even though it is not in use by another
node/process) and causes a journal checkpoint.  This delays operations
following the inode eviction.  This also moves the inode to the orphaned
inode which further causes more I/O and a lot of unneccessary orphans.

The following script can be used to generate the load causing issues:

  declare -a create
  declare -a remove
  declare -a iterations=(1 2 4 8 16 32 64 128 256 512 1024 2048 4096 8192 16384)
  unique="`mktemp -u XXXXX`"
  script="/tmp/idontknow-${unique}.sh"
  cat <<EOF > "${script}"
  for n in {1..8}; do mkdir -p test/dir\${n}
    eval touch test/dir\${n}/foo{1.."\$1"}
  done
  EOF
  chmod 700 "${script}"

  function fcreate ()
  {
    exec 2>&1 /usr/bin/time --format=%E "${script}" "$1"
  }

  function fremove ()
  {
    exec 2>&1 /usr/bin/time --format=%E ssh node2 "cd `pwd`; rm -Rf test*"
  }

  function fcp ()
  {
    exec 2>&1 /usr/bin/time --format=%E ssh node3 "cd `pwd`; cp -R test test.new"
  }

  echo -------------------------------------------------
  echo "| # files | create #s | copy #s | remove #s |"
  echo -------------------------------------------------
  for ((x=0; x < ${#iterations[*]} ; x++)) do
    create[$x]="`fcreate ${iterations[$x]}`"
    copy[$x]="`fcp ${iterations[$x]}`"
    remove[$x]="`fremove`"
    printf "| %8d | %9s | %9s | %9s |\n" ${iterations[$x]} ${create[$x]} ${copy[$x]} ${remove[$x]}
  done
  rm "${script}"
  echo "------------------------"
Signed-off-by: Srinivas Eeda <srinivas.eeda@oracle.com>
Signed-off-by: Goldwyn Rodrigues <rgoldwyn@suse.com>
Signed-off-by: Jan Kara <jack@suse.cz>
Reviewed-by: Mark Fasheh <mfasheh@suse.de>
Cc: Joel Becker <jlbec@evilplan.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Jiri Slaby <jslaby@suse.cz>

fc597a30

ocfs2: avoid blocking in ocfs2_mark_lockres_freeing() in downconvert thread · 8f265718

Jan Kara authored Apr 03, 2014

commit 84d86f83 upstream.

If we are dropping last inode reference from downconvert thread, we will
end up calling ocfs2_mark_lockres_freeing() which can block if the lock
we are freeing is queued thus creating an A-A deadlock.  Luckily, since
we are the downconvert thread, we can immediately dequeue the lock and
thus avoid waiting in this case.
Signed-off-by: Jan Kara <jack@suse.cz>
Reviewed-by: Mark Fasheh <mfasheh@suse.de>
Reviewed-by: Srinivas Eeda <srinivas.eeda@oracle.com>
Cc: Joel Becker <jlbec@evilplan.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Jiri Slaby <jslaby@suse.cz>

8f265718

ocfs2: implement delayed dropping of last dquot reference · cac99bee

Jan Kara authored Apr 03, 2014

commit e3a767b6 upstream.

We cannot drop last dquot reference from downconvert thread as that
creates the following deadlock:

NODE 1                                  NODE2
holds dentry lock for 'foo'
holds inode lock for GLOBAL_BITMAP_SYSTEM_INODE
                                        dquot_initialize(bar)
                                          ocfs2_dquot_acquire()
                                            ocfs2_inode_lock(USER_QUOTA_SYSTEM_INODE)
                                            ...
downconvert thread (triggered from another
node or a different process from NODE2)
  ocfs2_dentry_post_unlock()
    ...
    iput(foo)
      ocfs2_evict_inode(foo)
        ocfs2_clear_inode(foo)
          dquot_drop(inode)
            ...
	    ocfs2_dquot_release()
              ocfs2_inode_lock(USER_QUOTA_SYSTEM_INODE)
               - blocks
                                            finds we need more space in
                                            quota file
                                            ...
                                            ocfs2_extend_no_holes()
                                              ocfs2_inode_lock(GLOBAL_BITMAP_SYSTEM_INODE)
                                                - deadlocks waiting for
                                                  downconvert thread

We solve the problem by postponing dropping of the last dquot reference to
a workqueue if it happens from the downconvert thread.
Signed-off-by: Jan Kara <jack@suse.cz>
Reviewed-by: Mark Fasheh <mfasheh@suse.de>
Reviewed-by: Srinivas Eeda <srinivas.eeda@oracle.com>
Cc: Joel Becker <jlbec@evilplan.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Jiri Slaby <jslaby@suse.cz>

cac99bee

quota: provide function to grab quota structure reference · b5258061

Jan Kara authored Apr 03, 2014

commit 9f985cb6 upstream.

Provide dqgrab() function to get quota structure reference when we are
sure it already has at least one active reference.  Make use of this
function inside quota code.
Signed-off-by: Jan Kara <jack@suse.cz>
Reviewed-by: Mark Fasheh <mfasheh@suse.de>
Reviewed-by: Srinivas Eeda <srinivas.eeda@oracle.com>
Cc: Joel Becker <jlbec@evilplan.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Jiri Slaby <jslaby@suse.cz>

b5258061

ocfs2: move dquot_initialize() in ocfs2_delete_inode() somewhat later · 2eb0658f

Jan Kara authored Apr 03, 2014

commit bd62ad7a upstream.

Move dquot_initalize() call in ocfs2_delete_inode() after the moment we
verify inode is actually a sane one to delete.  We certainly don't want
to initialize quota for system inodes etc.  This also avoids calling
into quota code from downconvert thread.

Add more details into the comment why bailing out from
ocfs2_delete_inode() when we are in downconvert thread is OK.
Signed-off-by: Jan Kara <jack@suse.cz>
Reviewed-by: Mark Fasheh <mfasheh@suse.de>
Reviewed-by: Srinivas Eeda <srinivas.eeda@oracle.com>
Cc: Joel Becker <jlbec@evilplan.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Jiri Slaby <jslaby@suse.cz>

2eb0658f

dlm: keep listening connection alive with sctp mode · 34b6c049

Lidong Zhong authored Jun 12, 2014

commit 883854c5 upstream.

The connection struct with nodeid 0 is the listening socket,
not a connection to another node.  The sctp resend function
was not checking that the nodeid was valid (non-zero), so it
would mistakenly get and resend on the listening connection
when nodeid was zero.
Signed-off-by: Lidong Zhong <lzhong@suse.com>
Signed-off-by: David Teigland <teigland@redhat.com>
Signed-off-by: Jiri Slaby <jslaby@suse.cz>

34b6c049

Btrfs: fix BUG_ON() casued by the reserved space migration · 9bf37c05

Miao Xie authored Sep 25, 2013

commit 20dd2cbf upstream.

When we did space balance and snapshot creation at the same time, we might
meet the following oops:
 kernel BUG at fs/btrfs/inode.c:3038!
 [SNIP]
 Call Trace:
 [<ffffffffa0411ec7>] btrfs_orphan_cleanup+0x293/0x407 [btrfs]
 [<ffffffffa042dc45>] btrfs_mksubvol.isra.28+0x259/0x373 [btrfs]
 [<ffffffffa042de85>] btrfs_ioctl_snap_create_transid+0x126/0x156 [btrfs]
 [<ffffffffa042dff1>] btrfs_ioctl_snap_create_v2+0xd0/0x121 [btrfs]
 [<ffffffffa0430b2c>] btrfs_ioctl+0x414/0x1854 [btrfs]
 [<ffffffff813b60b7>] ? __do_page_fault+0x305/0x379
 [<ffffffff811215a9>] vfs_ioctl+0x1d/0x39
 [<ffffffff81121d7c>] do_vfs_ioctl+0x32d/0x3e2
 [<ffffffff81057fe7>] ? finish_task_switch+0x80/0xb8
 [<ffffffff81121e88>] SyS_ioctl+0x57/0x83
 [<ffffffff813b39ff>] ? do_device_not_available+0x12/0x14
 [<ffffffff813b99c2>] system_call_fastpath+0x16/0x1b
 [SNIP]
 RIP  [<ffffffffa040da40>] btrfs_orphan_add+0xc3/0x126 [btrfs]

The reason of the problem is that the relocation root creation stole
the reserved space, which was reserved for orphan item deletion.

There are several ways to fix this problem, one is to increasing
the reserved space size of the space balace, and then we can use
that space to create the relocation tree for each fs/file trees.
But it is hard to calculate the suitable size because we doesn't
know how many fs/file trees we need relocate.

We fixed this problem by reserving the space for relocation root creation
actively since the space it need is very small (one tree block, used for
root node copy), then we use that reserved space to create the
relocation tree. If we don't reserve space for relocation tree creation,
we will use the reserved space of the balance.
Signed-off-by: Miao Xie <miaox@cn.fujitsu.com>
Signed-off-by: Josef Bacik <jbacik@fusionio.com>
Signed-off-by: Chris Mason <chris.mason@fusionio.com>
Signed-off-by: Jiri Slaby <jslaby@suse.cz>

9bf37c05

Btrfs: fix two use-after-free bugs with transaction cleanup · 8b16b61c

Josef Bacik authored Sep 30, 2013

commit 724e2315 upstream.

I was noticing the slab redzone stuff going off every once and a while during
transaction aborts. This was caused by two things

1) We would walk the pending snapshots and set their error to -ECANCELED. We
don't need to do this, the snapshot stuff waits for a transaction commit and if
there is a problem we just free our pending snapshot object and exit. Doing
this was causing us to touch the pending snapshot object after the thing had
already been freed.

2) We were freeing the transaction manually with wanton disregard for it's
use_count reference counter. To fix this I cleaned up the transaction freeing
loop to either wait for the transaction commit to finish if it was in the middle
of that (since it will be cleaned and freed up there) or to do the cleanup
oursevles.

I also moved the global "kill all things dirty everywhere" stuff outside of the
transaction cleanup loop since that only needs to be done once. With this patch
I'm no longer seeing slab corruption because of use after frees. Thanks,
Signed-off-by: Josef Bacik <jbacik@fusionio.com>
Signed-off-by: Chris Mason <chris.mason@fusionio.com>
Signed-off-by: Jiri Slaby <jslaby@suse.cz>

8b16b61c

Btrfs: don't delete ordered roots from list during cleanup · 445d1c3a

Josef Bacik authored Sep 27, 2013

commit 1de2cfde upstream.

During transaction cleanup after an abort we are just removing roots from the
ordered roots list which is incorrect.  We have a BUG_ON() to make sure that the
root is still part of the ordered roots list when we put our ordered extent
which we were tripping in this case.  So do like we do everywhere else and just
move it to the tail of the ordered roots list and allow the normal cleanup to
take care of stuff.  Thanks,
Signed-off-by: Josef Bacik <jbacik@fusionio.com>
Signed-off-by: Chris Mason <chris.mason@fusionio.com>
Signed-off-by: Jiri Slaby <jslaby@suse.cz>

445d1c3a

Btrfs: cleanup transaction on abort · bd32872f

Josef Bacik authored Sep 27, 2013

commit 4e121c06 upstream.

If we abort not during a transaction commit we won't clean up anything until we
unmount.  Unfortunately if we abort in the middle of writing out an ordered
extent we won't clean it up and if somebody is waiting on that ordered extent
they will wait forever.  To fix this just make the transaction kthread call the
cleanup transaction stuff if it notices theres an error, and make
btrfs_end_transaction wake up the transaction kthread if there is an error.
Thanks,
Signed-off-by: Josef Bacik <jbacik@fusionio.com>
Signed-off-by: Chris Mason <chris.mason@fusionio.com>
Signed-off-by: Jiri Slaby <jslaby@suse.cz>

bd32872f

Btrfs: do not release metadata for space cache inodes · 4b6d66d1

Josef Bacik authored Sep 27, 2013

commit b6d08f06 upstream.

I've been testing our error paths and I was tripping the BUG_ON() in
drop_outstanding_extent because our outstanding_extents is 0 for space cache
inodes. This is because we don't reserve metadata space for these inodes since
we depend on the global block reserve for our space. To fix this we need to
make sure the DO_ACCOUNTING stuff doesn't actually call release_metadata for
space cache inodes. With this patch I'm no longer panicing. Thanks,
Signed-off-by: Josef Bacik <jbacik@fusionio.com>
Signed-off-by: Chris Mason <chris.mason@fusionio.com>
Signed-off-by: Jiri Slaby <jslaby@suse.cz>

4b6d66d1