Commits · 8ab2e3a2ae06bdd02b2a7269ef6ed896540a6170 · nexedi / linux

16 Aug, 2011 3 commits

cris: fix the prototype of sync_serial_ioctl() · 8ab2e3a2

WANG Cong authored Aug 03, 2011

commit b4bc2812 upstream.

Fix:

  arch/cris/arch-v10/drivers/sync_serial.c:961: error: conflicting types for 'sync_serial_ioctl'
Reported-by: Geert Uytterhoeven <geert@linux-m68k.org>
Signed-off-by: WANG Cong <xiyou.wangcong@gmail.com>
Cc: Mikael Starvik <starvik@axis.com>
Cc: Jesper Nilsson <jesper.nilsson@axis.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>

8ab2e3a2

cris: fix a build error in sync_serial_open() · a2708fa2

WANG Cong authored Aug 03, 2011

commit 4b851d88 upstream.

Fix:

  arch/cris/arch-v10/drivers/sync_serial.c:628: error: 'ret' undeclared (first use in this function)

'ret' should be 'err'.
Reported-by: Geert Uytterhoeven <geert@linux-m68k.org>
Signed-off-by: WANG Cong <xiyou.wangcong@gmail.com>
Cc: Mikael Starvik <starvik@axis.com>
Cc: Jesper Nilsson <jesper.nilsson@axis.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>

a2708fa2

cris: fix a build error in kernel/fork.c · f5508a09

WANG Cong authored Aug 03, 2011

commit d4969213 upstream.

Fix this error:

  kernel/fork.c:267: error: implicit declaration of function 'alloc_thread_info_node'

This is due to renaming alloc_thread_info() to alloc_thread_info_node().

[akpm@linux-foundation.org: coding-style fixes]
Reported-by: Geert Uytterhoeven <geert@linux-m68k.org>
Signed-off-by: WANG Cong <xiyou.wangcong@gmail.com>
Cc: Mikael Starvik <starvik@axis.com>
Cc: Jesper Nilsson <jesper.nilsson@axis.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>

f5508a09

05 Aug, 2011 37 commits

Linux 3.0.1 · 94ed5b47
Greg Kroah-Hartman authored Aug 04, 2011

94ed5b47

dm: fix idr leak on module removal · c2b49885

Alasdair G Kergon authored Aug 02, 2011

commit d15b774c upstream.

Destroy _minor_idr when unloading the core dm module.  (Found by kmemleak.)
Signed-off-by: Alasdair G Kergon <agk@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>

c2b49885

dm mpath: fix potential NULL pointer in feature arg processing · eb81cf19

Mike Snitzer authored Aug 02, 2011

commit 286f367d upstream.

Avoid dereferencing a NULL pointer if the number of feature arguments
supplied is fewer than indicated.
Signed-off-by: Mike Snitzer <snitzer@redhat.com>
Signed-off-by: Alasdair G Kergon <agk@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>

eb81cf19

dm snapshot: flush disk cache when merging · b41ed9c3

Mikulas Patocka authored Aug 02, 2011

commit 762a80d9 upstream.

This patch makes dm-snapshot flush disk cache when writing metadata for
merging snapshot.

Without cache flushing the disk may reorder metadata write and other
data writes and there is a possibility of data corruption in case of
power fault.
Signed-off-by: Mikulas Patocka <mpatocka@redhat.com>
Signed-off-by: Alasdair G Kergon <agk@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>

b41ed9c3

dm io: flush cpu cache with vmapped io · ee607aa2

Mikulas Patocka authored Aug 02, 2011

commit bb91bc7b upstream.

For normal kernel pages, CPU cache is synchronized by the dma layer.
However, this is not done for pages allocated with vmalloc. If we do I/O
to/from vmallocated pages, we must synchronize CPU cache explicitly.

Prior to doing I/O on vmallocated page we must call
flush_kernel_vmap_range to flush dirty cache on the virtual address.
After finished read we must call invalidate_kernel_vmap_range to
invalidate cache on the virtual address, so that accesses to the virtual
address return newly read data and not stale data from CPU cache.

This patch fixes metadata corruption on dm-snapshots on PA-RISC and
possibly other architectures with caches indexed by virtual address.
Signed-off-by: Mikulas Patocka <mpatocka@redhat.com>
Signed-off-by: Alasdair G Kergon <agk@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>

ee607aa2

ALSA: sound/core/pcm_compat.c: adjust array index · f8c62dc2

Julia Lawall authored Jul 28, 2011

commit ca9380fd upstream.

Convert array index from the loop bound to the loop index.

A simplified version of the semantic patch that fixes this problem is as
follows: (http://coccinelle.lip6.fr/)

// <smpl>
@@
expression e1,e2,ar;
@@

for(e1 = 0; e1 < e2; e1++) { <...
  ar[
- e2
+ e1
  ]
  ...> }
// </smpl>
Signed-off-by: Julia Lawall <julia@diku.dk>
Signed-off-by: Takashi Iwai <tiwai@suse.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>

f8c62dc2

watchdog: shwdt: fix usage of mod_timer · f9e4715d

David Engraf authored Jul 20, 2011

commit bea19066 upstream.

Fix the usage of mod_timer() and make the driver usable. mod_timer() must
be called with an absolute timeout in jiffies. The old implementation
used a relative timeout thus the hardware watchdog was never triggered.
Signed-off-by: David Engraf <david.engraf@sysgo.com>
Signed-off-by: Paul Mundt <lethal@linux-sh.org>
Signed-off-by: Wim Van sebroeck <wim@iguana.be>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>

f9e4715d

GFS2: Fix mount hang caused by certain access pattern to sysfs files · 52880922

Steven Whitehouse authored Jul 26, 2011

commit 19237039 upstream.

Depending upon the order of userspace/kernel during the
mount process, this can result in a hang without the
_all version of the completion.
Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>

52880922

rt2x00: Add device ID for RT539F device. · 2cd0312d

Gertjan van Wingerde authored Jul 06, 2011

commit 71e0b38c upstream.
Reported-by: Wim Vander Schelden <wim@fixnum.org>
Signed-off-by: Gertjan van Wingerde <gwingerde@gmail.com>
Signed-off-by: Ivo van Doorn <IvDoorn@gmail.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>

2cd0312d

oom: task->mm == NULL doesn't mean the memory was freed · 84416db6

Oleg Nesterov authored Jul 30, 2011

commit c027a474 upstream.

exit_mm() sets ->mm == NULL then it does mmput()->exit_mmap() which
frees the memory.

However select_bad_process() checks ->mm != NULL before TIF_MEMDIE,
so it continues to kill other tasks even if we have the oom-killed
task freeing its memory.

Change select_bad_process() to check ->mm after TIF_MEMDIE, but skip
the tasks which have already passed exit_notify() to ensure a zombie
with TIF_MEMDIE set can't block oom-killer. Alternatively we could
probably clear TIF_MEMDIE after exit_mmap().
Signed-off-by: Oleg Nesterov <oleg@redhat.com>
Reviewed-by: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>

84416db6

AppArmor: Fix masking of capabilities in complain mode · 06b94385

John Johansen authored Jun 25, 2011

commit 25e75dff upstream.

AppArmor is masking the capabilities returned by capget against the
capabilities mask in the profile.  This is wrong, in complain mode the
profile has effectively all capabilities, as the profile restrictions are
not being enforced, merely tested against to determine if an access is
known by the profile.

This can result in the wrong behavior of security conscience applications
like sshd which examine their capability set, and change their behavior
accordingly.  In this case because of the masked capability set being
returned sshd fails due to DAC checks, even when the profile is in complain
mode.

Kernels affected: 2.6.36 - 3.0.
Signed-off-by: John Johansen <john.johansen@canonical.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>

06b94385

AppArmor: Fix reference to rcu protected pointer outside of rcu_read_lock · 0635a74b

John Johansen authored Jun 28, 2011

commit 04fdc099 upstream.

The pointer returned from tracehook_tracer_task() is only valid inside
the rcu_read_lock.  However the tracer pointer obtained is being passed
to aa_may_ptrace outside of the rcu_read_lock critical section.

Mover the aa_may_ptrace test into the rcu_read_lock critical section, to
fix this.

Kernels affected: 2.6.36 - 3.0
Reported-by: Oleg Nesterov <oleg@redhat.com>
Signed-off-by: John Johansen <john.johansen@canonical.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>

0635a74b

ipc/sem.c: fix race with concurrent semtimedop() timeouts and IPC_RMID · e73ff290

Manfred Spraul authored Jul 25, 2011

commit d694ad62 upstream.

If a semaphore array is removed and in parallel a sleeping task is woken
up (signal or timeout, does not matter), then the woken up task does not
wait until wake_up_sem_queue_do() is completed. This will cause crashes,
because wake_up_sem_queue_do() will read from a stale pointer.

The fix is simple: Regardless of anything, always call get_queue_result().
This function waits until wake_up_sem_queue_do() has finished it's task.

Addresses https://bugzilla.kernel.org/show_bug.cgi?id=27142Reported-by: Yuriy Yevtukhov <yuriy@ucoz.com>
Reported-by: Harald Laabs <kernel@dasr.de>
Signed-off-by: Manfred Spraul <manfred@colorfullife.com>
Acked-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>

e73ff290

hvc_console: Improve tty/console put_chars handling · 9f78aa15

Hendrik Brueckner authored Jul 05, 2011

commit 8c2381af upstream.

Currently, the hvc_console_print() function drops console output if the
hvc backend's put_chars() returns 0.  This patch changes this behavior
to allow a retry through returning -EAGAIN.

This change also affects the hvc_push() function.  Both functions are
changed to handle -EAGAIN and to retry the put_chars() operation.

If a hvc backend returns -EAGAIN, the retry handling differs:

  - hvc_console_print() spins to write the complete console output.
  - hvc_push() behaves the same way as for returning 0.

Now hvc backends can indirectly control the way how console output is
handled through the hvc console layer.
Signed-off-by: Hendrik Brueckner <brueckner@linux.vnet.ibm.com>
Acked-by: Anton Blanchard <anton@samba.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>

9f78aa15

powerpc/pseries/hvconsole: Fix dropped console output · dc96c181

Anton Blanchard authored Jul 05, 2011

commit 51d33021 upstream.

Return -EAGAIN when we get H_BUSY back from the hypervisor. This
makes the hvc console driver retry, avoiding dropped printks.
Signed-off-by: Anton Blanchard <anton@samba.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>

dc96c181

SERIAL: SC26xx: Fix link error. · 86c361cd

Ralf Baechle authored Jun 27, 2011

commit f2eb3cdf upstream.

Kconfig allows enabling console support for the SC26xx driver even when
it's configured as a module resulting in a:

ERROR: "uart_console_device" [drivers/tty/serial/sc26xx.ko] undefined!

modpost error since the driver was merged in
eea63e0e [SC26XX: New serial driver for
SC2681 uarts] in 2.6.25.  Fixed by only allowing console support to be
enabled if the driver is builtin.
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
Cc: linux-serial@vger.kernel.org
Cc: linux-kernel@vger.kernel.org
Cc: linux-mips@linux-mips.org
Acked-by: Alan Cox <alan@linux.intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>

86c361cd

tty/serial: Fix XSCALE serial ports, e.g. ce4100 · c4b9902f

Stephen Warren authored Jun 17, 2011

commit 5568181f upstream.

Commit 4539c24f "tty/serial: Add
explicit PORT_TEGRA type" introduced separate flags describing the need
for IER bits UUE and RTOIE. Both bits are required for the XSCALE port
type. While that patch updated uart_config[] as required, the auto-probing
code wasn't updated to set the RTOIE flag when an XSCALE port type was
detected. This caused such ports to stop working. This patch rectifies
that.
Reported-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Tested-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Signed-off-by: Stephen Warren <swarren@nvidia.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>

c4b9902f

memcg: fix behavior of mem_cgroup_resize_limit() · a6f0411f

Daisuke Nishimura authored Jul 26, 2011

commit 108b6a78 upstream.

Commit 22a668d7 ("memcg: fix behavior under memory.limit equals to
memsw.limit") introduced "memsw_is_minimum" flag, which becomes true
when mem_limit == memsw_limit.  The flag is checked at the beginning of
reclaim, and "noswap" is set if the flag is true, because using swap is
meaningless in this case.

This works well in most cases, but when we try to shrink mem_limit,
which is the same as memsw_limit now, we might fail to shrink mem_limit
because swap doesn't used.

This patch fixes this behavior by:
 - check MEM_CGROUP_RECLAIM_SHRINK at the begining of reclaim
 - If it is set, don't set "noswap" flag even if memsw_is_minimum is true.
Signed-off-by: Daisuke Nishimura <nishimura@mxp.nes.nec.co.jp>
Cc: Balbir Singh <bsingharora@gmail.com>
Acked-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Cc: Michal Hocko <mhocko@suse.cz>
Cc: Ying Han <yinghan@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>

a6f0411f

cfg80211: really ignore the regulatory request · ea530dbf

Sven Neumann authored Jul 12, 2011

commit a203c2aa upstream.

At the beginning of wiphy_update_regulatory() a check is performed
whether the request is to be ignored. Then the request is sent to
the driver nevertheless. This happens even if last_request points
to NULL, leading to a crash in the driver:

 [<bf01d864>] (lbs_set_11d_domain_info+0x28/0x1e4 [libertas]) from [<c03b714c>] (wiphy_update_regulatory+0x4d0/0x4f4)
 [<c03b714c>] (wiphy_update_regulatory+0x4d0/0x4f4) from [<c03b4008>] (wiphy_register+0x354/0x420)
 [<c03b4008>] (wiphy_register+0x354/0x420) from [<bf01b17c>] (lbs_cfg_register+0x80/0x164 [libertas])
 [<bf01b17c>] (lbs_cfg_register+0x80/0x164 [libertas]) from [<bf020e64>] (lbs_start_card+0x20/0x88 [libertas])
 [<bf020e64>] (lbs_start_card+0x20/0x88 [libertas]) from [<bf02cbd8>] (if_sdio_probe+0x898/0x9c0 [libertas_sdio])

Fix this by returning early. Also remove the out: label as it is
not any longer needed.
Signed-off-by: Sven Neumann <s.neumann@raumfeld.com>
Cc: linux-wireless@vger.kernel.org
Cc: Johannes Berg <johannes@sipsolutions.net>
Cc: Daniel Mack <daniel@zonque.org>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>

ea530dbf

EHCI: fix direction handling for interrupt data toggles · d10a6cb2

Alan Stern authored Jul 19, 2011

commit e04f5f7e upstream.

This patch (as1480) fixes a rather obscure bug in ehci-hcd.  The
qh_update() routine needs to know the number and direction of the
endpoint corresponding to its QH argument.  The number can be taken
directly from the QH data structure, but the direction isn't stored
there.  The direction is taken instead from the first qTD linked to
the QH.

However, it turns out that for interrupt transfers, qh_update() gets
called before the qTDs are linked to the QH.  As a result, qh_update()
computes a bogus direction value, which messes up the endpoint toggle
handling.  Under the right combination of circumstances this causes
usb_reset_endpoint() not to work correctly, which causes packets to be
dropped and communications to fail.

Now, it's silly for the QH structure not to have direct access to all
the descriptor information for the corresponding endpoint.  Ultimately
it may get a pointer to the usb_host_endpoint structure; for now,
adding a copy of the direction flag solves the immediate problem.

This allows the Spyder2 color-calibration system (a low-speed USB
device that sends all its interrupt data packets with the toggle set
to 0 and hance requires constant use of usb_reset_endpoint) to work
when connected through a high-speed hub.  Thanks to Graeme Gill for
supplying the hardware that allowed me to track down this bug.
Signed-off-by: Alan Stern <stern@rowland.harvard.edu>
Reported-by: Graeme Gill <graeme@argyllcms.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>

d10a6cb2

EHCI: only power off port if over-current is active · e151a2a6

Sergei Shtylyov authored Jul 06, 2011

commit 81463c1d upstream.

MAX4967 USB power supply chip we use on our boards signals over-current when
power is not enabled; once it's enabled, over-current signal returns to normal.
That unfortunately caused the endless stream of "over-current change on port"
messages. The EHCI root hub code reacts on every over-current signal change
with powering off the port -- such change event is generated the moment the
port power is enabled, so once enabled the power is immediately cut off.
I think we should only cut off power when we're seeing the active over-current
signal, so I'm adding such check to that code. I also think that the fact that
we've cut off the port power should be reflected in the result of GetPortStatus
request immediately, hence I'm adding a PORTSCn register readback after write...
Signed-off-by: Sergei Shtylyov <sshtylyov@ru.mvista.com>
Acked-by: Alan Stern <stern@rowland.harvard.edu>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>

e151a2a6

n_gsm: fix the wrong FCS handling · 569f3720

Du, Alek authored Jul 07, 2011

commit f086ced1 upstream.

FCS could be GSM0_SOF, so will break state machine...

[This byte isn't quoted in any way so a SOF here doesn't imply an error
 occurred.]
Signed-off-by: Alek Du <alek.du@intel.com>
Signed-off-by: Alan Cox <alan@linux.intel.com>
[Trivial but best backported once its in 3.1rc I think]
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>

569f3720

proc: fix a race in do_io_accounting() · 8cd3f19d

Vasiliy Kulikov authored Jul 26, 2011

commit 293eb1e7 upstream.

If an inode's mode permits opening /proc/PID/io and the resulting file
descriptor is kept across execve() of a setuid or similar binary, the
ptrace_may_access() check tries to prevent using this fd against the
task with escalated privileges.

Unfortunately, there is a race in the check against execve().  If
execve() is processed after the ptrace check, but before the actual io
information gathering, io statistics will be gathered from the
privileged process.  At least in theory this might lead to gathering
sensible information (like ssh/ftp password length) that wouldn't be
available otherwise.

Holding task->signal->cred_guard_mutex while gathering the io
information should protect against the race.

The order of locking is similar to the one inside of ptrace_attach():
first goes cred_guard_mutex, then lock_task_sighand().
Signed-off-by: Vasiliy Kulikov <segoon@openwall.com>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>

8cd3f19d

NFS: Fix spurious readdir cookie loop messages · c14acb19

Trond Myklebust authored Jul 30, 2011

commit 0c030806 upstream.

If the directory contents change, then we have to accept that the
file->f_pos value may shrink if we do a 'search-by-cookie'. In that
case, we should turn off the loop detection and let the NFS client
try to recover.

The patch also fixes a second loop detection bug by ensuring
that after turning on the ctx->duped flag, we read at least one new
cookie into ctx->dir_cookie before attempting to match with
ctx->dup_cookie.
Reported-by: Petr Vandrovec <petr@vandrovec.name>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>

c14acb19

NFSv4: Don't use the delegation->inode in nfs_mark_return_delegation() · 1fcb9d4b

Trond Myklebust authored Jul 25, 2011

commit ed1e6211 upstream.

nfs_mark_return_delegation() is usually called without any locking, and
so it is not safe to dereference delegation->inode. Since the inode is
only used to discover the nfs_client anyway, it makes more sense to
have the callers pass a valid pointer to the nfs_server as a parameter.
Reported-by: Ian Kent <raven@themaw.net>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>

1fcb9d4b

svcrpc: fix list-corrupting race on nfsd shutdown · 83d20a07

J. Bruce Fields authored Jun 29, 2011

commit ebc63e53 upstream.

After commit 3262c816 "[PATCH] knfsd:
split svc_serv into pools", svc_delete_xprt (then svc_delete_socket) no
longer removed its xpt_ready (then sk_ready) field from whatever list it
was on, noting that there was no point since the whole list was about to
be destroyed anyway.

That was mostly true, but forgot that a few svc_xprt_enqueue()'s might
still be hanging around playing with the about-to-be-destroyed list, and
could get themselves into trouble writing to freed memory if we left
this xprt on the list after freeing it.

(This is actually functionally identical to a patch made first by Ben
Greear, but with more comments.)

Cc: gnb@fmeh.org
Reported-by: Ben Greear <greearb@candelatech.com>
Tested-by: Ben Greear <greearb@candelatech.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>

83d20a07

nfsd4: fix file leak on open_downgrade · 4beae54c

J. Bruce Fields authored Jun 29, 2011

commit f197c271 upstream.

Stateid's hold a read reference for a read open, a write reference for a
write open, and an additional one of each for each read+write open.  The
latter wasn't getting put on a downgrade, so something like:

	open RW
	open R
	downgrade to R

was resulting in a file leak.

Also fix an imbalance in an error path.

Regression from 7d947842 "nfsd4: fix
downgrade/lock logic".
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>

4beae54c

nfsd4: remember to put RW access on stateid destruction · ecf6c748

J. Bruce Fields authored Jun 27, 2011

commit 499f3edc upstream.

Without this, for example,

	open read
	open read+write
	close

will result in a struct file leak.

Regression from 7d947842 "nfsd4: fix
downgrade/lock logic".
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>

ecf6c748

nfsd: don't break lease on CLAIM_DELEGATE_CUR · f6d7de0e

Casey Bodley authored Jul 23, 2011

commit 0c12eaff upstream.

CLAIM_DELEGATE_CUR is used in response to a broken lease; allowing it
to break the lease and return EAGAIN leaves the client unable to make
progress in returning the delegation

nfs4_get_vfs_file() now takes struct nfsd4_open for access to the
claim type, and calls nfsd_open() with NFSD_MAY_NOT_BREAK_LEASE when
claim type is CLAIM_DELEGATE_CUR
Signed-off-by: Casey Bodley <cbodley@citi.umich.edu>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>

f6d7de0e

eCryptfs: Unlock keys needed by ecryptfsd · efc977be

Tyler Hicks authored Jul 26, 2011

commit b2987a5e upstream.

Fixes a regression caused by b5695d04

Kernel keyring keys containing eCryptfs authentication tokens should not
be write locked when calling out to ecryptfsd to wrap and unwrap file
encryption keys. The eCryptfs kernel code can not hold the key's write
lock because ecryptfsd needs to request the key after receiving such a
request from the kernel.

Without this fix, all file opens and creates will timeout and fail when
using the eCryptfs PKI infrastructure. This is not an issue when using
passphrase-based mount keys, which is the most widely deployed eCryptfs
configuration.
Signed-off-by: Tyler Hicks <tyhicks@linux.vnet.ibm.com>
Acked-by: Roberto Sassu <roberto.sassu@polito.it>
Tested-by: Roberto Sassu <roberto.sassu@polito.it>
Tested-by: Alexis Hafner1 <haf@zurich.ibm.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>

efc977be

ecryptfs: Make inode bdi consistent with superblock bdi · a21353ba

Thieu Le authored Jul 26, 2011

commit 985ca0e6 upstream.

Make the inode mapping bdi consistent with the superblock bdi so that
dirty pages are flushed properly.
Signed-off-by: Thieu Le <thieule@chromium.org>
Signed-off-by: Tyler Hicks <tyhicks@linux.vnet.ibm.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>

a21353ba

ext3: Fix oops in ext3_try_to_allocate_with_rsv() · 57073d34

Jan Kara authored May 30, 2011

commit ad95c5e9 upstream.

Block allocation is called from two places: ext3_get_blocks_handle() and
ext3_xattr_block_set(). These two callers are not necessarily synchronized
because xattr code holds only xattr_sem and i_mutex, and
ext3_get_blocks_handle() may hold only truncate_mutex when called from
writepage() path. Block reservation code does not expect two concurrent
allocations to happen to the same inode and thus assertions can be triggered
or reservation structure corruption can occur.

Fix the problem by taking truncate_mutex in xattr code to serialize
allocations.

CC: Sage Weil <sage@newdream.net>
Reported-by: Fyodor Ustinov <ufm@ufm.su>
Signed-off-by: Jan Kara <jack@suse.cz>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>

57073d34

ext4: free allocated and pre-allocated blocks when check_eofblocks_fl fails · fac04f94

Jiaying Zhang authored Jul 10, 2011

commit 575a1d4b upstream.

Upon corrupted inode or disk failures, we may fail after we already
allocate some blocks from the inode or take some blocks from the
inode's preallocation list, but before we successfully insert the
corresponding extent to the extent tree. In this case, we should free
any allocated blocks and discard the inode's preallocated blocks
because the entries in the inode's preallocation list may be in an
inconsistent state.
Signed-off-by: Jiaying Zhang <jiayingz@google.com>
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>

fac04f94

ext4: fix i_blocks/quota accounting when extent insertion fails · 99cdf2a4

Maxim Patlasov authored Jul 10, 2011

commit 7132de74 upstream.

The current implementation of ext4_free_blocks() always calls
dquot_free_block This looks quite sensible in the most cases: blocks
to be freed are associated with inode and were accounted in quota and
i_blocks some time ago.

However, there is a case when blocks to free were not accounted by the
time calling ext4_free_blocks() yet:

1. delalloc is on, write_begin pre-allocated some space in quota
2. write-back happens, ext4 allocates some blocks in ext4_ext_map_blocks()
3. then ext4_ext_map_blocks() gets an error (e.g.  ENOSPC) from
   ext4_ext_insert_extent() and calls ext4_free_blocks().

In this scenario, ext4_free_blocks() calls dquot_free_block() who, in
turn, decrements i_blocks for blocks which were not accounted yet (due
to delalloc) After clean umount, e2fsck reports something like:

> Inode 21, i_blocks is 5080, should be 5128.  Fix<y>?
because i_blocks was erroneously decremented as explained above.

The patch fixes the problem by passing the new flag
EXT4_FREE_BLOCKS_NO_QUOT_UPDATE to ext4_free_blocks(), to request
that the dquot_free_block() call be skipped.
Signed-off-by: Maxim Patlasov <maxim.patlasov@gmail.com>
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>

99cdf2a4

xtensa: prevent arbitrary read in ptrace · f7ac7c5b

Dan Rosenberg authored Jul 25, 2011

commit 0d0138eb upstream.

Prevent an arbitrary kernel read.  Check the user pointer with access_ok()
before copying data in.

[akpm@linux-foundation.org: s/EIO/EFAULT/]
Signed-off-by: Dan Rosenberg <drosenberg@vsecurity.com>
Cc: Christian Zankel <chris@zankel.net>
Cc: Oleg Nesterov <oleg@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>

f7ac7c5b

mm/backing-dev.c: reset bdi min_ratio in bdi_unregister() · 650957da

Peter Zijlstra authored Jul 25, 2011

commit ccb6108f upstream.

Vito said:

: The system has many usb disks coming and going day to day, with their
: respective bdi's having min_ratio set to 1 when inserted.  It works for
: some time until eventually min_ratio can no longer be set, even when the
: active set of bdi's seen in /sys/class/bdi/*/min_ratio doesn't add up to
: anywhere near 100.
:
: This then leads to an unrelated starvation problem caused by write-heavy
: fuse mounts being used atop the usb disks, a problem the min_ratio setting
: at the underlying devices bdi effectively prevents.

Fix this leakage by resetting the bdi min_ratio when unregistering the
BDI.
Signed-off-by: Peter Zijlstra <peterz@infradead.org>
Reported-by: Vito Caputo <lkml@pengaru.com>
Cc: Wu Fengguang <fengguang.wu@intel.com>
Cc: Miklos Szeredi <miklos@szeredi.hu>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>

650957da

mm/futex: fix futex writes on archs with SW tracking of dirty & young · b045b9a2

Benjamin Herrenschmidt authored Jul 25, 2011

commit 2efaca92 upstream.

I haven't reproduced it myself but the fail scenario is that on such
machines (notably ARM and some embedded powerpc), if you manage to hit
that futex path on a writable page whose dirty bit has gone from the PTE,
you'll livelock inside the kernel from what I can tell.

It will go in a loop of trying the atomic access, failing, trying gup to
"fix it up", getting succcess from gup, go back to the atomic access,
failing again because dirty wasn't fixed etc...

So I think you essentially hang in the kernel.

The scenario is probably rare'ish because affected architecture are
embedded and tend to not swap much (if at all) so we probably rarely hit
the case where dirty is missing or young is missing, but I think Shan has
a piece of SW that can reliably reproduce it using a shared writable
mapping & fork or something like that.

On archs who use SW tracking of dirty & young, a page without dirty is
effectively mapped read-only and a page without young unaccessible in the
PTE.

Additionally, some architectures might lazily flush the TLB when relaxing
write protection (by doing only a local flush), and expect a fault to
invalidate the stale entry if it's still present on another processor.

The futex code assumes that if the "in_atomic()" access -EFAULT's, it can
"fix it up" by causing get_user_pages() which would then be equivalent to
taking the fault.

However that isn't the case.  get_user_pages() will not call
handle_mm_fault() in the case where the PTE seems to have the right
permissions, regardless of the dirty and young state.  It will eventually
update those bits ...  in the struct page, but not in the PTE.

Additionally, it will not handle the lazy TLB flushing that can be
required by some architectures in the fault case.

Basically, gup is the wrong interface for the job.  The patch provides a
more appropriate one which boils down to just calling handle_mm_fault()
since what we are trying to do is simulate a real page fault.

The futex code currently attempts to write to user memory within a
pagefault disabled section, and if that fails, tries to fix it up using
get_user_pages().

This doesn't work on archs where the dirty and young bits are maintained
by software, since they will gate access permission in the TLB, and will
not be updated by gup().

In addition, there's an expectation on some archs that a spurious write
fault triggers a local TLB flush, and that is missing from the picture as
well.

I decided that adding those "features" to gup() would be too much for this
already too complex function, and instead added a new simpler
fixup_user_fault() which is essentially a wrapper around handle_mm_fault()
which the futex code can call.

[akpm@linux-foundation.org: coding-style fixes]
[akpm@linux-foundation.org: fix some nits Darren saw, fiddle comment layout]
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Reported-by: Shan Hai <haishan.bai@gmail.com>
Tested-by: Shan Hai <haishan.bai@gmail.com>
Cc: David Laight <David.Laight@ACULAB.COM>
Acked-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Darren Hart <darren.hart@intel.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>

b045b9a2