Commits · e85bfd3aa7a34fa963bb268a676b41694e6dcf96 · nexedi / linux

23 Sep, 2010 17 commits

oom: filter unkillable tasks from tasklist dump · e85bfd3a

David Rientjes authored Sep 22, 2010

/proc/sys/vm/oom_dump_tasks is enabled by default, so it's necessary to
limit as much information as possible that it should emit.

The tasklist dump should be filtered to only those tasks that are eligible
for oom kill.  This is already done for memcg ooms, but this patch extends
it to both cpuset and mempolicy ooms as well as init.

In addition to suppressing irrelevant information, this also reduces
confusion since users currently don't know which tasks in the tasklist
aren't eligible for kill (such as those attached to cpusets or bound to
mempolicies with a disjoint set of mems or nodes, respectively) since that
information is not shown.
Signed-off-by: David Rientjes <rientjes@google.com>
Reviewed-by: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

e85bfd3a

drivers/video/sis/sis_main.c: prevent reading uninitialized stack memory · fd02db9d

Dan Rosenberg authored Sep 22, 2010

The FBIOGET_VBLANK device ioctl allows unprivileged users to read 16 bytes
of uninitialized stack memory, because the "reserved" member of the
fb_vblank struct declared on the stack is not altered or zeroed before
being copied back to the user.  This patch takes care of it.
Signed-off-by: Dan Rosenberg <dan.j.rosenberg@gmail.com>
Cc: Thomas Winischhofer <thomas@winischhofer.net>
Cc: <stable@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

fd02db9d

uml: fix compile warning · cb1dcc0f

Richard Weinberger authored Sep 22, 2010

This fixes:
incompatible pointer type:  => 89
       arch/um/kernel/exec.c: warning: passing argument 2 of 'execve1' from
incompatible pointer type:  => 69, 85
       arch/um/kernel/exec.c: warning: passing argument 3 of 'execve1' from
incompatible pointer type:  => 69, 85

which was introduced by d7627467 ("Make do_execve() take a const
filename pointer")
Signed-off-by: Richard Weinberger <richard@nod.at>
Cc: David Howells <dhowells@redhat.com>
Cc: Jeff Dike <jdike@addtoit.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

cb1dcc0f

/proc/pid/smaps: fix dirty pages accounting · 1c2499ae

KOSAKI Motohiro authored Sep 22, 2010

Currently, /proc/<pid>/smaps has wrong dirty pages accounting.
Shared_Dirty and Private_Dirty output only pte dirty pages and ignore
PG_dirty page flag.  It is difference against documentation, but also
inconsistent against Referenced field.  (Referenced checks both pte and
page flags)

This patch fixes it.

Test program:

 large-array.c
 ---------------------------------------------------
 #include <stdio.h>
 #include <stdlib.h>
 #include <string.h>
 #include <unistd.h>

 char array[1*1024*1024*1024L];

 int main(void)
 {
         memset(array, 1, sizeof(array));
         pause();

         return 0;
 }
 ---------------------------------------------------

Test case:
 1. run ./large-array
 2. cat /proc/`pidof large-array`/smaps
 3. swapoff -a
 4. cat /proc/`pidof large-array`/smaps again

Test result:
 <before patch>

00601000-40601000 rw-p 00000000 00:00 0
Size:            1048576 kB
Rss:             1048576 kB
Pss:             1048576 kB
Shared_Clean:          0 kB
Shared_Dirty:          0 kB
Private_Clean:    218992 kB   <-- showed pages as clean incorrectly
Private_Dirty:    829584 kB
Referenced:       388364 kB
Swap:                  0 kB
KernelPageSize:        4 kB
MMUPageSize:           4 kB

 <after patch>

00601000-40601000 rw-p 00000000 00:00 0
Size:            1048576 kB
Rss:             1048576 kB
Pss:             1048576 kB
Shared_Clean:          0 kB
Shared_Dirty:          0 kB
Private_Clean:         0 kB
Private_Dirty:   1048576 kB  <-- fixed
Referenced:       388480 kB
Swap:                  0 kB
KernelPageSize:        4 kB
MMUPageSize:           4 kB
Signed-off-by: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
Acked-by: Hugh Dickins <hughd@google.com>
Cc: Matt Mackall <mpm@selenic.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

1c2499ae

fbcon: fix lockdep warning from fbcon_deinit() · 142092e5

Jarek Poplawski authored Sep 22, 2010

Fix the lockdep warning:

[   13.657164] INFO: trying to register non-static key.
[   13.657169] the code is fine but needs lockdep annotation.
[   13.657171] turning off the locking correctness validator.
[   13.657177] Pid: 622, comm: modprobe Not tainted 2.6.36-rc3c #8
[   13.657180] Call Trace:
[   13.657194]  [<c13002c8>] ? printk+0x18/0x20
[   13.657202]  [<c1056cf6>] register_lock_class+0x336/0x350
[   13.657208]  [<c1058bf9>] __lock_acquire+0x449/0x1180
[   13.657215]  [<c1059997>] lock_acquire+0x67/0x80
[   13.657222]  [<c1042bf1>] ? __cancel_work_timer+0x51/0x230
[   13.657227]  [<c1042c23>] __cancel_work_timer+0x83/0x230
[   13.657231]  [<c1042bf1>] ? __cancel_work_timer+0x51/0x230
[   13.657236]  [<c10582b2>] ? mark_held_locks+0x62/0x80
[   13.657243]  [<c10b3a2f>] ? kfree+0x7f/0xe0
[   13.657248]  [<c105853c>] ? trace_hardirqs_on_caller+0x11c/0x160
[   13.657253]  [<c105858b>] ? trace_hardirqs_on+0xb/0x10
[   13.657259]  [<c117f4cd>] ? fbcon_deinit+0x16d/0x1e0
[   13.657263]  [<c117f4cd>] ? fbcon_deinit+0x16d/0x1e0
[   13.657268]  [<c1042dea>] cancel_work_sync+0xa/0x10
[   13.657272]  [<c117f444>] fbcon_deinit+0xe4/0x1e0
...

The warning is caused by trying to cancel an uninitialized work from
fbcon_exit().  Fix it by adding a check for queue.func, similarly to other
places in this code.
Signed-off-by: Jarek Poplawski <jarkao2@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

142092e5

efifb: support the EFI framebuffer on more Apple hardware · a5757c2a

Luke Macken authored Sep 22, 2010

Enable the EFI framebuffer on 14 more Macs, including the iMac11,1
iMac10,1 iMac8,1 Macmini3,1 Macmini4,1 MacBook5,1 MacBook6,1 MacBook7,1
MacBookPro2,2 MacBookPro5,2 MacBookPro5,3 MacBookPro6,1 MacBookPro6,2 and
MacBookPro7,1

Information gathered from various user submissions.

    https://bugzilla.redhat.com/show_bug.cgi?id=528232
    http://ubuntuforums.org/showthread.php?t=1557326

[akpm@linux-foundation.org: coding-style fixes]
Signed-off-by: Luke Macken <lmacken@redhat.com>
Signed-off-by: Peter Jones <pjones@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

a5757c2a

efifb: check that the base address is plausible on pci systems · 85a00d9b

Peter Jones authored Sep 22, 2010

Some Apple machines have identical DMI data but different memory
configurations for the video.  Given that, check that the address in our
table is actually within the range of a PCI BAR on a VGA device in the
machine.

This also fixes up the return value from set_system(), which has always
been wrong, but never resulted in bad behavior since there's only ever
been one matching entry in the dmi table.

The patch

1) stops people's machines from crashing when we get their display wrong,
   which seems to be unfortunately inevitable,

2) allows us to support identical dmi data with differing video memory
   configurations

This also adds me as the efifb maintainer, since I've effectively been
acting as such for quite some time.
Signed-off-by: Peter Jones <pjones@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

85a00d9b

aio: do not return ERESTARTSYS as a result of AIO · a0c42bac

Jan Kara authored Sep 22, 2010

OCFS2 can return ERESTARTSYS from its write function when the process is
signalled while waiting for a cluster lock (and the filesystem is mounted
with intr mount option).  Generally, it seems reasonable to allow
filesystems to return this error code from its IO functions.  As we must
not leak ERESTARTSYS (and similar error codes) to userspace as a result of
an AIO operation, we have to properly convert it to EINTR inside AIO code
(restarting the syscall isn't really an option because other AIO could
have been already submitted by the same io_submit syscall).
Signed-off-by: Jan Kara <jack@suse.cz>
Reviewed-by: Jeff Moyer <jmoyer@redhat.com>
Cc: Christoph Hellwig <hch@infradead.org>
Cc: Zach Brown <zach.brown@oracle.com>
Cc: <stable@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

a0c42bac

vmscan: check all_unreclaimable in direct reclaim path · d1908362

Minchan Kim authored Sep 22, 2010

M.  Vefa Bicakci reported 2.6.35 kernel hang up when hibernation on his
32bit 3GB mem machine.
(https://bugzilla.kernel.org/show_bug.cgi?id=16771). Also he bisected
the regression to

  commit bb21c7ce
  Author: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
  Date:   Fri Jun 4 14:15:05 2010 -0700

     vmscan: fix do_try_to_free_pages() return value when priority==0 reclaim failure

At first impression, this seemed very strange because the above commit
only chenged function return value and hibernate_preallocate_memory()
ignore return value of shrink_all_memory().  But it's related.

Now, page allocation from hibernation code may enter infinite loop if the
system has highmem.  The reasons are that vmscan don't care enough OOM
case when oom_killer_disabled.

The problem sequence is following as.

1. hibernation
2. oom_disable
3. alloc_pages
4. do_try_to_free_pages
       if (scanning_global_lru(sc) && !all_unreclaimable)
               return 1;

If kswapd is not freozen, it would set zone->all_unreclaimable to 1 and
then shrink_zones maybe return true(ie, all_unreclaimable is true).  So at
last, alloc_pages could go to _nopage_.  If it is, it should have no
problem.

This patch adds all_unreclaimable check to protect in direct reclaim path,
too.  It can care of hibernation OOM case and help bailout
all_unreclaimable case slightly.
Signed-off-by: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
Signed-off-by: Minchan Kim <minchan.kim@gmail.com>
Reported-by: M. Vefa Bicakci <bicave@superonline.com>
Reported-by: <caiqian@redhat.com>
Reviewed-by: Johannes Weiner <hannes@cmpxchg.org>
Tested-by: <caiqian@redhat.com>
Acked-by: Rafael J. Wysocki <rjw@sisk.pl>
Acked-by: Rik van Riel <riel@redhat.com>
Acked-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Cc: Balbir Singh <balbir@in.ibm.com>
Cc: <stable@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

d1908362

drivers/rtc/rtc-ab3100.c: add missing platform_set_drvdata() in ab3100_rtc_probe() · eba93fcc

Axel Lin authored Sep 22, 2010

Otherwise, calling platform_get_drvdata() in ab3100_rtc_remove() returns
NULL.
Signed-off-by: Axel Lin <axel.lin@gmail.com>
Acked-by: Wan ZongShun <mcuos.com@gmail.com>
Acked-by: Linus Walleij <linus.walleij@stericsson.com>
Cc: Alessandro Zummo <a.zummo@towertech.it>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

eba93fcc

MAINTAINERS: change AVR32 and AT32AP maintainer · f5665518

Hans-Christian Egtvedt authored Sep 22, 2010

Alter the maintainer of the AVR32 architecture and the AVR32/AT32AP
machine support to me. Haavard is moving on to new challenges, and we've
found it better to transfer the maintainer part to me. I will have good
contact with Haavard anyway.
Signed-off-by: Hans-Christian Egtvedt <hans-christian.egtvedt@atmel.com>
Acked-by: Haavard Skinnemoen <haavard.skinnemoen@atmel.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

f5665518

vmware balloon: rename module · d544b7a4

Dmitry Torokhov authored Sep 22, 2010

In an effort to minimize customer confusion we want to unify naming
convention for VMware-provided kernel modules. This change renames the
balloon driver from vmware_ballon to vmw_balloon.

We expect to follow this naming convention (vmw_<module_name>) for all
modules that are part of mainline kernel and/or being distributed by
VMware, with the sole exception of vmxnet3 driver (since the name of
mainline driver happens to match with the name used in VMware Tools).
Signed-off-by: Dmitry Torokhov <dtor@vmware.com>
Acked-by: Bhavesh Davda <bhavesh@vmware.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

d544b7a4

arm: fix "arm: fix pci_set_consistent_dma_mask for dmabounce devices" · 710224fa

FUJITA Tomonori authored Sep 22, 2010

This fixes the regression caused by the commit 6fee48cd
("dma-mapping: arm: use generic pci_set_dma_mask and
pci_set_consistent_dma_mask").

ARM needs to clip the dma coherent mask for dmabounce devices. This
restores the old trick.

Note that strictly speaking, the DMA API doesn't allow architectures to do
such but I'm not sure it's worth adding the new API to set the dma mask
that allows architectures to clip it.
Reported-by: Krzysztof Halasa <khc@pm.waw.pl>
Signed-off-by: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp>
Acked-by: Russell King <rmk+kernel@arm.linux.org.uk>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

710224fa

/proc/vmcore: fix seeking · c227e690

Arnd Bergmann authored Sep 22, 2010

Commit 73296bc6 ("procfs: Use generic_file_llseek in /proc/vmcore")
broke seeking on /proc/vmcore.  This changes it back to use default_llseek
in order to restore the original behaviour.

The problem with generic_file_llseek is that it only allows seeks up to
inode->i_sb->s_maxbytes, which is zero on procfs and some other virtual
file systems.  We should merge generic_file_llseek and default_llseek some
day and clean this up in a proper way, but for 2.6.35/36, reverting vmcore
is the safer solution.
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Reported-by: CAI Qian <caiqian@redhat.com>
Tested-by: CAI Qian <caiqian@redhat.com>
Cc: <stable@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

c227e690

ipmi: fix acpi probe print · a9e31765

Yinghai Lu authored Sep 22, 2010

After d9e1b6c4 ("ipmi: fix ACPI detection with regspacing") we get

[   11.026326] ipmi_si: probing via ACPI
[   11.030019] ipmi_si 00:09: (null) regsize 1 spacing 1 irq 0
[   11.035594] ipmi_si: Adding ACPI-specified kcs state machine

on an old system with only one range for ipmi kcs range.

Try to fix it by adding another res pointer.
Signed-off-by: Yinghai Lu <yinghai@kernel.org>
Signed-off-by: Corey Minyard <cminyard@mvista.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

a9e31765

oom: always return a badness score of non-zero for eligible tasks · f19e8aa1

David Rientjes authored Sep 22, 2010

A task's badness score is roughly a proportion of its rss and swap
compared to the system's capacity.  The scale ranges from 0 to 1000 with
the highest score chosen for kill.  Thus, this scale operates on a
resolution of 0.1% of RAM + swap.  Admin tasks are also given a 3% bonus,
so the badness score of an admin task using 3% of memory, for example,
would still be 0.

It's possible that an exceptionally large number of tasks will combine to
exhaust all resources but never have a single task that uses more than
0.1% of RAM and swap (or 3.0% for admin tasks).

This patch ensures that the badness score of any eligible task is never 0
so the machine doesn't unnecessarily panic because it cannot find a task
to kill.
Signed-off-by: David Rientjes <rientjes@google.com>
Cc: Dave Hansen <dave@linux.vnet.ibm.com>
Cc: Nitin Gupta <ngupta@vflare.org>
Cc: Pekka Enberg <penberg@cs.helsinki.fi>
Cc: Minchan Kim <minchan.kim@gmail.com>
Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Cc: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

f19e8aa1

Prevent freeing uninitialized pointer in compat_do_readv_writev · 767b68e9

Dan Rosenberg authored Sep 22, 2010

In 32-bit compatibility mode, the error handling for
compat_do_readv_writev() may free an uninitialized pointer, potentially
leading to all sorts of ugly memory corruption.  This is reliably
triggerable by unprivileged users by invoking the readv()/writev()
syscalls with an invalid iovec pointer.  The below patch fixes this to
emulate the non-compat version.

Introduced by commit b8373363 ("compat: factor out
compat_rw_copy_check_uvector from compat_do_readv_writev")
Signed-off-by: Dan Rosenberg <dan.j.rosenberg@gmail.com>
Cc: stable@kernel.org (2.6.35)
Cc: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

767b68e9

22 Sep, 2010 11 commits

Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/sparc-2.6 · c79bd892

Linus Torvalds authored Sep 22, 2010

* git://git.kernel.org/pub/scm/linux/kernel/git/davem/sparc-2.6:
  sparc: Prevent no-handler signal syscall restart recursion.
  sparc: Don't mask signal when we can't setup signal frame.
  sparc64: Fix race in signal instruction flushing.
  sparc64: Support RAW perf events.

c79bd892

powerpc: fix double syscall restarts · 9a81c16b

Al Viro authored Sep 20, 2010

Make sigreturn zero regs->trap, make do_signal() do the same on all
paths.  As it is, signal interrupting e.g. read() from fd 512 (==
ERESTARTSYS) with another signal getting unblocked when the first
handler finishes will lead to restart one insn earlier than it ought
to.  Same for multiple signals with in-kernel handlers interrupting
that sucker at the same time.  Same for multiple signals of any kind
interrupting that sucker on 64bit...
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Acked-by: Paul Mackerras <paulus@samba.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

9a81c16b

Merge branch 'for-linus' of git://git.kernel.dk/linux-2.6-block · b68e9d45

Linus Torvalds authored Sep 22, 2010

* 'for-linus' of git://git.kernel.dk/linux-2.6-block:
  bdi: Fix warnings in __mark_inode_dirty for /dev/zero and friends
  char: Mark /dev/zero and /dev/kmem as not capable of writeback
  bdi: Initialize noop_backing_dev_info properly
  cfq-iosched: fix a kernel OOPs when usb key is inserted
  block: fix blk_rq_map_kern bio direction flag
  cciss: freeing uninitialized data on error path

b68e9d45

bdi: Fix warnings in __mark_inode_dirty for /dev/zero and friends · 692ebd17

Jan Kara authored Sep 21, 2010

Inodes of devices such as /dev/zero can get dirty for example via
utime(2) syscall or due to atime update. Backing device of such inodes
(zero_bdi, etc.) is however unable to handle dirty inodes and thus
__mark_inode_dirty complains.  In fact, inode should be rather dirtied
against backing device of the filesystem holding it. This is generally a
good rule except for filesystems such as 'bdev' or 'mtd_inodefs'. Inodes
in these pseudofilesystems are referenced from ordinary filesystem
inodes and carry mapping with real data of the device. Thus for these
inodes we have to use inode->i_mapping->backing_dev_info as we did so
far. We distinguish these filesystems by checking whether sb->s_bdi
points to a non-trivial backing device or not.

Example: Assume we have an ext3 filesystem on /dev/sda1 mounted on /.
There's a device inode A described by a path "/dev/sdb" on this
filesystem. This inode will be dirtied against backing device "8:0"
after this patch. bdev filesystem contains block device inode B coupled
with our inode A. When someone modifies a page of /dev/sdb, it's B that
gets dirtied and the dirtying happens against the backing device "8:16".
Thus both inodes get filed to a correct bdi list.

Cc: stable@kernel.org
Signed-off-by: Jan Kara <jack@suse.cz>
Signed-off-by: Jens Axboe <jaxboe@fusionio.com>

692ebd17

char: Mark /dev/zero and /dev/kmem as not capable of writeback · 371d217e

Jan Kara authored Sep 21, 2010

These devices don't do any writeback but their device inodes still can get
dirty so mark bdi appropriately so that bdi code does the right thing and files
inodes to lists of bdi carrying the device inodes.

Cc: stable@kernel.org
Signed-off-by: Jan Kara <jack@suse.cz>
Signed-off-by: Jens Axboe <jaxboe@fusionio.com>

371d217e

bdi: Initialize noop_backing_dev_info properly · 976e48f8

Jan Kara authored Sep 21, 2010

Properly initialize this backing dev info so that writeback code does not
barf when getting to it e.g. via sb->s_bdi.

Cc: stable@kernel.org
Signed-off-by: Jan Kara <jack@suse.cz>
Signed-off-by: Jens Axboe <jaxboe@fusionio.com>

976e48f8

sparc: Prevent no-handler signal syscall restart recursion. · c2785259

David S. Miller authored Sep 21, 2010

Explicitly clear the "in-syscall" bit when we have no signal
handler and back up the program counters to back up the system
call.
Reported-by: Al Viro <viro@ZenIV.linux.org.uk>
Signed-off-by: David S. Miller <davem@davemloft.net>

c2785259

sparc: Don't mask signal when we can't setup signal frame. · 392c2180

David S. Miller authored Sep 21, 2010

Don't invoke the signal handler tracehook in that situation
either.
Reported-by: Al Viro <viro@ZenIV.linux.org.uk>
Signed-off-by: David S. Miller <davem@davemloft.net>

392c2180

Merge branch 'for-linus/i2c/2636-rc5' of git://git.fluff.org/bjdooks/linux · 62f1b494

Linus Torvalds authored Sep 21, 2010

* 'for-linus/i2c/2636-rc5' of git://git.fluff.org/bjdooks/linux:
  i2c-omap: Make sure i2c bus is free before setting it to idle

62f1b494

fs: {lock,unlock}_flocks() stubs to prepare for BKL removal · 8b15575c

Sage Weil authored Sep 21, 2010

The lock structs are currently protected by the BKL, but are accessed by
code in fs/locks.c and misc file system and DLM code.  These stubs will
allow all users to switch to the new interface before the implementation
is changed to a spinlock.
Acked-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Sage Weil <sage@newdream.net>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

8b15575c

i2c-omap: Make sure i2c bus is free before setting it to idle · 5c64eb26

Mathias Nyman authored Aug 26, 2010

If the i2c bus receives an interrupt with both BB (bus busy) and
ARDY (register access ready) statuses set during the tranfer of the last message
the bus was put to idle while still busy.

This caused bus to timeout.
Signed-off-by: Mathias Nyman <mathias.nyman@nokia.com>
Acked-by: Tony Lindgren <tony@atomide.com>
Signed-off-by: Ben Dooks <ben-linux@fluff.org>

5c64eb26

21 Sep, 2010 12 commits

Merge branch 'sched-fixes-for-linus' of... · 1ce1e41c

Linus Torvalds authored Sep 21, 2010

Merge branch 'sched-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip

* 'sched-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
  sched: Fix nohz balance kick
  sched: Fix user time incorrectly accounted as system time on 32-bit

1ce1e41c

Merge branch 'perf-fixes-for-linus' of... · 87ac6fa2

Linus Torvalds authored Sep 21, 2010

Merge branch 'perf-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip

* 'perf-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
  hw breakpoints: Fix pid namespace bug
  x86: Fix instruction breakpoint encoding
  oprofile: Add Support for Intel CPU Family 6 / Model 22 (Intel Celeron 540)
  kprobes: Fix Kconfig dependency

87ac6fa2

Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/sage/ceph-client · 19746cad

Linus Torvalds authored Sep 21, 2010

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/sage/ceph-client:
  ceph: select CRYPTO
  ceph: check mapping to determine if FILE_CACHE cap is used
  ceph: only send one flushsnap per cap_snap per mds session
  ceph: fix cap_snap and realm split
  ceph: stop sending FLUSHSNAPs when we hit a dirty capsnap
  ceph: correctly set 'follows' in flushsnap messages
  ceph: fix dn offset during readdir_prepopulate
  ceph: fix file offset wrapping at 4GB on 32-bit archs
  ceph: fix reconnect encoding for old servers
  ceph: fix pagelist kunmap tail
  ceph: fix null pointer deref on anon root dentry release

19746cad

Merge branch 'drm-intel-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/ickle/drm-intel · 0ffe37de

Linus Torvalds authored Sep 21, 2010

* 'drm-intel-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/ickle/drm-intel:
  drm/i915: Hold a reference to the object whilst unbinding the eviction list
  drm/i915,agp/intel: Add second set of PCI-IDs for B43
  drm/i915: Fix Sandybridge fence registers
  drm/i915/crt: Downgrade warnings for hotplug failures
  drm/i915: Ensure that the crtcinfo is populated during mode_fixup()

0ffe37de

Merge git://git.kernel.org/pub/scm/linux/kernel/git/rusty/linux-2.6-for-linus · 4e24db5b

Linus Torvalds authored Sep 21, 2010

* git://git.kernel.org/pub/scm/linux/kernel/git/rusty/linux-2.6-for-linus:
  lguest: update comments to reflect LHCALL_LOAD_GDT_ENTRY.
  virtio: console: Prevent userspace from submitting NULL buffers
  virtio: console: Fix poll blocking even though there is data to read

4e24db5b

sched: Fix nohz balance kick · f6c3f168

Suresh Siddha authored Sep 13, 2010

There's a situation where the nohz balancer will try to wake itself:

cpu-x is idle which is also ilb_cpu
got a scheduler tick during idle
and the nohz_kick_needed() in trigger_load_balance() checks for
rq_x->nr_running which might not be zero (because of someone waking a
task on this rq etc) and this leads to the situation of the cpu-x
sending a kick to itself.

And this can cause a lockup.

Avoid this by not marking ourself eligible for kicking.
Signed-off-by: Suresh Siddha <suresh.b.siddha@intel.com>
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
LKML-Reference: <1284400941.2684.19.camel@sbsiddha-MOBL3.sc.intel.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>

f6c3f168

cfq-iosched: fix a kernel OOPs when usb key is inserted · 180be2a0

Vivek Goyal authored Sep 14, 2010

Mike reported a kernel crash when a usb key hotplug is performed while all
kernel thrads are not in a root cgroup and are running in one of the child
cgroups of blkio controller.

	BUG: unable to handle kernel NULL pointer dereference at 0000002c
	IP: [<c11c7b08>] cfq_get_queue+0x232/0x412
	*pde = 00000000
	Oops: 0000 [#1] PREEMPT
	last sysfs file: /sys/devices/pci0000:00/0000:00:1d.7/usb2/2-1/2-1:1.0/host3/scsi_host/host3/uevent

	[..]
	Pid: 30039, comm: scsi_scan_3 Not tainted 2.6.35.2-fg.roam #1 Volvi2                         /Aspire 4315
	EIP: 0060:[<c11c7b08>] EFLAGS: 00010086 CPU: 0
	EIP is at cfq_get_queue+0x232/0x412
	EAX: f705f9c0 EBX: e977abac ECX: 00000000 EDX: 00000000
	ESI: f00da400 EDI: f00da4ec EBP: e977a800 ESP: dff8fd00
	 DS: 007b ES: 007b FS: 0000 GS: 0000 SS: 0068
	Process scsi_scan_3 (pid: 30039, ti=dff8e000 task=f6b6c9a0 task.ti=dff8e000)
	Stack:
	 00000000 00000000 00000001 01ff0000 f00da508 00000000 f00da524 f00da540
	<0> e7994940 dd631750 f705f9c0 e977a820 e977ac44 f00da4d0 00000001 f6b6c9a0
	<0> 00000010 00008010 0000000b 00000000 00000001 e977a800 dd76fac0 00000246
	Call Trace:
	 [<c11c7f10>] ? cfq_set_request+0x228/0x34c
	 [<c11c7ce8>] ? cfq_set_request+0x0/0x34c
	 [<c11bb3b9>] ? elv_set_request+0xf/0x1c
	 [<c11bdd51>] ? get_request+0x1ad/0x22f
	 [<c11bddf2>] ? get_request_wait+0x1f/0x11a
	 [<c11d013b>] ? kvasprintf+0x33/0x3b
	 [<c127b537>] ? scsi_execute+0x1d/0x103
	 [<c127b675>] ? scsi_execute_req+0x58/0x83
	 [<c127c391>] ? scsi_probe_and_add_lun+0x188/0x7c2
	 [<c12718c6>] ? attribute_container_add_device+0x15/0xfa
	 [<c11c95d1>] ? kobject_get+0xf/0x13
	 [<c126d1db>] ? get_device+0x10/0x14
	 [<c127be93>] ? scsi_alloc_target+0x217/0x24d
	 [<c127cbd8>] ? __scsi_scan_target+0x95/0x480
	 [<c10204eb>] ? dequeue_entity+0x14/0x1fe
	 [<c1020491>] ? update_curr+0x165/0x1ab
	 [<c1020491>] ? update_curr+0x165/0x1ab
	 [<c127d00d>] ? scsi_scan_channel+0x4a/0x76
	 [<c127d0b0>] ? scsi_scan_host_selected+0x77/0xad
	 [<c127d13c>] ? do_scan_async+0x0/0x11a
	 [<c127d137>] ? do_scsi_scan_host+0x51/0x56
	 [<c127d13c>] ? do_scan_async+0x0/0x11a
	 [<c127d14a>] ? do_scan_async+0xe/0x11a
	 [<c127d13c>] ? do_scan_async+0x0/0x11a
	 [<c10354c5>] ? kthread+0x5e/0x63
	 [<c1035467>] ? kthread+0x0/0x63
	 [<c1002af6>] ? kernel_thread_helper+0x6/0x10
	Code: 44 24 1c 54 83 44 24 18 54 83 fa 03 75 94 8b 06 c7 86 64 02 00 00 01 00 00 00 83 e0 03 09 f0 89 06 8b 44 24 28 8b 90 58 01 00 00 <8b> 42 2c 85 c0 75 03 8b 42 08 8d 54 24 48 52 8d 4c 24 50 51 68
	EIP: [<c11c7b08>] cfq_get_queue+0x232/0x412 SS:ESP 0068:dff8fd00
	CR2: 000000000000002c
	---[ end trace 9a88306573f69b12 ]---

The problem here is that we don't have bdi->dev information available when
thread does some IO.  Hence when dev_name() tries to access bdi->dev, it
crashes.

This problem does not happen if kernel threads are in root group as root
group is statically allocated at device initialization time and we don't
hit this piece of code.

Fix it by delaying the filling of major and minor number information of
device in blk_group.  Initially a blk_group is created with 0 as device
information and this information is filled later once some more IO comes
in from same group.
Reported-by: Mike Kazantsev <mk.fraggod@gmail.com>
Signed-off-by: Vivek Goyal <vgoyal@redhat.com>
Signed-off-by: Jens Axboe <jaxboe@fusionio.com>

180be2a0

block: fix blk_rq_map_kern bio direction flag · a45dc2d2

Benny Halevy authored Sep 13, 2010

This bug was introduced in 7b6d91da
"block: unify flags for struct bio and struct request"

Cc: Boaz Harrosh <bharrosh@panasas.com>
Signed-off-by: Benny Halevy <bhalevy@panasas.com>
Signed-off-by: Jens Axboe <jaxboe@fusionio.com>

a45dc2d2

cciss: freeing uninitialized data on error path · b0722cb1

Dan Carpenter authored Sep 13, 2010

The "h->scatter_list" is allocated inside a for loop. If any of those
allocations fail, then the rest of the list is uninitialized data. When
we free it we should start from the top and free backwards so that we
don't call kfree() on uninitialized pointers.

Also if the allocation for "h->scatter_list" fails then we would get an
Oops here. I should have noticed this when I send: 4ee69851 "cciss:
handle allocation failure." but I didn't. Sorry about that.
Signed-off-by: Dan Carpenter <error27@gmail.com>
Signed-off-by: Jens Axboe <jaxboe@fusionio.com>

b0722cb1

Merge remote branch 'linus' into drm-intel-fixes · db8c076b
Chris Wilson authored Sep 21, 2010

db8c076b

sparc64: Fix race in signal instruction flushing. · 05c5e769

David S. Miller authored Sep 20, 2010

If another cpu does a very wide munmap() on the signal frame area,
it can tear down the page table hierarchy from underneath us.

Borrow an idea from the 64-bit fault path's get_user_insn(), and
disable cross call interrupts during the page table traversal
to lock them in place while we operate.
Reported-by: Al Viro <viro@ZenIV.linux.org.uk>
Signed-off-by: David S. Miller <davem@davemloft.net>

05c5e769

lguest: update comments to reflect LHCALL_LOAD_GDT_ENTRY. · 9b6efcd2

Rusty Russell authored Sep 21, 2010

We used to have a hypercall which reloaded the entire GDT, then we
switched to one which loaded a single entry (to match the IDT code).

Some comments were not updated, so fix them.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Reported by: Eviatar Khen <eviatarkhen@gmail.com>

9b6efcd2