1. 10 Jun, 2013 14 commits
    • Oleg Nesterov's avatar
      kmod: make __request_module() killable · ec4808e7
      Oleg Nesterov authored
      commit 1cc684ab upstream
      
      As Tetsuo Handa pointed out, request_module() can stress the system
      while the oom-killed caller sleeps in TASK_UNINTERRUPTIBLE.
      
      The task T uses "almost all" memory, then it does something which
      triggers request_module().  Say, it can simply call sys_socket().  This
      in turn needs more memory and leads to OOM.  oom-killer correctly
      chooses T and kills it, but this can't help because it sleeps in
      TASK_UNINTERRUPTIBLE and after that oom-killer becomes "disabled" by the
      TIF_MEMDIE task T.
      
      Make __request_module() killable.  The only necessary change is that
      call_modprobe() should kmalloc argv and module_name, they can't live in
      the stack if we use UMH_KILLABLE.  This memory is freed via
      call_usermodehelper_freeinfo()->cleanup.
      Reported-by: default avatarTetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
      Signed-off-by: default avatarOleg Nesterov <oleg@redhat.com>
      Cc: Rusty Russell <rusty@rustcorp.com.au>
      Cc: Tejun Heo <tj@kernel.org>
      Cc: David Rientjes <rientjes@google.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      [dannf, bwh: backported to Debian's 2.6.32]
      Signed-off-by: default avatarWilly Tarreau <w@1wt.eu>
      ec4808e7
    • Oleg Nesterov's avatar
      kmod: introduce call_modprobe() helper · d5734d32
      Oleg Nesterov authored
      commit 3e63a93b upstream
      
      No functional changes.  Move the call_usermodehelper code from
      __request_module() into the new simple helper, call_modprobe().
      Signed-off-by: default avatarOleg Nesterov <oleg@redhat.com>
      Cc: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
      Cc: Rusty Russell <rusty@rustcorp.com.au>
      Cc: Tejun Heo <tj@kernel.org>
      Cc: David Rientjes <rientjes@google.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      [dannf: backported to Debian's 2.6.32]
      Signed-off-by: default avatarWilly Tarreau <w@1wt.eu>
      d5734d32
    • Oleg Nesterov's avatar
      usermodehelper: ____call_usermodehelper() doesn't need do_exit() · 67fd05cc
      Oleg Nesterov authored
      commit 5b9bd473 upstream
      
      Minor cleanup.  ____call_usermodehelper() can simply return, no need to
      call do_exit() explicitely.
      Signed-off-by: default avatarOleg Nesterov <oleg@redhat.com>
      Cc: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
      Cc: Rusty Russell <rusty@rustcorp.com.au>
      Cc: Tejun Heo <tj@kernel.org>
      Cc: David Rientjes <rientjes@google.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      [dannf: adjusted to apply to Debian's 2.6.32]
      Signed-off-by: default avatarWilly Tarreau <w@1wt.eu>
      67fd05cc
    • Oleg Nesterov's avatar
      usermodehelper: implement UMH_KILLABLE · e2a28e9a
      Oleg Nesterov authored
      commit d0bd587a upstream
      
      Implement UMH_KILLABLE, should be used along with UMH_WAIT_EXEC/PROC.
      The caller must ensure that subprocess_info->path/etc can not go away
      until call_usermodehelper_freeinfo().
      
      call_usermodehelper_exec(UMH_KILLABLE) does
      wait_for_completion_killable.  If it fails, it uses
      xchg(&sub_info->complete, NULL) to serialize with umh_complete() which
      does the same xhcg() to access sub_info->complete.
      
      If call_usermodehelper_exec wins, it can safely return.  umh_complete()
      should get NULL and call call_usermodehelper_freeinfo().
      
      Otherwise we know that umh_complete() was already called, in this case
      call_usermodehelper_exec() falls back to wait_for_completion() which
      should succeed "very soon".
      
      Note: UMH_NO_WAIT == -1 but it obviously should not be used with
      UMH_KILLABLE.  We delay the neccessary cleanup to simplify the back
      porting.
      Signed-off-by: default avatarOleg Nesterov <oleg@redhat.com>
      Cc: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
      Cc: Rusty Russell <rusty@rustcorp.com.au>
      Cc: Tejun Heo <tj@kernel.org>
      Cc: David Rientjes <rientjes@google.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      [dannf: backported to Debian's 2.6.32]
      Signed-off-by: default avatarWilly Tarreau <w@1wt.eu>
      e2a28e9a
    • Oleg Nesterov's avatar
      usermodehelper: introduce umh_complete(sub_info) · 13db7353
      Oleg Nesterov authored
      commit b3449922 upstream
      
      Preparation.  Add the new trivial helper, umh_complete().  Currently it
      simply does complete(sub_info->complete).
      Signed-off-by: default avatarOleg Nesterov <oleg@redhat.com>
      Cc: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
      Cc: Rusty Russell <rusty@rustcorp.com.au>
      Cc: Tejun Heo <tj@kernel.org>
      Cc: David Rientjes <rientjes@google.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      [dannf: Adjusted to apply to Debian's 2.6.32]
      Signed-off-by: default avatarWilly Tarreau <w@1wt.eu>
      13db7353
    • Kees Cook's avatar
      gen_init_cpio: avoid stack overflow when expanding · dbd3462b
      Kees Cook authored
      commit 20f1de65 upstream.
      
      Fix possible overflow of the buffer used for expanding environment
      variables when building file list.
      
      In the extremely unlikely case of an attacker having control over the
      environment variables visible to gen_init_cpio, control over the
      contents of the file gen_init_cpio parses, and gen_init_cpio was built
      without compiler hardening, the attacker can gain arbitrary execution
      control via a stack buffer overflow.
      
        $ cat usr/crash.list
        file foo ${BIG}${BIG}${BIG}${BIG}${BIG}${BIG} 0755 0 0
        $ BIG=$(perl -e 'print "A" x 4096;') ./usr/gen_init_cpio usr/crash.list
        *** buffer overflow detected ***: ./usr/gen_init_cpio terminated
      
      This also replaces the space-indenting with tabs.
      
      Patch based on existing fix extracted from grsecurity.
      Signed-off-by: default avatarKees Cook <keescook@chromium.org>
      Cc: Michal Marek <mmarek@suse.cz>
      Cc: Brad Spengler <spender@grsecurity.net>
      Cc: PaX Team <pageexec@freemail.hu>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      Signed-off-by: default avatarWilly Tarreau <w@1wt.eu>
      dbd3462b
    • Jean Delvare's avatar
      kbuild: Fix gcc -x syntax · 56fb3d90
      Jean Delvare authored
      This is upstream commit b1e0d8b7
      backported to the 2.6.32.x stable branch.
      
      The correct syntax for gcc -x is "gcc -x assembler", not
      "gcc -xassembler". Even though the latter happens to work, the former
      is what is documented in the manual page and thus what gcc wrappers
      such as icecream do expect.
      
      This isn't a cosmetic change. The missing space prevents icecream from
      recognizing compilation tasks it can't handle, leading to silent kernel
      miscompilations.
      
      Besides me, credits go to Michael Matz and Dirk Mueller for
      investigating the miscompilation issue and tracking it down to this
      incorrect -x parameter syntax.
      Signed-off-by: default avatarJean Delvare <jdelvare@suse.de>
      Cc: stable@vger.kernel.org
      Cc: Bernhard Walle <bernhard@bwalle.de>
      Cc: Michal Marek <mmarek@suse.cz>
      Cc: Ralf Baechle <ralf@linux-mips.org>
      Signed-off-by: default avatarWilly Tarreau <w@1wt.eu>
      56fb3d90
    • Thomas Gleixner's avatar
      tick: Cleanup NOHZ per cpu data on cpu down · d31e3b9e
      Thomas Gleixner authored
      commit 4b0c0f29 upstream.
      
      Prarit reported a crash on CPU offline/online. The reason is that on
      CPU down the NOHZ related per cpu data of the dead cpu is not cleaned
      up. If at cpu online an interrupt happens before the per cpu tick
      device is registered the irq_enter() check potentially sees stale data
      and dereferences a NULL pointer.
      
      Cleanup the data after the cpu is dead.
      Reported-by: default avatarPrarit Bhargava <prarit@redhat.com>
      Cc: Mike Galbraith <bitbucket@online.de>
      Link: http://lkml.kernel.org/r/alpine.LFD.2.02.1305031451561.2886@ionosSigned-off-by: default avatarThomas Gleixner <tglx@linutronix.de>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      Signed-off-by: default avatarWilly Tarreau <w@1wt.eu>
      d31e3b9e
    • Tirupathi Reddy's avatar
      timer: Don't reinitialize the cpu base lock during CPU_UP_PREPARE · 34d91822
      Tirupathi Reddy authored
      commit 42a5cf46 upstream.
      
      An inactive timer's base can refer to a offline cpu's base.
      
      In the current code, cpu_base's lock is blindly reinitialized each
      time a CPU is brought up. If a CPU is brought online during the period
      that another thread is trying to modify an inactive timer on that CPU
      with holding its timer base lock, then the lock will be reinitialized
      under its feet. This leads to following SPIN_BUG().
      
      <0> BUG: spinlock already unlocked on CPU#3, kworker/u:3/1466
      <0> lock: 0xe3ebe000, .magic: dead4ead, .owner: kworker/u:3/1466, .owner_cpu: 1
      <4> [<c0013dc4>] (unwind_backtrace+0x0/0x11c) from [<c026e794>] (do_raw_spin_unlock+0x40/0xcc)
      <4> [<c026e794>] (do_raw_spin_unlock+0x40/0xcc) from [<c076c160>] (_raw_spin_unlock+0x8/0x30)
      <4> [<c076c160>] (_raw_spin_unlock+0x8/0x30) from [<c009b858>] (mod_timer+0x294/0x310)
      <4> [<c009b858>] (mod_timer+0x294/0x310) from [<c00a5e04>] (queue_delayed_work_on+0x104/0x120)
      <4> [<c00a5e04>] (queue_delayed_work_on+0x104/0x120) from [<c04eae00>] (sdhci_msm_bus_voting+0x88/0x9c)
      <4> [<c04eae00>] (sdhci_msm_bus_voting+0x88/0x9c) from [<c04d8780>] (sdhci_disable+0x40/0x48)
      <4> [<c04d8780>] (sdhci_disable+0x40/0x48) from [<c04bf300>] (mmc_release_host+0x4c/0xb0)
      <4> [<c04bf300>] (mmc_release_host+0x4c/0xb0) from [<c04c7aac>] (mmc_sd_detect+0x90/0xfc)
      <4> [<c04c7aac>] (mmc_sd_detect+0x90/0xfc) from [<c04c2504>] (mmc_rescan+0x7c/0x2c4)
      <4> [<c04c2504>] (mmc_rescan+0x7c/0x2c4) from [<c00a6a7c>] (process_one_work+0x27c/0x484)
      <4> [<c00a6a7c>] (process_one_work+0x27c/0x484) from [<c00a6e94>] (worker_thread+0x210/0x3b0)
      <4> [<c00a6e94>] (worker_thread+0x210/0x3b0) from [<c00aad9c>] (kthread+0x80/0x8c)
      <4> [<c00aad9c>] (kthread+0x80/0x8c) from [<c000ea80>] (kernel_thread_exit+0x0/0x8)
      
      As an example, this particular crash occurred when CPU #3 is executing
      mod_timer() on an inactive timer whose base is refered to offlined CPU
      #2.  The code locked the timer_base corresponding to CPU #2. Before it
      could proceed, CPU #2 came online and reinitialized the spinlock
      corresponding to its base. Thus now CPU #3 held a lock which was
      reinitialized. When CPU #3 finally ended up unlocking the old cpu_base
      corresponding to CPU #2, we hit the above SPIN_BUG().
      
      CPU #0		CPU #3				       CPU #2
      ------		-------				       -------
      .....		 ......				      <Offline>
      		mod_timer()
      		 lock_timer_base
      		   spin_lock_irqsave(&base->lock)
      
      cpu_up(2)	 .....				        ......
      							init_timers_cpu()
      ....		 .....				    	spin_lock_init(&base->lock)
      .....		   spin_unlock_irqrestore(&base->lock)  ......
      		   <spin_bug>
      
      Allocation of per_cpu timer vector bases is done only once under
      "tvec_base_done[]" check. In the current code, spinlock_initialization
      of base->lock isn't under this check. When a CPU is up each time the
      base lock is reinitialized. Move base spinlock initialization under
      the check.
      Signed-off-by: default avatarTirupathi Reddy <tirupath@codeaurora.org>
      Link: http://lkml.kernel.org/r/1368520142-4136-1-git-send-email-tirupath@codeaurora.orgSigned-off-by: default avatarThomas Gleixner <tglx@linutronix.de>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      Signed-off-by: default avatarWilly Tarreau <w@1wt.eu>
      34d91822
    • Stanislaw Gruszka's avatar
      posix-cpu-timers: Fix nanosleep task_struct leak · 820e8b30
      Stanislaw Gruszka authored
      commit e6c42c29 upstream.
      
      The trinity fuzzer triggered a task_struct reference leak via
      clock_nanosleep with CPU_TIMERs. do_cpu_nanosleep() calls
      posic_cpu_timer_create(), but misses a corresponding
      posix_cpu_timer_del() which leads to the task_struct reference leak.
      Reported-and-tested-by: default avatarTommi Rantala <tt.rantala@gmail.com>
      Signed-off-by: default avatarStanislaw Gruszka <sgruszka@redhat.com>
      Cc: Dave Jones <davej@redhat.com>
      Cc: John Stultz <john.stultz@linaro.org>
      Cc: Oleg Nesterov <oleg@redhat.com>
      Link: http://lkml.kernel.org/r/20130215100810.GF4392@redhat.comSigned-off-by: default avatarThomas Gleixner <tglx@linutronix.de>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      Signed-off-by: default avatarWilly Tarreau <w@1wt.eu>
      820e8b30
    • Mark Rutland's avatar
      clockevents: Don't allow dummy broadcast timers · 30145daa
      Mark Rutland authored
      commit a7dc19b8 upstream.
      
      Currently tick_check_broadcast_device doesn't reject clock_event_devices
      with CLOCK_EVT_FEAT_DUMMY, and may select them in preference to real
      hardware if they have a higher rating value. In this situation, the
      dummy timer is responsible for broadcasting to itself, and the core
      clockevents code may attempt to call non-existent callbacks for
      programming the dummy, eventually leading to a panic.
      
      This patch makes tick_check_broadcast_device always reject dummy timers,
      preventing this problem.
      Signed-off-by: default avatarMark Rutland <mark.rutland@arm.com>
      Cc: linux-arm-kernel@lists.infradead.org
      Cc: Jon Medhurst (Tixy) <tixy@linaro.org>
      Signed-off-by: default avatarThomas Gleixner <tglx@linutronix.de>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      Signed-off-by: default avatarWilly Tarreau <w@1wt.eu>
      30145daa
    • John Stultz's avatar
      2.6.32.y: timekeeping: Fix nohz issue with commit 61b76840 · d556d326
      John Stultz authored
      Commit 61b76840 ("time: Avoid
      making adjustments if we haven't accumulated anything")
      introduced a regression with nohz.
      
      Basically with kernels between 2.6.20-something to 2.6.32,
      we accumulate time in half second chunks, rather then every
      timer-tick. This was added because when NOHZ landed, if you
      were idle for a few seconds, you had to spin for every tick
      we skipped in the accumulation loop, which created some bad
      latencies.
      
      However, this required that we create the xtime_cache() which
      was still updated each tick, so that filesystem timestamps,
      etc continued to see time increment normally.
      
      Of course, the xtime_cache is updated at the bottom of
      update_wall_time(). So the early return on
      (offset < timekeeper.cycle_interval), added by the problematic
      commit causes the xtime_cache to not be updated.
      
      This can cause code using current_kernel_time() (like the mqueue
      code) or hrtimer_get_softirq_time(), which uses the non-updated
      xtime_cache, to see timers to fire with very coarse half-second
      granularity.
      
      Many thanks to Romain for describing the issue clearly,
      providing test case to reproduce it and helping with testing
      the solution.
      
      This change is for 2.6.32-stable ONLY!
      
      Cc: stable@vger.kernel.org
      Cc: Willy Tarreau <w@1wt.eu>
      Cc: Romain Francoise <romain@orebokech.com>
      Reported-by: default avatarRomain Francoise <romain@orebokech.com>
      Tested-by: default avatarRomain Francoise <romain@orebokech.com>
      Signed-off-by: default avatarJohn Stultz <john.stultz@linaro.org>
      Signed-off-by: default avatarWilly Tarreau <w@1wt.eu>
      d556d326
    • Jens Axboe's avatar
      Revert "block: improve queue_should_plug() by looking at IO depths" · ec2826bc
      Jens Axboe authored
      This reverts commit fb1e7538.
      
      "Benjamin S." <sbenni@gmx.de> reports that the patch in question
      causes a big drop in sequential throughput for him, dropping from
      200MB/sec down to only 70MB/sec.
      
      Needs to be investigated more fully, for now lets just revert the
      offending commit.
      
      Conflicts:
      
      	include/linux/blkdev.h
      Signed-off-by: default avatarJens Axboe <jens.axboe@oracle.com>
      (cherry picked from commit 79da0644)
      Cc: Thomas Bork <tom@eisfair.net>
      Signed-off-by: default avatarWilly Tarreau <w@1wt.eu>
      ec2826bc
    • Ben Hutchings's avatar
      Revert "pcdp: use early_ioremap/early_iounmap to access pcdp table" · 01ab25d5
      Ben Hutchings authored
      This reverts commit 2af3af56, which was
      commit 6c4088ac upstream.
      
      This broke compilation of the driver in 2.6.32.y as the
      early_io{remap,unmap}() functions are not defined for ia64.  The driver
      can *only* be built for ia64 (even in current mainline), so a fix for
      x86_64 is pointless.
      Signed-off-by: default avatarBen Hutchings <ben@decadent.org.uk>
      Signed-off-by: default avatarWilly Tarreau <w@1wt.eu>
      01ab25d5
  2. 07 Oct, 2012 26 commits