1. 15 May, 2013 4 commits
    • Linus Torvalds's avatar
      Merge branch 'timers-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip · cc51bf6e
      Linus Torvalds authored
      Pull timer fixes from Thomas Gleixner:
      
       - Cure for not using zalloc in the first place, which leads to random
         crashes with CPUMASK_OFF_STACK.
      
       - Revert a user space visible change which broke udev
      
       - Add a missing cpu_online early return introduced by the new full
         dyntick conversions
      
       - Plug a long standing race in the timer wheel cpu hotplug code.
         Sigh...
      
       - Cleanup NOHZ per cpu data on cpu down to prevent stale data on cpu
         up.
      
      * 'timers-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
        time: Revert ALWAYS_USE_PERSISTENT_CLOCK compile time optimizaitons
        timer: Don't reinitialize the cpu base lock during CPU_UP_PREPARE
        tick: Don't invoke tick_nohz_stop_sched_tick() if the cpu is offline
        tick: Cleanup NOHZ per cpu data on cpu down
        tick: Use zalloc_cpumask_var for allocating offstack cpumasks
      cc51bf6e
    • Linus Torvalds's avatar
      Merge branch 'core-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip · 37cae5e2
      Linus Torvalds authored
      Pull core fixes from Thomas Gleixner:
      
       - Two fixlets for the fallout of the generic idle task conversion
      
       - Documentation update
      
      * 'core-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
        rcu/idle: Wrap cpu-idle poll mode within rcu_idle_enter/exit
        idle: Fix hlt/nohlt command-line handling in new generic idle
        kthread: Document ways of reducing OS jitter due to per-CPU kthreads
      37cae5e2
    • Linus Torvalds's avatar
      Merge branch 'fixes' of git://git.linaro.org/people/rmk/linux-arm · d21572c5
      Linus Torvalds authored
      Pull ARM fixes from Russell King:
       "A small number of fixes for stuff from the last merge window, and in
        one case (IRQ time accounting) the previous merge window."
      
      * 'fixes' of git://git.linaro.org/people/rmk/linux-arm:
        ARM: 7720/1: ARM v6/v7 cmpxchg64 shouldn't clear upper 32 bits of the old/new value
        ARM: 7715/1: MCPM: adapt to GIC changes after upstream merge
        ARM: 7714/1: mmc: mmci: Ensure return value of regulator_enable() is checked
        ARM: 7712/1: Remove trailing whitespace in arch/arm/Makefile
        ARM: 7711/1: dove: fix Dove cpu type from V7 to PJ4
        ARM: finally enable IRQ time accounting config
      d21572c5
    • Linus Torvalds's avatar
      Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/sage/ceph-client · 109c3c02
      Linus Torvalds authored
      Pull Ceph fixes from Sage Weil:
       "Yes, this is a much larger pull than I would like after -rc1.  There
        are a few things included:
      
         - a few fixes for leaks and incorrect assertions
         - a few patches fixing behavior when mapped images are resized
         - handling for cloned/layered images that are flattened out from
           underneath the client
      
        The last bit was non-trivial, and there is some code movement and
        associated cleanup mixed in.  This was ready and was meant to go in
        last week but I missed the boat on Friday.  My only excuse is that I
        was waiting for an all clear from the testing and there were many
        other shiny things to distract me.
      
        Strictly speaking, handling the flatten case isn't a regression and
        could wait, so if you like we can try to pull the series apart, but
        Alex and I would much prefer to have it all in as it is a case real
        users will hit with 3.10."
      
      * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/sage/ceph-client: (33 commits)
        rbd: re-submit flattened write request (part 2)
        rbd: re-submit write request for flattened clone
        rbd: re-submit read request for flattened clone
        rbd: detect when clone image is flattened
        rbd: reference count parent requests
        rbd: define parent image request routines
        rbd: define rbd_dev_unparent()
        rbd: don't release write request until necessary
        rbd: get parent info on refresh
        rbd: ignore zero-overlap parent
        rbd: support reading parent page data for writes
        rbd: fix parent request size assumption
        libceph: init sent and completed when starting
        rbd: kill rbd_img_request_get()
        rbd: only set up watch for mapped images
        rbd: set mapping read-only flag in rbd_add()
        rbd: support reading parent page data
        rbd: fix an incorrect assertion condition
        rbd: define rbd_dev_v2_header_info()
        rbd: get rid of trivial v1 header wrappers
        ...
      109c3c02
  2. 14 May, 2013 29 commits
    • John Stultz's avatar
      time: Revert ALWAYS_USE_PERSISTENT_CLOCK compile time optimizaitons · b4f711ee
      John Stultz authored
      Kay Sievers noted that the ALWAYS_USE_PERSISTENT_CLOCK config,
      which enables some minor compile time optimization to avoid
      uncessary code in mostly the suspend/resume path could cause
      problems for userland.
      
      In particular, the dependency for RTC_HCTOSYS on
      !ALWAYS_USE_PERSISTENT_CLOCK, which avoids setting the time
      twice and simplifies suspend/resume, has the side effect
      of causing the /sys/class/rtc/rtcN/hctosys flag to always be
      zero, and this flag is commonly used by udev to setup the
      /dev/rtc symlink to /dev/rtcN, which can cause pain for
      older applications.
      
      While the udev rules could use some work to be less fragile,
      breaking userland should strongly be avoided. Additionally
      the compile time optimizations are fairly minor, and the code
      being optimized is likely to be reworked in the future, so
      lets revert this change.
      Reported-by: default avatarKay Sievers <kay@vrfy.org>
      Signed-off-by: default avatarJohn Stultz <john.stultz@linaro.org>
      Cc: stable <stable@vger.kernel.org> #3.9
      Cc: Feng Tang <feng.tang@intel.com>
      Cc: Jason Gunthorpe <jgunthorpe@obsidianresearch.com>
      Link: http://lkml.kernel.org/r/1366828376-18124-1-git-send-email-john.stultz@linaro.orgSigned-off-by: default avatarThomas Gleixner <tglx@linutronix.de>
      b4f711ee
    • Linus Torvalds's avatar
      Merge tag 'ext4_for_linus_stable' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4 · b973425c
      Linus Torvalds authored
      Pull ext4 update from Ted Ts'o:
       "Fixed regressions (two stability regressions and a performance
        regression) introduced during the 3.10-rc1 merge window.
      
        Also included is a bug fix relating to allocating blocks after
        resizing an ext3 file system when using the ext4 file system driver"
      
      * tag 'ext4_for_linus_stable' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4:
        jbd,jbd2: fix oops in jbd2_journal_put_journal_head()
        ext4: revert "ext4: use io_end for multiple bios"
        ext4: limit group search loop for non-extent files
        ext4: fix fio regression
      b973425c
    • Linus Torvalds's avatar
      Merge branch 'for-3.10-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/wq · 7fb30d2b
      Linus Torvalds authored
      Pull workqueue fix from Tejun Heo:
       "A fix for a workqueue_congested() regression that broke fscache"
      
      * 'for-3.10-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/wq:
        workqueue: workqueue_congested() shouldn't translate WORK_CPU_UNBOUND into node number
      7fb30d2b
    • Tirupathi Reddy's avatar
      timer: Don't reinitialize the cpu base lock during CPU_UP_PREPARE · 42a5cf46
      Tirupathi Reddy authored
      An inactive timer's base can refer to a offline cpu's base.
      
      In the current code, cpu_base's lock is blindly reinitialized each
      time a CPU is brought up. If a CPU is brought online during the period
      that another thread is trying to modify an inactive timer on that CPU
      with holding its timer base lock, then the lock will be reinitialized
      under its feet. This leads to following SPIN_BUG().
      
      <0> BUG: spinlock already unlocked on CPU#3, kworker/u:3/1466
      <0> lock: 0xe3ebe000, .magic: dead4ead, .owner: kworker/u:3/1466, .owner_cpu: 1
      <4> [<c0013dc4>] (unwind_backtrace+0x0/0x11c) from [<c026e794>] (do_raw_spin_unlock+0x40/0xcc)
      <4> [<c026e794>] (do_raw_spin_unlock+0x40/0xcc) from [<c076c160>] (_raw_spin_unlock+0x8/0x30)
      <4> [<c076c160>] (_raw_spin_unlock+0x8/0x30) from [<c009b858>] (mod_timer+0x294/0x310)
      <4> [<c009b858>] (mod_timer+0x294/0x310) from [<c00a5e04>] (queue_delayed_work_on+0x104/0x120)
      <4> [<c00a5e04>] (queue_delayed_work_on+0x104/0x120) from [<c04eae00>] (sdhci_msm_bus_voting+0x88/0x9c)
      <4> [<c04eae00>] (sdhci_msm_bus_voting+0x88/0x9c) from [<c04d8780>] (sdhci_disable+0x40/0x48)
      <4> [<c04d8780>] (sdhci_disable+0x40/0x48) from [<c04bf300>] (mmc_release_host+0x4c/0xb0)
      <4> [<c04bf300>] (mmc_release_host+0x4c/0xb0) from [<c04c7aac>] (mmc_sd_detect+0x90/0xfc)
      <4> [<c04c7aac>] (mmc_sd_detect+0x90/0xfc) from [<c04c2504>] (mmc_rescan+0x7c/0x2c4)
      <4> [<c04c2504>] (mmc_rescan+0x7c/0x2c4) from [<c00a6a7c>] (process_one_work+0x27c/0x484)
      <4> [<c00a6a7c>] (process_one_work+0x27c/0x484) from [<c00a6e94>] (worker_thread+0x210/0x3b0)
      <4> [<c00a6e94>] (worker_thread+0x210/0x3b0) from [<c00aad9c>] (kthread+0x80/0x8c)
      <4> [<c00aad9c>] (kthread+0x80/0x8c) from [<c000ea80>] (kernel_thread_exit+0x0/0x8)
      
      As an example, this particular crash occurred when CPU #3 is executing
      mod_timer() on an inactive timer whose base is refered to offlined CPU
      #2.  The code locked the timer_base corresponding to CPU #2. Before it
      could proceed, CPU #2 came online and reinitialized the spinlock
      corresponding to its base. Thus now CPU #3 held a lock which was
      reinitialized. When CPU #3 finally ended up unlocking the old cpu_base
      corresponding to CPU #2, we hit the above SPIN_BUG().
      
      CPU #0		CPU #3				       CPU #2
      ------		-------				       -------
      .....		 ......				      <Offline>
      		mod_timer()
      		 lock_timer_base
      		   spin_lock_irqsave(&base->lock)
      
      cpu_up(2)	 .....				        ......
      							init_timers_cpu()
      ....		 .....				    	spin_lock_init(&base->lock)
      .....		   spin_unlock_irqrestore(&base->lock)  ......
      		   <spin_bug>
      
      Allocation of per_cpu timer vector bases is done only once under
      "tvec_base_done[]" check. In the current code, spinlock_initialization
      of base->lock isn't under this check. When a CPU is up each time the
      base lock is reinitialized. Move base spinlock initialization under
      the check.
      Signed-off-by: default avatarTirupathi Reddy <tirupath@codeaurora.org>
      Cc: stable@vger.kernel.org
      Link: http://lkml.kernel.org/r/1368520142-4136-1-git-send-email-tirupath@codeaurora.orgSigned-off-by: default avatarThomas Gleixner <tglx@linutronix.de>
      42a5cf46
    • Srivatsa S. Bhat's avatar
      rcu/idle: Wrap cpu-idle poll mode within rcu_idle_enter/exit · b47430d3
      Srivatsa S. Bhat authored
      Bjørn Mork reported the following warning when running powertop.
      
      [   49.289034] ------------[ cut here ]------------
      [   49.289055] WARNING: at kernel/rcutree.c:502 rcu_eqs_exit_common.isra.48+0x3d/0x125()
      [   49.289244] CPU: 0 PID: 0 Comm: swapper/0 Not tainted 3.10.0-bisect-rcu-warn+ #107
      [   49.289251]  ffffffff8157d8c8 ffffffff81801e28 ffffffff8137e4e3 ffffffff81801e68
      [   49.289260]  ffffffff8103094f ffffffff81801e68 0000000000000000 ffff88023afcd9b0
      [   49.289268]  0000000000000000 0140000000000000 ffff88023bee7700 ffffffff81801e78
      [   49.289276] Call Trace:
      [   49.289285]  [<ffffffff8137e4e3>] dump_stack+0x19/0x1b
      [   49.289293]  [<ffffffff8103094f>] warn_slowpath_common+0x62/0x7b
      [   49.289300]  [<ffffffff8103097d>] warn_slowpath_null+0x15/0x17
      [   49.289306]  [<ffffffff810a9006>] rcu_eqs_exit_common.isra.48+0x3d/0x125
      [   49.289314]  [<ffffffff81079b49>] ? trace_hardirqs_off_caller+0x37/0xa6
      [   49.289320]  [<ffffffff810a9692>] rcu_idle_exit+0x85/0xa8
      [   49.289327]  [<ffffffff8107076e>] trace_cpu_idle_rcuidle+0xae/0xff
      [   49.289334]  [<ffffffff810708b1>] cpu_startup_entry+0x72/0x115
      [   49.289341]  [<ffffffff813689e5>] rest_init+0x149/0x150
      [   49.289347]  [<ffffffff8136889c>] ? csum_partial_copy_generic+0x16c/0x16c
      [   49.289355]  [<ffffffff81a82d34>] start_kernel+0x3f0/0x3fd
      [   49.289362]  [<ffffffff81a8274c>] ? repair_env_string+0x5a/0x5a
      [   49.289368]  [<ffffffff81a82481>] x86_64_start_reservations+0x2a/0x2c
      [   49.289375]  [<ffffffff81a82550>] x86_64_start_kernel+0xcd/0xd1
      [   49.289379] ---[ end trace 07a1cc95e29e9036 ]---
      
      The warning is that 'rdtp->dynticks' has an unexpected value, which roughly
      translates to - the calls to rcu_idle_enter() and rcu_idle_exit() were not
      made in the correct order, or otherwise messed up.
      
      And Bjørn's painstaking debugging indicated that this happens when the idle
      loop enters the poll mode. Looking at the poll function cpu_idle_poll(), and
      the implementation of trace_cpu_idle_rcuidle(), the problem becomes very clear:
      cpu_idle_poll() lacks calls to rcu_idle_enter/exit(), and trace_cpu_idle_rcuidle()
      calls them in the reverse order - first rcu_idle_exit(), and then rcu_idle_enter().
      Hence the even/odd alternative sequencing of rdtp->dynticks goes for a toss.
      
      And powertop readily triggers this because powertop uses the idle-tracing
      infrastructure extensively.
      
      So, to fix this, wrap the code in cpu_idle_poll() within rcu_idle_enter/exit(),
      so that it blends properly with the calls inside trace_cpu_idle_rcuidle() and
      thus get the function ordering right.
      Reported-and-tested-by: default avatarBjørn Mork <bjorn@mork.no>
      Cc: Paul McKenney <paulmck@linux.vnet.ibm.com>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      Cc: Dipankar Sarma <dipankar@in.ibm.com>
      Signed-off-by: default avatarSrivatsa S. Bhat <srivatsa.bhat@linux.vnet.ibm.com>
      Link: http://lkml.kernel.org/r/519169BF.4080208@linux.vnet.ibm.comSigned-off-by: default avatarThomas Gleixner <tglx@linutronix.de>
      b47430d3
    • Thomas Gleixner's avatar
      tick: Don't invoke tick_nohz_stop_sched_tick() if the cpu is offline · f7ea0fd6
      Thomas Gleixner authored
      commit 5b39939a (nohz: Move ts->idle_calls incrementation into strict
      idle logic) moved code out of tick_nohz_stop_sched_tick() and missed
      to bail out when the cpu is offline. That's causing subsequent
      failures as an offline CPU is supposed to die and not to fiddle with
      nohz magic.
      
      Return false in can_stop_idle_tick() if the cpu is offline.
      Reported-and-tested-by: default avatarJiri Kosina <jkosina@suse.cz>
      Reported-and-tested-by: default avatarPrarit Bhargava <prarit@redhat.com>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Tony Luck <tony.luck@intel.com>
      Cc: x86@kernel.org
      Link: http://lkml.kernel.org/r/alpine.LFD.2.02.1305132138160.2863@ionosSigned-off-by: default avatarThomas Gleixner <tglx@linutronix.de>
      f7ea0fd6
    • Linus Torvalds's avatar
      Merge branch 'merge' of git://git.kernel.org/pub/scm/linux/kernel/git/benh/powerpc · a2c7a54f
      Linus Torvalds authored
      Pull powerpc fixes from Benjamin Herrenschmidt:
       "This is mostly bug fixes (some of them regressions, some of them I
        deemed worth merging now) along with some patches from Li Zhong
        hooking up the new context tracking stuff (for the new full NO_HZ)"
      
      * 'merge' of git://git.kernel.org/pub/scm/linux/kernel/git/benh/powerpc: (25 commits)
        powerpc: Set show_unhandled_signals to 1 by default
        powerpc/perf: Fix setting of "to" addresses for BHRB
        powerpc/pmu: Fix order of interpreting BHRB target entries
        powerpc/perf: Move BHRB code into CONFIG_PPC64 region
        powerpc: select HAVE_CONTEXT_TRACKING for pSeries
        powerpc: Use the new schedule_user API on userspace preemption
        powerpc: Exit user context on notify resume
        powerpc: Exception hooks for context tracking subsystem
        powerpc: Syscall hooks for context tracking subsystem
        powerpc/booke64: Fix kernel hangs at kernel_dbg_exc
        powerpc: Fix irq_set_affinity() return values
        powerpc: Provide __bswapdi2
        powerpc/powernv: Fix starting of secondary CPUs on OPALv2 and v3
        powerpc/powernv: Detect OPAL v3 API version
        powerpc: Fix MAX_STACK_TRACE_ENTRIES too low warning again
        powerpc: Make CONFIG_RTAS_PROC depend on CONFIG_PROC_FS
        powerpc: Bring all threads online prior to migration/hibernation
        powerpc/rtas_flash: Fix validate_flash buffer overflow issue
        powerpc/kexec: Fix kexec when using VMX optimised memcpy
        powerpc: Fix build errors STRICT_MM_TYPECHECKS
        ...
      a2c7a54f
    • Benjamin Herrenschmidt's avatar
      powerpc: Set show_unhandled_signals to 1 by default · e34166ad
      Benjamin Herrenschmidt authored
      Just like other architectures
      Signed-off-by: default avatarBenjamin Herrenschmidt <benh@kernel.crashing.org>
      e34166ad
    • Michael Neuling's avatar
      powerpc/perf: Fix setting of "to" addresses for BHRB · 69123184
      Michael Neuling authored
      Currently we only set the "to" address in the branch stack when the CPU
      explicitly gives us a value.  Unfortunately it only does this for XL form
      branches (eg blr, bctr, bctar) and not I and B form branches (eg b, bc).
      
      Fortunately if we read the instruction from memory we can extract the offset of
      a branch and calculate the target address.
      
      This adds a function power_pmu_bhrb_to() to calculate the target/to address of
      the corresponding I and B form branches.  It handles branches in both user and
      kernel spaces.  It also plumbs this into the perf brhb reading code.
      Signed-off-by: default avatarMichael Neuling <mikey@neuling.org>
      Signed-off-by: default avatarBenjamin Herrenschmidt <benh@kernel.crashing.org>
      69123184
    • Michael Neuling's avatar
      powerpc/pmu: Fix order of interpreting BHRB target entries · 506e70d1
      Michael Neuling authored
      The current Branch History Rolling Buffer (BHRB) code misinterprets the order
      of entries in the hardware buffer.  It assumes that a branch target address
      will be read _after_ its corresponding branch.  In reality the branch target
      comes before (lower mfbhrb entry) it's corresponding branch.
      
      This is a rewrite of the code to take this into account.
      Signed-off-by: default avatarMichael Neuling <mikey@neuling.org>
      Signed-off-by: default avatarBenjamin Herrenschmidt <benh@kernel.crashing.org>
      506e70d1
    • Michael Neuling's avatar
      powerpc/perf: Move BHRB code into CONFIG_PPC64 region · d52f2dc4
      Michael Neuling authored
      The new Branch History Rolling buffer (BHRB) code is only useful on 64bit
      processors, so move it into the #ifdef CONFIG_PPC64 region.
      
      This avoids code bloat on 32bit systems.
      Signed-off-by: default avatarMichael Neuling <mikey@neuling.org>
      Signed-off-by: default avatarBenjamin Herrenschmidt <benh@kernel.crashing.org>
      d52f2dc4
    • Li Zhong's avatar
      powerpc: select HAVE_CONTEXT_TRACKING for pSeries · a1797b2f
      Li Zhong authored
      Start context tracking support from pSeries.
      Signed-off-by: default avatarLi Zhong <zhong@linux.vnet.ibm.com>
      Signed-off-by: default avatarBenjamin Herrenschmidt <benh@kernel.crashing.org>
      a1797b2f
    • Li Zhong's avatar
      powerpc: Use the new schedule_user API on userspace preemption · 5d1c5745
      Li Zhong authored
      This patch corresponds to
      [PATCH] x86: Use the new schedule_user API on userspace preemption
        commit 0430499cSigned-off-by: default avatarLi Zhong <zhong@linux.vnet.ibm.com>
      Signed-off-by: default avatarBenjamin Herrenschmidt <benh@kernel.crashing.org>
      5d1c5745
    • Li Zhong's avatar
      powerpc: Exit user context on notify resume · 106ed886
      Li Zhong authored
      This patch allows RCU usage in do_notify_resume, e.g. signal handling.
      It corresponds to
      [PATCH] x86: Exit RCU extended QS on notify resume
        commit edf55fdaSigned-off-by: default avatarLi Zhong <zhong@linux.vnet.ibm.com>
      Signed-off-by: default avatarBenjamin Herrenschmidt <benh@kernel.crashing.org>
      106ed886
    • Li Zhong's avatar
      powerpc: Exception hooks for context tracking subsystem · ba12eede
      Li Zhong authored
      This is the exception hooks for context tracking subsystem, including
      data access, program check, single step, instruction breakpoint, machine check,
      alignment, fp unavailable, altivec assist, unknown exception, whose handlers
      might use RCU.
      
      This patch corresponds to
      [PATCH] x86: Exception hooks for userspace RCU extended QS
        commit 6ba3c97a
      
      But after the exception handling moved to generic code, and some changes in
      following two commits:
      56dd9470
        context_tracking: Move exception handling to generic code
      6c1e0256
        context_tracking: Restore correct previous context state on exception exit
      
      it is able for exception hooks to use the generic code above instead of a
      redundant arch implementation.
      Signed-off-by: default avatarLi Zhong <zhong@linux.vnet.ibm.com>
      Signed-off-by: default avatarBenjamin Herrenschmidt <benh@kernel.crashing.org>
      ba12eede
    • Li Zhong's avatar
      powerpc: Syscall hooks for context tracking subsystem · 22ecbe8d
      Li Zhong authored
      This is the syscall slow path hooks for context tracking subsystem,
      corresponding to
      [PATCH] x86: Syscall hooks for userspace RCU extended QS
        commit bf5a3c13
      
      TIF_MEMDIE is moved to the second 16-bits (with value 17), as it seems there
      is no asm code using it. TIF_NOHZ is added to _TIF_SYCALL_T_OR_A, so it is
      better for it to be in the same 16 bits with others in the group, so in the
      asm code, andi. with this group could work.
      Signed-off-by: default avatarLi Zhong <zhong@linux.vnet.ibm.com>
      Acked-by: default avatarFrederic Weisbecker <fweisbec@gmail.com>
      Signed-off-by: default avatarBenjamin Herrenschmidt <benh@kernel.crashing.org>
      22ecbe8d
    • Scott Wood's avatar
      powerpc/booke64: Fix kernel hangs at kernel_dbg_exc · 6cecf76b
      Scott Wood authored
      MSR_DE is not cleared on entry to the kernel, and we don't clear it
      explicitly outside of debug code.  If we have MSR_DE set in
      prime_debug_regs(), and the new thread has events enabled in DBCR0
      (e.g.  ICMP is set in thread->dbsr0, even though it was cleared in the
      real DBCR0 when the thread got scheduled out), we'll end up taking a
      debug exception in the kernel when DBCR0 is loaded.  DSRR0 will not
      point to an exception vector, and the kernel ends up hanging at
      kernel_dbg_exc.  Fix this by always clearing MSR_DE when we load new
      debug state.
      
      Another observed source of kernel_dbg_exc hangs is with the branch
      taken event.  If this event is active, but we take a non-debug trap
      (e.g. a TLB miss or an asynchronous interrupt) before the next branch.
      We end up taking a branch-taken debug exception on the initial branch
      instruction of the exception vector, but because the debug exception is
      DBSR_BT rather than DBSR_IC we branch to kernel_dbg_exc before even
      checking the DSRR0 address.  Fix this by checking for DBSR_BT as well
      as DBSR_IC, which is what 32-bit does and what the comments suggest was
      intended in the 64-bit code as well.
      Signed-off-by: default avatarScott Wood <scottwood@freescale.com>
      Signed-off-by: default avatarBenjamin Herrenschmidt <benh@kernel.crashing.org>
      6cecf76b
    • Alexander Gordeev's avatar
    • David Woodhouse's avatar
      powerpc: Provide __bswapdi2 · ca9d7aea
      David Woodhouse authored
      Some versions of GCC apparently expect this to be provided by libgcc.
      
      Updates from Mikey to fix 32 bit version and adding "r" to registers.
      Signed-off-by: default avatarDavid Woodhouse <David.Woodhouse@intel.com>
      Signed-off-by: default avatarMichael Neuling <mikey@neuling.org>
      Signed-off-by: default avatarBenjamin Herrenschmidt <benh@kernel.crashing.org>
      ca9d7aea
    • Benjamin Herrenschmidt's avatar
      powerpc/powernv: Fix starting of secondary CPUs on OPALv2 and v3 · b2b48584
      Benjamin Herrenschmidt authored
      The current code fails to handle kexec on OPALv2. This fixes it
      and adds code to improve the situation on OPALv3 where we can
      query the CPU status from the firmware and decide what to do
      based on that.
      Signed-off-by: default avatarBenjamin Herrenschmidt <benh@kernel.crashing.org>
      b2b48584
    • Benjamin Herrenschmidt's avatar
      powerpc/powernv: Detect OPAL v3 API version · 75b93da4
      Benjamin Herrenschmidt authored
      Future firmwares will support that new version
      Signed-off-by: default avatarBenjamin Herrenschmidt <benh@kernel.crashing.org>
      75b93da4
    • Li Zhong's avatar
      powerpc: Fix MAX_STACK_TRACE_ENTRIES too low warning again · af945cf4
      Li Zhong authored
      Saw this warning again, and this time from the ret_from_fork path.
      
      It seems we could clear the back chain earlier in copy_thread(), which
      could cover both path, and also fix potential lockdep usage in
      schedule_tail(), or exception occurred before we clear the back chain.
      Signed-off-by: default avatarLi Zhong <zhong@linux.vnet.ibm.com>
      Signed-off-by: default avatarBenjamin Herrenschmidt <benh@kernel.crashing.org>
      af945cf4
    • Michael Ellerman's avatar
      powerpc: Make CONFIG_RTAS_PROC depend on CONFIG_PROC_FS · b80ec3dc
      Michael Ellerman authored
      We are getting build errors with CONFIG_PROC_FS=n:
      
      arch/powerpc/kernel/rtas_flash.c
         In function 'rtas_flash_init':
        745:33: error: unused variable 'f' [-Werror=unused-variable]
      
      But rtas_flash.c should not be built when CONFIG_PROC_FS=n, beacause all
      it does is provide a /proc interface to the RTAS flash routines.
      
      CONFIG_RTAS_FLASH already depends on CONFIG_RTAS_PROC, to indicate that
      it depends on the RTAS proc support, but CONFIG_RTAS_PROC does not
      depend on CONFIG_PROC_FS. So fix that.
      Signed-off-by: default avatarMichael Ellerman <michael@ellerman.id.au>
      Signed-off-by: default avatarBenjamin Herrenschmidt <benh@kernel.crashing.org>
      b80ec3dc
    • Robert Jennings's avatar
      powerpc: Bring all threads online prior to migration/hibernation · 120496ac
      Robert Jennings authored
      This patch brings online all threads which are present but not online
      prior to migration/hibernation.  After migration/hibernation those
      threads are taken back offline.
      
      During migration/hibernation all online CPUs must call H_JOIN, this is
      required by the hypervisor.  Without this patch, threads that are offline
      (H_CEDE'd) will not be woken to make the H_JOIN call and the OS will be
      deadlocked (all threads either JOIN'd or CEDE'd).
      
      Cc: <stable@kernel.org>
      Signed-off-by: default avatarRobert Jennings <rcj@linux.vnet.ibm.com>
      Signed-off-by: default avatarBenjamin Herrenschmidt <benh@kernel.crashing.org>
      120496ac
    • Vasant Hegde's avatar
      powerpc/rtas_flash: Fix validate_flash buffer overflow issue · a94a1472
      Vasant Hegde authored
      ibm,validate-flash-image RTAS call output buffer contains 150 - 200
      bytes of data on latest system. Presently we have output
      buffer size as 64 bytes and we use sprintf to copy data from
      RTAS buffer to local buffer. This causes kernel oops (see below
      call trace).
      
      This patch increases local buffer size to 256 and also uses
      snprintf instead of sprintf to copy data from RTAS buffer.
      
      Kernel call trace :
      -------------------
      Oops: Kernel access of bad area, sig: 11 [#1]
      SMP NR_CPUS=1024 NUMA pSeries
      Modules linked in: nfs fscache lockd auth_rpcgss nfs_acl sunrpc fuse loop dm_mod ipv6 ipv6_lib usb_storage ehea(X) sr_mod qlge ses cdrom enclosure st be2net sg ext3 jbd mbcache usbhid hid ohci_hcd ehci_hcd usbcore qla2xxx usb_common sd_mod crc_t10dif scsi_dh_hp_sw scsi_dh_rdac scsi_dh_alua scsi_dh_emc scsi_dh lpfc scsi_transport_fc scsi_tgt ipr(X) libata scsi_mod
      Supported: Yes
      NIP: 4520323031333130 LR: 4520323031333130 CTR: 0000000000000000
      REGS: c0000001b91779b0 TRAP: 0400   Tainted: G            X  (3.0.13-0.27-ppc64)
      MSR: 8000000040009032 <EE,ME,IR,DR>  CR: 44022488  XER: 20000018
      TASK = c0000001bca1aba0[4736] 'cat' THREAD: c0000001b9174000 CPU: 36
      GPR00: 4520323031333130 c0000001b9177c30 c000000000f87c98 000000000000009b
      GPR04: c0000001b9177c4a 000000000000000b 3520323031333130 2032303133313031
      GPR08: 3133313031350a4d 000000000000009b 0000000000000000 c0000000003664a4
      GPR12: 0000000022022448 c000000003ee6c00 0000000000000002 00000000100e8a90
      GPR16: 00000000100cb9d8 0000000010093370 000000001001d310 0000000000000000
      GPR20: 0000000000008000 00000000100fae60 000000000000005e 0000000000000000
      GPR24: 0000000010129350 46573738302e3030 2046573738302e30 300a4d4720323031
      GPR28: 333130313520554e 4b4e4f574e0a4d47 2032303133313031 3520323031333130
      NIP [4520323031333130] 0x4520323031333130
      LR [4520323031333130] 0x4520323031333130
      Call Trace:
      [c0000001b9177c30] [4520323031333130] 0x4520323031333130 (unreliable)
      Instruction dump:
      XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX
      XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX
      Signed-off-by: default avatarVasant Hegde <hegdevasant@linux.vnet.ibm.com>
      Signed-off-by: default avatarBenjamin Herrenschmidt <benh@kernel.crashing.org>
      a94a1472
    • Anton Blanchard's avatar
      powerpc/kexec: Fix kexec when using VMX optimised memcpy · 79c66ce8
      Anton Blanchard authored
      commit b3f271e8 (powerpc: POWER7 optimised memcpy using VMX and
      enhanced prefetch) uses VMX when it is safe to do so (ie not in
      interrupt). It also looks at the task struct to decide if we have to
      save the current tasks' VMX state.
      
      kexec calls memcpy() at a point where the task struct may have been
      overwritten by the new kexec segments. If it has been overwritten
      then when memcpy -> enable_altivec looks up current->thread.regs->msr
      we get a cryptic oops or lockup.
      
      I also notice we aren't initialising thread_info->cpu, which means
      smp_processor_id is broken. Fix that too.
      Signed-off-by: default avatarAnton Blanchard <anton@samba.org>
      Cc: <stable@vger.kernel.org> # 3.6+
      Signed-off-by: default avatarBenjamin Herrenschmidt <benh@kernel.crashing.org>
      79c66ce8
    • Aneesh Kumar K.V's avatar
    • Aneesh Kumar K.V's avatar
      powerpc/mm: Use the correct mask value when looking at pgtable address · 613e60a6
      Aneesh Kumar K.V authored
      Our pgtable are 2*sizeof(pte_t)*PTRS_PER_PTE which is PTE_FRAG_SIZE.
      Instead of depending on frag size, mask with PMD_MASKED_BITS.
      Signed-off-by: default avatarAneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
      Signed-off-by: default avatarBenjamin Herrenschmidt <benh@kernel.crashing.org>
      613e60a6
    • Linus Torvalds's avatar
      Merge tag 'fixes-for-3.10-rc2-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/sstabellini/xen · 674825d0
      Linus Torvalds authored
      Pull Xen/arm fixes from Stefano Stabellini:
       "This contains a couple of Xen on ARM initialization fixes and a patch
        to improve error handling"
      
      * tag 'fixes-for-3.10-rc2-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/sstabellini/xen:
        xen/arm: rename xen_secondary_init and run it on every online cpu
        xen/arm: do not handle VCPUOP_register_vcpu_info failures
        xen/arm: initialize pm functions later
      674825d0
  3. 13 May, 2013 7 commits
    • Linus Torvalds's avatar
      Merge branch 'parisc-for-3.10' of git://git.kernel.org/pub/scm/linux/kernel/git/deller/parisc-linux · c83bb885
      Linus Torvalds authored
      Pull parisc update from Helge Deller:
       "The second round of parisc updates for 3.10 includes build fixes and
        enhancements to utilize irq stacks, fixes SMP races when updating PTE
        and TLB entries by proper locking and makes the search for the correct
        cross compiler more robust on Debian and Gentoo."
      
      * 'parisc-for-3.10' of git://git.kernel.org/pub/scm/linux/kernel/git/deller/parisc-linux:
        parisc: make default cross compiler search more robust (v3)
        parisc: fix SMP races when updating PTE and TLB entries in entry.S
        parisc: implement irq stacks - part 2 (v2)
      c83bb885
    • Jaccon Bastiaansen's avatar
      ARM: 7720/1: ARM v6/v7 cmpxchg64 shouldn't clear upper 32 bits of the old/new value · 6eabb330
      Jaccon Bastiaansen authored
      The implementation of cmpxchg64() for the ARM v6 and v7 architecture
      casts parameter 2 and 3 (the old and new 64bit values) to an unsigned
      long before calling the atomic_cmpxchg64() function. This clears
      the top 32 bits of the old and new values, resulting in the wrong
      values being compare-exchanged. Luckily, this only appears to be used
      for 64-bit sched_clock, which we don't (yet) have on ARM.
      
      This bug was introduced by commit 3e0f5a15 ("ARM: 7404/1: cmpxchg64:
      use atomic64 and local64 routines for cmpxchg64").
      
      Cc: <stable@vger.kernel.org>
      Acked-by: default avatarWill Deacon <will.deacon@arm.com>
      Signed-off-by: default avatarJaccon Bastiaansen <jaccon.bastiaansen@gmail.com>
      Signed-off-by: default avatarRussell King <rmk+kernel@arm.linux.org.uk>
      6eabb330
    • Linus Torvalds's avatar
      Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net · dbbffe68
      Linus Torvalds authored
      Pull networking fixes from David Miller:
       "Several small bug fixes all over:
      
         1) be2net driver uses wrong payload length when submitting MAC list
            get requests to the chip.  From Sathya Perla.
      
         2) Fix mwifiex memory leak on driver unload, from Amitkumar Karwar.
      
         3) Prevent random memory access in batman-adv, from Marek Lindner.
      
         4) batman-adv doesn't check for pskb_trim_rcsum() errors, also from
            Marek Lindner.
      
         5) Fix fec crashes on rapid link up/down, from Frank Li.
      
         6) Fix inner protocol grovelling in GSO, from Pravin B Shelar.
      
         7) Link event validation fix in qlcnic from Rajesh Borundia.
      
         8) Not all FEC chips can support checksum offload, fix from Shawn
            Guo.
      
         9) EXPORT_SYMBOL + inline doesn't make any sense, from Denis Efremov.
      
        10) Fix race in passthru mode during device removal in macvlan, from
            Jiri Pirko.
      
        11) Fix RCU hash table lookup socket state race in ipv6, leading to
            NULL pointer derefs, from Eric Dumazet.
      
        12) Add several missing HAS_DMA kconfig dependencies, from Geert
            Uyttterhoeven.
      
        13) Fix bogus PCI resource management in 3c59x driver, from Sergei
            Shtylyov.
      
        14) Fix info leak in ipv6 GRE tunnel driver, from Amerigo Wang.
      
        15) Fix device leak in ipv6 IPSEC policy layer, from Cong Wang.
      
        16) DMA mapping leak fix in qlge from Thadeu Lima de Souza Cascardo.
      
        17) Missing iounmap on probe failure in bna driver, from Wei Yongjun."
      
      * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (40 commits)
        bna: add missing iounmap() on error in bnad_init()
        qlge: fix dma map leak when the last chunk is not allocated
        xfrm6: release dev before returning error
        ipv6,gre: do not leak info to user-space
        virtio_net: use default napi weight by default
        emac: Fix EMAC soft reset on 460EX/GT
        3c59x: fix PCI resource management
        caif: CAIF_VIRTIO should depend on HAS_DMA
        net/ethernet: MACB should depend on HAS_DMA
        net/ethernet: ARM_AT91_ETHER should depend on HAS_DMA
        net/wireless: ATH9K should depend on HAS_DMA
        net/ethernet: STMMAC_ETH should depend on HAS_DMA
        net/ethernet: NET_CALXEDA_XGMAC should depend on HAS_DMA
        ipv6: do not clear pinet6 field
        macvlan: fix passthru mode race between dev removal and rx path
        ipv4: ip_output: remove inline marking of EXPORT_SYMBOL functions
        net/mlx4: Strengthen VLAN tags/priorities enforcement in VST mode
        net/mlx4_core: Add missing report on VST and spoof-checking dev caps
        net: fec: enable hardware checksum only on imx6q-fec
        qlcnic: Fix validation of link event command.
        ...
      dbbffe68
    • Helge Deller's avatar
      parisc: make default cross compiler search more robust (v3) · 6880b015
      Helge Deller authored
      People/distros vary how they prefix the toolchain name for 64bit builds.
      Rather than enforce one convention over another, add a for loop which
      does a search for all the general prefixes.
      
      For 64bit builds, we now search for (in order):
      	hppa64-unknown-linux-gnu
      	hppa64-linux-gnu
      	hppa64-linux
      
      For 32bit builds, we look for:
      	hppa-unknown-linux-gnu
      	hppa-linux-gnu
      	hppa-linux
      	hppa2.0-unknown-linux-gnu
      	hppa2.0-linux-gnu
      	hppa2.0-linux
      	hppa1.1-unknown-linux-gnu
      	hppa1.1-linux-gnu
      	hppa1.1-linux
      
      This patch was initiated by Mike Frysinger, with feedback from Jeroen
      Roovers, John David Anglin and Helge Deller.
      Signed-off-by: default avatarMike Frysinger <vapier@gentoo.org>
      Signed-off-by: default avatarJeroen Roovers <jer@gentoo.org>
      Signed-off-by: default avatarJohn David Anglin <dave.anglin@bell.net>
      Signed-off-by: default avatarHelge Deller <deller@gmx.de>
      6880b015
    • Alex Elder's avatar
      rbd: re-submit flattened write request (part 2) · 638f5abe
      Alex Elder authored
      Add code to rbd_img_obj_exists_callback() to detect when a clone's
      parent image has disappeared, and re-submit the original write
      request in that case.
      
      Kill off some redundant assertions.
      
      This completes the resolution for:
          http://tracker.ceph.com/issues/3763Signed-off-by: default avatarAlex Elder <elder@inktank.com>
      Reviewed-by: default avatarJosh Durgin <josh.durgin@inktank.com>
      638f5abe
    • Alex Elder's avatar
      rbd: re-submit write request for flattened clone · bbea1c1a
      Alex Elder authored
      Add code to rbd_img_parent_read_full_callback() to detect when a
      clone's parent image has disappeared, and re-submit the original
      write request in that case.  (See the previous commit for more
      reasoning about why this is appropriate.)
      
      Rename some variables in rbd_img_obj_parent_read_full_callback()
      to match the convention used in the previous patch.
      Signed-off-by: default avatarAlex Elder <elder@inktank.com>
      Reviewed-by: default avatarJosh Durgin <josh.durgin@inktank.com>
      bbea1c1a
    • Alex Elder's avatar
      rbd: re-submit read request for flattened clone · 02c74fba
      Alex Elder authored
      If a clone image gets flattened while a parent read request is
      underway, the original rbd object request needs to be resubmitted.
      
      The reason is that by the time we get the response to the parent
      read request, the data read from the parent may be out of date.
      In other words, we could see this sequence of events:
      
          rbd client                      parent image/osd
          ----------                      ----------------
          original object ENOENT;
              issue parent read
                                          respond to parent read
                                          child image flattened
          original image header refresh
                   <--- original object written independently here
          parent read response received
      
      Add code to rbd_img_parent_read_callback() to detect when a clone's
      parent image has disappeared (as evidenced by its parent overlap
      becoming 0), and re-submit the original read request in that case.
      Signed-off-by: default avatarAlex Elder <elder@inktank.com>
      Reviewed-by: default avatarJosh Durgin <josh.durgin@inktank.com>
      02c74fba