- 10 Jun, 2013 18 commits
-
-
Paul E. McKenney authored
TINY_RCU's reset_cpu_stall_ticks() and check_cpu_stalls() functions are defined unconditionally, and are empty functions if CONFIG_RCU_TRACE is disabled (which in turns disables detection of RCU CPU stalls). This commit saves a few lines of source code by defining these functions only if CONFIG_RCU_TRACE=y. Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
-
Paul E. McKenney authored
Now that TINY_PREEMPT_RCU is no more, exit_rcu() is always an empty function. But if TINY_RCU is going to have an empty function, it should be in include/linux/rcutiny.h, where it does not bloat the kernel. This commit therefore moves exit_rcu() out of kernel/rcupdate.c to kernel/rcutree_plugin.h, and places a static inline empty function in include/linux/rcutiny.h in order to shrink TINY_RCU a bit. Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Reviewed-by: Josh Triplett <josh@joshtriplett.org>
-
Paul E. McKenney authored
Because TINY_PREEMPT_RCU is no more, this commit removes its tracing formats from the documentation. Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Reviewed-by: Josh Triplett <josh@joshtriplett.org>
-
Paul E. McKenney authored
This commit rearranges code in order to allow ifdefs to be consolidated in kernel/rcutiny_plugin.h, simplifying the code. Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Reviewed-by: Josh Triplett <josh@joshtriplett.org>
-
Paul E. McKenney authored
With the removal of CONFIG_TINY_PREEMPT_RCU, rcu_preempt_note_context_switch() is now an empty function. This commit therefore eliminates it by inlining it. Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Reviewed-by: Josh Triplett <josh@joshtriplett.org>
-
Paul E. McKenney authored
Now that CONFIG_TINY_PREEMPT_RCU is no more, this commit removes the CONFIG_TINY_RCU ifdefs from include/linux/rcutiny.h in favor of unconditionally compiling the CONFIG_TINY_RCU legs of those ifdefs. Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com> [ paulmck: Moved removal of #else to "Remove TINY_PREEMPT_RCU" as suggested by Josh Triplett. ] Reviewed-by: Josh Triplett <josh@joshtriplett.org>
-
Paul E. McKenney authored
With the removal of CONFIG_TINY_PREEMPT_RCU, check_cpu_stall_preempt() is now an empty function. This commit therefore eliminates it by inlining it. Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Reviewed-by: Josh Triplett <josh@joshtriplett.org>
-
Paul E. McKenney authored
TINY_PREEMPT_RCU could use a kthread to handle RCU callback invocation, which required an API to abstract kthread vs. softirq invocation. Now that TINY_PREEMPT_RCU is no longer with us, this commit retires this API in favor of direct use of the relevant softirq primitives. Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Reviewed-by: Josh Triplett <josh@joshtriplett.org>
-
Paul E. McKenney authored
With the removal of CONFIG_TINY_PREEMPT_RCU, rcu_preempt_process_callbacks() is now an empty function. This commit therefore eliminates it by inlining it. Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Reviewed-by: Josh Triplett <josh@joshtriplett.org>
-
Paul E. McKenney authored
With the removal of CONFIG_TINY_PREEMPT_RCU, rcu_preempt_remove_callbacks() is now an empty function. This commit therefore eliminates it by inlining it. Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Reviewed-by: Josh Triplett <josh@joshtriplett.org>
-
Paul E. McKenney authored
With the removal of CONFIG_TINY_PREEMPT_RCU, rcu_preempt_check_callbacks() is now an empty function. This commit therefore eliminates it by inlining it. Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Reviewed-by: Josh Triplett <josh@joshtriplett.org>
-
Paul E. McKenney authored
With the removal of CONFIG_TINY_PREEMPT_RCU, show_tiny_preempt_stats() is now an empty function. This commit therefore eliminates it by inlining it. Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Reviewed-by: Josh Triplett <josh@joshtriplett.org>
-
Paul E. McKenney authored
TINY_PREEMPT_RCU adds significant code and complexity, but does not offer commensurate benefits. People currently using TINY_PREEMPT_RCU can get much better memory footprint with TINY_RCU, or, if they really need preemptible RCU, they can use TREE_PREEMPT_RCU with a relatively minor degradation in memory footprint. Please note that this move has been widely publicized on LKML (https://lkml.org/lkml/2012/11/12/545) and on LWN (http://lwn.net/Articles/541037/). This commit therefore removes TINY_PREEMPT_RCU. Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com> [ paulmck: Updated to eliminate #else in rcutiny.h as suggested by Josh ] Reviewed-by: Josh Triplett <josh@joshtriplett.org>
-
Paul E. McKenney authored
This commit converts printk() calls to the corresponding pr_*() calls. Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Reviewed-by: Josh Triplett <josh@joshtriplett.org>
-
Paul E. McKenney authored
This commit converts printk() calls to the corresponding pr_*() calls. Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Reviewed-by: Josh Triplett <josh@joshtriplett.org>
-
Paul E. McKenney authored
In Steven Rostedt's words: > I've been debugging the last couple of days why my tests have been > locking up. One of my tracing tests, runs all available tracers. The > lockup always happened with the mmiotrace, which is used to trace > interactions between priority drivers and the kernel. But to do this > easily, when the tracer gets registered, it disables all but the boot > CPUs. The lockup always happened after it got done disabling the CPUs. > > Then I decided to try this: > > while :; do > for i in 1 2 3; do > echo 0 > /sys/devices/system/cpu/cpu$i/online > done > for i in 1 2 3; do > echo 1 > /sys/devices/system/cpu/cpu$i/online > done > done > > Well, sure enough, that locked up too, with the same users. Doing a > sysrq-w (showing all blocked tasks): > > [ 2991.344562] task PC stack pid father > [ 2991.344562] rcu_preempt D ffff88007986fdf8 0 10 2 0x00000000 > [ 2991.344562] ffff88007986fc98 0000000000000002 ffff88007986fc48 0000000000000908 > [ 2991.344562] ffff88007986c280 ffff88007986ffd8 ffff88007986ffd8 00000000001d3c80 > [ 2991.344562] ffff880079248a40 ffff88007986c280 0000000000000000 00000000fffd4295 > [ 2991.344562] Call Trace: > [ 2991.344562] [<ffffffff815437ba>] schedule+0x64/0x66 > [ 2991.344562] [<ffffffff81541750>] schedule_timeout+0xbc/0xf9 > [ 2991.344562] [<ffffffff8154bec0>] ? ftrace_call+0x5/0x2f > [ 2991.344562] [<ffffffff81049513>] ? cascade+0xa8/0xa8 > [ 2991.344562] [<ffffffff815417ab>] schedule_timeout_uninterruptible+0x1e/0x20 > [ 2991.344562] [<ffffffff810c980c>] rcu_gp_kthread+0x502/0x94b > [ 2991.344562] [<ffffffff81062791>] ? __init_waitqueue_head+0x50/0x50 > [ 2991.344562] [<ffffffff810c930a>] ? rcu_gp_fqs+0x64/0x64 > [ 2991.344562] [<ffffffff81061cdb>] kthread+0xb1/0xb9 > [ 2991.344562] [<ffffffff81091e31>] ? lock_release_holdtime.part.23+0x4e/0x55 > [ 2991.344562] [<ffffffff81061c2a>] ? __init_kthread_worker+0x58/0x58 > [ 2991.344562] [<ffffffff8154c1dc>] ret_from_fork+0x7c/0xb0 > [ 2991.344562] [<ffffffff81061c2a>] ? __init_kthread_worker+0x58/0x58 > [ 2991.344562] kworker/0:1 D ffffffff81a30680 0 47 2 0x00000000 > [ 2991.344562] Workqueue: events cpuset_hotplug_workfn > [ 2991.344562] ffff880078dbbb58 0000000000000002 0000000000000006 00000000000000d8 > [ 2991.344562] ffff880078db8100 ffff880078dbbfd8 ffff880078dbbfd8 00000000001d3c80 > [ 2991.344562] ffff8800779ca5c0 ffff880078db8100 ffffffff81541fcf 0000000000000000 > [ 2991.344562] Call Trace: > [ 2991.344562] [<ffffffff81541fcf>] ? __mutex_lock_common+0x3d4/0x609 > [ 2991.344562] [<ffffffff815437ba>] schedule+0x64/0x66 > [ 2991.344562] [<ffffffff81543a39>] schedule_preempt_disabled+0x18/0x24 > [ 2991.344562] [<ffffffff81541fcf>] __mutex_lock_common+0x3d4/0x609 > [ 2991.344562] [<ffffffff8103d11b>] ? get_online_cpus+0x3c/0x50 > [ 2991.344562] [<ffffffff8103d11b>] ? get_online_cpus+0x3c/0x50 > [ 2991.344562] [<ffffffff815422ff>] mutex_lock_nested+0x3b/0x40 > [ 2991.344562] [<ffffffff8103d11b>] get_online_cpus+0x3c/0x50 > [ 2991.344562] [<ffffffff810af7e6>] rebuild_sched_domains_locked+0x6e/0x3a8 > [ 2991.344562] [<ffffffff810b0ec6>] rebuild_sched_domains+0x1c/0x2a > [ 2991.344562] [<ffffffff810b109b>] cpuset_hotplug_workfn+0x1c7/0x1d3 > [ 2991.344562] [<ffffffff810b0ed9>] ? cpuset_hotplug_workfn+0x5/0x1d3 > [ 2991.344562] [<ffffffff81058e07>] process_one_work+0x2d4/0x4d1 > [ 2991.344562] [<ffffffff81058d3a>] ? process_one_work+0x207/0x4d1 > [ 2991.344562] [<ffffffff8105964c>] worker_thread+0x2e7/0x3b5 > [ 2991.344562] [<ffffffff81059365>] ? rescuer_thread+0x332/0x332 > [ 2991.344562] [<ffffffff81061cdb>] kthread+0xb1/0xb9 > [ 2991.344562] [<ffffffff81061c2a>] ? __init_kthread_worker+0x58/0x58 > [ 2991.344562] [<ffffffff8154c1dc>] ret_from_fork+0x7c/0xb0 > [ 2991.344562] [<ffffffff81061c2a>] ? __init_kthread_worker+0x58/0x58 > [ 2991.344562] bash D ffffffff81a4aa80 0 2618 2612 0x10000000 > [ 2991.344562] ffff8800379abb58 0000000000000002 0000000000000006 0000000000000c2c > [ 2991.344562] ffff880077fea140 ffff8800379abfd8 ffff8800379abfd8 00000000001d3c80 > [ 2991.344562] ffff8800779ca5c0 ffff880077fea140 ffffffff81541fcf 0000000000000000 > [ 2991.344562] Call Trace: > [ 2991.344562] [<ffffffff81541fcf>] ? __mutex_lock_common+0x3d4/0x609 > [ 2991.344562] [<ffffffff815437ba>] schedule+0x64/0x66 > [ 2991.344562] [<ffffffff81543a39>] schedule_preempt_disabled+0x18/0x24 > [ 2991.344562] [<ffffffff81541fcf>] __mutex_lock_common+0x3d4/0x609 > [ 2991.344562] [<ffffffff81530078>] ? rcu_cpu_notify+0x2f5/0x86e > [ 2991.344562] [<ffffffff81530078>] ? rcu_cpu_notify+0x2f5/0x86e > [ 2991.344562] [<ffffffff815422ff>] mutex_lock_nested+0x3b/0x40 > [ 2991.344562] [<ffffffff81530078>] rcu_cpu_notify+0x2f5/0x86e > [ 2991.344562] [<ffffffff81091c99>] ? __lock_is_held+0x32/0x53 > [ 2991.344562] [<ffffffff81548912>] notifier_call_chain+0x6b/0x98 > [ 2991.344562] [<ffffffff810671fd>] __raw_notifier_call_chain+0xe/0x10 > [ 2991.344562] [<ffffffff8103cf64>] __cpu_notify+0x20/0x32 > [ 2991.344562] [<ffffffff8103cf8d>] cpu_notify_nofail+0x17/0x36 > [ 2991.344562] [<ffffffff815225de>] _cpu_down+0x154/0x259 > [ 2991.344562] [<ffffffff81522710>] cpu_down+0x2d/0x3a > [ 2991.344562] [<ffffffff81526351>] store_online+0x4e/0xe7 > [ 2991.344562] [<ffffffff8134d764>] dev_attr_store+0x20/0x22 > [ 2991.344562] [<ffffffff811b3c5f>] sysfs_write_file+0x108/0x144 > [ 2991.344562] [<ffffffff8114c5ef>] vfs_write+0xfd/0x158 > [ 2991.344562] [<ffffffff8114c928>] SyS_write+0x5c/0x83 > [ 2991.344562] [<ffffffff8154c494>] tracesys+0xdd/0xe2 > > As well as held locks: > > [ 3034.728033] Showing all locks held in the system: > [ 3034.728033] 1 lock held by rcu_preempt/10: > [ 3034.728033] #0: (rcu_preempt_state.onoff_mutex){+.+...}, at: [<ffffffff810c9471>] rcu_gp_kthread+0x167/0x94b > [ 3034.728033] 4 locks held by kworker/0:1/47: > [ 3034.728033] #0: (events){.+.+.+}, at: [<ffffffff81058d3a>] process_one_work+0x207/0x4d1 > [ 3034.728033] #1: (cpuset_hotplug_work){+.+.+.}, at: [<ffffffff81058d3a>] process_one_work+0x207/0x4d1 > [ 3034.728033] #2: (cpuset_mutex){+.+.+.}, at: [<ffffffff810b0ec1>] rebuild_sched_domains+0x17/0x2a > [ 3034.728033] #3: (cpu_hotplug.lock){+.+.+.}, at: [<ffffffff8103d11b>] get_online_cpus+0x3c/0x50 > [ 3034.728033] 1 lock held by mingetty/2563: > [ 3034.728033] #0: (&ldata->atomic_read_lock){+.+...}, at: [<ffffffff8131e28a>] n_tty_read+0x252/0x7e8 > [ 3034.728033] 1 lock held by mingetty/2565: > [ 3034.728033] #0: (&ldata->atomic_read_lock){+.+...}, at: [<ffffffff8131e28a>] n_tty_read+0x252/0x7e8 > [ 3034.728033] 1 lock held by mingetty/2569: > [ 3034.728033] #0: (&ldata->atomic_read_lock){+.+...}, at: [<ffffffff8131e28a>] n_tty_read+0x252/0x7e8 > [ 3034.728033] 1 lock held by mingetty/2572: > [ 3034.728033] #0: (&ldata->atomic_read_lock){+.+...}, at: [<ffffffff8131e28a>] n_tty_read+0x252/0x7e8 > [ 3034.728033] 1 lock held by mingetty/2575: > [ 3034.728033] #0: (&ldata->atomic_read_lock){+.+...}, at: [<ffffffff8131e28a>] n_tty_read+0x252/0x7e8 > [ 3034.728033] 7 locks held by bash/2618: > [ 3034.728033] #0: (sb_writers#5){.+.+.+}, at: [<ffffffff8114bc3f>] file_start_write+0x2a/0x2c > [ 3034.728033] #1: (&buffer->mutex#2){+.+.+.}, at: [<ffffffff811b3b93>] sysfs_write_file+0x3c/0x144 > [ 3034.728033] #2: (s_active#54){.+.+.+}, at: [<ffffffff811b3c3e>] sysfs_write_file+0xe7/0x144 > [ 3034.728033] #3: (x86_cpu_hotplug_driver_mutex){+.+.+.}, at: [<ffffffff810217c2>] cpu_hotplug_driver_lock+0x17/0x19 > [ 3034.728033] #4: (cpu_add_remove_lock){+.+.+.}, at: [<ffffffff8103d196>] cpu_maps_update_begin+0x17/0x19 > [ 3034.728033] #5: (cpu_hotplug.lock){+.+.+.}, at: [<ffffffff8103cfd8>] cpu_hotplug_begin+0x2c/0x6d > [ 3034.728033] #6: (rcu_preempt_state.onoff_mutex){+.+...}, at: [<ffffffff81530078>] rcu_cpu_notify+0x2f5/0x86e > [ 3034.728033] 1 lock held by bash/2980: > [ 3034.728033] #0: (&ldata->atomic_read_lock){+.+...}, at: [<ffffffff8131e28a>] n_tty_read+0x252/0x7e8 > > Things looked a little weird. Also, this is a deadlock that lockdep did > not catch. But what we have here does not look like a circular lock > issue: > > Bash is blocked in rcu_cpu_notify(): > > 1961 /* Exclude any attempts to start a new grace period. */ > 1962 mutex_lock(&rsp->onoff_mutex); > > > kworker is blocked in get_online_cpus(), which makes sense as we are > currently taking down a CPU. > > But rcu_preempt is not blocked on anything. It is simply sleeping in > rcu_gp_kthread (really rcu_gp_init) here: > > 1453 #ifdef CONFIG_PROVE_RCU_DELAY > 1454 if ((prandom_u32() % (rcu_num_nodes * 8)) == 0 && > 1455 system_state == SYSTEM_RUNNING) > 1456 schedule_timeout_uninterruptible(2); > 1457 #endif /* #ifdef CONFIG_PROVE_RCU_DELAY */ > > And it does this while holding the onoff_mutex that bash is waiting for. > > Doing a function trace, it showed me where it happened: > > [ 125.940066] rcu_pree-10 3.... 28384115273: schedule_timeout_uninterruptible <-rcu_gp_kthread > [...] > [ 125.940066] rcu_pree-10 3d..3 28384202439: sched_switch: prev_comm=rcu_preempt prev_pid=10 prev_prio=120 prev_state=D ==> next_comm=watchdog/3 next_pid=38 next_prio=120 > > The watchdog ran, and then: > > [ 125.940066] watchdog-38 3d..3 28384692863: sched_switch: prev_comm=watchdog/3 prev_pid=38 prev_prio=120 prev_state=P ==> next_comm=modprobe next_pid=2848 next_prio=118 > > Not sure what modprobe was doing, but shortly after that: > > [ 125.940066] modprobe-2848 3d..3 28385041749: sched_switch: prev_comm=modprobe prev_pid=2848 prev_prio=118 prev_state=R+ ==> next_comm=migration/3 next_pid=40 next_prio=0 > > Where the migration thread took down the CPU: > > [ 125.940066] migratio-40 3d..3 28389148276: sched_switch: prev_comm=migration/3 prev_pid=40 prev_prio=0 prev_state=P ==> next_comm=swapper/3 next_pid=0 next_prio=120 > > which finally did: > > [ 125.940066] <idle>-0 3...1 28389282142: arch_cpu_idle_dead <-cpu_startup_entry > [ 125.940066] <idle>-0 3...1 28389282548: native_play_dead <-arch_cpu_idle_dead > [ 125.940066] <idle>-0 3...1 28389282924: play_dead_common <-native_play_dead > [ 125.940066] <idle>-0 3...1 28389283468: idle_task_exit <-play_dead_common > [ 125.940066] <idle>-0 3...1 28389284644: amd_e400_remove_cpu <-play_dead_common > > > CPU 3 is now offline, the rcu_preempt thread that ran on CPU 3 is still > doing a schedule_timeout_uninterruptible() and it registered it's > timeout to the timer base for CPU 3. You would think that it would get > migrated right? The issue here is that the timer migration happens at > the CPU notifier for CPU_DEAD. The problem is that the rcu notifier for > CPU_DOWN is blocked waiting for the onoff_mutex to be released, which is > held by the thread that just put itself into a uninterruptible sleep, > that wont wake up until the CPU_DEAD notifier of the timer > infrastructure is called, which wont happen until the rcu notifier > finishes. Here's our deadlock! This commit breaks this deadlock cycle by substituting a shorter udelay() for the previous schedule_timeout_uninterruptible(), while at the same time increasing the probability of the delay. This maintains the intensity of the testing. Reported-by: Steven Rostedt <rostedt@goodmis.org> Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Tested-by: Steven Rostedt <rostedt@goodmis.org>
-
Steven Rostedt authored
This commit fixes a lockdep-detected deadlock by moving a wake_up() call out from a rnp->lock critical section. Please see below for the long version of this story. On Tue, 2013-05-28 at 16:13 -0400, Dave Jones wrote: > [12572.705832] ====================================================== > [12572.750317] [ INFO: possible circular locking dependency detected ] > [12572.796978] 3.10.0-rc3+ #39 Not tainted > [12572.833381] ------------------------------------------------------- > [12572.862233] trinity-child17/31341 is trying to acquire lock: > [12572.870390] (rcu_node_0){..-.-.}, at: [<ffffffff811054ff>] rcu_read_unlock_special+0x9f/0x4c0 > [12572.878859] > but task is already holding lock: > [12572.894894] (&ctx->lock){-.-...}, at: [<ffffffff811390ed>] perf_lock_task_context+0x7d/0x2d0 > [12572.903381] > which lock already depends on the new lock. > > [12572.927541] > the existing dependency chain (in reverse order) is: > [12572.943736] > -> #4 (&ctx->lock){-.-...}: > [12572.960032] [<ffffffff810b9851>] lock_acquire+0x91/0x1f0 > [12572.968337] [<ffffffff816ebc90>] _raw_spin_lock+0x40/0x80 > [12572.976633] [<ffffffff8113c987>] __perf_event_task_sched_out+0x2e7/0x5e0 > [12572.984969] [<ffffffff81088953>] perf_event_task_sched_out+0x93/0xa0 > [12572.993326] [<ffffffff816ea0bf>] __schedule+0x2cf/0x9c0 > [12573.001652] [<ffffffff816eacfe>] schedule_user+0x2e/0x70 > [12573.009998] [<ffffffff816ecd64>] retint_careful+0x12/0x2e > [12573.018321] > -> #3 (&rq->lock){-.-.-.}: > [12573.034628] [<ffffffff810b9851>] lock_acquire+0x91/0x1f0 > [12573.042930] [<ffffffff816ebc90>] _raw_spin_lock+0x40/0x80 > [12573.051248] [<ffffffff8108e6a7>] wake_up_new_task+0xb7/0x260 > [12573.059579] [<ffffffff810492f5>] do_fork+0x105/0x470 > [12573.067880] [<ffffffff81049686>] kernel_thread+0x26/0x30 > [12573.076202] [<ffffffff816cee63>] rest_init+0x23/0x140 > [12573.084508] [<ffffffff81ed8e1f>] start_kernel+0x3f1/0x3fe > [12573.092852] [<ffffffff81ed856f>] x86_64_start_reservations+0x2a/0x2c > [12573.101233] [<ffffffff81ed863d>] x86_64_start_kernel+0xcc/0xcf > [12573.109528] > -> #2 (&p->pi_lock){-.-.-.}: > [12573.125675] [<ffffffff810b9851>] lock_acquire+0x91/0x1f0 > [12573.133829] [<ffffffff816ebe9b>] _raw_spin_lock_irqsave+0x4b/0x90 > [12573.141964] [<ffffffff8108e881>] try_to_wake_up+0x31/0x320 > [12573.150065] [<ffffffff8108ebe2>] default_wake_function+0x12/0x20 > [12573.158151] [<ffffffff8107bbf8>] autoremove_wake_function+0x18/0x40 > [12573.166195] [<ffffffff81085398>] __wake_up_common+0x58/0x90 > [12573.174215] [<ffffffff81086909>] __wake_up+0x39/0x50 > [12573.182146] [<ffffffff810fc3da>] rcu_start_gp_advanced.isra.11+0x4a/0x50 > [12573.190119] [<ffffffff810fdb09>] rcu_start_future_gp+0x1c9/0x1f0 > [12573.198023] [<ffffffff810fe2c4>] rcu_nocb_kthread+0x114/0x930 > [12573.205860] [<ffffffff8107a91d>] kthread+0xed/0x100 > [12573.213656] [<ffffffff816f4b1c>] ret_from_fork+0x7c/0xb0 > [12573.221379] > -> #1 (&rsp->gp_wq){..-.-.}: > [12573.236329] [<ffffffff810b9851>] lock_acquire+0x91/0x1f0 > [12573.243783] [<ffffffff816ebe9b>] _raw_spin_lock_irqsave+0x4b/0x90 > [12573.251178] [<ffffffff810868f3>] __wake_up+0x23/0x50 > [12573.258505] [<ffffffff810fc3da>] rcu_start_gp_advanced.isra.11+0x4a/0x50 > [12573.265891] [<ffffffff810fdb09>] rcu_start_future_gp+0x1c9/0x1f0 > [12573.273248] [<ffffffff810fe2c4>] rcu_nocb_kthread+0x114/0x930 > [12573.280564] [<ffffffff8107a91d>] kthread+0xed/0x100 > [12573.287807] [<ffffffff816f4b1c>] ret_from_fork+0x7c/0xb0 Notice the above call chain. rcu_start_future_gp() is called with the rnp->lock held. Then it calls rcu_start_gp_advance, which does a wakeup. You can't do wakeups while holding the rnp->lock, as that would mean that you could not do a rcu_read_unlock() while holding the rq lock, or any lock that was taken while holding the rq lock. This is because... (See below). > [12573.295067] > -> #0 (rcu_node_0){..-.-.}: > [12573.309293] [<ffffffff810b8d36>] __lock_acquire+0x1786/0x1af0 > [12573.316568] [<ffffffff810b9851>] lock_acquire+0x91/0x1f0 > [12573.323825] [<ffffffff816ebc90>] _raw_spin_lock+0x40/0x80 > [12573.331081] [<ffffffff811054ff>] rcu_read_unlock_special+0x9f/0x4c0 > [12573.338377] [<ffffffff810760a6>] __rcu_read_unlock+0x96/0xa0 > [12573.345648] [<ffffffff811391b3>] perf_lock_task_context+0x143/0x2d0 > [12573.352942] [<ffffffff8113938e>] find_get_context+0x4e/0x1f0 > [12573.360211] [<ffffffff811403f4>] SYSC_perf_event_open+0x514/0xbd0 > [12573.367514] [<ffffffff81140e49>] SyS_perf_event_open+0x9/0x10 > [12573.374816] [<ffffffff816f4dd4>] tracesys+0xdd/0xe2 Notice the above trace. perf took its own ctx->lock, which can be taken while holding the rq lock. While holding this lock, it did a rcu_read_unlock(). The perf_lock_task_context() basically looks like: rcu_read_lock(); raw_spin_lock(ctx->lock); rcu_read_unlock(); Now, what looks to have happened, is that we scheduled after taking that first rcu_read_lock() but before taking the spin lock. When we scheduled back in and took the ctx->lock, the following rcu_read_unlock() triggered the "special" code. The rcu_read_unlock_special() takes the rnp->lock, which gives us a possible deadlock scenario. CPU0 CPU1 CPU2 ---- ---- ---- rcu_nocb_kthread() lock(rq->lock); lock(ctx->lock); lock(rnp->lock); wake_up(); lock(rq->lock); rcu_read_unlock(); rcu_read_unlock_special(); lock(rnp->lock); lock(ctx->lock); **** DEADLOCK **** > [12573.382068] > other info that might help us debug this: > > [12573.403229] Chain exists of: > rcu_node_0 --> &rq->lock --> &ctx->lock > > [12573.424471] Possible unsafe locking scenario: > > [12573.438499] CPU0 CPU1 > [12573.445599] ---- ---- > [12573.452691] lock(&ctx->lock); > [12573.459799] lock(&rq->lock); > [12573.467010] lock(&ctx->lock); > [12573.474192] lock(rcu_node_0); > [12573.481262] > *** DEADLOCK *** > > [12573.501931] 1 lock held by trinity-child17/31341: > [12573.508990] #0: (&ctx->lock){-.-...}, at: [<ffffffff811390ed>] perf_lock_task_context+0x7d/0x2d0 > [12573.516475] > stack backtrace: > [12573.530395] CPU: 1 PID: 31341 Comm: trinity-child17 Not tainted 3.10.0-rc3+ #39 > [12573.545357] ffffffff825b4f90 ffff880219f1dbc0 ffffffff816e375b ffff880219f1dc00 > [12573.552868] ffffffff816dfa5d ffff880219f1dc50 ffff88023ce4d1f8 ffff88023ce4ca40 > [12573.560353] 0000000000000001 0000000000000001 ffff88023ce4d1f8 ffff880219f1dcc0 > [12573.567856] Call Trace: > [12573.575011] [<ffffffff816e375b>] dump_stack+0x19/0x1b > [12573.582284] [<ffffffff816dfa5d>] print_circular_bug+0x200/0x20f > [12573.589637] [<ffffffff810b8d36>] __lock_acquire+0x1786/0x1af0 > [12573.596982] [<ffffffff810918f5>] ? sched_clock_cpu+0xb5/0x100 > [12573.604344] [<ffffffff810b9851>] lock_acquire+0x91/0x1f0 > [12573.611652] [<ffffffff811054ff>] ? rcu_read_unlock_special+0x9f/0x4c0 > [12573.619030] [<ffffffff816ebc90>] _raw_spin_lock+0x40/0x80 > [12573.626331] [<ffffffff811054ff>] ? rcu_read_unlock_special+0x9f/0x4c0 > [12573.633671] [<ffffffff811054ff>] rcu_read_unlock_special+0x9f/0x4c0 > [12573.640992] [<ffffffff811390ed>] ? perf_lock_task_context+0x7d/0x2d0 > [12573.648330] [<ffffffff810b429e>] ? put_lock_stats.isra.29+0xe/0x40 > [12573.655662] [<ffffffff813095a0>] ? delay_tsc+0x90/0xe0 > [12573.662964] [<ffffffff810760a6>] __rcu_read_unlock+0x96/0xa0 > [12573.670276] [<ffffffff811391b3>] perf_lock_task_context+0x143/0x2d0 > [12573.677622] [<ffffffff81139070>] ? __perf_event_enable+0x370/0x370 > [12573.684981] [<ffffffff8113938e>] find_get_context+0x4e/0x1f0 > [12573.692358] [<ffffffff811403f4>] SYSC_perf_event_open+0x514/0xbd0 > [12573.699753] [<ffffffff8108cd9d>] ? get_parent_ip+0xd/0x50 > [12573.707135] [<ffffffff810b71fd>] ? trace_hardirqs_on_caller+0xfd/0x1c0 > [12573.714599] [<ffffffff81140e49>] SyS_perf_event_open+0x9/0x10 > [12573.721996] [<ffffffff816f4dd4>] tracesys+0xdd/0xe2 This commit delays the wakeup via irq_work(), which is what perf and ftrace use to perform wakeups in critical sections. Reported-by: Dave Jones <davej@redhat.com> Signed-off-by: Steven Rostedt <rostedt@goodmis.org> Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
-
Paul E. McKenney authored
__DECLARE_TRACE_RCU() currently creates an _rcuidle() tracepoint which may safely be invoked from what RCU considers to be an idle CPU. However, these _rcuidle() tracepoints may -not- be invoked from the handler of an irq taken from idle, because rcu_idle_enter() zeroes RCU's nesting-level counter, so that the rcu_irq_exit() returning to idle will trigger a WARN_ON_ONCE(). This commit therefore substitutes rcu_irq_enter() for rcu_idle_exit() and rcu_irq_exit() for rcu_idle_enter() in order to make the _rcuidle() tracepoints usable from irq handlers as well as from process context. Reported-by: Dave Jones <davej@redhat.com> Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Cc: Steven Rostedt <rostedt@goodmis.org>
-
- 09 Jun, 2013 2 commits
-
-
Linus Torvalds authored
-
Mikulas Patocka authored
This patch fixes warnings due to missing lock on write error path. WARNING: at fs/hpfs/hpfs_fn.h:353 hpfs_truncate+0x75/0x80 [hpfs]() Hardware name: empty Pid: 26563, comm: dd Tainted: P O 3.9.4 #12 Call Trace: hpfs_truncate+0x75/0x80 [hpfs] hpfs_write_begin+0x84/0x90 [hpfs] _hpfs_bmap+0x10/0x10 [hpfs] generic_file_buffered_write+0x121/0x2c0 __generic_file_aio_write+0x1c7/0x3f0 generic_file_aio_write+0x7c/0x100 do_sync_write+0x98/0xd0 hpfs_file_write+0xd/0x50 [hpfs] vfs_write+0xa2/0x160 sys_write+0x51/0xa0 page_fault+0x22/0x30 system_call_fastpath+0x1a/0x1f Signed-off-by: Mikulas Patocka <mikulas@artax.karlin.mff.cuni.cz> Cc: stable@kernel.org # 2.6.39+ Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
-
- 08 Jun, 2013 18 commits
-
-
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tipLinus Torvalds authored
Pull timer fixes from Thomas Gleixner: - Trivial: unused variable removal - Posix-timers: Add the clock ID to the new proc interface to make it useful. The interface is new and should be functional when we reach the final 3.10 release. - Cure a false positive warning in the tick code introduced by the overhaul in 3.10 - Fix for a persistent clock detection regression introduced in this cycle * 'timers-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: timekeeping: Correct run-time detection of persistent_clock. ntp: Remove unused variable flags in __hardpps posix-timers: Show clock ID in proc file tick: Cure broadcast false positive pending bit warning
-
git://git.secretlab.ca/git/linuxLinus Torvalds authored
Pull irqdomain bug fixes from Grant Likely: "This branch contains a set of straight forward bug fixes to the irqdomain code and to a couple of drivers that make use of it." * tag 'irqdomain-for-linus' of git://git.secretlab.ca/git/linux: irqchip: Return -EPERM for reserved IRQs irqdomain: document the simple domain first_irq kernel/irq/irqdomain.c: before use 'irq_data', need check it whether valid. irqdomain: export irq_domain_add_simple
-
Grant Likely authored
The irqdomain core will report a log message for any attempted map call that fails unless the error code is -EPERM. This patch changes the Versatile irq controller drivers to use -EPERM because it is normal for a subset of the IRQ inputs to be marked as reserved on the various Versatile platforms. Signed-off-by: Grant Likely <grant.likely@linaro.org>
-
Linus Walleij authored
The first_irq needs to be zero to get a linear domain and that comes with special semantics. We want to simplify this going forward but some documentation never hurts. Signed-off-by: Linus Walleij <linus.walleij@linaro.org> Signed-off-by: Grant Likely <grant.likely@linaro.org>
-
Chen Gang authored
Since irq_data may be NULL, if so, we WARN_ON(), and continue, 'hwirq' which related with 'irq_data' has to initialize later, or it will cause issue. Signed-off-by: Chen Gang <gang.chen@asianux.com> Signed-off-by: Grant Likely <grant.likely@linaro.org>
-
Arnd Bergmann authored
All other irq_domain_add_* functions are exported already, and apparently this one got left out by mistake, which causes build errors for ARM allmodconfig kernels: ERROR: "irq_domain_add_simple" [drivers/gpio/gpio-rcar.ko] undefined! ERROR: "irq_domain_add_simple" [drivers/gpio/gpio-em.ko] undefined! Signed-off-by: Arnd Bergmann <arnd@arndb.de> Acked-by: Simon Horman <horms+renesas@verge.net.au> Signed-off-by: Grant Likely <grant.likely@linaro.org>
-
git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-socLinus Torvalds authored
Pull ARM SoC fixes from Olof Johansson: "Another week, another batch of fixes for arm-soc platforms. Nothing controversial here, a handful of fixes for regressions and/or serious problems across several of the platforms. Things are slowing down nicely on fix rates for 3.10" * tag 'fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc: ARM: exynos: add debug_ll_io_init() call in exynos_init_io() ARM: EXYNOS: uncompress - print debug messages if DEBUG_LL is defined ARM: shmobile: sh73a0: Update CMT clockevent rating to 80 sh-pfc: r8a7779: Don't group USB OVC and PENC pins ARM: mxs: icoll: Fix interrupts gpio bank 0 ARM: imx: clk-imx6q: AXI clock select index is incorrect ARM: bcm2835: override the HW UART periphid ARM: mvebu: Fix bug in coherency fabric low level init function ARM: Kirkwood: TS219: Fix crash by double PCIe instantiation ARM: ux500: Provide supplies for AUX1, AUX2 and AUX3 ARM: ux500: Only configure wake-up reasons on ux500 based platforms ARM: dts: imx: fix clocks for cspi ARM i.MX6q: fix for ldb_di_sels
-
git://git.linux-mips.org/pub/scm/ralf/upstream-linusLinus Torvalds authored
Pull MIPS fixes from Ralf Baechle: "MIPS fixes across the field. The only area that's standing out is the exception handling which received it's dose of breakage as part of the microMIPS patchset" * 'upstream' of git://git.linux-mips.org/pub/scm/ralf/upstream-linus: MIPS: ralink: add missing SZ_1M multiplier MIPS: Compat: Fix cputime_to_timeval() arguments in compat binfmt_elf. MIPS: OCTEON: Improve _machine_halt implementation. MIPS: rtlx: Fix implicit declaration of function set_vi_handler() MIPS: Trap exception handling fixes MIPS: Quit exposing Kconfig symbols in uapi headers. MIPS: Remove duplicate definition of check_for_high_segbits.
-
git://git.kernel.org/pub/scm/linux/kernel/git/gerg/m68knommuLinus Torvalds authored
Pull m68knommu fix from Greg Ungerer: "A single fix for compilation breakage to many of the ColdFire CPU targets" * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/gerg/m68knommu: m68k: only use local gpio_request_one if not using GPIOLIB
-
git://people.freedesktop.org/~airlied/linuxLinus Torvalds authored
Pull drm fixes from Dave Airlie: "Regression fixers for the big 3: - nouveau: hdmi audio, dac load detect, s/r regressions fixed - radeon: long standing system hang fixed, hdmi audio and rs780 fast fb fixes - intel: one old regression, a WARN removal, and a stop X dying fix Otherwise one mgag200 fix, a couple of arm build fixes, and a core use after free fix." * 'drm-fixes' of git://people.freedesktop.org/~airlied/linux: drm/nv50/kms: use dac loadval from vbios, where it's available drm/nv50/disp: force dac power state during load detect drm/nv50-nv84/fifo: fix resume regression introduced by playlist race fix drm/nv84/disp: Fix HDMI audio regression drm/i915/sdvo: Use &intel_sdvo->ddc instead of intel_sdvo->i2c for DDC. drm/radeon: don't allow audio on DCE6 drm/radeon: Use direct mapping for fast fb access on RS780/RS880 (v2) radeon: Fix system hang issue when using KMS with older cards drm/i915: no lvds quirk for hp t5740 drm/i915: Quirk the pipe A quirk in the modeset state checker drm/i915: Fix spurious -EIO/SIGBUS on wedged gpus drm/mgag200: Add missing write to index before accessing data register drm/nouveau: use mdelay instead of large udelay constants drm/tilcd: select BACKLIGHT_LCD_SUPPORT drm: fix a use-after-free when GPU acceleration disabled
-
git://git.infradead.org/users/vkoul/slave-dmaLinus Torvalds authored
Pull slave-dmaengine fixes from Vinod Koul: "Fix from Andy is for dmatest regression reported by Will and Rabin has fixed runtime ref counting for st_dma40" * 'fixes' of git://git.infradead.org/users/vkoul/slave-dma: dmatest: do not allow to interrupt ongoing tests dmaengine: ste_dma40: fix pm runtime ref counting
-
Linus Torvalds authored
Merge tag 'trace-fixes-v3.10-rc3-v3' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace Pull tracing fixes from Steven Rostedt: "This contains 4 fixes. The first two fix the case where full RCU debugging is enabled, enabling function tracing causes a live lock of the system. This is due to the added debug checks in rcu_dereference_raw() that is used by the function tracer. These checks are also traced by the function tracer as well as cause enough overhead to the function tracer to slow down the system enough that the time to finish an interrupt can take longer than when the next interrupt is triggered, causing a live lock from the timer interrupt. Talking this over with Paul McKenney, we came up with a fix that adds a new rcu_dereference_raw_notrace() that does not perform these added checks, and let the function tracer use that. The third commit fixes a failed compile when branch tracing is enabled, due to the conversion of the trace_test_buffer() selftest that the branch trace wasn't converted for. The forth patch fixes a bug caught by the RCU lockdep code where a rcu_read_lock() is performed when rcu is disabled (either going to or from idle, or user space). This happened on the irqsoff tracer as it calls task_uid(). The fix here was to use current_uid() when possible that doesn't use rcu locking. Which luckily, is always used when irqsoff calls this code." * tag 'trace-fixes-v3.10-rc3-v3' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace: tracing: Use current_uid() for critical time tracing tracing: Fix bad parameter passed in branch selftest ftrace: Use the rcu _notrace variants for rcu_dereference_raw() and friends rcu: Add _notrace variation of rcu_dereference_raw() and hlist_for_each_entry_rcu()
-
Rafael J. Wysocki authored
Commit 9f29ab11 ("ACPI / scan: do not match drivers against objects having scan handlers") introduced a boot regression on Tony's ia64 HP rx2600. Tony says: "It panics with the message: Kernel panic - not syncing: Unable to find SBA IOMMU: Try a generic or DIG kernel [...] my problem comes from arch/ia64/hp/common/sba_iommu.c where the code in sba_init() says: acpi_bus_register_driver(&acpi_sba_ioc_driver); if (!ioc_list) { but because of this change we never managed to call ioc_init() so ioc_list doesn't get set up, and we die." Revert it to avoid this breakage and we'll fix the problem it attempted to address later. Reported-by: Tony Luck <tony.luck@gmail.com> Cc: 3.9+ <stable@vger.kernel.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
-
git://git.linaro.org/people/shawnguo/linux-2.6Olof Johansson authored
From Shawn Guo, mxs fixes for 3.10: - Since the time we move to MULTI_IRQ_HANDLER, the 0x7f polling for no interrupt in icoll_handle_irq() becomes insane, because 0x7f is an valid interrupt number, the irq of gpio bank 0. That unnecessary polling results in the driver not detecting when irq 0x7f is active which makes the machine effectively dead lock. The fix removes the interrupt poll loop and allows usage of gpio0 interrupt without an infinite loop. * tag 'mxs-fixes-3.10' of git://git.linaro.org/people/shawnguo/linux-2.6: ARM: mxs: icoll: Fix interrupts gpio bank 0 Signed-off-by: Olof Johansson <olof@lixom.net>
-
git://git.linaro.org/people/shawnguo/linux-2.6Olof Johansson authored
From Shawn Guo, imx fixes for 3.10, take 2: - One device tree fix for all spi node to have per clock added. The clock is needed by spi driver to calculate bit rate divisor. The spi node in the current device trees either does not have the clock or is defined as dummy clock, in which case the driver probe will fail or spi will run at a wrong bit rate. - Two imx6q clock fixes, which correct axi_sels and ldb_di_sels. * tag 'imx-fixes-3.10-2' of git://git.linaro.org/people/shawnguo/linux-2.6: ARM: imx: clk-imx6q: AXI clock select index is incorrect ARM: dts: imx: fix clocks for cspi ARM i.MX6q: fix for ldb_di_sels Signed-off-by: Olof Johansson <olof@lixom.net>
-
Doug Anderson authored
If the early MMU mapping of the UART happens to get booted out of the TLB between the start of paging_init() and when we finally re-add the UART at the very end of s3c_init_cpu(), we'll get a hang at bootup if we've got early_printk enabled. Avoid this hang by calling debug_ll_io_init() early. Without this patch, you can reliably reproduce a hang when early printk is enabled by adding flush_tlb_all() at the start of exynos_init_io(). After this patch the hang goes away. Signed-off-by: Doug Anderson <dianders@chromium.org> Acked-by: Kukjin Kim <kgene.kim@samsung.com> Signed-off-by: Olof Johansson <olof@lixom.net>
-
Olof Johansson authored
Merge tag 'renesas-fixes-for-v3.10' of git://git.kernel.org/pub/scm/linux/kernel/git/horms/renesas into fixes From Simon Horman, Renesas ARM based SoC fixes for v3.10: - Correction to USB OVC and PENC pin groupings on r8a7779 SoC. This avoids conflicts when the USB_OVCn pins are used by another function. This has been observed to be a problem in v3.10-rc1. - Update CMT clock rating for sh73a0 SoC to resolve boot failure on kzm9g-reference. This resolves a regression between v3.9 and v3.10-rc1. * tag 'renesas-fixes-for-v3.10' of git://git.kernel.org/pub/scm/linux/kernel/git/horms/renesas: ARM: shmobile: sh73a0: Update CMT clockevent rating to 80 sh-pfc: r8a7779: Don't group USB OVC and PENC pins Signed-off-by: Olof Johansson <olof@lixom.net>
-
Tushar Behera authored
Printing low-level debug messages make an assumption that the specified UART port has been preconfigured by the bootloader. Incorrectly specified UART port results in system getting stalled while printing the message "Uncompressing Linux... done, booting the kernel" This UART port number is specified through S3C_LOWLEVEL_UART_PORT. Since the UART port might different for different board, it is not possible to specify it correctly for every board that use a common defconfig file. Calling this print subroutine only when DEBUG_LL fixes the problem. By disabling DEBUG_LL in default config file, we would be able to boot multiple boards with different default UART ports. With this current approach, we miss the print "Uncompressing Linux... done, booting the kernel." when DEBUG_LL is not defined. Signed-off-by: Tushar Behera <tushar.behera@linaro.org> Signed-off-by: Olof Johansson <olof@lixom.net>
-
- 07 Jun, 2013 2 commits
-
-
git://git.kernel.org/pub/scm/linux/kernel/git/roland/infinibandLinus Torvalds authored
Pull infiniband fixes from Roland Dreier: - qib RCU/lockdep fix - iser device removal fix, plus doc fixes * tag 'rdma-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/roland/infiniband: IB/qib: Fix lockdep splat in qib_alloc_lkey() MAINTAINERS: Add entry for iSCSI Extensions for RDMA (iSER) initiator IB/iser: Add Mellanox copyright IB/iser: Fix device removal flow
-
git://github.com/awilliam/linux-vfioLinus Torvalds authored
Pull vfio fix from Alex Williamson: "fix rmmod crash" * tag 'vfio-v3.10-rc5' of git://github.com/awilliam/linux-vfio: vfio: fix crash on rmmod
-