• Frederic Weisbecker's avatar
    rcu/nocb: Fix missed nocb_timer requeue · b2fcf210
    Frederic Weisbecker authored
    This sequence of events can lead to a failure to requeue a CPU's
    ->nocb_timer:
    
    1.	There are no callbacks queued for any CPU covered by CPU 0-2's
    	->nocb_gp_kthread.  Note that ->nocb_gp_kthread is associated
    	with CPU 0.
    
    2.	CPU 1 enqueues its first callback with interrupts disabled, and
    	thus must defer awakening its ->nocb_gp_kthread.  It therefore
    	queues its rcu_data structure's ->nocb_timer.  At this point,
    	CPU 1's rdp->nocb_defer_wakeup is RCU_NOCB_WAKE.
    
    3.	CPU 2, which shares the same ->nocb_gp_kthread, also enqueues a
    	callback, but with interrupts enabled, allowing it to directly
    	awaken the ->nocb_gp_kthread.
    
    4.	The newly awakened ->nocb_gp_kthread associates both CPU 1's
    	and CPU 2's callbacks with a future grace period and arranges
    	for that grace period to be started.
    
    5.	This ->nocb_gp_kthread goes to sleep waiting for the end of this
    	future grace period.
    
    6.	This grace period elapses before the CPU 1's timer fires.
    	This is normally improbably given that the timer is set for only
    	one jiffy, but timers can be delayed.  Besides, it is possible
    	that kernel was built with CONFIG_RCU_STRICT_GRACE_PERIOD=y.
    
    7.	The grace period ends, so rcu_gp_kthread awakens the
    	->nocb_gp_kthread, which in turn awakens both CPU 1's and
    	CPU 2's ->nocb_cb_kthread.  Then ->nocb_gb_kthread sleeps
    	waiting for more newly queued callbacks.
    
    8.	CPU 1's ->nocb_cb_kthread invokes its callback, then sleeps
    	waiting for more invocable callbacks.
    
    9.	Note that neither kthread updated any ->nocb_timer state,
    	so CPU 1's ->nocb_defer_wakeup is still set to RCU_NOCB_WAKE.
    
    10.	CPU 1 enqueues its second callback, this time with interrupts
     	enabled so it can wake directly	->nocb_gp_kthread.
    	It does so with calling wake_nocb_gp() which also cancels the
    	pending timer that got queued in step 2. But that doesn't reset
    	CPU 1's ->nocb_defer_wakeup which is still set to RCU_NOCB_WAKE.
    	So CPU 1's ->nocb_defer_wakeup and its ->nocb_timer are now
    	desynchronized.
    
    11.	->nocb_gp_kthread associates the callback queued in 10 with a new
    	grace period, arranges for that grace period to start and sleeps
    	waiting for it to complete.
    
    12.	The grace period ends, rcu_gp_kthread awakens ->nocb_gp_kthread,
    	which in turn wakes up CPU 1's ->nocb_cb_kthread which then
    	invokes the callback queued in 10.
    
    13.	CPU 1 enqueues its third callback, this time with interrupts
    	disabled so it must queue a timer for a deferred wakeup. However
    	the value of its ->nocb_defer_wakeup is RCU_NOCB_WAKE which
    	incorrectly indicates that a timer is already queued.  Instead,
    	CPU 1's ->nocb_timer was cancelled in 10.  CPU 1 therefore fails
    	to queue the ->nocb_timer.
    
    14.	CPU 1 has its pending callback and it may go unnoticed until
    	some other CPU ever wakes up ->nocb_gp_kthread or CPU 1 ever
    	calls an explicit deferred wakeup, for example, during idle entry.
    
    This commit fixes this bug by resetting rdp->nocb_defer_wakeup everytime
    we delete the ->nocb_timer.
    
    It is quite possible that there is a similar scenario involving
    ->nocb_bypass_timer and ->nocb_defer_wakeup.  However, despite some
    effort from several people, a failure scenario has not yet been located.
    However, that by no means guarantees that no such scenario exists.
    Finding a failure scenario is left as an exercise for the reader, and the
    "Fixes:" tag below relates to ->nocb_bypass_timer instead of ->nocb_timer.
    
    Fixes: d1b222c6 (rcu/nocb: Add bypass callback queueing)
    Cc: <stable@vger.kernel.org>
    Cc: Josh Triplett <josh@joshtriplett.org>
    Cc: Lai Jiangshan <jiangshanlai@gmail.com>
    Cc: Joel Fernandes <joel@joelfernandes.org>
    Cc: Boqun Feng <boqun.feng@gmail.com>
    Reviewed-by: default avatarNeeraj Upadhyay <neeraju@codeaurora.org>
    Signed-off-by: default avatarFrederic Weisbecker <frederic@kernel.org>
    Signed-off-by: default avatarPaul E. McKenney <paulmck@kernel.org>
    b2fcf210
tree_plugin.h 92 KB