• Paul E. McKenney's avatar
    rcu: Fix long-grace-period race between forcing and initialization · 83f5b01f
    Paul E. McKenney authored
    Very long RCU read-side critical sections (50 milliseconds or
    so) can cause a race between force_quiescent_state() and
    rcu_start_gp() as follows on kernel builds with multi-level
    rcu_node hierarchies:
    
    1.	CPU 0 calls force_quiescent_state(), sees that there is a
    	grace period in progress, and acquires ->fsqlock.
    
    2.	CPU 1 detects the end of the grace period, and so
    	cpu_quiet_msk_finish() sets rsp->completed to rsp->gpnum.
    	This operation is carried out under the root rnp->lock,
    	but CPU 0 has not yet acquired that lock.  Note that
    	rsp->signaled is still RCU_SAVE_DYNTICK from the last
    	grace period.
    
    3.	CPU 1 calls rcu_start_gp(), but no one wants a new grace
    	period, so it drops the root rnp->lock and returns.
    
    4.	CPU 0 acquires the root rnp->lock and picks up rsp->completed
    	and rsp->signaled, then drops rnp->lock.  It then enters the
    	RCU_SAVE_DYNTICK leg of the switch statement.
    
    5.	CPU 2 invokes call_rcu(), and now needs a new grace period.
    	It calls rcu_start_gp(), which acquires the root rnp->lock, sets
    	rsp->signaled to RCU_GP_INIT (too bad that CPU 0 is already in
    	the RCU_SAVE_DYNTICK leg of the switch statement!)  and starts
    	initializing the rcu_node hierarchy.  If there are multiple
    	levels to the hierarchy, it will drop the root rnp->lock and
    	initialize the lower levels of the hierarchy.
    
    6.	CPU 0 notes that rsp->completed has not changed, which permits
            both CPU 2 and CPU 0 to try updating it concurrently.  If CPU 0's
    	update prevails, later calls to force_quiescent_state() can
    	count old quiescent states against the new grace period, which
    	can in turn result in premature ending of grace periods.
    
    	Not good.
    
    This patch adds an RCU_GP_IDLE state for rsp->signaled that is
    set initially at boot time and any time a grace period ends.
    This prevents CPU 0 from getting into the workings of
    force_quiescent_state() in step 4.  Additional locking and
    checks prevent the concurrent update of rsp->signaled in step 6.
    Signed-off-by: default avatarPaul E. McKenney <paulmck@linux.vnet.ibm.com>
    Cc: laijs@cn.fujitsu.com
    Cc: dipankar@in.ibm.com
    Cc: mathieu.desnoyers@polymtl.ca
    Cc: josh@joshtriplett.org
    Cc: dvhltc@us.ibm.com
    Cc: niv@us.ibm.com
    Cc: peterz@infradead.org
    Cc: rostedt@goodmis.org
    Cc: Valdis.Kletnieks@vt.edu
    Cc: dhowells@redhat.com
    LKML-Reference: <1256742889199-git-send-email->
    Signed-off-by: default avatarIngo Molnar <mingo@elte.hu>
    83f5b01f
rcutree.h 12.5 KB