• Paul E. McKenney's avatar
    rcu/nocb: Provide separate no-CBs grace-period kthreads · 12f54c3a
    Paul E. McKenney authored
    Currently, there is one no-CBs rcuo kthread per CPU, and these kthreads
    are divided into groups.  The first rcuo kthread to come online in a
    given group is that group's leader, and the leader both waits for grace
    periods and invokes its CPU's callbacks.  The non-leader rcuo kthreads
    only invoke callbacks.
    
    This works well in the real-time/embedded environments for which it was
    intended because such environments tend not to generate all that many
    callbacks.  However, given huge floods of callbacks, it is possible for
    the leader kthread to be stuck invoking callbacks while its followers
    wait helplessly while their callbacks pile up.  This is a good recipe
    for an OOM, and rcutorture's new callback-flood capability does generate
    such OOMs.
    
    One strategy would be to wait until such OOMs start happening in
    production, but similar OOMs have in fact happened starting in 2018.
    It would therefore be wise to take a more proactive approach.
    
    This commit therefore features per-CPU rcuo kthreads that do nothing
    but invoke callbacks.  Instead of having one of these kthreads act as
    leader, each group has a separate rcog kthread that handles grace periods
    for its group.  Because these rcuog kthreads do not invoke callbacks,
    callback floods on one CPU no longer block callbacks from reaching the
    rcuc callback-invocation kthreads on other CPUs.
    
    This change does introduce additional kthreads, however:
    
    1.	The number of additional kthreads is about the square root of
    	the number of CPUs, so that a 4096-CPU system would have only
    	about 64 additional kthreads.  Note that recent changes
    	decreased the number of rcuo kthreads by a factor of two
    	(CONFIG_PREEMPT=n) or even three (CONFIG_PREEMPT=y), so
    	this still represents a significant improvement on most systems.
    
    2.	The leading "rcuo" of the rcuog kthreads should allow existing
    	scripting to affinity these additional kthreads as needed, the
    	same as for the rcuop and rcuos kthreads.  (There are no longer
    	any rcuob kthreads.)
    
    3.	A state-machine approach was considered and rejected.  Although
    	this would allow the rcuo kthreads to continue their dual
    	leader/follower roles, it complicates callback invocation
    	and makes it more difficult to consolidate rcuo callback
    	invocation with existing softirq callback invocation.
    
    The introduction of rcuog kthreads should thus be acceptable.
    Signed-off-by: default avatarPaul E. McKenney <paulmck@linux.ibm.com>
    12f54c3a
tree.h 18.1 KB