• Paul E. McKenney's avatar
    rcu: Make rcu_barrier() understand about missing rcuo kthreads · d7e29933
    Paul E. McKenney authored
    Commit 35ce7f29 (rcu: Create rcuo kthreads only for onlined CPUs)
    avoids creating rcuo kthreads for CPUs that never come online.  This
    fixes a bug in many instances of firmware: Instead of lying about their
    age, these systems instead lie about the number of CPUs that they have.
    Before commit 35ce7f29, this could result in huge numbers of useless
    rcuo kthreads being created.
    
    It appears that experience indicates that I should have told the
    people suffering from this problem to fix their broken firmware, but
    I instead produced what turned out to be a partial fix.   The missing
    piece supplied by this commit makes sure that rcu_barrier() knows not to
    post callbacks for no-CBs CPUs that have not yet come online, because
    otherwise rcu_barrier() will hang on systems having firmware that lies
    about the number of CPUs.
    
    It is tempting to simply have rcu_barrier() refuse to post a callback on
    any no-CBs CPU that does not have an rcuo kthread.  This unfortunately
    does not work because rcu_barrier() is required to wait for all pending
    callbacks.  It is therefore required to wait even for those callbacks
    that cannot possibly be invoked.  Even if doing so hangs the system.
    
    Given that posting a callback to a no-CBs CPU that does not yet have an
    rcuo kthread can hang rcu_barrier(), It is tempting to report an error
    in this case.  Unfortunately, this will result in false positives at
    boot time, when it is perfectly legal to post callbacks to the boot CPU
    before the scheduler has started, in other words, before it is legal
    to invoke rcu_barrier().
    
    So this commit instead has rcu_barrier() avoid posting callbacks to
    CPUs having neither rcuo kthread nor pending callbacks, and has it
    complain bitterly if it finds CPUs having no rcuo kthread but some
    pending callbacks.  And when rcu_barrier() does find CPUs having no rcuo
    kthread but pending callbacks, as noted earlier, it has no choice but
    to hang indefinitely.
    Reported-by: default avatarYanko Kaneti <yaneti@declera.com>
    Reported-by: default avatarJay Vosburgh <jay.vosburgh@canonical.com>
    Reported-by: default avatarMeelis Roos <mroos@linux.ee>
    Reported-by: default avatarEric B Munson <emunson@akamai.com>
    Signed-off-by: default avatarPaul E. McKenney <paulmck@linux.vnet.ibm.com>
    Tested-by: default avatarEric B Munson <emunson@akamai.com>
    Tested-by: default avatarJay Vosburgh <jay.vosburgh@canonical.com>
    Tested-by: default avatarYanko Kaneti <yaneti@declera.com>
    Tested-by: default avatarKevin Fenzi <kevin@scrye.com>
    Tested-by: default avatarMeelis Roos <mroos@linux.ee>
    d7e29933
rcu.h 22 KB