• Nathan Lynch's avatar
    powerpc/rtas: retry when cpu offline races with suspend/migration · 9fb60305
    Nathan Lynch authored
    The protocol for suspending or migrating an LPAR requires all present
    processor threads to enter H_JOIN. So if we have threads offline, we
    have to temporarily bring them up. This can race with administrator
    actions such as SMT state changes. As of dfd718a2 ("powerpc/rtas:
    Fix a potential race between CPU-Offline & Migration"),
    rtas_ibm_suspend_me() accounts for this, but errors out with -EBUSY
    for what almost certainly is a transient condition in any reasonable
    scenario.
    
    Callers of rtas_ibm_suspend_me() already retry when -EAGAIN is
    returned, and it is typical during a migration for that to happen
    repeatedly for several minutes polling the H_VASI_STATE hcall result
    before proceeding to the next stage.
    
    So return -EAGAIN instead of -EBUSY when this race is
    encountered. Additionally: logging this event is still appropriate but
    use pr_info instead of pr_err; and remove use of unlikely() while here
    as this is not a hot path at all.
    
    Fixes: dfd718a2 ("powerpc/rtas: Fix a potential race between CPU-Offline & Migration")
    Signed-off-by: default avatarNathan Lynch <nathanl@linux.ibm.com>
    Signed-off-by: default avatarMichael Ellerman <mpe@ellerman.id.au>
    9fb60305
rtas.c 29.4 KB