• NeilBrown's avatar
    SUNRPC in case of backlog, hand free slots directly to waiting task · e877a88d
    NeilBrown authored
    If sunrpc.tcp_max_slot_table_entries is small and there are tasks
    on the backlog queue, then when a request completes it is freed and the
    first task on the queue is woken.  The expectation is that it will wake
    and claim that request.  However if it was a sync task and the waiting
    process was killed at just that moment, it will wake and NOT claim the
    request.
    
    As long as TASK_CONGESTED remains set, requests can only be claimed by
    tasks woken from the backlog, and they are woken only as requests are
    freed, so when a task doesn't claim a request, no other task can ever
    get that request until TASK_CONGESTED is cleared.  Each time this
    happens the number of available requests is decreased by one.
    
    With a sufficiently high workload and sufficiently low setting of
    max_slot (16 in the case where this was seen), TASK_CONGESTED can remain
    set for an extended period, and the above scenario (of a process being
    killed just as its task was woken) can repeat until no requests can be
    allocated.  Then traffic stops.
    
    This patch addresses the problem by introducing a positive handover of a
    request from a completing task to a backlog task - the request is never
    freed when there is a backlog.
    
    When a task is woken it might not already have a request attached in
    which case it is *not* freed (as with current code) but is initialised
    (if needed) and used.  If it isn't used it will eventually be freed by
    rpc_exit_task().  xprt_release() is enhanced to be able to correctly
    release an uninitialised request.
    
    Fixes: ba60eb25 ("SUNRPC: Fix a livelock problem in the xprt->backlog queue")
    Signed-off-by: default avatarNeilBrown <neilb@suse.de>
    Signed-off-by: default avatarTrond Myklebust <trond.myklebust@hammerspace.com>
    e877a88d
xprt.c 52.7 KB