• David Howells's avatar
    rxrpc: Fix lockup due to no error backoff after ack transmit error · c7e86acf
    David Howells authored
    If the network becomes (partially) unavailable, say by disabling IPv6, the
    background ACK transmission routine can get itself into a tizzy by
    proposing immediate ACK retransmission.  Since we're in the call event
    processor, that happens immediately without returning to the workqueue
    manager.
    
    The condition should clear after a while when either the network comes back
    or the call times out.
    
    Fix this by:
    
     (1) When re-proposing an ACK on failed Tx, don't schedule it immediately.
         This will allow a certain amount of time to elapse before we try
         again.
    
     (2) Enforce a return to the workqueue manager after a certain number of
         iterations of the call processing loop.
    
     (3) Add a backoff delay that increases the delay on deferred ACKs by a
         jiffy per failed transmission to a limit of HZ.  The backoff delay is
         cleared on a successful return from kernel_sendmsg().
    
     (4) Cancel calls immediately if the opening sendmsg fails.  The layer
         above can arrange retransmission or rotate to another server.
    
    Fixes: 248f219c ("rxrpc: Rewrite the data and ack handling code")
    Signed-off-by: default avatarDavid Howells <dhowells@redhat.com>
    Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
    c7e86acf
output.c 17.5 KB