• David Howells's avatar
    rxrpc: Fix missing notification · 5ac0d622
    David Howells authored
    Under some circumstances, rxrpc will fail a transmit a packet through the
    underlying UDP socket (ie. UDP sendmsg returns an error).  This may result
    in a call getting stuck.
    
    In the instance being seen, where AFS tries to send a probe to the Volume
    Location server, tracepoints show the UDP Tx failure (in this case returing
    error 99 EADDRNOTAVAIL) and then nothing more:
    
     afs_make_vl_call: c=0000015d VL.GetCapabilities
     rxrpc_call: c=0000015d NWc u=1 sp=rxrpc_kernel_begin_call+0x106/0x170 [rxrpc] a=00000000dd89ee8a
     rxrpc_call: c=0000015d Gus u=2 sp=rxrpc_new_client_call+0x14f/0x580 [rxrpc] a=00000000e20e4b08
     rxrpc_call: c=0000015d SEE u=2 sp=rxrpc_activate_one_channel+0x7b/0x1c0 [rxrpc] a=00000000e20e4b08
     rxrpc_call: c=0000015d CON u=2 sp=rxrpc_kernel_begin_call+0x106/0x170 [rxrpc] a=00000000e20e4b08
     rxrpc_tx_fail: c=0000015d r=1 ret=-99 CallDataNofrag
    
    The problem is that if the initial packet fails and the retransmission
    timer hasn't been started, the call is set to completed and an error is
    returned from rxrpc_send_data_packet() to rxrpc_queue_packet().  Though
    rxrpc_instant_resend() is called, this does nothing because the call is
    marked completed.
    
    So rxrpc_notify_socket() isn't called and the error is passed back up to
    rxrpc_send_data(), rxrpc_kernel_send_data() and thence to afs_make_call()
    and afs_vl_get_capabilities() where it is simply ignored because it is
    assumed that the result of a probe will be collected asynchronously.
    
    Fileserver probing is similarly affected via afs_fs_get_capabilities().
    
    Fix this by always issuing a notification in __rxrpc_set_call_completion()
    if it shifts a call to the completed state, even if an error is also
    returned to the caller through the function return value.
    
    Also put in a little bit of optimisation to avoid taking the call
    state_lock and disabling softirqs if the call is already in the completed
    state and remove some now redundant rxrpc_notify_socket() calls.
    
    Fixes: f5c17aae ("rxrpc: Calls should only have one terminal state")
    Reported-by: default avatarGerry Seidman <gerry@auristor.com>
    Signed-off-by: default avatarDavid Howells <dhowells@redhat.com>
    Reviewed-by: default avatarMarc Dionne <marc.dionne@auristor.com>
    5ac0d622
conn_event.c 12.3 KB