• Chuck Lever's avatar
    xprtrdma: Fix disconnect regression · 8d4fb8ff
    Chuck Lever authored
    I found that injecting disconnects with v4.18-rc resulted in
    random failures of the multi-threaded git regression test.
    
    The root cause appears to be that, after a reconnect, the
    RPC/RDMA transport is waking pending RPCs before the transport has
    posted enough Receive buffers to receive the Replies. If a Reply
    arrives before enough Receive buffers are posted, the connection
    is dropped. A few connection drops happen in quick succession as
    the client and server struggle to regain credit synchronization.
    
    This regression was introduced with commit 7c8d9e7c ("xprtrdma:
    Move Receive posting to Receive handler"). The client is supposed to
    post a single Receive when a connection is established because
    it's not supposed to send more than one RPC Call before it gets
    a fresh credit grant in the first RPC Reply [RFC 8166, Section
    3.3.3].
    
    Unfortunately there appears to be a longstanding bug in the Linux
    client's credit accounting mechanism. On connect, it simply dumps
    all pending RPC Calls onto the new connection. It's possible it has
    done this ever since the RPC/RDMA transport was added to the kernel
    ten years ago.
    
    Servers have so far been tolerant of this bad behavior. Currently no
    server implementation ever changes its credit grant over reconnects,
    and servers always repost enough Receives before connections are
    fully established.
    
    The Linux client implementation used to post a Receive before each
    of these Calls. This has covered up the flooding send behavior.
    
    I could try to correct this old bug so that the client sends exactly
    one RPC Call and waits for a Reply. Since we are so close to the
    next merge window, I'm going to instead provide a simple patch to
    post enough Receives before a reconnect completes (based on the
    number of credits granted to the previous connection).
    
    The spurious disconnects will be gone, but the client will still
    send multiple RPC Calls immediately after a reconnect.
    
    Addressing the latter problem will wait for a merge window because
    a) I expect it to be a large change requiring lots of testing, and
    b) obviously the Linux client has interoperated successfully since
    day zero while still being broken.
    
    Fixes: 7c8d9e7c ("xprtrdma: Move Receive posting to ... ")
    Cc: stable@vger.kernel.org # v4.18+
    Signed-off-by: default avatarChuck Lever <chuck.lever@oracle.com>
    Signed-off-by: default avatarAnna Schumaker <Anna.Schumaker@Netapp.com>
    8d4fb8ff
verbs.c 38.8 KB