• Chuck Lever's avatar
    xprtrdma: Fix FRWR invalidation error recovery · 8d75483a
    Chuck Lever authored
    When ib_post_send() fails, all LOCAL_INV WRs past @bad_wr have to be
    examined, and the MRs reset by hand.
    
    I'm not sure how the existing code can work by comparing R_keys.
    Restructure the logic so that instead it walks the chain of WRs,
    starting from the first bad one.
    
    Make sure to wait for completion if at least one WR was actually
    posted. Otherwise, if the ib_post_send fails, we can end up
    DMA-unmapping the MR while LOCAL_INV operations are in flight.
    
    Commit 7a89f9c6 ("xprtrdma: Honor ->send_request API contract")
    added the rdma_disconnect() call site. The disconnect actually
    causes more problems than it solves, and SQ overruns happen only as
    a result of software bugs. So remove it.
    
    Fixes: d7a21c1b ("xprtrdma: Reset MRs in frwr_op_unmap_sync()")
    Signed-off-by: default avatarChuck Lever <chuck.lever@oracle.com>
    Signed-off-by: default avatarAnna Schumaker <Anna.Schumaker@Netapp.com>
    8d75483a
frwr_ops.c 15.7 KB