• Chuck Lever's avatar
    xprtrdma: Faster server reboot recovery · b2dde94b
    Chuck Lever authored
    In a cluster failover scenario, it is desirable for the client to
    attempt to reconnect quickly, as an alternate NFS server is already
    waiting to take over for the down server. The client can't see that
    a server IP address has moved to a new server until the existing
    connection is gone.
    
    For fabrics and devices where it is meaningful, set a definite upper
    bound on the amount of time before it is determined that a
    connection is no longer valid. This allows the RPC client to detect
    connection loss in a timely matter, then perform a fresh resolution
    of the server GUID in case it has changed (cluster failover).
    Signed-off-by: default avatarChuck Lever <chuck.lever@oracle.com>
    Tested-by: default avatarSteve Wise <swise@opengridcomputing.com>
    Signed-off-by: default avatarAnna Schumaker <Anna.Schumaker@Netapp.com>
    b2dde94b
verbs.c 31.1 KB