• Chuck Lever's avatar
    xprtrdma: Allocate RPC send buffer separately from struct rpcrdma_req · 0ca77dc3
    Chuck Lever authored
    Because internal memory registration is an expensive and synchronous
    operation, xprtrdma pre-registers send and receive buffers at mount
    time, and then re-uses them for each RPC.
    
    A "hardway" allocation is a memory allocation and registration that
    replaces a send buffer during the processing of an RPC. Hardway must
    be done if the RPC send buffer is too small to accommodate an RPC's
    call and reply headers.
    
    For xprtrdma, each RPC send buffer is currently part of struct
    rpcrdma_req so that xprt_rdma_free(), which is passed nothing but
    the address of an RPC send buffer, can find its matching struct
    rpcrdma_req and rpcrdma_rep quickly via container_of / offsetof.
    
    That means that hardway currently has to replace a whole rpcrmda_req
    when it replaces an RPC send buffer. This is often a fairly hefty
    chunk of contiguous memory due to the size of the rl_segments array
    and the fact that both the send and receive buffers are part of
    struct rpcrdma_req.
    
    Some obscure re-use of fields in rpcrdma_req is done so that
    xprt_rdma_free() can detect replaced rpcrdma_req structs, and
    restore the original.
    
    This commit breaks apart the RPC send buffer and struct rpcrdma_req
    so that increasing the size of the rl_segments array does not change
    the alignment of each RPC send buffer. (Increasing rl_segments is
    needed to bump up the maximum r/wsize for NFS/RDMA).
    
    This change opens up some interesting possibilities for improving
    the design of xprt_rdma_allocate().
    
    xprt_rdma_allocate() is now the one place where RPC send buffers
    are allocated or re-allocated, and they are now always left in place
    by xprt_rdma_free().
    
    A large re-allocation that includes both the rl_segments array and
    the RPC send buffer is no longer needed. Send buffer re-allocation
    becomes quite rare. Good send buffer alignment is guaranteed no
    matter what the size of the rl_segments array is.
    Signed-off-by: default avatarChuck Lever <chuck.lever@oracle.com>
    Reviewed-by: default avatarSteve Wise <swise@opengridcomputing.com>
    Signed-off-by: default avatarAnna Schumaker <Anna.Schumaker@Netapp.com>
    0ca77dc3
verbs.c 56.3 KB