• Chuck Lever's avatar
    svcrdma: Use rdma_rw API in RPC reply path · 9a6a180b
    Chuck Lever authored
    The current svcrdma sendto code path posts one RDMA Write WR at a
    time. Each of these Writes typically carries a small number of pages
    (for instance, up to 30 pages for mlx4 devices). That means a 1MB
    NFS READ reply requires 9 ib_post_send() calls for the Write WRs,
    and one for the Send WR carrying the actual RPC Reply message.
    
    Instead, use the new rdma_rw API. The details of Write WR chain
    construction and memory registration are taken care of in the RDMA
    core. svcrdma can focus on the details of the RPC-over-RDMA
    protocol. This gives three main benefits:
    
    1. All Write WRs for one RDMA segment are posted in a single chain.
    As few as one ib_post_send() for each Write chunk.
    
    2. The Write path can now use FRWR to register the Write buffers.
    If the device's maximum page list depth is large, this means a
    single Write WR is needed for each RPC's Write chunk data.
    
    3. The new code introduces support for RPCs that carry both a Write
    list and a Reply chunk. This combination can be used for an NFSv4
    READ where the data payload is large, and thus is removed from the
    Payload Stream, but the Payload Stream is still larger than the
    inline threshold.
    Signed-off-by: default avatarChuck Lever <chuck.lever@oracle.com>
    Signed-off-by: default avatarJ. Bruce Fields <bfields@redhat.com>
    9a6a180b
svc_rdma_transport.c 37.5 KB