1. 10 Feb, 2017 5 commits
    • Chuck Lever's avatar
      xprtrdma: Shrink send SGEs array · c6f5b47f
      Chuck Lever authored
      We no longer need to accommodate an xdr_buf whose pages start at an
      offset and cross extra page boundaries. If there are more partial or
      whole pages to send than there are available SGEs, the marshaling
      logic is now smart enough to use a Read chunk instead of failing.
      Signed-off-by: default avatarChuck Lever <chuck.lever@oracle.com>
      Signed-off-by: default avatarAnna Schumaker <Anna.Schumaker@Netapp.com>
      c6f5b47f
    • Chuck Lever's avatar
      xprtrdma: Reduce required number of send SGEs · 16f906d6
      Chuck Lever authored
      The MAX_SEND_SGES check introduced in commit 655fec69
      ("xprtrdma: Use gathered Send for large inline messages") fails
      for devices that have a small max_sge.
      
      Instead of checking for a large fixed maximum number of SGEs,
      check for a minimum small number. RPC-over-RDMA will switch to
      using a Read chunk if an xdr_buf has more pages than can fit in
      the device's max_sge limit. This is considerably better than
      failing all together to mount the server.
      
      This fix supports devices that have as few as three send SGEs
      available.
      Reported-by: default avatarSelvin Xavier <selvin.xavier@broadcom.com>
      Reported-by: default avatarDevesh Sharma <devesh.sharma@broadcom.com>
      Reported-by: default avatarHonggang Li <honli@redhat.com>
      Reported-by: default avatarRam Amrani <Ram.Amrani@cavium.com>
      Fixes: 655fec69 ("xprtrdma: Use gathered Send for large ...")
      Cc: stable@vger.kernel.org # v4.9+
      Tested-by: default avatarHonggang Li <honli@redhat.com>
      Tested-by: default avatarRam Amrani <Ram.Amrani@cavium.com>
      Tested-by: default avatarSteve Wise <swise@opengridcomputing.com>
      Reviewed-by: default avatarParav Pandit <parav@mellanox.com>
      Signed-off-by: default avatarChuck Lever <chuck.lever@oracle.com>
      Signed-off-by: default avatarAnna Schumaker <Anna.Schumaker@Netapp.com>
      16f906d6
    • Chuck Lever's avatar
      xprtrdma: Disable pad optimization by default · c95a3c6b
      Chuck Lever authored
      Commit d5440e27 ("xprtrdma: Enable pad optimization") made the
      Linux client omit XDR round-up padding in normal Read and Write
      chunks so that the client doesn't have to register and invalidate
      3-byte memory regions that contain no real data.
      
      Unfortunately, my cheery 2014 assessment that this optimization "is
      supported now by both Linux and Solaris servers" was premature.
      We've found bugs in Solaris in this area since commit d5440e27
      ("xprtrdma: Enable pad optimization") was merged (SYMLINK is the
      main offender).
      
      So for maximum interoperability, I'm disabling this optimization
      again. If a CM private message is exchanged when connecting, the
      client recognizes that the server is Linux, and enables the
      optimization for that connection.
      
      Until now the Solaris server bugs did not impact common operations,
      and were thus largely benign. Soon, less capable devices on Linux
      NFS/RDMA clients will make use of Read chunks more often, and these
      Solaris bugs will prevent interoperation in more cases.
      
      Fixes: 677eb17e ("xprtrdma: Fix XDR tail buffer marshalling")
      Cc: stable@vger.kernel.org # v4.9+
      Signed-off-by: default avatarChuck Lever <chuck.lever@oracle.com>
      Signed-off-by: default avatarAnna Schumaker <Anna.Schumaker@Netapp.com>
      c95a3c6b
    • Chuck Lever's avatar
      xprtrdma: Per-connection pad optimization · b5f0afbe
      Chuck Lever authored
      Pad optimization is changed by echoing into
      /proc/sys/sunrpc/rdma_pad_optimize. This is a global setting,
      affecting all RPC-over-RDMA connections to all servers.
      
      The marshaling code picks up that value and uses it for decisions
      about how to construct each RPC-over-RDMA frame. Having it change
      suddenly in mid-operation can result in unexpected failures. And
      some servers a client mounts might need chunk round-up, while
      others don't.
      
      So instead, copy the pad_optimize setting into each connection's
      rpcrdma_ia when the transport is created, and use the copy, which
      can't change during the life of the connection, instead.
      
      This also removes a hack: rpcrdma_convert_iovs was using
      the remote-invalidation-expected flag to predict when it could leave
      out Write chunk padding. This is because the Linux server handles
      implicit XDR padding on Write chunks correctly, and only Linux
      servers can set the connection's remote-invalidation-expected flag.
      
      It's more sensible to use the pad optimization setting instead.
      
      Fixes: 677eb17e ("xprtrdma: Fix XDR tail buffer marshalling")
      Cc: stable@vger.kernel.org # v4.9+
      Signed-off-by: default avatarChuck Lever <chuck.lever@oracle.com>
      Signed-off-by: default avatarAnna Schumaker <Anna.Schumaker@Netapp.com>
      b5f0afbe
    • Chuck Lever's avatar
      xprtrdma: Fix Read chunk padding · 24abdf1b
      Chuck Lever authored
      When pad optimization is disabled, rpcrdma_convert_iovs still
      does not add explicit XDR round-up padding to a Read chunk.
      
      Commit 677eb17e ("xprtrdma: Fix XDR tail buffer marshalling")
      incorrectly short-circuited the test for whether round-up padding
      is needed that appears later in rpcrdma_convert_iovs.
      
      However, if this is indeed a regular Read chunk (and not a
      Position-Zero Read chunk), the tail iovec _always_ contains the
      chunk's padding, and never anything else.
      
      So, it's easy to just skip the tail when padding optimization is
      enabled, and add the tail in a subsequent Read chunk segment, if
      disabled.
      
      Fixes: 677eb17e ("xprtrdma: Fix XDR tail buffer marshalling")
      Cc: stable@vger.kernel.org # v4.9+
      Signed-off-by: default avatarChuck Lever <chuck.lever@oracle.com>
      Signed-off-by: default avatarAnna Schumaker <Anna.Schumaker@Netapp.com>
      24abdf1b
  2. 09 Feb, 2017 4 commits
  3. 08 Feb, 2017 10 commits
  4. 30 Jan, 2017 21 commits