1. 22 Sep, 2020 5 commits
    • Maxim Mikityanskiy's avatar
      net/mlx5e: Small improvements for XDP TX MPWQE logic · 388a2b56
      Maxim Mikityanskiy authored
      Use MLX5E_XDP_MPW_MAX_WQEBBS to reserve space for a MPWQE, because it's
      actually the maximal size a MPWQE can take.
      
      Reorganize the logic that checks when to close the MPWQE session:
      
      1. Put all checks into a single function.
      
      2. When inline is on, make only one comparison - if it's false, the less
      strict one will also be false. The compiler probably optimized it out
      anyway, but it's clearer to also reflect it in the code.
      
      The MLX5E_XDP_INLINE_WQE_* defines are also changed to make the
      calculations more correct from the logical point of view. Though
      MLX5E_XDP_INLINE_WQE_MAX_DS_CNT used to be 16 and didn't change its
      value, the calculation used to be DIV_ROUND_UP(max inline packet size,
      MLX5_SEND_WQE_DS), and the numerator should have included sizeof(struct
      mlx5_wqe_inline_seg).
      Signed-off-by: default avatarMaxim Mikityanskiy <maximmi@mellanox.com>
      Reviewed-by: default avatarTariq Toukan <tariqt@mellanox.com>
      Signed-off-by: default avatarSaeed Mahameed <saeedm@nvidia.com>
      388a2b56
    • Maxim Mikityanskiy's avatar
      net/mlx5e: Refactor xmit functions · 8e4b53f6
      Maxim Mikityanskiy authored
      A huge function mlx5e_sq_xmit was split into several to achieve multiple
      goals:
      
      1. Reuse the code in IPoIB.
      
      2. Better intergrate with TLS, IPSEC, GENEVE and checksum offloads. Now
      it's possible to reserve space in the WQ before running eseg-based
      offloads, so:
      
      2.1. It's not needed to copy cseg and eseg after mlx5e_fill_sq_frag_edge
      anymore.
      
      2.2. mlx5e_txqsq_get_next_pi will be used instead of the legacy
      mlx5e_fill_sq_frag_edge for better code maintainability and reuse.
      
      3. Prepare for the upcoming TX MPWQE for SKBs. It will intervene after
      mlx5e_sq_calc_wqe_attr to check if it's possible to use MPWQE, and the
      code flow will split into two paths: MPWQE and non-MPWQE.
      
      Two high-level functions are provided to send packets:
      
      * mlx5e_xmit is called by the networking stack, runs offloads and sends
      the packet. In one of the following patches, MPWQE support will be added
      to this flow.
      
      * mlx5e_sq_xmit_simple is called by the TLS offload, runs only the
      checksum offload and sends the packet.
      
      This change has no performance impact in TCP single stream test and
      XDP_TX single stream test.
      
      When compiled with a recent GCC, this change shows no visible
      performance impact on UDP pktgen (burst 32) single stream test either:
        Packet rate: 16.86 Mpps (±0.15 Mpps) -> 16.95 Mpps (±0.15 Mpps)
        Instructions per packet: 434 -> 429
        Cycles per packet: 158 -> 160
        Instructions per cycle: 2.75 -> 2.69
      
      CPU: Intel(R) Xeon(R) CPU E5-2680 v3 @ 2.50GHz (x86_64)
      NIC: Mellanox ConnectX-6 Dx
      GCC 10.2.0
      Signed-off-by: default avatarMaxim Mikityanskiy <maximmi@mellanox.com>
      Reviewed-by: default avatarTariq Toukan <tariqt@mellanox.com>
      Signed-off-by: default avatarSaeed Mahameed <saeedm@nvidia.com>
      8e4b53f6
    • Maxim Mikityanskiy's avatar
      net/mlx5e: Move mlx5e_tx_wqe_inline_mode to en_tx.c · d02dfcd5
      Maxim Mikityanskiy authored
      Move mlx5e_tx_wqe_inline_mode from en/txrx.h to en_tx.c as it's only
      used there.
      Signed-off-by: default avatarMaxim Mikityanskiy <maximmi@mellanox.com>
      Reviewed-by: default avatarTariq Toukan <tariqt@mellanox.com>
      Signed-off-by: default avatarSaeed Mahameed <saeedm@nvidia.com>
      d02dfcd5
    • Maxim Mikityanskiy's avatar
      net/mlx5e: Use struct assignment to initialize mlx5e_tx_wqe_info · 8ba6f183
      Maxim Mikityanskiy authored
      Struct assignment guarantees that all fields of the structure are
      initialized (those that are not mentioned are zeroed). It makes code
      mode robust and reduces chances for unpredictable behavior when one
      forgets to reset some field and it holds an old value from previous
      iterations of using the structure.
      Signed-off-by: default avatarMaxim Mikityanskiy <maximmi@mellanox.com>
      Reviewed-by: default avatarTariq Toukan <tariqt@mellanox.com>
      Signed-off-by: default avatarSaeed Mahameed <saeedm@nvidia.com>
      8ba6f183
    • Maxim Mikityanskiy's avatar
      net/mlx5e: Refactor inline header size calculation in the TX path · 6d55af43
      Maxim Mikityanskiy authored
      As preparation for the next patch, don't increase ihs to calculate
      ds_cnt and then decrease it, but rather calculate the intermediate value
      temporarily. This code has the same amount of arithmetic operations, but
      now allows to split out ds_cnt calculation, which will be performed in
      the next patch.
      Signed-off-by: default avatarMaxim Mikityanskiy <maximmi@mellanox.com>
      Reviewed-by: default avatarTariq Toukan <tariqt@mellanox.com>
      Signed-off-by: default avatarSaeed Mahameed <saeedm@nvidia.com>
      6d55af43
  2. 21 Sep, 2020 35 commits