1. 06 Sep, 2018 7 commits
  2. 05 Sep, 2018 30 commits
  3. 04 Sep, 2018 3 commits
    • Majd Dibbiny's avatar
      IB/mlx5: Change TX affinity assignment in RoCE LAG mode · c6a21c38
      Majd Dibbiny authored
      In the current code, the TX affinity is per RoCE device, which can cause
      unfairness between different contexts. e.g. if we open two contexts, and
      each open 10 QPs concurrently, all of the QPs of the first context might
      end up on the first port instead of distributed on the two ports as
      expected
      
      To overcome this unfairness between processes, we maintain per device TX
      affinity, and per process TX affinity.
      
      The allocation algorithm is as follow:
      
      1. Hold two tx_port_affinity atomic variables, one per RoCE device and one
         per ucontext. Both initialized to 0.
      
      2. In mlx5_ib_alloc_ucontext do:
       2.1. ucontext.tx_port_affinity = device.tx_port_affinity
       2.2. device.tx_port_affinity += 1
      
      3. In modify QP INIT2RST:
       3.1. qp.tx_port_affinity = ucontext.tx_port_affinity % MLX5_PORT_NUM
       3.2. ucontext.tx_port_affinity += 1
      Signed-off-by: default avatarMajd Dibbiny <majd@mellanox.com>
      Reviewed-by: default avatarMoni Shoua <monis@mellanox.com>
      Signed-off-by: default avatarLeon Romanovsky <leonro@mellanox.com>
      Signed-off-by: default avatarJason Gunthorpe <jgg@mellanox.com>
      c6a21c38
    • Steve Wise's avatar
      iw_cxgb4: only allow 1 flush on user qps · 308aa2b8
      Steve Wise authored
      Once the qp has been flushed, it cannot be flushed again.  The user qp
      flush logic wasn't enforcing it however.  The bug can cause
      touch-after-free crashes like:
      
      Unable to handle kernel paging request for data at address 0x000001ec
      Faulting instruction address: 0xc008000016069100
      Oops: Kernel access of bad area, sig: 11 [#1]
      ...
      NIP [c008000016069100] flush_qp+0x80/0x480 [iw_cxgb4]
      LR [c00800001606cd6c] c4iw_modify_qp+0x71c/0x11d0 [iw_cxgb4]
      Call Trace:
      [c00800001606cd6c] c4iw_modify_qp+0x71c/0x11d0 [iw_cxgb4]
      [c00800001606e868] c4iw_ib_modify_qp+0x118/0x200 [iw_cxgb4]
      [c0080000119eae80] ib_security_modify_qp+0xd0/0x3d0 [ib_core]
      [c0080000119c4e24] ib_modify_qp+0xc4/0x2c0 [ib_core]
      [c008000011df0284] iwcm_modify_qp_err+0x44/0x70 [iw_cm]
      [c008000011df0fec] destroy_cm_id+0xcc/0x370 [iw_cm]
      [c008000011ed4358] rdma_destroy_id+0x3c8/0x520 [rdma_cm]
      [c0080000134b0540] ucma_close+0x90/0x1b0 [rdma_ucm]
      [c000000000444da4] __fput+0xe4/0x2f0
      
      So fix flush_qp() to only flush the wq once.
      
      Cc: stable@vger.kernel.org
      Signed-off-by: default avatarSteve Wise <swise@opengridcomputing.com>
      Signed-off-by: default avatarJason Gunthorpe <jgg@mellanox.com>
      308aa2b8
    • Artemy Kovalyov's avatar
      IB/core: Release object lock if destroy failed · e4ff3d22
      Artemy Kovalyov authored
      The object lock was supposed to always be released during destroy, but
      when the destruction retry series was integrated with the destroy series
      it created a failure path that missed the unlock.
      
      Keep with convention, if destroy fails the caller must undo all locking.
      
      Fixes: 87ad80ab ("IB/uverbs: Consolidate uobject destruction")
      Signed-off-by: default avatarArtemy Kovalyov <artemyko@mellanox.com>
      Signed-off-by: default avatarLeon Romanovsky <leonro@mellanox.com>
      Signed-off-by: default avatarJason Gunthorpe <jgg@mellanox.com>
      e4ff3d22