1. 06 Sep, 2018 4 commits
  2. 05 Sep, 2018 30 commits
  3. 04 Sep, 2018 6 commits
    • Majd Dibbiny's avatar
      IB/mlx5: Change TX affinity assignment in RoCE LAG mode · c6a21c38
      Majd Dibbiny authored
      In the current code, the TX affinity is per RoCE device, which can cause
      unfairness between different contexts. e.g. if we open two contexts, and
      each open 10 QPs concurrently, all of the QPs of the first context might
      end up on the first port instead of distributed on the two ports as
      expected
      
      To overcome this unfairness between processes, we maintain per device TX
      affinity, and per process TX affinity.
      
      The allocation algorithm is as follow:
      
      1. Hold two tx_port_affinity atomic variables, one per RoCE device and one
         per ucontext. Both initialized to 0.
      
      2. In mlx5_ib_alloc_ucontext do:
       2.1. ucontext.tx_port_affinity = device.tx_port_affinity
       2.2. device.tx_port_affinity += 1
      
      3. In modify QP INIT2RST:
       3.1. qp.tx_port_affinity = ucontext.tx_port_affinity % MLX5_PORT_NUM
       3.2. ucontext.tx_port_affinity += 1
      Signed-off-by: default avatarMajd Dibbiny <majd@mellanox.com>
      Reviewed-by: default avatarMoni Shoua <monis@mellanox.com>
      Signed-off-by: default avatarLeon Romanovsky <leonro@mellanox.com>
      Signed-off-by: default avatarJason Gunthorpe <jgg@mellanox.com>
      c6a21c38
    • Steve Wise's avatar
      iw_cxgb4: only allow 1 flush on user qps · 308aa2b8
      Steve Wise authored
      Once the qp has been flushed, it cannot be flushed again.  The user qp
      flush logic wasn't enforcing it however.  The bug can cause
      touch-after-free crashes like:
      
      Unable to handle kernel paging request for data at address 0x000001ec
      Faulting instruction address: 0xc008000016069100
      Oops: Kernel access of bad area, sig: 11 [#1]
      ...
      NIP [c008000016069100] flush_qp+0x80/0x480 [iw_cxgb4]
      LR [c00800001606cd6c] c4iw_modify_qp+0x71c/0x11d0 [iw_cxgb4]
      Call Trace:
      [c00800001606cd6c] c4iw_modify_qp+0x71c/0x11d0 [iw_cxgb4]
      [c00800001606e868] c4iw_ib_modify_qp+0x118/0x200 [iw_cxgb4]
      [c0080000119eae80] ib_security_modify_qp+0xd0/0x3d0 [ib_core]
      [c0080000119c4e24] ib_modify_qp+0xc4/0x2c0 [ib_core]
      [c008000011df0284] iwcm_modify_qp_err+0x44/0x70 [iw_cm]
      [c008000011df0fec] destroy_cm_id+0xcc/0x370 [iw_cm]
      [c008000011ed4358] rdma_destroy_id+0x3c8/0x520 [rdma_cm]
      [c0080000134b0540] ucma_close+0x90/0x1b0 [rdma_ucm]
      [c000000000444da4] __fput+0xe4/0x2f0
      
      So fix flush_qp() to only flush the wq once.
      
      Cc: stable@vger.kernel.org
      Signed-off-by: default avatarSteve Wise <swise@opengridcomputing.com>
      Signed-off-by: default avatarJason Gunthorpe <jgg@mellanox.com>
      308aa2b8
    • Artemy Kovalyov's avatar
      IB/core: Release object lock if destroy failed · e4ff3d22
      Artemy Kovalyov authored
      The object lock was supposed to always be released during destroy, but
      when the destruction retry series was integrated with the destroy series
      it created a failure path that missed the unlock.
      
      Keep with convention, if destroy fails the caller must undo all locking.
      
      Fixes: 87ad80ab ("IB/uverbs: Consolidate uobject destruction")
      Signed-off-by: default avatarArtemy Kovalyov <artemyko@mellanox.com>
      Signed-off-by: default avatarLeon Romanovsky <leonro@mellanox.com>
      Signed-off-by: default avatarJason Gunthorpe <jgg@mellanox.com>
      e4ff3d22
    • Jann Horn's avatar
      RDMA/ucma: check fd type in ucma_migrate_id() · 0d23ba60
      Jann Horn authored
      The current code grabs the private_data of whatever file descriptor
      userspace has supplied and implicitly casts it to a `struct ucma_file *`,
      potentially causing a type confusion.
      
      This is probably fine in practice because the pointer is only used for
      comparisons, it is never actually dereferenced; and even in the
      comparisons, it is unlikely that a file from another filesystem would have
      a ->private_data pointer that happens to also be valid in this context.
      But ->private_data is not always guaranteed to be a valid pointer to an
      object owned by the file's filesystem; for example, some filesystems just
      cram numbers in there.
      
      Check the type of the supplied file descriptor to be safe, analogous to how
      other places in the kernel do it.
      
      Fixes: 88314e4d ("RDMA/cma: add support for rdma_migrate_id()")
      Signed-off-by: default avatarJann Horn <jannh@google.com>
      Signed-off-by: default avatarJason Gunthorpe <jgg@mellanox.com>
      0d23ba60
    • Ariel Levkovich's avatar
      net/mlx5: Add memic command opcode to command checker · 09adbb5d
      Ariel Levkovich authored
      Adding the alloc/dealloc memic FW command opcodes to
      avoid "unknown command" prints in the command string
      converter and internal error status handler.
      Signed-off-by: default avatarAriel Levkovich <lariel@mellanox.com>
      Signed-off-by: default avatarLeon Romanovsky <leonro@mellanox.com>
      09adbb5d
    • Moni Shoua's avatar
      net/mlx5: Fix atomic_mode enum values · aa7e80b2
      Moni Shoua authored
      The field atomic_mode is 4 bits wide and therefore can hold values
      from 0x0 to 0xf. Remove the unnecessary 20 bit shift that made the values
      be incorrect. While that, remove unused enum values.
      
      Fixes: 57cda166 ("net/mlx5: Add DCT command interface")
      Signed-off-by: default avatarMoni Shoua <monis@mellanox.com>
      Reviewed-by: default avatarArtemy Kovalyov <artemyko@mellanox.com>
      Signed-off-by: default avatarLeon Romanovsky <leonro@mellanox.com>
      aa7e80b2