1. 01 Dec, 2022 5 commits
  2. 30 Nov, 2022 2 commits
  3. 29 Nov, 2022 1 commit
  4. 28 Nov, 2022 2 commits
  5. 24 Nov, 2022 4 commits
  6. 22 Nov, 2022 5 commits
  7. 21 Nov, 2022 1 commit
  8. 19 Nov, 2022 3 commits
    • Luoyouming's avatar
      RDMA/hns: Fix incorrect sge nums calculation · 0c5e259b
      Luoyouming authored
      The user usually configures the number of sge through the max_send_sge
      parameter when creating qp, and configures the maximum size of inline data
      that can be sent through max_inline_data. Inline uses sge to fill data to
      send. Expect the following:
      
      1) When the sge space cannot hold inline data, the sge space needs to be
         expanded to accommodate all inline data
      
      2) When the sge space is enough to accommodate inline data, the upper
         limit of inline data can be increased so that users can send larger
         inline data
      
      Currently case one is not implemented. When the inline data is larger than
      the sge space, an error of insufficient sge space occurs.  This part of
      the code needs to be reimplemented according to the expected rules. The
      calculation method of sge num is modified to take the maximum value of
      max_send_sge and the sge for max_inline_data to solve this problem.
      
      Fixes: 05201e01 ("RDMA/hns: Refactor process of setting extended sge")
      Fixes: 30b70788 ("RDMA/hns: Support inline data in extented sge space for RC")
      Link: https://lore.kernel.org/r/20221108133847.2304539-3-xuhaoyue1@hisilicon.comSigned-off-by: default avatarLuoyouming <luoyouming@huawei.com>
      Signed-off-by: default avatarHaoyue Xu <xuhaoyue1@hisilicon.com>
      Signed-off-by: default avatarJason Gunthorpe <jgg@nvidia.com>
      0c5e259b
    • Luoyouming's avatar
      RDMA/hns: Fix ext_sge num error when post send · 8eaa6f7d
      Luoyouming authored
      In the HNS ROCE driver, The sge is divided into standard sge and extended
      sge.  There are 2 standard sge in RC/XRC, and the UD standard sge is 0.
      In the scenario of RC SQ inline, if the data does not exceed 32bytes, the
      standard sge will be used. If it exceeds, only the extended sge will be
      used to fill the data.
      
      Currently, when filling the extended sge, max_gs is directly used as the
      number of the extended sge, which did not subtract the number of standard
      sge.  There is a logical error. The new algorithm subtracts the number of
      standard sge from max_gs to get the actual number of extended sge.
      
      Fixes: 30b70788 ("RDMA/hns: Support inline data in extented sge space for RC")
      Link: https://lore.kernel.org/r/20221108133847.2304539-2-xuhaoyue1@hisilicon.comSigned-off-by: default avatarLuoyouming <luoyouming@huawei.com>
      Signed-off-by: default avatarHaoyue Xu <xuhaoyue1@hisilicon.com>
      Signed-off-by: default avatarJason Gunthorpe <jgg@nvidia.com>
      8eaa6f7d
    • Li Zhijian's avatar
      RDMA/rxe: Fix mr->map double free · 7d984dac
      Li Zhijian authored
      rxe_mr_cleanup() which tries to free mr->map again will be called when
      rxe_mr_init_user() fails:
      
         CPU: 0 PID: 4917 Comm: rdma_flush_serv Kdump: loaded Not tainted 6.1.0-rc1-roce-flush+ #25
         Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.16.0-0-gd239552ce722-prebuilt.qemu.org 04/01/2014
         Call Trace:
          <TASK>
          dump_stack_lvl+0x45/0x5d
          panic+0x19e/0x349
          end_report.part.0+0x54/0x7c
          kasan_report.cold+0xa/0xf
          rxe_mr_cleanup+0x9d/0xf0 [rdma_rxe]
          __rxe_cleanup+0x10a/0x1e0 [rdma_rxe]
          rxe_reg_user_mr+0xb7/0xd0 [rdma_rxe]
          ib_uverbs_reg_mr+0x26a/0x480 [ib_uverbs]
          ib_uverbs_handler_UVERBS_METHOD_INVOKE_WRITE+0x1a2/0x250 [ib_uverbs]
          ib_uverbs_cmd_verbs+0x1397/0x15a0 [ib_uverbs]
      
      This issue was firstly exposed since commit b18c7da6 ("RDMA/rxe: Fix
      memory leak in error path code") and then we fixed it in commit
      8ff5f5d9 ("RDMA/rxe: Prevent double freeing rxe_map_set()") but this
      fix was reverted together at last by commit 1e755506 (Revert
      "RDMA/rxe: Create duplicate mapping tables for FMRs")
      
      Simply let rxe_mr_cleanup() always handle freeing the mr->map once it is
      successfully allocated.
      
      Fixes: 1e755506 ("Revert "RDMA/rxe: Create duplicate mapping tables for FMRs"")
      Link: https://lore.kernel.org/r/1667099073-2-1-git-send-email-lizhijian@fujitsu.comSigned-off-by: default avatarLi Zhijian <lizhijian@fujitsu.com>
      Signed-off-by: default avatarJason Gunthorpe <jgg@nvidia.com>
      7d984dac
  9. 18 Nov, 2022 2 commits
  10. 17 Nov, 2022 11 commits
  11. 15 Nov, 2022 4 commits
    • Bernard Metzler's avatar
      RDMA/siw: Set defined status for work completion with undefined status · 60da2d11
      Bernard Metzler authored
      A malicious user may write undefined values into memory mapped completion
      queue elements status or opcode. Undefined status or opcode values will
      result in out-of-bounds access to an array mapping siw internal
      representation of opcode and status to RDMA core representation when
      reaping CQ elements. While siw detects those undefined values, it did not
      correctly set completion status to a defined value, thus defeating the
      whole purpose of the check.
      
      This bug leads to the following Smatch static checker warning:
      
      	drivers/infiniband/sw/siw/siw_cq.c:96 siw_reap_cqe()
      	error: buffer overflow 'map_cqe_status' 10 <= 21
      
      Fixes: bdf1da5d ("RDMA/siw: Fix immediate work request flush to completion queue")
      Link: https://lore.kernel.org/r/20221115170747.1263298-1-bmt@zurich.ibm.comReported-by: default avatarDan Carpenter <error27@gmail.com>
      Signed-off-by: default avatarBernard Metzler <bmt@zurich.ibm.com>
      Signed-off-by: default avatarJason Gunthorpe <jgg@nvidia.com>
      60da2d11
    • Mark Zhang's avatar
      RDMA/nldev: Return "-EAGAIN" if the cm_id isn't from expected port · ecacb375
      Mark Zhang authored
      When filling a cm_id entry, return "-EAGAIN" instead of 0 if the cm_id
      doesn'the have the same port as requested, otherwise an incomplete entry
      may be returned, which causes "rdam res show cm_id" to return an error.
      
      For example on a machine with two rdma devices with "rping -C 1 -v -s"
      running background, the "rdma" command fails:
        $ rdma -V
        rdma utility, iproute2-5.19.0
        $ rdma res show cm_id
        link mlx5_0/- cm-idn 0 state LISTEN ps TCP pid 28056 comm rping src-addr 0.0.0.0:7174
        error: Protocol not available
      
      While with this fix it succeeds:
        $ rdma res show cm_id
        link mlx5_0/- cm-idn 0 state LISTEN ps TCP pid 26395 comm rping src-addr 0.0.0.0:7174
        link mlx5_1/- cm-idn 0 state LISTEN ps TCP pid 26395 comm rping src-addr 0.0.0.0:7174
      
      Fixes: 00313983 ("RDMA/nldev: provide detailed CM_ID information")
      Signed-off-by: default avatarMark Zhang <markzhang@nvidia.com>
      Link: https://lore.kernel.org/r/a08e898cdac5e28428eb749a99d9d981571b8ea7.1667810736.git.leonro@nvidia.comSigned-off-by: default avatarLeon Romanovsky <leon@kernel.org>
      ecacb375
    • Mark Zhang's avatar
      RDMA/core: Make sure "ib_port" is valid when access sysfs node · 5e15ff29
      Mark Zhang authored
      The "ib_port" structure must be set before adding the sysfs kobject,
      and reset after removing it, otherwise it may crash when accessing
      the sysfs node:
        Unable to handle kernel NULL pointer dereference at virtual address 0000000000000050
        Mem abort info:
          ESR = 0x96000006
          Exception class = DABT (current EL), IL = 32 bits
          SET = 0, FnV = 0
          EA = 0, S1PTW = 0
        Data abort info:
          ISV = 0, ISS = 0x00000006
          CM = 0, WnR = 0
        user pgtable: 4k pages, 48-bit VAs, pgdp = 00000000e85f5ba5
        [0000000000000050] pgd=0000000848fd9003, pud=000000085b387003, pmd=0000000000000000
        Internal error: Oops: 96000006 [#2] PREEMPT SMP
        Modules linked in: ib_umad(O) mlx5_ib(O) nfnetlink_cttimeout(E) nfnetlink(E) act_gact(E) cls_flower(E) sch_ingress(E) openvswitch(E) nsh(E) nf_nat_ipv6(E) nf_nat_ipv4(E) nf_conncount(E) nf_nat(E) nf_conntrack(E) nf_defrag_ipv6(E) nf_defrag_ipv4(E) mst_pciconf(O) ipmi_devintf(E) ipmi_msghandler(E) ipmb_dev_int(OE) mlx5_core(O) mlxfw(O) mlxdevm(O) auxiliary(O) ib_uverbs(O) ib_core(O) mlx_compat(O) psample(E) sbsa_gwdt(E) uio_pdrv_genirq(E) uio(E) mlxbf_pmc(OE) mlxbf_gige(OE) mlxbf_tmfifo(OE) gpio_mlxbf2(OE) pwr_mlxbf(OE) mlx_trio(OE) i2c_mlxbf(OE) mlx_bootctl(OE) bluefield_edac(OE) knem(O) ip_tables(E) ipv6(E) crc_ccitt(E) [last unloaded: mst_pci]
        Process grep (pid: 3372, stack limit = 0x0000000022055c92)
        CPU: 5 PID: 3372 Comm: grep Tainted: G      D    OE     4.19.161-mlnx.47.gadcd9e3 #1
        Hardware name: https://www.mellanox.com BlueField SoC/BlueField SoC, BIOS BlueField:3.9.2-15-ga2403ab Sep  8 2022
        pstate: 40000005 (nZcv daif -PAN -UAO)
        pc : hw_stat_port_show+0x4c/0x80 [ib_core]
        lr : port_attr_show+0x40/0x58 [ib_core]
        sp : ffff000029f43b50
        x29: ffff000029f43b50 x28: 0000000019375000
        x27: ffff8007b821a540 x26: ffff000029f43e30
        x25: 0000000000008000 x24: ffff000000eaa958
        x23: 0000000000001000 x22: ffff8007a4ce3000
        x21: ffff8007baff8000 x20: ffff8007b9066ac0
        x19: ffff8007bae97578 x18: 0000000000000000
        x17: 0000000000000000 x16: 0000000000000000
        x15: 0000000000000000 x14: 0000000000000000
        x13: 0000000000000000 x12: 0000000000000000
        x11: 0000000000000000 x10: 0000000000000000
        x9 : 0000000000000000 x8 : ffff8007a4ce4000
        x7 : 0000000000000000 x6 : 000000000000003f
        x5 : ffff000000e6a280 x4 : ffff8007a4ce3000
        x3 : 0000000000000000 x2 : aaaaaaaaaaaaaaab
        x1 : ffff8007b9066a10 x0 : ffff8007baff8000
        Call trace:
         hw_stat_port_show+0x4c/0x80 [ib_core]
         port_attr_show+0x40/0x58 [ib_core]
         sysfs_kf_seq_show+0x8c/0x150
         kernfs_seq_show+0x44/0x50
         seq_read+0x1b4/0x45c
         kernfs_fop_read+0x148/0x1d8
         __vfs_read+0x58/0x180
         vfs_read+0x94/0x154
         ksys_read+0x68/0xd8
         __arm64_sys_read+0x28/0x34
         el0_svc_common+0x88/0x18c
         el0_svc_handler+0x78/0x94
         el0_svc+0x8/0xe8
        Code: f2955562 aa1603e4 aa1503e0 f9405683 (f9402861)
      
      Fixes: d8a58838 ("RDMA/core: Replace the ib_port_data hw_stats pointers with a ib_port pointer")
      Signed-off-by: default avatarMark Zhang <markzhang@nvidia.com>
      Reviewed-by: default avatarMichael Guralnik <michaelgur@nvidia.com>
      Link: https://lore.kernel.org/r/88867e705c42c1cd2011e45201c25eecdb9fef94.1667810736.git.leonro@nvidia.comSigned-off-by: default avatarLeon Romanovsky <leon@kernel.org>
      5e15ff29
    • Mark Zhang's avatar
      RDMA/restrack: Release MR restrack when delete · dac153f2
      Mark Zhang authored
      The MR restrack also needs to be released when delete it, otherwise it
      cause memory leak as the task struct won't be released.
      
      Fixes: 13ef5539 ("RDMA/restrack: Count references to the verbs objects")
      Signed-off-by: default avatarMark Zhang <markzhang@nvidia.com>
      Reviewed-by: default avatarMichael Guralnik <michaelgur@nvidia.com>
      Link: https://lore.kernel.org/r/703db18e8d4ef628691fb93980a709be673e62e3.1667810736.git.leonro@nvidia.comSigned-off-by: default avatarLeon Romanovsky <leon@kernel.org>
      dac153f2