- 21 Jun, 2023 8 commits
-
-
Selvin Xavier authored
Introduce driver specific uapi functionalites. Added a alloc_page functionality for user library to allocate specific pages. Currently added support for allocating write combine pages for push functinality. This interface shall be extended for other page allocations. Allocate a WC page using the uapi hook for enabling the low latency push in Gen P5 adapters for small packets. This is supported only for the user space QPs. Link: https://lore.kernel.org/r/1686679943-17117-8-git-send-email-selvin.xavier@broadcom.comSigned-off-by: Selvin Xavier <selvin.xavier@broadcom.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
-
Selvin Xavier authored
Reorganize the code for allocation and mapping of Doorbell pages. Implements new HW command to get the BAR length used by L2 driver. These changes are used by the future patch which maps the WC Doorbell pages. Also, introduced a new lock dpi_tbl_lock for synchronize the DB page allocation from users. Link: https://lore.kernel.org/r/1686679943-17117-7-git-send-email-selvin.xavier@broadcom.comSigned-off-by: Selvin Xavier <selvin.xavier@broadcom.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
-
Selvin Xavier authored
FW interface version check is required for multiple features. Moving the interface version to chip context structure. Link: https://lore.kernel.org/r/1686679943-17117-6-git-send-email-selvin.xavier@broadcom.comSigned-off-by: Selvin Xavier <selvin.xavier@broadcom.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
-
Selvin Xavier authored
Query Function capabilities to enable advanced features. Link: https://lore.kernel.org/r/1686679943-17117-5-git-send-email-selvin.xavier@broadcom.comSigned-off-by: Selvin Xavier <selvin.xavier@broadcom.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
-
Selvin Xavier authored
As of now bnxt_re_init_hwrm_hdr is taking only the opcode from the caller. compl_ring and target_id field is always -1. These fields might be changed when newer features are added. For now, removing these parameters as they are hard coded. Also, remove the rdev field which is not used. Also, initialize the structure bnxt_fw_msg during declaration itself. Link: https://lore.kernel.org/r/1686679943-17117-4-git-send-email-selvin.xavier@broadcom.comSuggested-by: Leon Romanovsky <leon@kernel.org> Signed-off-by: Selvin Xavier <selvin.xavier@broadcom.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
-
Selvin Xavier authored
Add driver disassociation support. Driver uses the APIs rdma_user_mmap_io api while mapping the IO pages to user space. Add empty stub for disassociate ucontext. Link: https://lore.kernel.org/r/1686679943-17117-3-git-send-email-selvin.xavier@broadcom.comSigned-off-by: Selvin Xavier <selvin.xavier@broadcom.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
-
Selvin Xavier authored
Replace the mmap handling function with common code in IB core. Create rdma_user_mmap_entry for each mmap resource and add to the ib_core mmap list. Add mmap_free verb support. Also, use rdma_user_mmap_io while mapping Doorbell pages. Link: https://lore.kernel.org/r/1686679943-17117-2-git-send-email-selvin.xavier@broadcom.comSigned-off-by: Selvin Xavier <selvin.xavier@broadcom.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
-
Leon Romanovsky authored
Fix compilation warning: drivers/infiniband/hw/bnxt_re/qplib_rcfw.c:325:18: error: variable 'opcode' is uninitialized when used here [-Werror,-Wuninitialized] crsqe->opcode = opcode; ^~~~~~ drivers/infiniband/hw/bnxt_re/qplib_rcfw.c:291:11: note: initialize the variable 'opcode' to silence this warning u8 opcode; ^ = '\0' Fixes: bcfee4ce ("RDMA/bnxt_re: remove redundant cmdq_bitmap") Link: https://lore.kernel.org/r/6ad1e44be2b560986da6fdc6b68da606413e9026.1686644105.git.leonro@nvidia.comAcked-by: Selvin Xavier <selvin.xavier@broadcom.com> Signed-off-by: Leon Romanovsky <leonro@nvidia.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
-
- 20 Jun, 2023 3 commits
-
-
Yang Li authored
The call netdev_{put, hold} of dev_{put, hold} will check NULL, so there is no need to check before using dev_{put, hold}, remove it to silence the warning: ./drivers/infiniband/core/cma.c:4812:2-9: WARNING: NULL check before dev_{put, hold} functions is not needed. Link: https://lore.kernel.org/r/20230614014328.14007-1-yang.lee@linux.alibaba.comReported-by: Abaci Robot <abaci@linux.alibaba.com> Closes: https://bugzilla.openanolis.cn/show_bug.cgi?id=5521Signed-off-by: Yang Li <yang.lee@linux.alibaba.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
-
Bob Pearson authored
The flags parameter to the request notify verb is a bitmask. But, rxe driver treats cq->notify as an int. If someone ever set both the IB_CQ_SOLICITED and the IB_CQ_NEXT_COMP bits rxe_cq_post could fail to generate a completion event. This patch treats the notify flags as a bit mask consistently and can handle the above case correctly. Link: https://lore.kernel.org/r/20230612162244.20038-1-rpearsonhpe@gmail.comSigned-off-by: Bob Pearson <rpearsonhpe@gmail.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
-
Bob Pearson authored
A recent patch incorrectly did not include IB_ACCESS_RELAXED_ORDERING in the list of supported access flags for the rxe driver. The driver actually does nothing related to relaxed ordering but it causes no problems to include it as supported but with no effect. This change caused ib_send_bw and friends to not run correctly. The correct approach is for the driver to allow any of the optional access flags and otherwise ignore them. This patch adds IB_ACCESS_OPTIONAL to the list of rxe supported flags. Fixes: 02ed2537 ("RDMA/rxe: Introduce rxe access supported flags") Link: https://lore.kernel.org/r/20230613171654.19334-1-rpearsonhpe@gmail.comSigned-off-by: Bob Pearson <rpearsonhpe@gmail.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
-
- 12 Jun, 2023 17 commits
-
-
Kashyap Desai authored
Avoid passing arguments like Opcode which can be retrieved from bnxt_qplib_crsqe structure. Signed-off-by: Kashyap Desai <kashyap.desai@broadcom.com> Signed-off-by: Selvin Xavier <selvin.xavier@broadcom.com> Link: https://lore.kernel.org/r/1686308514-11996-18-git-send-email-selvin.xavier@broadcom.comSigned-off-by: Leon Romanovsky <leon@kernel.org>
-
Kashyap Desai authored
cmdq_bitmap is used to derive the next available index in the CMDQ. This is not required as the we can get the next index using the existing bnxt_qplib_crsqe array. Driver will use bnxt_qplib_crsqe array and flag is_in_used to derive valid entries. is_in_used is replacement of cmdq_bitmap. There is no change in the existing mechanism of the circular buffer used to get index. Added opcode field in bnxt_qplib_crsqe array so that it is easy to map opcode associated with pending rcfw command. Signed-off-by: Kashyap Desai <kashyap.desai@broadcom.com> Signed-off-by: Selvin Xavier <selvin.xavier@broadcom.com> Link: https://lore.kernel.org/r/1686308514-11996-17-git-send-email-selvin.xavier@broadcom.comSigned-off-by: Leon Romanovsky <leon@kernel.org>
-
Kashyap Desai authored
Firmware provides max request timeout value as part of hwrm_ver_get API. Driver gets the timeout from firmware and if that interface is not available then fall back to hardcoded timeout value. Also, Add a helper function to check the FW status. Signed-off-by: Kashyap Desai <kashyap.desai@broadcom.com> Signed-off-by: Selvin Xavier <selvin.xavier@broadcom.com> Link: https://lore.kernel.org/r/1686308514-11996-16-git-send-email-selvin.xavier@broadcom.comSigned-off-by: Leon Romanovsky <leon@kernel.org>
-
Kashyap Desai authored
When an error is detected in FW, wake up all the waiters as the all of them need to be completed with timeout. Add the device error state also as a wait condition. Signed-off-by: Kashyap Desai <kashyap.desai@broadcom.com> Signed-off-by: Selvin Xavier <selvin.xavier@broadcom.com> Link: https://lore.kernel.org/r/1686308514-11996-15-git-send-email-selvin.xavier@broadcom.comSigned-off-by: Leon Romanovsky <leon@kernel.org>
-
Kashyap Desai authored
If destroy_ah is timed out, it is likely to be destroyed by firmware but it is taking longer time due to temporary slowness in processing the rcfw command. In worst case, there might be AH resource leak in firmware. Sending timeout return value can dump warning message from ib_core which can be avoided if we map timeout of destroy_ah as success. Signed-off-by: Kashyap Desai <kashyap.desai@broadcom.com> Signed-off-by: Selvin Xavier <selvin.xavier@broadcom.com> Link: https://lore.kernel.org/r/1686308514-11996-14-git-send-email-selvin.xavier@broadcom.comSigned-off-by: Leon Romanovsky <leon@kernel.org>
-
Kashyap Desai authored
AH create may be called from interrpt context and driver has a special timeout (8 sec) for this command. This is to avoid soft lockups when the FW command takes more time. Driver returns -ETIMEOUT and fail create AH, without waiting for actual completion from firmware. When FW completion is received, use is_waiter_alive flag to avoid a regular completion path. If create_ah opcode is detected in completion path which does not have waiter alive, driver will fetch ah_id from successful firmware completion in the interrupt context and sends destroy_ah command for same ah_id. This special post is done in quick manner using helper function __send_message_no_waiter. timeout_send is only used for debugging purposes. If timeout_send value keeps incrementing, it indicates out of sync active ah counter between driver and firmware. This is a limitation but graceful handling is possible in future. Signed-off-by: Kashyap Desai <kashyap.desai@broadcom.com> Signed-off-by: Selvin Xavier <selvin.xavier@broadcom.com> Link: https://lore.kernel.org/r/1686308514-11996-13-git-send-email-selvin.xavier@broadcom.comSigned-off-by: Leon Romanovsky <leon@kernel.org>
-
Kashyap Desai authored
Every completion will update last_seen value in the unit of jiffies. last_seen field will be used to know if firmware is alive and is useful to detect firmware stall. Non blocking interface __wait_for_resp will have logic to detect firmware stall. After every 10 second interval if __wait_for_resp has not received completion for a given command it will check for firmware stall condition. If current jiffies is greater than last_seen jiffies + RCFW_FW_STALL_TIMEOUT_SEC * HZ, it is a firmware stall. Signed-off-by: Kashyap Desai <kashyap.desai@broadcom.com> Signed-off-by: Selvin Xavier <selvin.xavier@broadcom.com> Link: https://lore.kernel.org/r/1686308514-11996-12-git-send-email-selvin.xavier@broadcom.comSigned-off-by: Leon Romanovsky <leon@kernel.org>
-
Kashyap Desai authored
If calling context detect command timeout, associated memory stored on stack will not be valid. If firmware complete the same command later, this causes incorrect memory access by driver. Added is_waiter_alive to handle delayed completion by firmware. is_waiter_alive is set and reset under command queue lock. Signed-off-by: Kashyap Desai <kashyap.desai@broadcom.com> Signed-off-by: Selvin Xavier <selvin.xavier@broadcom.com> Link: https://lore.kernel.org/r/1686308514-11996-11-git-send-email-selvin.xavier@broadcom.comSigned-off-by: Leon Romanovsky <leon@kernel.org>
-
Kashyap Desai authored
This interface will be used if the driver has not enabled interrupt and/or interrupt is disabled for a short period of time. Completion is not possible from interrupt so this interface does self-polling. Signed-off-by: Kashyap Desai <kashyap.desai@broadcom.com> Signed-off-by: Selvin Xavier <selvin.xavier@broadcom.com> Link: https://lore.kernel.org/r/1686308514-11996-10-git-send-email-selvin.xavier@broadcom.comSigned-off-by: Leon Romanovsky <leon@kernel.org>
-
Kashyap Desai authored
- Use __send_message_basic_sanity helper function. - Do not retry posting same command if there is a queue full detection. - ENXIO is used to indicate controller recovery. - In the case of ERR_DEVICE_DETACHED state, the driver should not post commands to the firmware, but also return fabricated written code. Signed-off-by: Kashyap Desai <kashyap.desai@broadcom.com> Signed-off-by: Selvin Xavier <selvin.xavier@broadcom.com> Link: https://lore.kernel.org/r/1686308514-11996-9-git-send-email-selvin.xavier@broadcom.comSigned-off-by: Leon Romanovsky <leon@kernel.org>
-
Kashyap Desai authored
Whenever there is a fast path IO and create/destroy resources from the slow path is happening in parallel, we may notice high latency of slow path command completion. Introduces a shadow queue depth to prevent the outstanding requests to the FW. Driver will not allow more than #RCFW_CMD_NON_BLOCKING_SHADOW_QD non-blocking commands to the Firmware. Shadow queue depth is a soft limit only for non-blocking commands. Blocking commands will be posted to the firmware as long as there is a free slot. Signed-off-by: Kashyap Desai <kashyap.desai@broadcom.com> Signed-off-by: Selvin Xavier <selvin.xavier@broadcom.com> Link: https://lore.kernel.org/r/1686308514-11996-8-git-send-email-selvin.xavier@broadcom.comSigned-off-by: Leon Romanovsky <leon@kernel.org>
-
Kashyap Desai authored
Add a check to avoid waiting if driver already detects a FW timeout. Return success for resource destroy in case the device is detached. Add helper function to map timeout error code to success. Signed-off-by: Kashyap Desai <kashyap.desai@broadcom.com> Signed-off-by: Selvin Xavier <selvin.xavier@broadcom.com> Link: https://lore.kernel.org/r/1686308514-11996-7-git-send-email-selvin.xavier@broadcom.comSigned-off-by: Leon Romanovsky <leon@kernel.org>
-
Kashyap Desai authored
Use jiffies based timewait instead of counting iteration for commands that block for FW response. Also add a poll routine for control path commands. This is for polling completion if the waiting commands timeout. This avoids cases where the driver misses completion interrupts. Signed-off-by: Kashyap Desai <kashyap.desai@broadcom.com> Signed-off-by: Selvin Xavier <selvin.xavier@broadcom.com> Link: https://lore.kernel.org/r/1686308514-11996-6-git-send-email-selvin.xavier@broadcom.comSigned-off-by: Leon Romanovsky <leon@kernel.org>
-
Kashyap Desai authored
There is no need of setting max command queue entries based on firmware version check. Removing deperecated code. Signed-off-by: Kashyap Desai <kashyap.desai@broadcom.com> Signed-off-by: Selvin Xavier <selvin.xavier@broadcom.com> Link: https://lore.kernel.org/r/1686308514-11996-5-git-send-email-selvin.xavier@broadcom.comSigned-off-by: Leon Romanovsky <leon@kernel.org>
-
Kashyap Desai authored
There is a common FW communication offset for both PF and VF. Removed code around virt_fn check while creating FW channel. Signed-off-by: Kashyap Desai <kashyap.desai@broadcom.com> Signed-off-by: Selvin Xavier <selvin.xavier@broadcom.com> Link: https://lore.kernel.org/r/1686308514-11996-4-git-send-email-selvin.xavier@broadcom.comSigned-off-by: Leon Romanovsky <leon@kernel.org>
-
Kashyap Desai authored
bnxt_qplib_service_creq can be called from interrupt or tasklet or process context. So the function take irq variant of spin_lock. But when wake_up is invoked with the lock held, it is putting the calling context to sleep. [exception RIP: __wake_up_common+190] RIP: ffffffffb7539d7e RSP: ffffa73300207ad8 RFLAGS: 00000083 RAX: 0000000000000001 RBX: ffff91fa295f69b8 RCX: dead000000000200 RDX: ffffa733344af940 RSI: ffffa73336527940 RDI: ffffa73336527940 RBP: 000000000000001c R8: 0000000000000002 R9: 00000000000299c0 R10: 0000017230de82c5 R11: 0000000000000002 R12: ffffa73300207b28 R13: 0000000000000000 R14: ffffa733341bf928 R15: 0000000000000000 ORIG_RAX: ffffffffffffffff CS: 0010 SS: 0018 Call the wakeup after releasing the lock. Fixes: 1ac5a404 ("RDMA/bnxt_re: Add bnxt_re RoCE driver") Signed-off-by: Kashyap Desai <kashyap.desai@broadcom.com> Signed-off-by: Selvin Xavier <selvin.xavier@broadcom.com> Link: https://lore.kernel.org/r/1686308514-11996-3-git-send-email-selvin.xavier@broadcom.comSigned-off-by: Leon Romanovsky <leon@kernel.org>
-
Kashyap Desai authored
Driver is not handling the wraparound of the mbox producer index correctly. Currently the wraparound happens once u32 max is reached. Bit 31 of the producer index register is special and should be set only once for the first command. Because the producer index overflow setting bit31 after a long time, FW goes to initialization sequence and this causes FW hang. Fix is to wraparound the mbox producer index once it reaches u16 max. Fixes: cee0c7bb ("RDMA/bnxt_re: Refactor command queue management code") Fixes: 1ac5a404 ("RDMA/bnxt_re: Add bnxt_re RoCE driver") Signed-off-by: Kashyap Desai <kashyap.desai@broadcom.com> Signed-off-by: Selvin Xavier <selvin.xavier@broadcom.com> Link: https://lore.kernel.org/r/1686308514-11996-2-git-send-email-selvin.xavier@broadcom.comSigned-off-by: Leon Romanovsky <leon@kernel.org>
-
- 11 Jun, 2023 8 commits
-
-
Cheng Xu authored
The original doorbell allocation mechanism is complex and does not meet the isolation requirement. So we introduce a new doorbell mechanism and the original mechanism (only be used with CAP_SYS_RAWIO if hardware does not support the new mechanism) needs to be kept as simple as possible for compatibility. Signed-off-by: Cheng Xu <chengyou@linux.alibaba.com> Link: https://lore.kernel.org/r/20230606055005.80729-5-chengyou@linux.alibaba.comSigned-off-by: Leon Romanovsky <leon@kernel.org>
-
Cheng Xu authored
For the isolation requirement, each QP/CQ can only issue doorbells from the allocated mmio space. Configure the relationship between QPs/CQs and mmio doorbell spaces to hardware in create_qp/create_cq interfaces. Signed-off-by: Cheng Xu <chengyou@linux.alibaba.com> Link: https://lore.kernel.org/r/20230606055005.80729-4-chengyou@linux.alibaba.comSigned-off-by: Leon Romanovsky <leon@kernel.org>
-
Cheng Xu authored
Each ucontext will try to allocate doorbell resources in the extended bar space from hardware. For compatibility, we change nothing for the original bar space, and it will be used only for applications with CAP_SYS_RAWIO authority in the older HW/FW environments. Signed-off-by: Cheng Xu <chengyou@linux.alibaba.com> Link: https://lore.kernel.org/r/20230606055005.80729-3-chengyou@linux.alibaba.comSigned-off-by: Leon Romanovsky <leon@kernel.org>
-
Cheng Xu authored
Add a new CMDQ message to configure hardware. Initially the page size (in the format of shift) will be passed to hardware, so that hardware can organize the mmio space properly. It's called only if hardware supports it. Signed-off-by: Cheng Xu <chengyou@linux.alibaba.com> Link: https://lore.kernel.org/r/20230606055005.80729-2-chengyou@linux.alibaba.comSigned-off-by: Leon Romanovsky <leon@kernel.org>
-
Patrisious Haddad authored
Previously when destroying a QP/RQ, the result of the firmware destruction function was ignored and upper layers weren't informed about the failure. Which in turn could lead to various problems since when upper layer isn't aware of the failure it continues its operation thinking that the related QP/RQ was successfully destroyed while it actually wasn't, which could lead to the below kernel WARN. Currently, we return the correct firmware destruction status to upper layers which in case of the RQ would be mlx5_ib_destroy_wq() which was already capable of handling RQ destruction failure or in case of a QP to destroy_qp_common(), which now would actually warn upon qp destruction failure. WARNING: CPU: 3 PID: 995 at drivers/infiniband/core/rdma_core.c:940 uverbs_destroy_ufile_hw+0xcb/0xe0 [ib_uverbs] Modules linked in: xt_conntrack xt_MASQUERADE nf_conntrack_netlink nfnetlink xt_addrtype iptable_nat nf_nat br_netfilter rpcrdma rdma_ucm ib_iser libiscsi scsi_transport_iscsi rdma_cm ib_umad ib_ipoib iw_cm ib_cm mlx5_ib ib_uverbs ib_core overlay mlx5_core fuse CPU: 3 PID: 995 Comm: python3 Not tainted 5.16.0-rc5+ #1 Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS rel-1.13.0-0-gf21b5a4aeb02-prebuilt.qemu.org 04/01/2014 RIP: 0010:uverbs_destroy_ufile_hw+0xcb/0xe0 [ib_uverbs] Code: 41 5c 41 5d 41 5e e9 44 34 f0 e0 48 89 df e8 4c 77 ff ff 49 8b 86 10 01 00 00 48 85 c0 74 a1 4c 89 e7 ff d0 eb 9a 0f 0b eb c1 <0f> 0b be 04 00 00 00 48 89 df e8 b6 f6 ff ff e9 75 ff ff ff 90 0f RSP: 0018:ffff8881533e3e78 EFLAGS: 00010287 RAX: ffff88811b2cf3e0 RBX: ffff888106209700 RCX: 0000000000000000 RDX: ffff888106209780 RSI: ffff8881533e3d30 RDI: ffff888109b101a0 RBP: 0000000000000001 R08: ffff888127cb381c R09: 0de9890000000009 R10: ffff888127cb3800 R11: 0000000000000000 R12: ffff888106209780 R13: ffff888106209750 R14: ffff888100f20660 R15: 0000000000000000 FS: 00007f8be353b740(0000) GS:ffff88852c980000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 00007f8bd5b117c0 CR3: 000000012cd8a004 CR4: 0000000000370ea0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 Call Trace: <TASK> ib_uverbs_close+0x1a/0x90 [ib_uverbs] __fput+0x82/0x230 task_work_run+0x59/0x90 exit_to_user_mode_prepare+0x138/0x140 syscall_exit_to_user_mode+0x1d/0x50 ? __x64_sys_close+0xe/0x40 do_syscall_64+0x4a/0x90 entry_SYSCALL_64_after_hwframe+0x44/0xae RIP: 0033:0x7f8be3ae0abb Code: 03 00 00 00 0f 05 48 3d 00 f0 ff ff 77 41 c3 48 83 ec 18 89 7c 24 0c e8 83 43 f9 ff 8b 7c 24 0c 41 89 c0 b8 03 00 00 00 0f 05 <48> 3d 00 f0 ff ff 77 35 44 89 c7 89 44 24 0c e8 c1 43 f9 ff 8b 44 RSP: 002b:00007ffdb51909c0 EFLAGS: 00000293 ORIG_RAX: 0000000000000003 RAX: 0000000000000000 RBX: 0000557bb7f7c020 RCX: 00007f8be3ae0abb RDX: 0000557bb7c74010 RSI: 0000557bb7f14ca0 RDI: 0000000000000005 RBP: 0000557bb7fbd598 R08: 0000000000000000 R09: 0000000000000000 R10: 0000000000000000 R11: 0000000000000293 R12: 0000557bb7fbd5b8 R13: 0000557bb7fbd5a8 R14: 0000000000001000 R15: 0000557bb7f7c020 </TASK> Signed-off-by: Patrisious Haddad <phaddad@nvidia.com> Link: https://lore.kernel.org/r/c6df677f931d18090bafbe7f7dbb9524047b7d9b.1685953497.git.leon@kernel.orgSigned-off-by: Leon Romanovsky <leon@kernel.org>
-
Patrisious Haddad authored
Previously when destroying a DCT, if the firmware function for the destruction failed, the common resource would have been destroyed either way, since it was destroyed before the firmware object. Which leads to kernel warning "refcount_t: underflow" which indicates possible use-after-free. Which is triggered when we try to destroy the common resource for the second time and execute refcount_dec_and_test(&common->refcount). So, let's fix the destruction order by factoring out the DCT QP logic to be in separate XArray database. refcount_t: underflow; use-after-free. WARNING: CPU: 8 PID: 1002 at lib/refcount.c:28 refcount_warn_saturate+0xd8/0xe0 Modules linked in: xt_conntrack xt_MASQUERADE nf_conntrack_netlink nfnetlink xt_addrtype iptable_nat nf_nat br_netfilter rpcrdma rdma_ucm ib_iser libiscsi scsi_transport_iscsi ib_umad rdma_cm ib_ipoib iw_cm ib_cm mlx5_ib ib_uverbs ib_core overlay mlx5_core fuse CPU: 8 PID: 1002 Comm: python3 Not tainted 5.16.0-rc5+ #1 Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS rel-1.13.0-0-gf21b5a4aeb02-prebuilt.qemu.org 04/01/2014 RIP: 0010:refcount_warn_saturate+0xd8/0xe0 Code: ff 48 c7 c7 18 f5 23 82 c6 05 60 70 ff 00 01 e8 d0 0a 45 00 0f 0b c3 48 c7 c7 c0 f4 23 82 c6 05 4c 70 ff 00 01 e8 ba 0a 45 00 <0f> 0b c3 0f 1f 44 00 00 8b 07 3d 00 00 00 c0 74 12 83 f8 01 74 13 RSP: 0018:ffff8881221d3aa8 EFLAGS: 00010286 RAX: 0000000000000000 RBX: ffff8881313e8d40 RCX: ffff88852cc1b5c8 RDX: 00000000ffffffd8 RSI: 0000000000000027 RDI: ffff88852cc1b5c0 RBP: ffff888100f70000 R08: ffff88853ffd1ba8 R09: 0000000000000003 R10: 00000000fffff000 R11: 3fffffffffffffff R12: 0000000000000246 R13: ffff888100f71fa0 R14: ffff8881221d3c68 R15: 0000000000000020 FS: 00007efebbb13740(0000) GS:ffff88852cc00000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 00005611aac29f80 CR3: 00000001313de004 CR4: 0000000000370ea0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 Call Trace: <TASK> destroy_resource_common+0x6e/0x95 [mlx5_ib] mlx5_core_destroy_rq_tracked+0x38/0xbe [mlx5_ib] mlx5_ib_destroy_wq+0x22/0x80 [mlx5_ib] ib_destroy_wq_user+0x1f/0x40 [ib_core] uverbs_free_wq+0x19/0x40 [ib_uverbs] destroy_hw_idr_uobject+0x18/0x50 [ib_uverbs] uverbs_destroy_uobject+0x2f/0x190 [ib_uverbs] uobj_destroy+0x3c/0x80 [ib_uverbs] ib_uverbs_cmd_verbs+0x3e4/0xb80 [ib_uverbs] ? uverbs_free_wq+0x40/0x40 [ib_uverbs] ? ip_list_rcv+0xf7/0x120 ? netif_receive_skb_list_internal+0x1b6/0x2d0 ? task_tick_fair+0xbf/0x450 ? __handle_mm_fault+0x11fc/0x1450 ib_uverbs_ioctl+0xa4/0x110 [ib_uverbs] __x64_sys_ioctl+0x3e4/0x8e0 ? handle_mm_fault+0xb9/0x210 do_syscall_64+0x3d/0x90 entry_SYSCALL_64_after_hwframe+0x44/0xae RIP: 0033:0x7efebc0be17b Code: 0f 1e fa 48 8b 05 1d ad 0c 00 64 c7 00 26 00 00 00 48 c7 c0 ff ff ff ff c3 66 0f 1f 44 00 00 f3 0f 1e fa b8 10 00 00 00 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d ed ac 0c 00 f7 d8 64 89 01 48 RSP: 002b:00007ffe71813e78 EFLAGS: 00000246 ORIG_RAX: 0000000000000010 RAX: ffffffffffffffda RBX: 00007ffe71813fb8 RCX: 00007efebc0be17b RDX: 00007ffe71813fa0 RSI: 00000000c0181b01 RDI: 0000000000000005 RBP: 00007ffe71813f80 R08: 00005611aae96020 R09: 000000000000004f R10: 00007efebbf9ffa0 R11: 0000000000000246 R12: 00007ffe71813f80 R13: 00007ffe71813f4c R14: 00005611aae2eca0 R15: 00007efeae6c89d0 </TASK> Signed-off-by: Patrisious Haddad <phaddad@nvidia.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com> Link: https://lore.kernel.org/r/4470888466c8a898edc9833286967529cc5f3c0d.1685953497.git.leon@kernel.orgSigned-off-by: Leon Romanovsky <leon@kernel.org>
-
Leon Romanovsky authored
driver.h is common header to whole mlx5 code base, but struct mlx5_qp_table is used in mlx5_ib driver only. So move that struct to be under sole responsibility of mlx5_ib. Link: https://lore.kernel.org/r/bec0dc1158e795813b135d1143147977f26bf668.1685953497.git.leon@kernel.orgSigned-off-by: Leon Romanovsky <leonro@nvidia.com>
-
Patrisious Haddad authored
Nullifying qp->dbg is a preparation for the next patches from the series in which mlx5_core_destroy_qp() could actually fail, and then it can be called again which causes a kernel crash, since qp->dbg was not nullified in previous call. Signed-off-by: Patrisious Haddad <phaddad@nvidia.com> Link: https://lore.kernel.org/r/1677e52bb642fd8d6062d73a5aa69083c0283dc9.1685953497.git.leon@kernel.orgSigned-off-by: Leon Romanovsky <leon@kernel.org>
-
- 09 Jun, 2023 4 commits
-
-
Bryan Tan authored
wr->opcode is unsigned; checking if it is negative is unnecessary. Fix this issue by removing the check. Fixes: 29c8d9eb ("IB: Add vmw_pvrdma driver") Link: https://lore.kernel.org/r/20230605183728.47021-1-bryantan@vmware.comReported-by: Dan Carpenter <dan.carpenter@linaro.org> Closes: https://lore.kernel.org/r/07c48eee-0aca-48ee-897a-38588c341c41@kili.mountainSigned-off-by: Bryan Tan <bryantan@vmware.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
-
Bob Pearson authored
The IBA requires: o11-5.2.5: If the HCA supports SRQ, for RC and UD service, the CI shall generate a Last WQE Reached Affiliated Asynchronous Event on a QP that is in the Error State and is associated with an SRQ when either: • a CQE is generated for the last WQE, or • the QP gets in the Error State and there are no more WQEs on the RQ. This patch implements this behavior in flush_recv_queue() which is called as a result of rxe_qp_error() being called whenever the qp is put into the error state. The rxe responder executes SRQ WQEs directly from the SRQ so there are never more WQES on the RQ. Link: https://lore.kernel.org/r/20230602164229.9277-1-rpearsonhpe@gmail.comSigned-off-by: Bob Pearson <rpearsonhpe@gmail.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
-
Bob Pearson authored
Implement the two easy cases of ib_rereg_user_mr. Link: https://lore.kernel.org/r/20230530221334.89432-7-rpearsonhpe@gmail.comSigned-off-by: Bob Pearson <rpearsonhpe@gmail.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
-
Bob Pearson authored
In order to conform to other drivers stop using rkey == 0 as an indication that there are no remote access flags set. Set rkey == lkey by default for all MRs. Link: https://lore.kernel.org/r/20230530221334.89432-6-rpearsonhpe@gmail.comSigned-off-by: Bob Pearson <rpearsonhpe@gmail.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
-