- 13 May, 2016 40 commits
-
-
Doug Ledford authored
-
Majd Dibbiny authored
Report Scatter FCS support when the Firmware supports as well. Signed-off-by: Majd Dibbiny <majd@mellanox.com> Signed-off-by: Matan Barak <matanb@mellanox.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
-
Majd Dibbiny authored
Enable Scatter FCS in the RQ context when the user passes Scatter FCS create flag. Signed-off-by: Majd Dibbiny <majd@mellanox.com> Signed-off-by: Matan Barak <matanb@mellanox.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
-
Majd Dibbiny authored
Raw Packet QPs that were created with Scatter FCS flag, will scatter the FCS into the receive buffers. Signed-off-by: Majd Dibbiny <majd@mellanox.com> Signed-off-by: Matan Barak <matanb@mellanox.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
-
Majd Dibbiny authored
Raw Scatter FCS device capability is set when the NIC supports scattering the FCS to the receive buffers of Raw Packet QPs. Signed-off-by: Majd Dibbiny <majd@mellanox.com> Signed-off-by: Matan Barak <matanb@mellanox.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
-
Majd Dibbiny authored
Since all the uverbs device_cap_flags are occupied, we need a place to expose more device capabilities. This patch adds a new 64 bit device_cap_flags_ex to expose new device capabilities. The lower 32 bits will be identical to the original device_cap_flags, The upper 32 bits will be new capabilities. Signed-off-by: Majd Dibbiny <majd@mellanox.com> Signed-off-by: Matan Barak <matanb@mellanox.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
-
Colin Ian King authored
passing hw_stats by value requires a 280 byte copy so instead pass it by reference is much more efficient. Signed-off-by: Colin Ian King <colin.king@canonical.com> Acked-by: Chien Tin Tung <chien.tin.tung@intel.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
-
Lars-Peter Clausen authored
Calling synchronize_irq() right before free_irq() is quite useless. On one hand the IRQ can easily fire again before free_irq() is entered, on the other hand free_irq() itself calls synchronize_irq() internally (in a race condition free way), before any state associated with the IRQ is freed. Patch was generated using the following semantic patch: // <smpl> @@ expression irq; @@ -synchronize_irq(irq); free_irq(irq, ...); // </smpl> Signed-off-by: Lars-Peter Clausen <lars@metafoo.de> Reviewed-by: Leon Romanovsky <leonro@mellanox.com> Acked-by: Faisal Latif <faisal.latif#intel.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
-
Julia Lawall authored
The i40iw_vf_cqp_ops structure is never modified, so declare it as const. Done with the help of Coccinelle. Signed-off-by: Julia Lawall <Julia.Lawall@lip6.fr> Reviewed-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
-
Guy Levi authored
By this patch, the user space library will be able to improve performance using appropriate ringing DoorBell method according to the memory type it asked for. Currently only one mapping command is allowed for UARs: MLX5_IB_MMAP_REGULAR_PAGE. Using this mapping, the kernel maps the UARs to write-combining (WC) if the system supports it. If the system is not supporting WC the UARs are mapped to non-cached(NC). In this case the user space library can't tell which mapping is applied. This patch adds 2 new mapping commands: MLX5_IB_MMAP_WC_PAGE and MLX5_IB_MMAP_NC_PAGE. For these commands the kernel maps exactly as requested and fails if it can't. Since there is no generic way to check if the requested memory region can be mapped as WC, driver enables conclusive WC mapping only for x86, PowerPC and ARM which support WC for the device's memory region. Signed-off-by: Guy Levy <guyle@mellanox.com> Signed-off-by: Moshe Lazer <moshel@mellanox.com> Signed-off-by: Matan Barak <matanb@mellanox.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
-
Matan Barak authored
The current mlx5 code disallows mapping the free running counter of mlx5 based hardwares when PROT_EXEC is set. Although this behaviour is correct, Linux does add an implicit VM_EXEC to the vm_flags if the READ_IMPLIES_EXEC bit is set in the process personality. This happens for example if the process stack is executable. This causes libmlx5 to output a warning and prevents the user from reading the free running clock. Executing the init segment of the hardware isn't a security risk (at least no more than executing a process own stack), so we just prevent writes to there. Fixes: d69e3bcf ('IB/mlx5: Mmap the HCA's core clock register to user-space') Signed-off-by: Matan Barak <matanb@mellanox.com> Reviewed-by: Haggai Eran <haggaie@mellanox.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
-
Geliang Tang authored
Simplify the code in search_relocate_mgid0_group with by using list_for_each_entry_safe(). Signed-off-by: Geliang Tang <geliangtang@163.com> Reviewed-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
-
Mark Bloch authored
Fixes a direct call to kfree_skb when nlmsg_free should be used. Fixes: 2ca546b9 ('IB/sa: Route SA pathrecord query through netlink') Signed-off-by: Mark Bloch <markb@mellanox.com> Reviewed-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Leon Romanovsky <leon@kernel.org> Reviewed-by: Ira Weiny <ira.weiny@intel.com> Reviewed-by: Steve Wise <swise@opengridcomputing.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
-
Mark Bloch authored
Fix array overrun when going over callback table. In declaration of callback table, the max size isn't provided and in registration phase, it is provided. There is potential scenario where a new operation is added and it is not supported by current client. The acceptance of such operation by ib_netlink will cause to array overrun. Fixes: 809d5fc9 ("infiniband: pass rdma_cm module to netlink_dump_start") Fixes: b493d91d ("iwcm: common code for port mapper") Fixes: 2ca546b9 ("IB/sa: Route SA pathrecord query through netlink") Signed-off-by: Mark Bloch <markb@mellanox.com> Reviewed-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Leon Romanovsky <leon@kernel.org> Signed-off-by: Doug Ledford <dledford@redhat.com>
-
Mark Bloch authored
RDMA_NL_GET_OP is defined like this: (type & ((1 << 10) - 1)) which means op (defined as an int) can never be a negative number. Fixes: b2cbae2c ('RDMA: Add netlink infrastructure') Signed-off-by: Mark Bloch <markb@mellanox.com> Reviewed-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Leon Romanovsky <leon@kernel.org> Signed-off-by: Doug Ledford <dledford@redhat.com>
-
Mark Bloch authored
In case ibnl_put_msg fails in send_nlmsg_done, the function returns with -ENOMEM without freeing. This patch fixes this behavior. Fixes: 30dc5e63 ("RDMA/core: Add support for iWARP Port Mapper user space service") Signed-off-by: Mark Bloch <markb@mellanox.com> Reviewed-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Leon Romanovsky <leon@kernel.org> Reviewed-by: Steve Wise <swise@opengridcomputing.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
-
Andy Shevchenko authored
There is no need to duplicate a lot of code that is in the kernel library for ages. Replace duplicating code by calling to print_hex_dump() directly. Note that output is slightly changed: - hex and ascii parts have just two spaces delimeter - there is no delimeter for ascii portions - file and line removed from prefix (they were redundant anyway since previous output shows same closer enough) Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Reviewed-by: Tatyana Nikolova <tatyana.e.nikolova@intel.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
-
Colin Ian King authored
fix spelling mistake, argumant -> argument Signed-off-by: Colin Ian King <colin.king@canonical.com> Reviewed-by: Bart Van Assche <bart.vanassche@sandisk.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
-
Denys Vlasenko authored
This function compiles to 550 bytes of machine code. Three callsites, all in nes_create_qp. Signed-off-by: Denys Vlasenko <dvlasenk@redhat.com> CC: Faisal Latif <faisal.latif@intel.com> CC: Doug Ledford <dledford@redhat.com> CC: linux-rdma@vger.kernel.org CC: linux-kernel@vger.kernel.org Reviewed-By: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
-
Tatyana Nikolova authored
Adding sq and rq drain functions, which block until all previously posted wr-s in the specified queue have completed. A completion object is signaled to unblock the thread, when the last cqe for the corresponding queue is processed. Signed-off-by: Tatyana Nikolova <tatyana.e.nikolova@intel.com> Signed-off-by: Faisal Latif <faisal.latif@intel.com> Reviewed-by: Steve Wise <swise@opengridcomputing.com> Reviewed-by: Steve Wise <swise@opengridcomputing.com> Reviewed-by: Sagi Grimberg <sagi@grimberg.me> Signed-off-by: Doug Ledford <dledford@redhat.com>
-
Bart Van Assche authored
Avoid that sparse complains about the comparison of s_addr with INADDR_ANY. Signed-off-by: Bart Van Assche <bart.vanassche@sandisk.com> Cc: Steve Wise <swise@opengridcomputing.com> Cc: Faisal Latif <faisal.latif@intel.com> Reviewed-by: Steve Wise <swise@opengridcomputing.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
-
Bart Van Assche authored
Avoid that the following BUG() is triggered against a debug kernel: kernel BUG at include/linux/scatterlist.h:92! RIP: 0010:[<ffffffffa0467199>] [<ffffffffa0467199>] srp_map_idb+0x199/0x1a0 [ib_srp] Call Trace: [<ffffffffa04685fa>] srp_map_data+0x84a/0x890 [ib_srp] [<ffffffffa0469674>] srp_queuecommand+0x1e4/0x610 [ib_srp] [<ffffffff813f5a5e>] scsi_dispatch_cmd+0x9e/0x180 [<ffffffff813f8b07>] scsi_request_fn+0x477/0x610 [<ffffffff81298ffe>] __blk_run_queue+0x2e/0x40 [<ffffffff81299070>] blk_delay_work+0x20/0x30 [<ffffffff81071f07>] process_one_work+0x197/0x480 [<ffffffff81072239>] worker_thread+0x49/0x490 [<ffffffff810787ea>] kthread+0xea/0x100 [<ffffffff8159b632>] ret_from_fork+0x22/0x40 Fixes: f7f7aab1 ("IB/srp: Convert to new registration API") Signed-off-by: Bart Van Assche <bart.vanassche@sandisk.com> Cc: Sagi Grimberg <sagi@grimberg.me> Cc: Christoph Hellwig <hch@lst.de> Cc: <stable@vger.kernel.org> # v4.4+ Reviewed-by: Max Gurtovoy <maxg@mellanox.com> Reviewed-by: Sagi Grimberg <sagi@grimberg.me> Signed-off-by: Doug Ledford <dledford@redhat.com>
-
Hans Westgaard Ry authored
IPoIB collects statistics of traffic including number of packets sent/received, number of bytes transferred, and certain errors. This patch makes these statistics available to be queried by ethtool. Signed-off-by: Hans Westgaard Ry <hans.westgaard.ry@oracle.com> Reviewed-by: Yuval Shaia <yuval.shaia@oracle.com> Reviewed-by: Santosh Shilimkar <santosh.shilimkar@oracle.com> Tested-by: Yuval Shaia <yuval.shaia@oracle.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
-
Sebastian Andrzej Siewior authored
The last user of pkey_mutex was removed in db84f880 ("IB/ipoib: Use P_Key change event instead of P_Key polling mechanism") but the lock remained. This patch removes it. Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> Reviewed-by: Bart Van Assche <bart.vanassche@sandisk.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
-
Colin Ian King authored
passing hw_stats by value requires a 280 byte copy so instead pass it by reference is much more efficient. Signed-off-by: Colin Ian King <colin.king@canonical.com> Acked-by: Chien Tin Tung <chien.tin.tung@intel.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
-
Lars-Peter Clausen authored
Calling synchronize_irq() right before free_irq() is quite useless. On one hand the IRQ can easily fire again before free_irq() is entered, on the other hand free_irq() itself calls synchronize_irq() internally (in a race condition free way), before any state associated with the IRQ is freed. Patch was generated using the following semantic patch: // <smpl> @@ expression irq; @@ -synchronize_irq(irq); free_irq(irq, ...); // </smpl> Signed-off-by: Lars-Peter Clausen <lars@metafoo.de> Reviewed-by: Leon Romanovsky <leonro@mellanox.com> Acked-by: Faisal Latif <faisal.latif#intel.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
-
Julia Lawall authored
The i40iw_vf_cqp_ops structure is never modified, so declare it as const. Done with the help of Coccinelle. Signed-off-by: Julia Lawall <Julia.Lawall@lip6.fr> Reviewed-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
-
Bart Van Assche authored
__force casts should be avoided if there is a better alternative. Hence modify the comparison of s_addr with INADDR_ANY such that the __force cast is no longer necessary. Signed-off-by: Bart Van Assche <bart.vanassche@sandisk.com> Cc: Steve Wise <swise@opengridcomputing.com> Cc: Vipul Pandya <vipul@chelsio.com> Acked-by: Steve Wise <swise@opengridcomputing.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
-
Hariprasad S authored
These handlers when called print error message to the kernel log, but the actual handling is done by _c4iw_free_ep() and process_timeout(). Signed-off-by: Steve Wise <swise@opengridcomputing.com> Signed-off-by: Hariprasad Shenai <hariprasad@chelsio.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
-
Hariprasad S authored
Currently c4iw_peer_abort_intr() does not wake up the waiter if the endpoint state indicates we're using MPAv2 and we're currently trying to connect. This was introduced with commit 7c0a33d6 ("RDMA/cxgb4: Don't wakeup threads for MPAv2") However, this original fix is flawed because it introduces a race that can cause a deadlock of the iwarp stack. Here is the race: ->local side sets up an active offload connection. ->local side sends MPA_START request. ->peer sends MPA_START response. ->local side ingress cpl thread begins processing the MPA_START response, but before it changes the state from MPA_REQ_SENT to FPDU_MODE: ->peer sends a RST which results in a ABORT_REQ_RSS. This triggers peer_abort_intr() which sees the state in MPA_REQ_SENT and since mpa_rev is 2, it will avoid waking up the endpoint with -ECONNRESET, assuming the stack will re-attempt the connection using MPAv1. ->Meanwhile, the cpl thread moves the state to FPDU_MODE and calls c4iw_modify_rc_qp() which calls rdma_init() which sends a RI_WR/INIT WR to firmware. But since HW sent an abort, FW correctly drops the RI_WR/INIT WR. ->So the cpl thread is stuck waiting for a reply and cannot process the ABORT_REQ_RSS cpl sitting in its input queue. Thus everything comes to a halt because no more ingress cpls are processed by the stack... The correct fix for the issue is to always do the wake up in c4iw_abort_intr() but reinitialize the wait object in c4iw_reconnect(). Fixes: 7c0a33d6 ("RDMA/cxgb4: Don't wakeup threads for MPAv2") Signed-off-by: Steve Wise <swise@opengridcomputing.com> Signed-off-by: Hariprasad Shenai <hariprasad@chelsio.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
-
Hariprasad S authored
Signed-off-by: Steve Wise <swise@opengridcomputing.com> Signed-off-by: Hariprasad Shenai <hariprasad@chelsio.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
-
Hariprasad S authored
Add get_ep_from_stid() which will atomically find and reference the endpoint struct if found. This avoids touch-after-free races between threads destroying listening endpoints and the CPL processing thread processing an incoming PASS_ACCEPT_REQ CPL. Signed-off-by: Steve Wise <swise@opengridcomputing.com> Signed-off-by: Hariprasad Shenai <hariprasad@chelsio.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
-
Hariprasad S authored
c4iw_reject() and c4iw_accept() need to handle the case where the endpoint has timed out and is in the middle of ABORTING the connection. Here is the flow that causes the BUG_ON() to fire on the server side: 1) offload connection setup and endpoint timer started 2) MPA_START request received from peer, CONNECT_REQUEST passed to ULP 3) endpoint timer fires, and process_timeout() aborts the connection, this moves the endpoint state to ABORTING until HW sends up the ABORT_RPL_RSS. 4) application exits closing the CONNECT_REQUEST cm_id. The IWCM calls c4iw_reject_cr() to destroy this connection request. 5) WHAMO: BUG_ON() because the state is ABORTING. The fix is to change c4iw_reject_cr() and c4iw_accept_cr() to fail the operation if the state is not in MPA_REQ_RCVD vs in DEAD. Signed-off-by: Steve Wise <swise@opengridcomputing.com> Signed-off-by: Hariprasad Shenai <hariprasad@chelsio.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
-
Hariprasad S authored
ARP failure may also happen when ep in FPDU_MODE and these failures need to be handled by process_timeout(). process_timeout() also has to handle case MPA_REQ_RCVD, setting abort to 1, leading to ep resource release. Signed-off-by: Steve Wise <swise@opengridcomputing.com> Signed-off-by: Hariprasad Shenai <hariprasad@chelsio.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
-
Hariprasad S authored
Arp failure for send_mpa_reply/reject() is handled by freeing the mpa_skb in c4iw_free_ep() before releasing ep. Signed-off-by: Hariprasad Shenai <hariprasad@chelsio.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
-
Hariprasad S authored
There is a race between ULP threads calling c4iw_ep_disconnect() via c4iw_modify_rc_qp() and the ingress CPL thread where the ULP thread can free the endpoint just after the ingress CPL thread finds the ep pointer in the tid table. To avoid this, we now use the hwtid_idr table for lookups instead of the LLD tid table so we can lock around insert, remove, and lookup+get_ep to avoid the race. The CPL handlers now will either find the ep ptr and have a ref on it, or not find it and they can discard the CPL. Callers of get_ep_from_tid() will have a ref on the ep if found, and thus must deref when they are done. Negative advice in peer_abort_intr() need to dereference the ep. therefore peer_abort() is scheduled to dereference the ep later. Signed-off-by: Steve Wise <swise@opengridcomputing.com> Signed-off-by: Hariprasad Shenai <hariprasad@chelsio.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
-
Hariprasad S authored
In abort_arp_failure(), the return value from c4iw_ofld_send() is ignored and thus if the CPL isn't sent, the endpoint is stuck and never gets aborted. Failure of c4iw_ofld_send() is treated as fatal error, and the ep resources are released in a safer context through process_work(). Signed-off-by: Steve Wise <swise@opengridcomputing.com> Signed-off-by: Hariprasad Shenai <hariprasad@chelsio.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
-
Hariprasad S authored
Moving the state to ABORTING causes the ep to get stuck because c4iw_ep_timeout() thinks the ABORT has already been done. So leave the state alone and let c4iw_ep_disconnect() do the right thing given the ep state. Signed-off-by: Steve Wise <swise@opengridcomputing.com> Signed-off-by: Hariprasad Shenai <hariprasad@chelsio.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
-
Hariprasad S authored
->In act_open_rpl(), CPL_ERR_TCAM_FULL error handling branch, there is no handling of the return value of send_fw_act_open_req(). ->In send_fw_act_open_req(), there is no handling of return value of c4iw_l2t_send(), which may cause a ep leak and won't notify upper layers on connection establish failure. ->send_mpa_req() should act on the return from c4iw_l2t_send() and return the error to the caller. ->In case of c4iw_l2t_send() failure in send_mpa_req(), returns without starting the timer and not changing the ep state, which is further handled by act_establish() -> In act_establish()?if send_mpa_request's get_skb returns an error, may cause an ep leak. So handle return value of send_mpa_req() Signed-off-by: Steve Wise <swise@opengridcomputing.com> Signed-off-by: Hariprasad Shenai <hariprasad@chelsio.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
-
Hariprasad S authored
->Stop the ep timer after MPA negotiation so that the arp failures during send_mpa_reply/reject will be handled by process_timeout() after the ep timer expires. ->Added case MPA_REP_SENT in process_timeout(). ->For MPA reject, c4iw_ep_disconnect tries to start an already started timer, which leads to warning message "timer already started". -> In case of mpa reject stop the timer and call send_mpa_reject(). -> Added new ep flag STOP_MPA_TIMER to tell fw4_ack() to stop the timer only for send_mpa_reply(), which is set in c4iw_accept_cr(). Signed-off-by: Steve Wise <swise@opengridcomputing.com> Signed-off-by: Hariprasad Shenai <hariprasad@chelsio.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
-