- 31 Jul, 2017 4 commits
-
-
Kaike Wan authored
When an egress resource(SDMA descriptors, pio credits) is not available, a sending thread will be put on the resource's wait queue. When the resource becomes available again, up to a fixed number of sending threads can be awakened sequentially and removed from the wait queue, depending on the number of waiting threads and the number of free resources. Since each awakened sending thread will send as many packets as possible, it is highly likely that the first sending thread will consume all the egress resources. Subsequently, it will be put back to the end of the wait queue. Depending on the timing when the later sending threads wake up, they may not be able to send any packet and be again put back to the end of the wait queue sequentially, right behind the first sending thread. This starvation cycle continues until some sending threads exceed their retry limit and consequently fail. This patch fixes the issue by two simple approaches: (1) Any starved sending thread will be put to the head of the wait queue while a served sending thread will be put to the tail; (2) The most starved sending thread will be served first. Reviewed-by: Mike Marciniszyn <mike.marciniszyn@intel.com> Signed-off-by: Kaike Wan <kaike.wan@intel.com> Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
-
Mike Marciniszyn authored
When the debugpat kernel boot flag is turned on the following traces are printed: [ 1884.793168] x86/PAT: Overlap at 0x90000000-0x92000000 [ 1884.803510] x86/PAT: reserve_memtype added [mem 0x91200000-0x9127ffff], track uncached-minus, req write-combining, ret uncached-minus [ 1884.818167] hfi1 0000:05:00.0: hfi1_0: WC Remapped RcvArray: ffffc9000a980000 The ioremap_wc() clearly is not returning a write combining mapping due to an overlap where the RcvArray is mapped in a uncached mapping prior to creating the proposed write combining mapping. The patch replaces the single base register for uncached CSRs that used to overlap the RcvArray with two mappings. One, kregbase1, from the bar0 up to the RcvArray and another, kregbase2, from the end of the RcvArray to the pio send buffer space. A new dd field, base2_start, is used to convert the zero-based offset in the CSR routines to the correct kregbase1/kregbase2 mapping. A single direct write of the RcvArray CSRs is replaced with hfi1_put_tid() to insure correct access using the new disjoint mapping. Additionally, the kregend field is deleted since it is only ever written. patdebug now shows the RcvArray as write combining: [ 35.688990] x86/PAT: reserve_memtype added [mem 0x91200000-0x9127ffff], track write-combining, req write-combining, ret write-combining To insulate from any potential issues with write combining, all writeq are now flushed in hfi1_put_tid() and rcv_array_wc_fill(). Reviewed-by: Mitko Haralanov <mitko.haralanov@intel.com> Reviewed-by: Ashutosh Dixit <ashutosh.dixit@intel.com> Signed-off-by: Mike Marciniszyn <mike.marciniszyn@intel.com> Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
-
Bartlomiej Dudek authored
Ensure that return values from kernel PCI config access API calls in HFI driver are checked and react properly if they are not expected (i.e. not successful). Reviewed-by: Jakub Byczkowski <jakub.byczkowski@intel.com> Signed-off-by: Bartlomiej Dudek <bartlomiej.dudek@intel.com> Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
-
Arnd Bergmann authored
I ran into this build error on linux-next: drivers/infiniband/hw/hns/hns_roce_eq.c:477:8: error: unknown type name 'irqreturn_t' static irqreturn_t hns_roce_msi_x_interrupt(int irq, void *eq_ptr) ^~~~~~~~~~~ drivers/infiniband/hw/hns/hns_roce_eq.c: In function 'hns_roce_msi_x_interrupt': drivers/infiniband/hw/hns/hns_roce_eq.c:485:9: error: implicit declaration of function 'IRQ_RETVAL'; did you mean 'BPF_RVAL'? [-Werror=implicit-function-declaration] return IRQ_RETVAL(int_work); I have bisected this to a seemingly unrelated change that happened to remove some indirect header inclusions. Simply including the required header explicitly fixes the build failure. Fixes: 09c75704 ("xfrm: remove flow cache") Signed-off-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: Doug Ledford <dledford@redhat.com>
-
- 28 Jul, 2017 6 commits
-
-
Doug Ledford authored
The 0day build system flags implicit includes as errors. A patch from Matan Barak to allow hns_roce, an aarch64 specific RDMA driver, to be built on other arches, but it resulted in build regressions. The problem is that hns_roce_device.h needs a definition for __raw_writeq but did not have an include to provide it. Add <linux/io.h> as an include to resolve the issue. Signed-off-by: Doug Ledford <dledford@redhat.com>
-
kbuild test robot authored
drivers/infiniband/hw/hns/hns_roce_eq.c:295:3-4: Unneeded semicolon Remove unneeded semicolon. Generated by: scripts/coccinelle/misc/semicolon.cocci Fixes: 7d1b6a678e0b ("IB/hns: Support compile test for hns RoCE driver") CC: Matan Barak <matanb@mellanox.com> Signed-off-by: Fengguang Wu <fengguang.wu@intel.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
-
kbuild test robot authored
drivers/infiniband/hw/hns/hns_roce_hw_v1.c:2026:5-8: Unneeded variable: "ret". Return "0" on line 2046 Remove unneeded variable used to store return value. Generated by: scripts/coccinelle/misc/returnvar.cocci Fixes: 7d1b6a678e0b ("IB/hns: Support compile test for hns RoCE driver") CC: Matan Barak <matanb@mellanox.com> Signed-off-by: Fengguang Wu <fengguang.wu@intel.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
-
kbuild test robot authored
drivers/infiniband/hw/hns/hns_roce_qp.c:802:9-10: WARNING: return of 0/1 in function 'hns_roce_wq_overflow' with return type bool Return statements in functions returning bool should use true/false instead of 1/0. Generated by: scripts/coccinelle/misc/boolreturn.cocci Fixes: 7d1b6a678e0b ("IB/hns: Support compile test for hns RoCE driver") CC: Matan Barak <matanb@mellanox.com> Signed-off-by: Fengguang Wu <fengguang.wu@intel.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
-
Matan Barak authored
Compiling the hns RoCE driver requires ARM architecture. In order to simplify development of IB/core, support compile test. Add the necessary includes for that too. Signed-off-by: Matan Barak <matanb@mellanox.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
-
Doug Ledford authored
The initial patch for changing the stack to use RoCEv2 GIDs by default set the CMA_PREFERRED_ROCE_GID_TYPE to an incorrect value. Instead of an absolute value, we needed to set the right bit in a bitmask. Correct the default setting so we use RoCEv2 by default. Fixes: 63a5f483 (IB/cma: Set default gid type to RoCEv2) Signed-off-by: Doug Ledford <dledford@redhat.com>
-
- 27 Jul, 2017 4 commits
-
-
Doug Ledford authored
-
Amrani, Ram authored
The number of supported WIDs, if they are supported at all, can be limited due to resources. Notifying the user space application the number of available WIDs allows it to utilize them correctly. Signed-off-by: Ram Amrani <Ram.Amrani@cavium.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
-
Amrani, Ram authored
Direct Packet Mode support may be disabled, e.g, due to limited resources. Notifying the user application prevents wasting cycles on attempting to send these kind of packets. Signed-off-by: Ram Amrani <Ram.Amrani@cavium.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
-
Doug Ledford authored
-
- 24 Jul, 2017 26 commits
-
-
Colin Ian King authored
Trivial fix to spelling mistake in mlx5_ib_dbg message Signed-off-by: Colin Ian King <colin.king@canonical.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
-
Kaike Wan authored
This patch adds qp->s_ack_queue indices and size to qp_stats printout. This information will provide information about the receiving side. Reviewed-by: Mike Marciniszyn <mike.marciniszyn@intel.com> Signed-off-by: Kaike Wan <kaike.wan@intel.com> Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
-
Guy Levi authored
As a final step to support RSS feature, expose the RSS capabilities in query device verb. It includes: - Max rwq indirection tables. - Max rwq indirection table size. - Supported qp types. Signed-off-by: Guy Levi <guyle@mellanox.com> Reviewed-by: Yishai Hadas <yishaih@mellanox.com> Signed-off-by: Leon Romanovsky <leon@kernel.org> Signed-off-by: Doug Ledford <dledford@redhat.com>
-
Guy Levi authored
Add support to work with a RSS QP by using an indirection table object upon QP creation. Other related QP verbs (e.g. modify/destroy/query) were updated as well for that QP mode. Notes: - The RX hash properties are supplied as driver private data. - The RSS QP port is used on the associated WQs in its indirection table. Applying different ports during WQ life time is not allowed. - The expected RSS QP flow is: create, modify(RST->INIT), modify(RST->RTR), destroy. Signed-off-by: Guy Levi <guyle@mellanox.com> Reviewed-by: Yishai Hadas <yishaih@mellanox.com> Signed-off-by: Leon Romanovsky <leon@kernel.org> Signed-off-by: Doug Ledford <dledford@redhat.com>
-
Guy Levi authored
To enable RSS functionality the IB indirection table object (i.e. ib_rwq_ind_table) should be used. This patch implements the related verbs as of create and destroy an indirection table. In downstream patches the indirection table will be used as part of RSS QP creation. Signed-off-by: Guy Levi <guyle@mellanox.com> Reviewed-by: Yishai Hadas <yishaih@mellanox.com> Signed-off-by: Leon Romanovsky <leon@kernel.org> Signed-off-by: Doug Ledford <dledford@redhat.com>
-
Guy Levi authored
Support create/modify/destroy WQ related verbs. The base IB object to enable RSS functionality is a WQ (i.e. ib_wq). This patch implements the related WQ verbs as of create, modify and destroy. In downstream patches the WQ will be used as part of an indirection table (i.e. ib_rwq_ind_table) to enable RSS QP creation. Notes: ConnectX-3 hardware requires consecutive WQNs list as receive descriptor queues for the RSS QP. Hence, the driver manages consecutive ranges lists per context which the user must respect. Destroying the WQ does not return its WQN back to its range for reusing. However, destroying all WQs from the same range releases the range and in turn releases its WQNs for reusing. Since the WQ object is not a natural object in the hardware, the driver implements the WQ by the hardware QP. As such, the WQ inherits its port from its RSS QP parent upon its RST->INIT transition and by that time its state is applied to the hardware. Signed-off-by: Guy Levi <guyle@mellanox.com> Reviewed-by: Yishai Hadas <yishaih@mellanox.com> Signed-off-by: Leon Romanovsky <leon@kernel.org> Signed-off-by: Doug Ledford <dledford@redhat.com>
-
Moshe Shemesh authored
Adding visibility of resource usage of QPs, CQs and counters used by virtual functions. This feature will be used to give the PF administrator more data while debugging VF status. Usage info was added to ALLOC_RES command, to notify the PF if the resource which is being reserved or allocated for the VF will be used by kernel driver or by user verbs. Updated reservation and allocation functions of QP, CQ and counter with additional usage parameter. Signed-off-by: Moshe Shemesh <moshe@mellanox.com> Signed-off-by: Leon Romanovsky <leon@kernel.org> Signed-off-by: Doug Ledford <dledford@redhat.com>
-
Maor Gottlieb authored
When inline-receive is enabled, the HCA may write received data into the receive WQE. Inline-receive is enabled by setting its matching bit in the QP context and each single-packet message with payload not exceeding the receive WQE size will be delivered to the WQE. The completion report will indicate that the payload was placed to the WQE. It includes: 1) Return maximum supported size of inline-receive by the hardware in query_device vendor's data part. 2) Enable the feature when requested by the vendor data input. Signed-off-by: Maor Gottlieb <maorg@mellanox.com> Reviewed-by: Yishai Hadas <yishaih@mellanox.com> Signed-off-by: Leon Romanovsky <leon@kernel.org> Signed-off-by: Doug Ledford <dledford@redhat.com>
-
Parav Pandit authored
This patch adds below requester and responder side error counters, which will be exposed by hardware counters interface and are supported as part of query Q counters command extension. +---------------------------+-------------------------------------+ | Name | Description | |---------------------------+-------------------------------------| |resp_local_length_error | Number of times responder detected | | | local length errors | |---------------------------+-------------------------------------| |resp_cqe_error | Number of CQEs completed with error | | | at responder | |---------------------------+-------------------------------------| |req_cqe_error | Number of CQEs completed with error | | | at requester | |---------------------------+-------------------------------------| |req_remote_invalid_request | Number of times requester detected | | | remote invalid request error | |---------------------------+-------------------------------------| |req_remote_access_error | Number of times requester detected | | | remote access error | |---------------------------+-------------------------------------| |resp_remote_access_error | Number of times responder detected | | | remote access error | |---------------------------+-------------------------------------| |resp_cqe_flush_error | Number of CQEs completed with | | | flushed with error at responder | |---------------------------+-------------------------------------| |req_cqe_flush_error | Number of CQEs completed with | | | flushed with error at requester | +---------------------------+-------------------------------------+ Signed-off-by: Parav Pandit <parav@mellanox.com> Reviewed-by: Daniel Jurgens <danielj@mellanox.com> Reviewed-by: Eli Cohen <eli@mellanox.com> Signed-off-by: Leon Romanovsky <leon@kernel.org> Signed-off-by: Doug Ledford <dledford@redhat.com>
-
Leon Romanovsky authored
The extended address vector is the highest bit in be32 variable, but it was compared with the lowest. This patch fixes the endianness of that check and removes already declared define. Fixes: 17d2f88f ("IB/mlx5: Add ODP atomics support") Reviewed-by: Artemy Kovalyov <artemyko@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
-
Majd Dibbiny authored
When we have a miss in one order of the mkey cache, we try to get an mkey from a higher order. We still need to check that the higher order can be used with UMR before using it. Otherwise, we will get an mkey with 0 entries and the post send operation that is used to fill it will complete with the following error: mlx5_0:dump_cqe:275:(pid 0): dump error cqe 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000 0f007806 25000025 49ce59d2 Fixes: 49780d42 ("IB/mlx5: Expose MR cache for mlx5_ib") Cc: <stable@vger.kernel.org> # v4.10+ Signed-off-by: Majd Dibbiny <majd@mellanox.com> Reviewed-by: Ilya Lesokhin <ilyal@mellanox.com> Signed-off-by: Leon Romanovsky <leon@kernel.org> Signed-off-by: Doug Ledford <dledford@redhat.com>
-
Yishai Hadas authored
Report RX checksum capabilities when 'ipoib_enhanced_offloads' capabilities are set and support checksum. Signed-off-by: Yishai Hadas <yishaih@mellanox.com> Reviewed-by: Maor Gottlieb <maorg@mellanox.com> Signed-off-by: Leon Romanovsky <leon@kernel.org> Signed-off-by: Doug Ledford <dledford@redhat.com>
-
Yishai Hadas authored
Report 'ipoib_enhanced_offloads' capabilities from the core layer, it will be used in the next patch from this series. Signed-off-by: Yishai Hadas <yishaih@mellanox.com> Reviewed-by: Maor Gottlieb <maorg@mellanox.com> Signed-off-by: Leon Romanovsky <leon@kernel.org> Signed-off-by: Doug Ledford <dledford@redhat.com>
-
Yishai Hadas authored
In order to add multicast flow steering support, there is need to block the attaching of mcg flow for underlay QP, recognize multicast IB_FLOW_SPEC_IPV4 based on its IP and enable creating/destroying flow for IB layer. Signed-off-by: Yishai Hadas <yishaih@mellanox.com> Reviewed-by: Maor Gottlieb <maorg@mellanox.com> Signed-off-by: Leon Romanovsky <leon@kernel.org> Signed-off-by: Doug Ledford <dledford@redhat.com>
-
Yishai Hadas authored
Allow user space applications to accelerate send and receive traffic which is typically handled by IPoIB ULP by creating a UD QP with a given source QPN of the IPoIB UD QP. UD QP with a given source QPN should basically be similar to RAW QP from point of view of its created resources. However, - Its TIS should point to the source QPN. - Modify can be done only on its state as the transport attributes are managed by its source QP. This patch manages below: - Creating/destroying/modifying UD QP with a given source QPN. Signed-off-by: Yishai Hadas <yishaih@mellanox.com> Reviewed-by: Maor Gottlieb <maorg@mellanox.com> Signed-off-by: Leon Romanovsky <leon@kernel.org> Signed-off-by: Doug Ledford <dledford@redhat.com>
-
Yishai Hadas authored
Enable QP creation with a given source QP number, the created QP will use the source QPN as its wire QP number. To create such a QP, root privileges (i.e. CAP_NET_RAW) are required from the user application. This comes as a pre-patch for downstream patches in this series to allow user space applications to accelerate traffic which is typically handled by IPoIB ULP. Signed-off-by: Yishai Hadas <yishaih@mellanox.com> Reviewed-by: Maor Gottlieb <maorg@mellanox.com> Signed-off-by: Leon Romanovsky <leon@kernel.org> Signed-off-by: Doug Ledford <dledford@redhat.com>
-
Yishai Hadas authored
Enable QP creation with a given source QP number. The created QP will use the source QPN as its wire QP number. This comes as a pre-patch for downstream patches in this series to allow user space applications to accelerate traffic which is typically handled by IPoIB ULP. Signed-off-by: Yishai Hadas <yishaih@mellanox.com> Reviewed-by: Maor Gottlieb <maorg@mellanox.com> Signed-off-by: Leon Romanovsky <leon@kernel.org> Signed-off-by: Doug Ledford <dledford@redhat.com>
-
Noa Osherovich authored
When creating address handle from multicast GID, set MAC according to the appropriate formula instead of searching for it in the GID table: - For IPv4 multicast GID use ip_eth_mc_map(). - For IPv6 multicast GID use ipv6_eth_mc_map(). Signed-off-by: Noa Osherovich <noaos@mellanox.com> Reviewed-by: Moni Shoua <monis@mellanox.com> Signed-off-by: Leon Romanovsky <leon@kernel.org> Signed-off-by: Doug Ledford <dledford@redhat.com>
-
Noa Osherovich authored
RoCEv2 Annex states that for RoCEv2 over IPv4, the corresponding IPv4 address is encoded into the GID according to the following rule: GID= :ffff:<IPv4 address> Remove the 0xff0e prefix for RoCEv2 packets with IPv4 and leave it zeroed and change rdma_is_multicast_addr() to consider the new logic. Signed-off-by: Noa Osherovich <noaos@mellanox.com> Reviewed-by: Moni Shoua <monis@mellanox.com> Signed-off-by: Leon Romanovsky <leon@kernel.org> Signed-off-by: Doug Ledford <dledford@redhat.com>
-
Noa Osherovich authored
RoCE Annex (A16.9.10/11) declares that during attach (detach) QP to a multicast group, if the QP is associated with a RoCE port, the multicast group MLID is unused and is ignored. During attach or detach multicast, when the QP is associated with a port, it is enough to check the port's link layer and validate the LID only if it is Infiniband. Otherwise, avoid validating the multicast LID. Fixes: 8561eae6 ("IB/core: For multicast functions, verify that LIDs are multicast LIDs") Signed-off-by: Noa Osherovich <noaos@mellanox.com> Reviewed-by: Moni Shoua <monis@mellanox.com> Signed-off-by: Leon Romanovsky <leon@kernel.org> Signed-off-by: Doug Ledford <dledford@redhat.com>
-
Maor Gottlieb authored
Add debugfs interface for monitor the number of delay drop timeout events and the number of existing dropless RQs in the system. In addition add debugfs interface for configuring the global timeout value which is used in the SET_DELAY_DROP command. Signed-off-by: Maor Gottlieb <maorg@mellanox.com> Reviewed-by: Yishai Hadas <yishaih@mellanox.com> Signed-off-by: Leon Romanovsky <leon@kernel.org> Signed-off-by: Doug Ledford <dledford@redhat.com>
-
Maor Gottlieb authored
RQs that were configured for "delay drop" will prevent packet drops when their WQEs are depleted. Marking an RQ to be drop-less is done by setting delay_drop_en in RQ context using CREATE_RQ command. Since this feature is globally activated/deactivated by using the SET_DELAY_DROP command on all the marked RQs, we activated/deactivated it according to the number of RQs with 'delay_drop' enabled. When timeout is expired, then the feature is deactivated. Therefore the driver handles the delay drop timeout event and reactivate it. Signed-off-by: Maor Gottlieb <maorg@mellanox.com> Reviewed-by: Yishai Hadas <yishaih@mellanox.com> Signed-off-by: Leon Romanovsky <leon@kernel.org> Signed-off-by: Doug Ledford <dledford@redhat.com>
-
Maor Gottlieb authored
When delay drop timeout is expired, the firmware raises general notification event of DELAY_DROP_TIMEOUT subtype. In addition the feature is disable so the driver have to reactivate the timeout. Signed-off-by: Maor Gottlieb <maorg@mellanox.com> Reviewed-by: Yishai Hadas <yishaih@mellanox.com> Signed-off-by: Leon Romanovsky <leon@kernel.org> Signed-off-by: Doug Ledford <dledford@redhat.com>
-
Maor Gottlieb authored
Add support to SET_DELAY_DROP command. This command will be used in downstream patches for delay packet drop. The timeout value should be indicated by delay_drop_timeout field. Packet processing will be delayed till timeout value passed or until more WQEs are posted. Setting this value to 0 disables the feature. Signed-off-by: Maor Gottlieb <maorg@mellanox.com> Reviewed-by: Yishai Hadas <yishaih@mellanox.com> Signed-off-by: Leon Romanovsky <leon@kernel.org> Signed-off-by: Doug Ledford <dledford@redhat.com>
-
Maor Gottlieb authored
Work queue which is created with IB_WQ_FLAGS_DELAY_DROP won't cause packet drops when there aren't receive WQEs, but will wait until posting of receive WQEs or for some period of time that the device was configured with. It includes: * Add a new creation flag to enable delay drop functionality in a WQ. * A new capability was introduced - IB_RAW_PACKET_CAP_DELAY_DROP, which is the device's ability to delay packet drops when there aren't receive WQEs. Signed-off-by: Maor Gottlieb <maorg@mellanox.com> Reviewed-by: Yishai Hadas <yishaih@mellanox.com> Signed-off-by: Leon Romanovsky <leon@kernel.org> Signed-off-by: Doug Ledford <dledford@redhat.com>
-
Bodong Wang authored
When a user sets port_guid, node_guid or policy of an IB virtual function, save this information in "struct mlx5_vf_context". This information will be restored later when pci_resume is called. To make sure this works, one can use aer-inject to generate PCI errors on mlx5 devices and verify if relevant fields are restored after PCI resume. Signed-off-by: Bodong Wang <bodong@mellanox.com> Reviewed-by: Eli Cohen <eli@mellanox.com> Signed-off-by: Leon Romanovsky <leon@kernel.org> Signed-off-by: Doug Ledford <dledford@redhat.com>
-