Commit aa9d4648 authored by Linus Torvalds's avatar Linus Torvalds

Merge tag 'for-linus-ioctl' of git://git.kernel.org/pub/scm/linux/kernel/git/dledford/rdma

Pull rdma updates from Doug Ledford:
 "This is a big pull request.

  Of note is that I'm sending you the new ioctl API for the rdma
  subsystem. We put it up on linux-api@, but didn't get much response.
  The API is complex, but it solves two different problems in one go:

   1) The bi-directional nature of the RDMA file write calls, which
      created the security hole we had to handle (and for which the fix
      is now causing problems for systems in production, we were a bit
      over zealous in the fix and the ability to open a device, then
      fork, then create new queue pairs on the device and use them is
      broken).

   2) The bloat caused by different vendors implementing extensions to
      the base verbs API. Each vendor's hardware is slightly different,
      and the hardware might be suitable for one extension but not
      another.

      By the time we add generic extensions for all the different ways
      that the different hardware can offload things, the API becomes
      bloated. Things like our completion structs have started to exceed
      a cache line in size because of all the elements needed to support
      this. That in turn shows up heavily in the performance graphs with
      a noticable drop in performance on 100Gigabit links as our
      completion structs go from occupying one cache line to 1+.

      This API makes things like the completion structs modular in a
      very similar way to netlink so that your structs can only include
      the items needed for the offloads/features you are actually using
      on a given queue pair. In that way we support everything, but only
      use what we need, and our structs stay smaller.

  The ioctl API is better explained by the posting on linux-api@ than I
  can explain it here, so I'll just leave it at that.

  The rest of the pull request is typical stuff.

  Updates for 4.14 kernel merge window

   - Lots of hfi1 driver updates (mixed with a few qib and core updates
     as well)

   - rxe updates

   - various mlx updates

   - Set default roce type to RoCEv2

   - Several larger fixes for bnxt_re that were too big for -rc

   - Several larger fixes for qedr that, likewise, were too big for -rc

   - Misc core changes

   - Make the hns_roce driver compilable on arches other than aarch64 so
     we can more easily debug build issues related to it

   - Add rdma-netlink infrastructure updates

   - Add automatic IRQ affinity infrastructure

   - Add 32bit lid support

   - Lots of misc fixes across the subsystem from random people

   - Autoloading of RDMA netlink modules

   - PCI pool cleanups from Romain Perier

   - mlx5 driver feature additions and fixes

   - Hardware tag matchine feature

   - Fix sleeping in atomic when resolving roce ah

   - Add experimental ioctl interface as posted to linux-api@"

* tag 'for-linus-ioctl' of git://git.kernel.org/pub/scm/linux/kernel/git/dledford/rdma: (328 commits)
  IB/core: Expose ioctl interface through experimental Kconfig
  IB/core: Assign root to all drivers
  IB/core: Add completion queue (cq) object actions
  IB/core: Add legacy driver's user-data
  IB/core: Export ioctl enum types to user-space
  IB/core: Explicitly destroy an object while keeping uobject
  IB/core: Add macros for declaring methods and attributes
  IB/core: Add uverbs merge trees functionality
  IB/core: Add DEVICE object and root tree structure
  IB/core: Declare an object instead of declaring only type attributes
  IB/core: Add new ioctl interface
  RDMA/vmw_pvrdma: Fix a signedness
  RDMA/vmw_pvrdma: Report network header type in WC
  IB/core: Add might_sleep() annotation to ib_init_ah_from_wc()
  IB/cm: Fix sleeping in atomic when RoCE is used
  IB/core: Add support to finalize objects in one transaction
  IB/core: Add a generic way to execute an operation on a uobject
  Documentation: Hardware tag matching
  IB/mlx5: Support IB_SRQT_TM
  net/mlx5: Add XRQ support
  ...
parents 906dde0f 8eb19e8e
Tag matching logic
The MPI standard defines a set of rules, known as tag-matching, for matching
source send operations to destination receives. The following parameters must
match the following source and destination parameters:
* Communicator
* User tag - wild card may be specified by the receiver
* Source rank – wild car may be specified by the receiver
* Destination rank – wild
The ordering rules require that when more than one pair of send and receive
message envelopes may match, the pair that includes the earliest posted-send
and the earliest posted-receive is the pair that must be used to satisfy the
matching operation. However, this doesn’t imply that tags are consumed in
the order they are created, e.g., a later generated tag may be consumed, if
earlier tags can’t be used to satisfy the matching rules.
When a message is sent from the sender to the receiver, the communication
library may attempt to process the operation either after or before the
corresponding matching receive is posted. If a matching receive is posted,
this is an expected message, otherwise it is called an unexpected message.
Implementations frequently use different matching schemes for these two
different matching instances.
To keep MPI library memory footprint down, MPI implementations typically use
two different protocols for this purpose:
1. The Eager protocol- the complete message is sent when the send is
processed by the sender. A completion send is received in the send_cq
notifying that the buffer can be reused.
2. The Rendezvous Protocol - the sender sends the tag-matching header,
and perhaps a portion of data when first notifying the receiver. When the
corresponding buffer is posted, the responder will use the information from
the header to initiate an RDMA READ operation directly to the matching buffer.
A fin message needs to be received in order for the buffer to be reused.
Tag matching implementation
There are two types of matching objects used, the posted receive list and the
unexpected message list. The application posts receive buffers through calls
to the MPI receive routines in the posted receive list and posts send messages
using the MPI send routines. The head of the posted receive list may be
maintained by the hardware, with the software expected to shadow this list.
When send is initiated and arrives at the receive side, if there is no
pre-posted receive for this arriving message, it is passed to the software and
placed in the unexpected message list. Otherwise the match is processed,
including rendezvous processing, if appropriate, delivering the data to the
specified receive buffer. This allows overlapping receive-side MPI tag
matching with computation.
When a receive-message is posted, the communication library will first check
the software unexpected message list for a matching receive. If a match is
found, data is delivered to the user buffer, using a software controlled
protocol. The UCX implementation uses either an eager or rendezvous protocol,
depending on data size. If no match is found, the entire pre-posted receive
list is maintained by the hardware, and there is space to add one more
pre-posted receive to this list, this receive is passed to the hardware.
Software is expected to shadow this list, to help with processing MPI cancel
operations. In addition, because hardware and software are not expected to be
tightly synchronized with respect to the tag-matching operation, this shadow
list is used to detect the case that a pre-posted receive is passed to the
hardware, as the matching unexpected message is being passed from the hardware
to the software.
......@@ -206,4 +206,9 @@ config BLK_MQ_VIRTIO
depends on BLOCK && VIRTIO
default y
config BLK_MQ_RDMA
bool
depends on BLOCK && INFINIBAND
default y
source block/Kconfig.iosched
......@@ -29,6 +29,7 @@ obj-$(CONFIG_BLK_CMDLINE_PARSER) += cmdline-parser.o
obj-$(CONFIG_BLK_DEV_INTEGRITY) += bio-integrity.o blk-integrity.o t10-pi.o
obj-$(CONFIG_BLK_MQ_PCI) += blk-mq-pci.o
obj-$(CONFIG_BLK_MQ_VIRTIO) += blk-mq-virtio.o
obj-$(CONFIG_BLK_MQ_RDMA) += blk-mq-rdma.o
obj-$(CONFIG_BLK_DEV_ZONED) += blk-zoned.o
obj-$(CONFIG_BLK_WBT) += blk-wbt.o
obj-$(CONFIG_BLK_DEBUG_FS) += blk-mq-debugfs.o
......
/*
* Copyright (c) 2017 Sagi Grimberg.
*
* This program is free software; you can redistribute it and/or modify it
* under the terms and conditions of the GNU General Public License,
* version 2, as published by the Free Software Foundation.
*
* This program is distributed in the hope it will be useful, but WITHOUT
* ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or
* FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for
* more details.
*/
#include <linux/blk-mq.h>
#include <linux/blk-mq-rdma.h>
#include <rdma/ib_verbs.h>
/**
* blk_mq_rdma_map_queues - provide a default queue mapping for rdma device
* @set: tagset to provide the mapping for
* @dev: rdma device associated with @set.
* @first_vec: first interrupt vectors to use for queues (usually 0)
*
* This function assumes the rdma device @dev has at least as many available
* interrupt vetors as @set has queues. It will then query it's affinity mask
* and built queue mapping that maps a queue to the CPUs that have irq affinity
* for the corresponding vector.
*
* In case either the driver passed a @dev with less vectors than
* @set->nr_hw_queues, or @dev does not provide an affinity mask for a
* vector, we fallback to the naive mapping.
*/
int blk_mq_rdma_map_queues(struct blk_mq_tag_set *set,
struct ib_device *dev, int first_vec)
{
const struct cpumask *mask;
unsigned int queue, cpu;
for (queue = 0; queue < set->nr_hw_queues; queue++) {
mask = ib_get_vector_affinity(dev, first_vec + queue);
if (!mask)
goto fallback;
for_each_cpu(cpu, mask)
set->mq_map[cpu] = queue;
}
return 0;
fallback:
return blk_mq_map_queues(set);
}
EXPORT_SYMBOL_GPL(blk_mq_rdma_map_queues);
......@@ -34,6 +34,15 @@ config INFINIBAND_USER_ACCESS
libibverbs, libibcm and a hardware driver library from
<http://www.openfabrics.org/git/>.
config INFINIBAND_EXP_USER_ACCESS
bool "Allow experimental support for Infiniband ABI"
depends on INFINIBAND_USER_ACCESS
---help---
IOCTL based ABI support for Infiniband. This allows userspace
to invoke the experimental IOCTL based ABI.
These commands are parsed via per-device parsing tree and
enables per-device features.
config INFINIBAND_USER_MEM
bool
depends on INFINIBAND_USER_ACCESS != n
......
......@@ -11,7 +11,8 @@ ib_core-y := packer.o ud_header.o verbs.o cq.o rw.o sysfs.o \
device.o fmr_pool.o cache.o netlink.o \
roce_gid_mgmt.o mr_pool.o addr.o sa_query.o \
multicast.o mad.o smi.o agent.o mad_rmpp.o \
security.o
security.o nldev.o
ib_core-$(CONFIG_INFINIBAND_USER_MEM) += umem.o
ib_core-$(CONFIG_INFINIBAND_ON_DEMAND_PAGING) += umem_odp.o umem_rbtree.o
ib_core-$(CONFIG_CGROUP_RDMA) += cgroup.o
......@@ -31,4 +32,5 @@ ib_umad-y := user_mad.o
ib_ucm-y := ucm.o
ib_uverbs-y := uverbs_main.o uverbs_cmd.o uverbs_marshall.o \
rdma_core.o uverbs_std_types.o
rdma_core.o uverbs_std_types.o uverbs_ioctl.o \
uverbs_ioctl_merge.o
......@@ -130,13 +130,11 @@ static void ib_nl_process_good_ip_rsep(const struct nlmsghdr *nlh)
}
int ib_nl_handle_ip_res_resp(struct sk_buff *skb,
struct netlink_callback *cb)
struct nlmsghdr *nlh,
struct netlink_ext_ack *extack)
{
const struct nlmsghdr *nlh = (struct nlmsghdr *)cb->nlh;
if ((nlh->nlmsg_flags & NLM_F_REQUEST) ||
!(NETLINK_CB(skb).sk) ||
!netlink_capable(skb, CAP_NET_ADMIN))
!(NETLINK_CB(skb).sk))
return -EPERM;
if (ib_nl_is_good_ip_resp(nlh))
......@@ -186,7 +184,7 @@ static int ib_nl_ip_send_msg(struct rdma_dev_addr *dev_addr,
/* Repair the nlmsg header length */
nlmsg_end(skb, nlh);
ibnl_multicast(skb, nlh, RDMA_NL_GROUP_LS, GFP_KERNEL);
rdma_nl_multicast(skb, RDMA_NL_GROUP_LS, GFP_KERNEL);
/* Make the request retry, so when we get the response from userspace
* we will have something.
......@@ -326,7 +324,7 @@ static void queue_req(struct addr_req *req)
static int ib_nl_fetch_ha(struct dst_entry *dst, struct rdma_dev_addr *dev_addr,
const void *daddr, u32 seq, u16 family)
{
if (ibnl_chk_listeners(RDMA_NL_GROUP_LS))
if (rdma_nl_chk_listeners(RDMA_NL_GROUP_LS))
return -EADDRNOTAVAIL;
/* We fill in what we can, the response will fill the rest */
......
......@@ -1199,30 +1199,23 @@ int ib_cache_setup_one(struct ib_device *device)
device->cache.ports =
kzalloc(sizeof(*device->cache.ports) *
(rdma_end_port(device) - rdma_start_port(device) + 1), GFP_KERNEL);
if (!device->cache.ports) {
err = -ENOMEM;
goto out;
}
if (!device->cache.ports)
return -ENOMEM;
err = gid_table_setup_one(device);
if (err)
goto out;
if (err) {
kfree(device->cache.ports);
device->cache.ports = NULL;
return err;
}
for (p = 0; p <= rdma_end_port(device) - rdma_start_port(device); ++p)
ib_cache_update(device, p + rdma_start_port(device), true);
INIT_IB_EVENT_HANDLER(&device->cache.event_handler,
device, ib_cache_event);
err = ib_register_event_handler(&device->cache.event_handler);
if (err)
goto err;
ib_register_event_handler(&device->cache.event_handler);
return 0;
err:
gid_table_cleanup_one(device);
out:
return err;
}
void ib_cache_release_one(struct ib_device *device)
......
This diff is collapsed.
......@@ -72,6 +72,7 @@ MODULE_LICENSE("Dual BSD/GPL");
#define CMA_MAX_CM_RETRIES 15
#define CMA_CM_MRA_SETTING (IB_CM_MRA_FLAG_DELAY | 24)
#define CMA_IBOE_PACKET_LIFETIME 18
#define CMA_PREFERRED_ROCE_GID_TYPE IB_GID_TYPE_ROCE_UDP_ENCAP
static const char * const cma_events[] = {
[RDMA_CM_EVENT_ADDR_RESOLVED] = "address resolved",
......@@ -3998,7 +3999,8 @@ static void iboe_mcast_work_handler(struct work_struct *work)
kfree(mw);
}
static void cma_iboe_set_mgid(struct sockaddr *addr, union ib_gid *mgid)
static void cma_iboe_set_mgid(struct sockaddr *addr, union ib_gid *mgid,
enum ib_gid_type gid_type)
{
struct sockaddr_in *sin = (struct sockaddr_in *)addr;
struct sockaddr_in6 *sin6 = (struct sockaddr_in6 *)addr;
......@@ -4008,8 +4010,8 @@ static void cma_iboe_set_mgid(struct sockaddr *addr, union ib_gid *mgid)
} else if (addr->sa_family == AF_INET6) {
memcpy(mgid, &sin6->sin6_addr, sizeof *mgid);
} else {
mgid->raw[0] = 0xff;
mgid->raw[1] = 0x0e;
mgid->raw[0] = (gid_type == IB_GID_TYPE_IB) ? 0xff : 0;
mgid->raw[1] = (gid_type == IB_GID_TYPE_IB) ? 0x0e : 0;
mgid->raw[2] = 0;
mgid->raw[3] = 0;
mgid->raw[4] = 0;
......@@ -4050,7 +4052,9 @@ static int cma_iboe_join_multicast(struct rdma_id_private *id_priv,
goto out1;
}
cma_iboe_set_mgid(addr, &mc->multicast.ib->rec.mgid);
gid_type = id_priv->cma_dev->default_gid_type[id_priv->id.port_num -
rdma_start_port(id_priv->cma_dev->device)];
cma_iboe_set_mgid(addr, &mc->multicast.ib->rec.mgid, gid_type);
mc->multicast.ib->rec.pkey = cpu_to_be16(0xffff);
if (id_priv->id.ps == RDMA_PS_UDP)
......@@ -4066,8 +4070,6 @@ static int cma_iboe_join_multicast(struct rdma_id_private *id_priv,
mc->multicast.ib->rec.hop_limit = 1;
mc->multicast.ib->rec.mtu = iboe_get_mtu(ndev->mtu);
gid_type = id_priv->cma_dev->default_gid_type[id_priv->id.port_num -
rdma_start_port(id_priv->cma_dev->device)];
if (addr->sa_family == AF_INET) {
if (gid_type == IB_GID_TYPE_ROCE_UDP_ENCAP) {
mc->multicast.ib->rec.hop_limit = IPV6_DEFAULT_HOPLIMIT;
......@@ -4280,8 +4282,12 @@ static void cma_add_one(struct ib_device *device)
for (i = rdma_start_port(device); i <= rdma_end_port(device); i++) {
supported_gids = roce_gid_type_mask_support(device, i);
WARN_ON(!supported_gids);
cma_dev->default_gid_type[i - rdma_start_port(device)] =
find_first_bit(&supported_gids, BITS_PER_LONG);
if (supported_gids & (1 << CMA_PREFERRED_ROCE_GID_TYPE))
cma_dev->default_gid_type[i - rdma_start_port(device)] =
CMA_PREFERRED_ROCE_GID_TYPE;
else
cma_dev->default_gid_type[i - rdma_start_port(device)] =
find_first_bit(&supported_gids, BITS_PER_LONG);
cma_dev->default_roce_tos[i - rdma_start_port(device)] = 0;
}
......@@ -4452,9 +4458,8 @@ static int cma_get_id_stats(struct sk_buff *skb, struct netlink_callback *cb)
return skb->len;
}
static const struct ibnl_client_cbs cma_cb_table[] = {
[RDMA_NL_RDMA_CM_ID_STATS] = { .dump = cma_get_id_stats,
.module = THIS_MODULE },
static const struct rdma_nl_cbs cma_cb_table[] = {
[RDMA_NL_RDMA_CM_ID_STATS] = { .dump = cma_get_id_stats},
};
static int cma_init_net(struct net *net)
......@@ -4506,9 +4511,7 @@ static int __init cma_init(void)
if (ret)
goto err;
if (ibnl_add_client(RDMA_NL_RDMA_CM, ARRAY_SIZE(cma_cb_table),
cma_cb_table))
pr_warn("RDMA CMA: failed to add netlink callback\n");
rdma_nl_register(RDMA_NL_RDMA_CM, cma_cb_table);
cma_configfs_init();
return 0;
......@@ -4525,7 +4528,7 @@ static int __init cma_init(void)
static void __exit cma_cleanup(void)
{
cma_configfs_exit();
ibnl_remove_client(RDMA_NL_RDMA_CM);
rdma_nl_unregister(RDMA_NL_RDMA_CM);
ib_unregister_client(&cma_client);
unregister_netdevice_notifier(&cma_nb);
rdma_addr_unregister_client(&addr_client);
......@@ -4534,5 +4537,7 @@ static void __exit cma_cleanup(void)
destroy_workqueue(cma_wq);
}
MODULE_ALIAS_RDMA_NETLINK(RDMA_NL_RDMA_CM, 1);
module_init(cma_init);
module_exit(cma_cleanup);
......@@ -38,6 +38,7 @@
#include <linux/cgroup_rdma.h>
#include <rdma/ib_verbs.h>
#include <rdma/opa_addr.h>
#include <rdma/ib_mad.h>
#include "mad_priv.h"
......@@ -102,6 +103,14 @@ void ib_enum_all_roce_netdevs(roce_netdev_filter filter,
roce_netdev_callback cb,
void *cookie);
typedef int (*nldev_callback)(struct ib_device *device,
struct sk_buff *skb,
struct netlink_callback *cb,
unsigned int idx);
int ib_enum_all_devs(nldev_callback nldev_cb, struct sk_buff *skb,
struct netlink_callback *cb);
enum ib_cache_gid_default_mode {
IB_CACHE_GID_DEFAULT_MODE_SET,
IB_CACHE_GID_DEFAULT_MODE_DELETE
......@@ -179,8 +188,8 @@ void ib_mad_cleanup(void);
int ib_sa_init(void);
void ib_sa_cleanup(void);
int ibnl_init(void);
void ibnl_cleanup(void);
int rdma_nl_init(void);
void rdma_nl_exit(void);
/**
* Check if there are any listeners to the netlink group
......@@ -190,11 +199,14 @@ void ibnl_cleanup(void);
int ibnl_chk_listeners(unsigned int group);
int ib_nl_handle_resolve_resp(struct sk_buff *skb,
struct netlink_callback *cb);
struct nlmsghdr *nlh,
struct netlink_ext_ack *extack);
int ib_nl_handle_set_timeout(struct sk_buff *skb,
struct netlink_callback *cb);
struct nlmsghdr *nlh,
struct netlink_ext_ack *extack);
int ib_nl_handle_ip_res_resp(struct sk_buff *skb,
struct netlink_callback *cb);
struct nlmsghdr *nlh,
struct netlink_ext_ack *extack);
int ib_get_cached_subnet_prefix(struct ib_device *device,
u8 port_num,
......@@ -301,4 +313,9 @@ static inline int ib_mad_enforce_security(struct ib_mad_agent_private *map,
return 0;
}
#endif
struct ib_device *__ib_device_get_by_index(u32 ifindex);
/* RDMA device netlink */
void nldev_init(void);
void nldev_exit(void);
#endif /* _CORE_PRIV_H */
......@@ -134,6 +134,17 @@ static int ib_device_check_mandatory(struct ib_device *device)
return 0;
}
struct ib_device *__ib_device_get_by_index(u32 index)
{
struct ib_device *device;
list_for_each_entry(device, &device_list, core_list)
if (device->index == index)
return device;
return NULL;
}
static struct ib_device *__ib_device_get_by_name(const char *name)
{
struct ib_device *device;
......@@ -145,7 +156,6 @@ static struct ib_device *__ib_device_get_by_name(const char *name)
return NULL;
}
static int alloc_name(char *name)
{
unsigned long *inuse;
......@@ -326,10 +336,10 @@ static int read_port_immutable(struct ib_device *device)
return 0;
}
void ib_get_device_fw_str(struct ib_device *dev, char *str, size_t str_len)
void ib_get_device_fw_str(struct ib_device *dev, char *str)
{
if (dev->get_dev_fw_str)
dev->get_dev_fw_str(dev, str, str_len);
dev->get_dev_fw_str(dev, str);
else
str[0] = '\0';
}
......@@ -394,6 +404,30 @@ static int ib_security_change(struct notifier_block *nb, unsigned long event,
return NOTIFY_OK;
}
/**
* __dev_new_index - allocate an device index
*
* Returns a suitable unique value for a new device interface
* number. It assumes that there are less than 2^32-1 ib devices
* will be present in the system.
*/
static u32 __dev_new_index(void)
{
/*
* The device index to allow stable naming.
* Similar to struct net -> ifindex.
*/
static u32 index;
for (;;) {
if (!(++index))
index = 1;
if (!__ib_device_get_by_index(index))
return index;
}
}
/**
* ib_register_device - Register an IB device with IB core
* @device:Device to register
......@@ -489,9 +523,10 @@ int ib_register_device(struct ib_device *device,
device->reg_state = IB_DEV_REGISTERED;
list_for_each_entry(client, &client_list, list)
if (client->add && !add_client_context(device, client))
if (!add_client_context(device, client) && client->add)
client->add(device);
device->index = __dev_new_index();
down_write(&lists_rwsem);
list_add_tail(&device->core_list, &device_list);
up_write(&lists_rwsem);
......@@ -578,7 +613,7 @@ int ib_register_client(struct ib_client *client)
mutex_lock(&device_mutex);
list_for_each_entry(device, &device_list, core_list)
if (client->add && !add_client_context(device, client))
if (!add_client_context(device, client) && client->add)
client->add(device);
down_write(&lists_rwsem);
......@@ -712,7 +747,7 @@ EXPORT_SYMBOL(ib_set_client_data);
* chapter 11 of the InfiniBand Architecture Specification). This
* callback may occur in interrupt context.
*/
int ib_register_event_handler (struct ib_event_handler *event_handler)
void ib_register_event_handler(struct ib_event_handler *event_handler)
{
unsigned long flags;
......@@ -720,8 +755,6 @@ int ib_register_event_handler (struct ib_event_handler *event_handler)
list_add_tail(&event_handler->list,
&event_handler->device->event_handler_list);
spin_unlock_irqrestore(&event_handler->device->event_handler_lock, flags);
return 0;
}
EXPORT_SYMBOL(ib_register_event_handler);
......@@ -732,15 +765,13 @@ EXPORT_SYMBOL(ib_register_event_handler);
* Unregister an event handler registered with
* ib_register_event_handler().
*/
int ib_unregister_event_handler(struct ib_event_handler *event_handler)
void ib_unregister_event_handler(struct ib_event_handler *event_handler)
{
unsigned long flags;
spin_lock_irqsave(&event_handler->device->event_handler_lock, flags);
list_del(&event_handler->list);
spin_unlock_irqrestore(&event_handler->device->event_handler_lock, flags);
return 0;
}
EXPORT_SYMBOL(ib_unregister_event_handler);
......@@ -893,6 +924,31 @@ void ib_enum_all_roce_netdevs(roce_netdev_filter filter,
up_read(&lists_rwsem);
}
/**
* ib_enum_all_devs - enumerate all ib_devices
* @cb: Callback to call for each found ib_device
*
* Enumerates all ib_devices and calls callback() on each device.
*/
int ib_enum_all_devs(nldev_callback nldev_cb, struct sk_buff *skb,
struct netlink_callback *cb)
{
struct ib_device *dev;
unsigned int idx = 0;
int ret = 0;
down_read(&lists_rwsem);
list_for_each_entry(dev, &device_list, core_list) {
ret = nldev_cb(dev, skb, cb, idx);
if (ret)
break;
idx++;
}
up_read(&lists_rwsem);
return ret;
}
/**
* ib_query_pkey - Get P_Key table entry
* @device:Device to query
......@@ -945,14 +1001,17 @@ int ib_modify_port(struct ib_device *device,
u8 port_num, int port_modify_mask,
struct ib_port_modify *port_modify)
{
if (!device->modify_port)
return -ENOSYS;
int rc;
if (!rdma_is_port_valid(device, port_num))
return -EINVAL;
return device->modify_port(device, port_num, port_modify_mask,
port_modify);
if (device->modify_port)
rc = device->modify_port(device, port_num, port_modify_mask,
port_modify);
else
rc = rdma_protocol_roce(device, port_num) ? 0 : -ENOSYS;
return rc;
}
EXPORT_SYMBOL(ib_modify_port);
......@@ -1087,29 +1146,21 @@ struct net_device *ib_get_net_dev_by_params(struct ib_device *dev,
}
EXPORT_SYMBOL(ib_get_net_dev_by_params);
static struct ibnl_client_cbs ibnl_ls_cb_table[] = {
static const struct rdma_nl_cbs ibnl_ls_cb_table[] = {
[RDMA_NL_LS_OP_RESOLVE] = {
.dump = ib_nl_handle_resolve_resp,
.module = THIS_MODULE },
.doit = ib_nl_handle_resolve_resp,
.flags = RDMA_NL_ADMIN_PERM,
},
[RDMA_NL_LS_OP_SET_TIMEOUT] = {
.dump = ib_nl_handle_set_timeout,
.module = THIS_MODULE },
.doit = ib_nl_handle_set_timeout,
.flags = RDMA_NL_ADMIN_PERM,
},
[RDMA_NL_LS_OP_IP_RESOLVE] = {
.dump = ib_nl_handle_ip_res_resp,
.module = THIS_MODULE },
.doit = ib_nl_handle_ip_res_resp,
.flags = RDMA_NL_ADMIN_PERM,
},
};
static int ib_add_ibnl_clients(void)
{
return ibnl_add_client(RDMA_NL_LS, ARRAY_SIZE(ibnl_ls_cb_table),
ibnl_ls_cb_table);
}
static void ib_remove_ibnl_clients(void)
{
ibnl_remove_client(RDMA_NL_LS);
}
static int __init ib_core_init(void)
{
int ret;
......@@ -1131,9 +1182,9 @@ static int __init ib_core_init(void)
goto err_comp;
}
ret = ibnl_init();
ret = rdma_nl_init();
if (ret) {
pr_warn("Couldn't init IB netlink interface\n");
pr_warn("Couldn't init IB netlink interface: err %d\n", ret);
goto err_sysfs;
}
......@@ -1155,24 +1206,18 @@ static int __init ib_core_init(void)
goto err_mad;
}
ret = ib_add_ibnl_clients();
if (ret) {
pr_warn("Couldn't register ibnl clients\n");
goto err_sa;
}
ret = register_lsm_notifier(&ibdev_lsm_nb);
if (ret) {
pr_warn("Couldn't register LSM notifier. ret %d\n", ret);
goto err_ibnl_clients;
goto err_sa;
}
nldev_init();
rdma_nl_register(RDMA_NL_LS, ibnl_ls_cb_table);
ib_cache_setup();
return 0;
err_ibnl_clients:
ib_remove_ibnl_clients();
err_sa:
ib_sa_cleanup();
err_mad:
......@@ -1180,7 +1225,7 @@ static int __init ib_core_init(void)
err_addr:
addr_cleanup();
err_ibnl:
ibnl_cleanup();
rdma_nl_exit();
err_sysfs:
class_unregister(&ib_class);
err_comp:
......@@ -1192,18 +1237,21 @@ static int __init ib_core_init(void)
static void __exit ib_core_cleanup(void)
{
unregister_lsm_notifier(&ibdev_lsm_nb);
ib_cache_cleanup();
ib_remove_ibnl_clients();
nldev_exit();
rdma_nl_unregister(RDMA_NL_LS);
unregister_lsm_notifier(&ibdev_lsm_nb);
ib_sa_cleanup();
ib_mad_cleanup();
addr_cleanup();
ibnl_cleanup();
rdma_nl_exit();
class_unregister(&ib_class);
destroy_workqueue(ib_comp_wq);
/* Make sure that any pending umem accounting work is done. */
destroy_workqueue(ib_wq);
}
MODULE_ALIAS_RDMA_NETLINK(RDMA_NL_LS, 4);
module_init(ib_core_init);
module_exit(ib_core_cleanup);
......@@ -80,7 +80,7 @@ const char *__attribute_const__ iwcm_reject_msg(int reason)
}
EXPORT_SYMBOL(iwcm_reject_msg);
static struct ibnl_client_cbs iwcm_nl_cb_table[] = {
static struct rdma_nl_cbs iwcm_nl_cb_table[] = {
[RDMA_NL_IWPM_REG_PID] = {.dump = iwpm_register_pid_cb},
[RDMA_NL_IWPM_ADD_MAPPING] = {.dump = iwpm_add_mapping_cb},
[RDMA_NL_IWPM_QUERY_MAPPING] = {.dump = iwpm_add_and_query_mapping_cb},
......@@ -1175,13 +1175,9 @@ static int __init iw_cm_init(void)
ret = iwpm_init(RDMA_NL_IWCM);
if (ret)
pr_err("iw_cm: couldn't init iwpm\n");
ret = ibnl_add_client(RDMA_NL_IWCM, ARRAY_SIZE(iwcm_nl_cb_table),
iwcm_nl_cb_table);
if (ret)
pr_err("iw_cm: couldn't register netlink callbacks\n");
iwcm_wq = alloc_ordered_workqueue("iw_cm_wq", WQ_MEM_RECLAIM);
else
rdma_nl_register(RDMA_NL_IWCM, iwcm_nl_cb_table);
iwcm_wq = alloc_ordered_workqueue("iw_cm_wq", 0);
if (!iwcm_wq)
return -ENOMEM;
......@@ -1200,9 +1196,11 @@ static void __exit iw_cm_cleanup(void)
{
unregister_net_sysctl_table(iwcm_ctl_table_hdr);
destroy_workqueue(iwcm_wq);
ibnl_remove_client(RDMA_NL_IWCM);
rdma_nl_unregister(RDMA_NL_IWCM);
iwpm_exit(RDMA_NL_IWCM);
}
MODULE_ALIAS_RDMA_NETLINK(RDMA_NL_IWCM, 2);
module_init(iw_cm_init);
module_exit(iw_cm_cleanup);
......@@ -42,7 +42,6 @@ int iwpm_valid_pid(void)
{
return iwpm_user_pid > 0;
}
EXPORT_SYMBOL(iwpm_valid_pid);
/*
* iwpm_register_pid - Send a netlink query to user space
......@@ -104,7 +103,7 @@ int iwpm_register_pid(struct iwpm_dev_data *pm_msg, u8 nl_client)
pr_debug("%s: Multicasting a nlmsg (dev = %s ifname = %s iwpm = %s)\n",
__func__, pm_msg->dev_name, pm_msg->if_name, iwpm_ulib_name);
ret = ibnl_multicast(skb, nlh, RDMA_NL_GROUP_IWPM, GFP_KERNEL);
ret = rdma_nl_multicast(skb, RDMA_NL_GROUP_IWPM, GFP_KERNEL);
if (ret) {
skb = NULL; /* skb is freed in the netlink send-op handling */
iwpm_user_pid = IWPM_PID_UNAVAILABLE;
......@@ -122,7 +121,6 @@ int iwpm_register_pid(struct iwpm_dev_data *pm_msg, u8 nl_client)
iwpm_free_nlmsg_request(&nlmsg_request->kref);
return ret;
}
EXPORT_SYMBOL(iwpm_register_pid);
/*
* iwpm_add_mapping - Send a netlink add mapping message
......@@ -174,7 +172,7 @@ int iwpm_add_mapping(struct iwpm_sa_data *pm_msg, u8 nl_client)
goto add_mapping_error;
nlmsg_request->req_buffer = pm_msg;
ret = ibnl_unicast(skb, nlh, iwpm_user_pid);
ret = rdma_nl_unicast_wait(skb, iwpm_user_pid);
if (ret) {
skb = NULL; /* skb is freed in the netlink send-op handling */
iwpm_user_pid = IWPM_PID_UNDEFINED;
......@@ -191,7 +189,6 @@ int iwpm_add_mapping(struct iwpm_sa_data *pm_msg, u8 nl_client)
iwpm_free_nlmsg_request(&nlmsg_request->kref);
return ret;
}
EXPORT_SYMBOL(iwpm_add_mapping);
/*
* iwpm_add_and_query_mapping - Send a netlink add and query
......@@ -251,7 +248,7 @@ int iwpm_add_and_query_mapping(struct iwpm_sa_data *pm_msg, u8 nl_client)
goto query_mapping_error;
nlmsg_request->req_buffer = pm_msg;
ret = ibnl_unicast(skb, nlh, iwpm_user_pid);
ret = rdma_nl_unicast_wait(skb, iwpm_user_pid);
if (ret) {
skb = NULL; /* skb is freed in the netlink send-op handling */
err_str = "Unable to send a nlmsg";
......@@ -267,7 +264,6 @@ int iwpm_add_and_query_mapping(struct iwpm_sa_data *pm_msg, u8 nl_client)
iwpm_free_nlmsg_request(&nlmsg_request->kref);
return ret;
}
EXPORT_SYMBOL(iwpm_add_and_query_mapping);
/*
* iwpm_remove_mapping - Send a netlink remove mapping message
......@@ -312,7 +308,7 @@ int iwpm_remove_mapping(struct sockaddr_storage *local_addr, u8 nl_client)
if (ret)
goto remove_mapping_error;
ret = ibnl_unicast(skb, nlh, iwpm_user_pid);
ret = rdma_nl_unicast_wait(skb, iwpm_user_pid);
if (ret) {
skb = NULL; /* skb is freed in the netlink send-op handling */
iwpm_user_pid = IWPM_PID_UNDEFINED;
......@@ -328,7 +324,6 @@ int iwpm_remove_mapping(struct sockaddr_storage *local_addr, u8 nl_client)
dev_kfree_skb_any(skb);
return ret;
}
EXPORT_SYMBOL(iwpm_remove_mapping);
/* netlink attribute policy for the received response to register pid request */
static const struct nla_policy resp_reg_policy[IWPM_NLA_RREG_PID_MAX] = {
......@@ -397,7 +392,6 @@ int iwpm_register_pid_cb(struct sk_buff *skb, struct netlink_callback *cb)
up(&nlmsg_request->sem);
return 0;
}
EXPORT_SYMBOL(iwpm_register_pid_cb);
/* netlink attribute policy for the received response to add mapping request */
static const struct nla_policy resp_add_policy[IWPM_NLA_RMANAGE_MAPPING_MAX] = {
......@@ -466,7 +460,6 @@ int iwpm_add_mapping_cb(struct sk_buff *skb, struct netlink_callback *cb)
up(&nlmsg_request->sem);
return 0;
}
EXPORT_SYMBOL(iwpm_add_mapping_cb);
/* netlink attribute policy for the response to add and query mapping request
* and response with remote address info */
......@@ -558,7 +551,6 @@ int iwpm_add_and_query_mapping_cb(struct sk_buff *skb,
up(&nlmsg_request->sem);
return 0;
}
EXPORT_SYMBOL(iwpm_add_and_query_mapping_cb);
/*
* iwpm_remote_info_cb - Process a port mapper message, containing
......@@ -627,7 +619,6 @@ int iwpm_remote_info_cb(struct sk_buff *skb, struct netlink_callback *cb)
"remote_info: Mapped remote sockaddr:");
return ret;
}
EXPORT_SYMBOL(iwpm_remote_info_cb);
/* netlink attribute policy for the received request for mapping info */
static const struct nla_policy resp_mapinfo_policy[IWPM_NLA_MAPINFO_REQ_MAX] = {
......@@ -677,7 +668,6 @@ int iwpm_mapping_info_cb(struct sk_buff *skb, struct netlink_callback *cb)
ret = iwpm_send_mapinfo(nl_client, iwpm_user_pid);
return ret;
}
EXPORT_SYMBOL(iwpm_mapping_info_cb);
/* netlink attribute policy for the received mapping info ack */
static const struct nla_policy ack_mapinfo_policy[IWPM_NLA_MAPINFO_NUM_MAX] = {
......@@ -707,7 +697,6 @@ int iwpm_ack_mapping_info_cb(struct sk_buff *skb, struct netlink_callback *cb)
atomic_set(&echo_nlmsg_seq, cb->nlh->nlmsg_seq);
return 0;
}
EXPORT_SYMBOL(iwpm_ack_mapping_info_cb);
/* netlink attribute policy for the received port mapper error message */
static const struct nla_policy map_error_policy[IWPM_NLA_ERR_MAX] = {
......@@ -751,4 +740,3 @@ int iwpm_mapping_error_cb(struct sk_buff *skb, struct netlink_callback *cb)
up(&nlmsg_request->sem);
return 0;
}
EXPORT_SYMBOL(iwpm_mapping_error_cb);
......@@ -54,8 +54,6 @@ static struct iwpm_admin_data iwpm_admin;
int iwpm_init(u8 nl_client)
{
int ret = 0;
if (iwpm_valid_client(nl_client))
return -EINVAL;
mutex_lock(&iwpm_admin_lock);
if (atomic_read(&iwpm_admin.refcount) == 0) {
iwpm_hash_bucket = kzalloc(IWPM_MAPINFO_HASH_SIZE *
......@@ -83,7 +81,6 @@ int iwpm_init(u8 nl_client)
}
return ret;
}
EXPORT_SYMBOL(iwpm_init);
static void free_hash_bucket(void);
static void free_reminfo_bucket(void);
......@@ -109,7 +106,6 @@ int iwpm_exit(u8 nl_client)
iwpm_set_registration(nl_client, IWPM_REG_UNDEF);
return 0;
}
EXPORT_SYMBOL(iwpm_exit);
static struct hlist_head *get_mapinfo_hash_bucket(struct sockaddr_storage *,
struct sockaddr_storage *);
......@@ -148,7 +144,6 @@ int iwpm_create_mapinfo(struct sockaddr_storage *local_sockaddr,
spin_unlock_irqrestore(&iwpm_mapinfo_lock, flags);
return ret;
}
EXPORT_SYMBOL(iwpm_create_mapinfo);
int iwpm_remove_mapinfo(struct sockaddr_storage *local_sockaddr,
struct sockaddr_storage *mapped_local_addr)
......@@ -184,7 +179,6 @@ int iwpm_remove_mapinfo(struct sockaddr_storage *local_sockaddr,
spin_unlock_irqrestore(&iwpm_mapinfo_lock, flags);
return ret;
}
EXPORT_SYMBOL(iwpm_remove_mapinfo);
static void free_hash_bucket(void)
{
......@@ -297,7 +291,6 @@ int iwpm_get_remote_info(struct sockaddr_storage *mapped_loc_addr,
spin_unlock_irqrestore(&iwpm_reminfo_lock, flags);
return ret;
}
EXPORT_SYMBOL(iwpm_get_remote_info);
struct iwpm_nlmsg_request *iwpm_get_nlmsg_request(__u32 nlmsg_seq,
u8 nl_client, gfp_t gfp)
......@@ -383,15 +376,11 @@ int iwpm_get_nlmsg_seq(void)
int iwpm_valid_client(u8 nl_client)
{
if (nl_client >= RDMA_NL_NUM_CLIENTS)
return 0;
return iwpm_admin.client_list[nl_client];
}
void iwpm_set_valid(u8 nl_client, int valid)
{
if (nl_client >= RDMA_NL_NUM_CLIENTS)
return;
iwpm_admin.client_list[nl_client] = valid;
}
......@@ -608,7 +597,7 @@ static int send_mapinfo_num(u32 mapping_num, u8 nl_client, int iwpm_pid)
&mapping_num, IWPM_NLA_MAPINFO_SEND_NUM);
if (ret)
goto mapinfo_num_error;
ret = ibnl_unicast(skb, nlh, iwpm_pid);
ret = rdma_nl_unicast(skb, iwpm_pid);
if (ret) {
skb = NULL;
err_str = "Unable to send a nlmsg";
......@@ -637,7 +626,7 @@ static int send_nlmsg_done(struct sk_buff *skb, u8 nl_client, int iwpm_pid)
return -ENOMEM;
}
nlh->nlmsg_type = NLMSG_DONE;
ret = ibnl_unicast(skb, (struct nlmsghdr *)skb->data, iwpm_pid);
ret = rdma_nl_unicast(skb, iwpm_pid);
if (ret)
pr_warn("%s Unable to send a nlmsg\n", __func__);
return ret;
......
......@@ -64,7 +64,7 @@ struct mad_rmpp_recv {
__be64 tid;
u32 src_qp;
u16 slid;
u32 slid;
u8 mgmt_class;
u8 class_version;
u8 method;
......
This diff is collapsed.
/*
* Copyright (c) 2017 Mellanox Technologies. All rights reserved.
*
* Redistribution and use in source and binary forms, with or without
* modification, are permitted provided that the following conditions are met:
*
* 1. Redistributions of source code must retain the above copyright
* notice, this list of conditions and the following disclaimer.
* 2. Redistributions in binary form must reproduce the above copyright
* notice, this list of conditions and the following disclaimer in the
* documentation and/or other materials provided with the distribution.
* 3. Neither the names of the copyright holders nor the names of its
* contributors may be used to endorse or promote products derived from
* this software without specific prior written permission.
*
* Alternatively, this software may be distributed under the terms of the
* GNU General Public License ("GPL") version 2 as published by the Free
* Software Foundation.
*
* THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS"
* AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
* IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
* ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT OWNER OR CONTRIBUTORS BE
* LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR
* CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF
* SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS
* INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN
* CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE)
* ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE
* POSSIBILITY OF SUCH DAMAGE.
*/
#include <linux/module.h>
#include <net/netlink.h>
#include <rdma/rdma_netlink.h>
#include "core_priv.h"
static const struct nla_policy nldev_policy[RDMA_NLDEV_ATTR_MAX] = {
[RDMA_NLDEV_ATTR_DEV_INDEX] = { .type = NLA_U32 },
[RDMA_NLDEV_ATTR_DEV_NAME] = { .type = NLA_NUL_STRING,
.len = IB_DEVICE_NAME_MAX - 1},
[RDMA_NLDEV_ATTR_PORT_INDEX] = { .type = NLA_U32 },
[RDMA_NLDEV_ATTR_FW_VERSION] = { .type = NLA_NUL_STRING,
.len = IB_FW_VERSION_NAME_MAX - 1},
[RDMA_NLDEV_ATTR_NODE_GUID] = { .type = NLA_U64 },
[RDMA_NLDEV_ATTR_SYS_IMAGE_GUID] = { .type = NLA_U64 },
[RDMA_NLDEV_ATTR_SUBNET_PREFIX] = { .type = NLA_U64 },
[RDMA_NLDEV_ATTR_LID] = { .type = NLA_U32 },
[RDMA_NLDEV_ATTR_SM_LID] = { .type = NLA_U32 },
[RDMA_NLDEV_ATTR_LMC] = { .type = NLA_U8 },
[RDMA_NLDEV_ATTR_PORT_STATE] = { .type = NLA_U8 },
[RDMA_NLDEV_ATTR_PORT_PHYS_STATE] = { .type = NLA_U8 },
[RDMA_NLDEV_ATTR_DEV_NODE_TYPE] = { .type = NLA_U8 },
};
static int fill_dev_info(struct sk_buff *msg, struct ib_device *device)
{
char fw[IB_FW_VERSION_NAME_MAX];
if (nla_put_u32(msg, RDMA_NLDEV_ATTR_DEV_INDEX, device->index))
return -EMSGSIZE;
if (nla_put_string(msg, RDMA_NLDEV_ATTR_DEV_NAME, device->name))
return -EMSGSIZE;
if (nla_put_u32(msg, RDMA_NLDEV_ATTR_PORT_INDEX, rdma_end_port(device)))
return -EMSGSIZE;
BUILD_BUG_ON(sizeof(device->attrs.device_cap_flags) != sizeof(u64));
if (nla_put_u64_64bit(msg, RDMA_NLDEV_ATTR_CAP_FLAGS,
device->attrs.device_cap_flags, 0))
return -EMSGSIZE;
ib_get_device_fw_str(device, fw);
/* Device without FW has strlen(fw) */
if (strlen(fw) && nla_put_string(msg, RDMA_NLDEV_ATTR_FW_VERSION, fw))
return -EMSGSIZE;
if (nla_put_u64_64bit(msg, RDMA_NLDEV_ATTR_NODE_GUID,
be64_to_cpu(device->node_guid), 0))
return -EMSGSIZE;
if (nla_put_u64_64bit(msg, RDMA_NLDEV_ATTR_SYS_IMAGE_GUID,
be64_to_cpu(device->attrs.sys_image_guid), 0))
return -EMSGSIZE;
if (nla_put_u8(msg, RDMA_NLDEV_ATTR_DEV_NODE_TYPE, device->node_type))
return -EMSGSIZE;
return 0;
}
static int fill_port_info(struct sk_buff *msg,
struct ib_device *device, u32 port)
{
struct ib_port_attr attr;
int ret;
if (nla_put_u32(msg, RDMA_NLDEV_ATTR_DEV_INDEX, device->index))
return -EMSGSIZE;
if (nla_put_string(msg, RDMA_NLDEV_ATTR_DEV_NAME, device->name))
return -EMSGSIZE;
if (nla_put_u32(msg, RDMA_NLDEV_ATTR_PORT_INDEX, port))
return -EMSGSIZE;
ret = ib_query_port(device, port, &attr);
if (ret)
return ret;
BUILD_BUG_ON(sizeof(attr.port_cap_flags) > sizeof(u64));
if (nla_put_u64_64bit(msg, RDMA_NLDEV_ATTR_CAP_FLAGS,
(u64)attr.port_cap_flags, 0))
return -EMSGSIZE;
if (rdma_protocol_ib(device, port) &&
nla_put_u64_64bit(msg, RDMA_NLDEV_ATTR_SUBNET_PREFIX,
attr.subnet_prefix, 0))
return -EMSGSIZE;
if (rdma_protocol_ib(device, port)) {
if (nla_put_u32(msg, RDMA_NLDEV_ATTR_LID, attr.lid))
return -EMSGSIZE;
if (nla_put_u32(msg, RDMA_NLDEV_ATTR_SM_LID, attr.sm_lid))
return -EMSGSIZE;
if (nla_put_u8(msg, RDMA_NLDEV_ATTR_LMC, attr.lmc))
return -EMSGSIZE;
}
if (nla_put_u8(msg, RDMA_NLDEV_ATTR_PORT_STATE, attr.state))
return -EMSGSIZE;
if (nla_put_u8(msg, RDMA_NLDEV_ATTR_PORT_PHYS_STATE, attr.phys_state))
return -EMSGSIZE;
return 0;
}
static int nldev_get_doit(struct sk_buff *skb, struct nlmsghdr *nlh,
struct netlink_ext_ack *extack)
{
struct nlattr *tb[RDMA_NLDEV_ATTR_MAX];
struct ib_device *device;
struct sk_buff *msg;
u32 index;
int err;
err = nlmsg_parse(nlh, 0, tb, RDMA_NLDEV_ATTR_MAX - 1,
nldev_policy, extack);
if (err || !tb[RDMA_NLDEV_ATTR_DEV_INDEX])
return -EINVAL;
index = nla_get_u32(tb[RDMA_NLDEV_ATTR_DEV_INDEX]);
device = __ib_device_get_by_index(index);
if (!device)
return -EINVAL;
msg = nlmsg_new(NLMSG_DEFAULT_SIZE, GFP_KERNEL);
if (!msg)
return -ENOMEM;
nlh = nlmsg_put(msg, NETLINK_CB(skb).portid, nlh->nlmsg_seq,
RDMA_NL_GET_TYPE(RDMA_NL_NLDEV, RDMA_NLDEV_CMD_GET),
0, 0);
err = fill_dev_info(msg, device);
if (err) {
nlmsg_free(msg);
return err;
}
nlmsg_end(msg, nlh);
return rdma_nl_unicast(msg, NETLINK_CB(skb).portid);
}
static int _nldev_get_dumpit(struct ib_device *device,
struct sk_buff *skb,
struct netlink_callback *cb,
unsigned int idx)
{
int start = cb->args[0];
struct nlmsghdr *nlh;
if (idx < start)
return 0;
nlh = nlmsg_put(skb, NETLINK_CB(cb->skb).portid, cb->nlh->nlmsg_seq,
RDMA_NL_GET_TYPE(RDMA_NL_NLDEV, RDMA_NLDEV_CMD_GET),
0, NLM_F_MULTI);
if (fill_dev_info(skb, device)) {
nlmsg_cancel(skb, nlh);
goto out;
}
nlmsg_end(skb, nlh);
idx++;
out: cb->args[0] = idx;
return skb->len;
}
static int nldev_get_dumpit(struct sk_buff *skb, struct netlink_callback *cb)
{
/*
* There is no need to take lock, because
* we are relying on ib_core's lists_rwsem
*/
return ib_enum_all_devs(_nldev_get_dumpit, skb, cb);
}
static int nldev_port_get_doit(struct sk_buff *skb, struct nlmsghdr *nlh,
struct netlink_ext_ack *extack)
{
struct nlattr *tb[RDMA_NLDEV_ATTR_MAX];
struct ib_device *device;
struct sk_buff *msg;
u32 index;
u32 port;
int err;
err = nlmsg_parse(nlh, 0, tb, RDMA_NLDEV_ATTR_MAX - 1,
nldev_policy, extack);
if (err || !tb[RDMA_NLDEV_ATTR_PORT_INDEX])
return -EINVAL;
index = nla_get_u32(tb[RDMA_NLDEV_ATTR_DEV_INDEX]);
device = __ib_device_get_by_index(index);
if (!device)
return -EINVAL;
port = nla_get_u32(tb[RDMA_NLDEV_ATTR_PORT_INDEX]);
if (!rdma_is_port_valid(device, port))
return -EINVAL;
msg = nlmsg_new(NLMSG_DEFAULT_SIZE, GFP_KERNEL);
if (!msg)
return -ENOMEM;
nlh = nlmsg_put(msg, NETLINK_CB(skb).portid, nlh->nlmsg_seq,
RDMA_NL_GET_TYPE(RDMA_NL_NLDEV, RDMA_NLDEV_CMD_GET),
0, 0);
err = fill_port_info(msg, device, port);
if (err) {
nlmsg_free(msg);
return err;
}
nlmsg_end(msg, nlh);
return rdma_nl_unicast(msg, NETLINK_CB(skb).portid);
}
static int nldev_port_get_dumpit(struct sk_buff *skb,
struct netlink_callback *cb)
{
struct nlattr *tb[RDMA_NLDEV_ATTR_MAX];
struct ib_device *device;
int start = cb->args[0];
struct nlmsghdr *nlh;
u32 idx = 0;
u32 ifindex;
int err;
u32 p;
err = nlmsg_parse(cb->nlh, 0, tb, RDMA_NLDEV_ATTR_MAX - 1,
nldev_policy, NULL);
if (err || !tb[RDMA_NLDEV_ATTR_DEV_INDEX])
return -EINVAL;
ifindex = nla_get_u32(tb[RDMA_NLDEV_ATTR_DEV_INDEX]);
device = __ib_device_get_by_index(ifindex);
if (!device)
return -EINVAL;
for (p = rdma_start_port(device); p <= rdma_end_port(device); ++p) {
/*
* The dumpit function returns all information from specific
* index. This specific index is taken from the netlink
* messages request sent by user and it is available
* in cb->args[0].
*
* Usually, the user doesn't fill this field and it causes
* to return everything.
*
*/
if (idx < start) {
idx++;
continue;
}
nlh = nlmsg_put(skb, NETLINK_CB(cb->skb).portid,
cb->nlh->nlmsg_seq,
RDMA_NL_GET_TYPE(RDMA_NL_NLDEV,
RDMA_NLDEV_CMD_PORT_GET),
0, NLM_F_MULTI);
if (fill_port_info(skb, device, p)) {
nlmsg_cancel(skb, nlh);
goto out;
}
idx++;
nlmsg_end(skb, nlh);
}
out: cb->args[0] = idx;
return skb->len;
}
static const struct rdma_nl_cbs nldev_cb_table[] = {
[RDMA_NLDEV_CMD_GET] = {
.doit = nldev_get_doit,
.dump = nldev_get_dumpit,
},
[RDMA_NLDEV_CMD_PORT_GET] = {
.doit = nldev_port_get_doit,
.dump = nldev_port_get_dumpit,
},
};
void __init nldev_init(void)
{
rdma_nl_register(RDMA_NL_NLDEV, nldev_cb_table);
}
void __exit nldev_exit(void)
{
rdma_nl_unregister(RDMA_NL_NLDEV);
}
MODULE_ALIAS_RDMA_NETLINK(RDMA_NL_NLDEV, 5);
......@@ -35,10 +35,57 @@
#include <rdma/ib_verbs.h>
#include <rdma/uverbs_types.h>
#include <linux/rcupdate.h>
#include <rdma/uverbs_ioctl.h>
#include <rdma/rdma_user_ioctl.h>
#include "uverbs.h"
#include "core_priv.h"
#include "rdma_core.h"
int uverbs_ns_idx(u16 *id, unsigned int ns_count)
{
int ret = (*id & UVERBS_ID_NS_MASK) >> UVERBS_ID_NS_SHIFT;
if (ret >= ns_count)
return -EINVAL;
*id &= ~UVERBS_ID_NS_MASK;
return ret;
}
const struct uverbs_object_spec *uverbs_get_object(const struct ib_device *ibdev,
uint16_t object)
{
const struct uverbs_root_spec *object_hash = ibdev->specs_root;
const struct uverbs_object_spec_hash *objects;
int ret = uverbs_ns_idx(&object, object_hash->num_buckets);
if (ret < 0)
return NULL;
objects = object_hash->object_buckets[ret];
if (object >= objects->num_objects)
return NULL;
return objects->objects[object];
}
const struct uverbs_method_spec *uverbs_get_method(const struct uverbs_object_spec *object,
uint16_t method)
{
const struct uverbs_method_spec_hash *methods;
int ret = uverbs_ns_idx(&method, object->num_buckets);
if (ret < 0)
return NULL;
methods = object->method_buckets[ret];
if (method >= methods->num_methods)
return NULL;
return methods->methods[method];
}
void uverbs_uobject_get(struct ib_uobject *uobject)
{
kref_get(&uobject->ref);
......@@ -404,6 +451,41 @@ int __must_check rdma_remove_commit_uobject(struct ib_uobject *uobj)
return ret;
}
static int null_obj_type_class_remove_commit(struct ib_uobject *uobj,
enum rdma_remove_reason why)
{
return 0;
}
static const struct uverbs_obj_type null_obj_type = {
.type_class = &((const struct uverbs_obj_type_class){
.remove_commit = null_obj_type_class_remove_commit,
/* be cautious */
.needs_kfree_rcu = true}),
};
int rdma_explicit_destroy(struct ib_uobject *uobject)
{
int ret;
struct ib_ucontext *ucontext = uobject->context;
/* Cleanup is running. Calling this should have been impossible */
if (!down_read_trylock(&ucontext->cleanup_rwsem)) {
WARN(true, "ib_uverbs: Cleanup is running while removing an uobject\n");
return 0;
}
lockdep_check(uobject, true);
ret = uobject->type->type_class->remove_commit(uobject,
RDMA_REMOVE_DESTROY);
if (ret)
return ret;
uobject->type = &null_obj_type;
up_read(&ucontext->cleanup_rwsem);
return 0;
}
static void alloc_commit_idr_uobject(struct ib_uobject *uobj)
{
uverbs_uobject_add(uobj);
......@@ -625,3 +707,100 @@ const struct uverbs_obj_type_class uverbs_fd_class = {
.needs_kfree_rcu = false,
};
struct ib_uobject *uverbs_get_uobject_from_context(const struct uverbs_obj_type *type_attrs,
struct ib_ucontext *ucontext,
enum uverbs_obj_access access,
int id)
{
switch (access) {
case UVERBS_ACCESS_READ:
return rdma_lookup_get_uobject(type_attrs, ucontext, id, false);
case UVERBS_ACCESS_DESTROY:
case UVERBS_ACCESS_WRITE:
return rdma_lookup_get_uobject(type_attrs, ucontext, id, true);
case UVERBS_ACCESS_NEW:
return rdma_alloc_begin_uobject(type_attrs, ucontext);
default:
WARN_ON(true);
return ERR_PTR(-EOPNOTSUPP);
}
}
int uverbs_finalize_object(struct ib_uobject *uobj,
enum uverbs_obj_access access,
bool commit)
{
int ret = 0;
/*
* refcounts should be handled at the object level and not at the
* uobject level. Refcounts of the objects themselves are done in
* handlers.
*/
switch (access) {
case UVERBS_ACCESS_READ:
rdma_lookup_put_uobject(uobj, false);
break;
case UVERBS_ACCESS_WRITE:
rdma_lookup_put_uobject(uobj, true);
break;
case UVERBS_ACCESS_DESTROY:
if (commit)
ret = rdma_remove_commit_uobject(uobj);
else
rdma_lookup_put_uobject(uobj, true);
break;
case UVERBS_ACCESS_NEW:
if (commit)
ret = rdma_alloc_commit_uobject(uobj);
else
rdma_alloc_abort_uobject(uobj);
break;
default:
WARN_ON(true);
ret = -EOPNOTSUPP;
}
return ret;
}
int uverbs_finalize_objects(struct uverbs_attr_bundle *attrs_bundle,
struct uverbs_attr_spec_hash * const *spec_hash,
size_t num,
bool commit)
{
unsigned int i;
int ret = 0;
for (i = 0; i < num; i++) {
struct uverbs_attr_bundle_hash *curr_bundle =
&attrs_bundle->hash[i];
const struct uverbs_attr_spec_hash *curr_spec_bucket =
spec_hash[i];
unsigned int j;
for (j = 0; j < curr_bundle->num_attrs; j++) {
struct uverbs_attr *attr;
const struct uverbs_attr_spec *spec;
if (!uverbs_attr_is_valid_in_hash(curr_bundle, j))
continue;
attr = &curr_bundle->attrs[j];
spec = &curr_spec_bucket->attrs[j];
if (spec->type == UVERBS_ATTR_TYPE_IDR ||
spec->type == UVERBS_ATTR_TYPE_FD) {
int current_ret;
current_ret = uverbs_finalize_object(attr->obj_attr.uobject,
spec->obj.access,
commit);
if (!ret)
ret = current_ret;
}
}
}
return ret;
}
......@@ -39,9 +39,15 @@
#include <linux/idr.h>
#include <rdma/uverbs_types.h>
#include <rdma/uverbs_ioctl.h>
#include <rdma/ib_verbs.h>
#include <linux/mutex.h>
int uverbs_ns_idx(u16 *id, unsigned int ns_count);
const struct uverbs_object_spec *uverbs_get_object(const struct ib_device *ibdev,
uint16_t object);
const struct uverbs_method_spec *uverbs_get_method(const struct uverbs_object_spec *object,
uint16_t method);
/*
* These functions initialize the context and cleanups its uobjects.
* The context has a list of objects which is protected by a mutex
......@@ -75,4 +81,40 @@ void uverbs_uobject_put(struct ib_uobject *uobject);
*/
void uverbs_close_fd(struct file *f);
/*
* Get an ib_uobject that corresponds to the given id from ucontext, assuming
* the object is from the given type. Lock it to the required access when
* applicable.
* This function could create (access == NEW), destroy (access == DESTROY)
* or unlock (access == READ || access == WRITE) objects if required.
* The action will be finalized only when uverbs_finalize_object or
* uverbs_finalize_objects are called.
*/
struct ib_uobject *uverbs_get_uobject_from_context(const struct uverbs_obj_type *type_attrs,
struct ib_ucontext *ucontext,
enum uverbs_obj_access access,
int id);
int uverbs_finalize_object(struct ib_uobject *uobj,
enum uverbs_obj_access access,
bool commit);
/*
* Note that certain finalize stages could return a status:
* (a) alloc_commit could return a failure if the object is committed at the
* same time when the context is destroyed.
* (b) remove_commit could fail if the object wasn't destroyed successfully.
* Since multiple objects could be finalized in one transaction, it is very NOT
* recommended to have several finalize actions which have side effects.
* For example, it's NOT recommended to have a certain action which has both
* a commit action and a destroy action or two destroy objects in the same
* action. The rule of thumb is to have one destroy or commit action with
* multiple lookups.
* The first non zero return value of finalize_object is returned from this
* function. For example, this could happen when we couldn't destroy an
* object.
*/
int uverbs_finalize_objects(struct uverbs_attr_bundle *attrs_bundle,
struct uverbs_attr_spec_hash * const *spec_hash,
size_t num,
bool commit);
#endif /* RDMA_CORE_H */
......@@ -44,6 +44,8 @@
static struct workqueue_struct *gid_cache_wq;
static struct workqueue_struct *gid_cache_wq;
enum gid_op_type {
GID_DEL = 0,
GID_ADD
......
......@@ -50,6 +50,7 @@
#include <uapi/rdma/ib_user_sa.h>
#include <rdma/ib_marshall.h>
#include <rdma/ib_addr.h>
#include <rdma/opa_addr.h>
#include "sa.h"
#include "core_priv.h"
......@@ -861,7 +862,7 @@ static int ib_nl_send_msg(struct ib_sa_query *query, gfp_t gfp_mask)
/* Repair the nlmsg header length */
nlmsg_end(skb, nlh);
ret = ibnl_multicast(skb, nlh, RDMA_NL_GROUP_LS, gfp_mask);
ret = rdma_nl_multicast(skb, RDMA_NL_GROUP_LS, gfp_mask);
if (!ret)
ret = len;
else
......@@ -1021,9 +1022,9 @@ static void ib_nl_request_timeout(struct work_struct *work)
}
int ib_nl_handle_set_timeout(struct sk_buff *skb,
struct netlink_callback *cb)
struct nlmsghdr *nlh,
struct netlink_ext_ack *extack)
{
const struct nlmsghdr *nlh = (struct nlmsghdr *)cb->nlh;
int timeout, delta, abs_delta;
const struct nlattr *attr;
unsigned long flags;
......@@ -1033,8 +1034,7 @@ int ib_nl_handle_set_timeout(struct sk_buff *skb,
int ret;
if (!(nlh->nlmsg_flags & NLM_F_REQUEST) ||
!(NETLINK_CB(skb).sk) ||
!netlink_capable(skb, CAP_NET_ADMIN))
!(NETLINK_CB(skb).sk))
return -EPERM;
ret = nla_parse(tb, LS_NLA_TYPE_MAX - 1, nlmsg_data(nlh),
......@@ -1098,9 +1098,9 @@ static inline int ib_nl_is_good_resolve_resp(const struct nlmsghdr *nlh)
}
int ib_nl_handle_resolve_resp(struct sk_buff *skb,
struct netlink_callback *cb)
struct nlmsghdr *nlh,
struct netlink_ext_ack *extack)
{
const struct nlmsghdr *nlh = (struct nlmsghdr *)cb->nlh;
unsigned long flags;
struct ib_sa_query *query;
struct ib_mad_send_buf *send_buf;
......@@ -1109,8 +1109,7 @@ int ib_nl_handle_resolve_resp(struct sk_buff *skb,
int ret;
if ((nlh->nlmsg_flags & NLM_F_REQUEST) ||
!(NETLINK_CB(skb).sk) ||
!netlink_capable(skb, CAP_NET_ADMIN))
!(NETLINK_CB(skb).sk))
return -EPERM;
spin_lock_irqsave(&ib_nl_request_lock, flags);
......@@ -1241,6 +1240,11 @@ int ib_init_ah_from_path(struct ib_device *device, u8 port_num,
ah_attr->type = rdma_ah_find_type(device, port_num);
rdma_ah_set_dlid(ah_attr, be32_to_cpu(sa_path_get_dlid(rec)));
if ((ah_attr->type == RDMA_AH_ATTR_TYPE_OPA) &&
(rdma_ah_get_dlid(ah_attr) == be16_to_cpu(IB_LID_PERMISSIVE)))
rdma_ah_set_make_grd(ah_attr, true);
rdma_ah_set_sl(ah_attr, rec->sl);
rdma_ah_set_path_bits(ah_attr, be32_to_cpu(sa_path_get_slid(rec)) &
get_src_path_mask(device, port_num));
......@@ -1420,7 +1424,7 @@ static int send_mad(struct ib_sa_query *query, int timeout_ms, gfp_t gfp_mask)
if ((query->flags & IB_SA_ENABLE_LOCAL_SERVICE) &&
(!(query->flags & IB_SA_QUERY_OPA))) {
if (!ibnl_chk_listeners(RDMA_NL_GROUP_LS)) {
if (!rdma_nl_chk_listeners(RDMA_NL_GROUP_LS)) {
if (!ib_nl_make_request(query, gfp_mask))
return id;
}
......@@ -2290,12 +2294,15 @@ static void update_sm_ah(struct work_struct *work)
rdma_ah_set_sl(&ah_attr, port_attr.sm_sl);
rdma_ah_set_port_num(&ah_attr, port->port_num);
if (port_attr.grh_required) {
rdma_ah_set_ah_flags(&ah_attr, IB_AH_GRH);
rdma_ah_set_subnet_prefix(&ah_attr,
cpu_to_be64(port_attr.subnet_prefix));
rdma_ah_set_interface_id(&ah_attr,
cpu_to_be64(IB_SA_WELL_KNOWN_GUID));
if (ah_attr.type == RDMA_AH_ATTR_TYPE_OPA) {
rdma_ah_set_make_grd(&ah_attr, true);
} else {
rdma_ah_set_ah_flags(&ah_attr, IB_AH_GRH);
rdma_ah_set_subnet_prefix(&ah_attr,
cpu_to_be64(port_attr.subnet_prefix));
rdma_ah_set_interface_id(&ah_attr,
cpu_to_be64(IB_SA_WELL_KNOWN_GUID));
}
}
new_ah->ah = rdma_create_ah(port->agent->qp->pd, &ah_attr);
......@@ -2410,8 +2417,7 @@ static void ib_sa_add_one(struct ib_device *device)
*/
INIT_IB_EVENT_HANDLER(&sa_dev->event_handler, device, ib_sa_event);
if (ib_register_event_handler(&sa_dev->event_handler))
goto err;
ib_register_event_handler(&sa_dev->event_handler);
for (i = 0; i <= e - s; ++i) {
if (rdma_cap_ib_sa(device, i + 1))
......
......@@ -1210,8 +1210,8 @@ static ssize_t show_fw_ver(struct device *device, struct device_attribute *attr,
{
struct ib_device *dev = container_of(device, struct ib_device, dev);
ib_get_device_fw_str(dev, buf, PAGE_SIZE);
strlcat(buf, "\n", PAGE_SIZE);
ib_get_device_fw_str(dev, buf);
strlcat(buf, "\n", IB_FW_VERSION_NAME_MAX);
return strlen(buf);
}
......
......@@ -618,7 +618,7 @@ static ssize_t ib_ucm_init_qp_attr(struct ib_ucm_file *file,
if (result)
goto out;
ib_copy_qp_attr_to_user(&resp, &qp_attr);
ib_copy_qp_attr_to_user(ctx->cm_id->device, &resp, &qp_attr);
if (copy_to_user((void __user *)(unsigned long)cmd.response,
&resp, sizeof(resp)))
......
......@@ -248,14 +248,15 @@ static void ucma_copy_conn_event(struct rdma_ucm_conn_param *dst,
dst->qp_num = src->qp_num;
}
static void ucma_copy_ud_event(struct rdma_ucm_ud_param *dst,
static void ucma_copy_ud_event(struct ib_device *device,
struct rdma_ucm_ud_param *dst,
struct rdma_ud_param *src)
{
if (src->private_data_len)
memcpy(dst->private_data, src->private_data,
src->private_data_len);
dst->private_data_len = src->private_data_len;
ib_copy_ah_attr_to_user(&dst->ah_attr, &src->ah_attr);
ib_copy_ah_attr_to_user(device, &dst->ah_attr, &src->ah_attr);
dst->qp_num = src->qp_num;
dst->qkey = src->qkey;
}
......@@ -335,7 +336,8 @@ static int ucma_event_handler(struct rdma_cm_id *cm_id,
uevent->resp.event = event->event;
uevent->resp.status = event->status;
if (cm_id->qp_type == IB_QPT_UD)
ucma_copy_ud_event(&uevent->resp.param.ud, &event->param.ud);
ucma_copy_ud_event(cm_id->device, &uevent->resp.param.ud,
&event->param.ud);
else
ucma_copy_conn_event(&uevent->resp.param.conn,
&event->param.conn);
......@@ -1157,7 +1159,7 @@ static ssize_t ucma_init_qp_attr(struct ucma_file *file,
if (ret)
goto out;
ib_copy_qp_attr_to_user(&resp, &qp_attr);
ib_copy_qp_attr_to_user(ctx->cm_id->device, &resp, &qp_attr);
if (copy_to_user((void __user *)(unsigned long)cmd.response,
&resp, sizeof(resp)))
ret = -EFAULT;
......
......@@ -229,7 +229,7 @@ static void recv_handler(struct ib_mad_agent *agent,
packet->mad.hdr.status = 0;
packet->mad.hdr.length = hdr_size(file) + mad_recv_wc->mad_len;
packet->mad.hdr.qpn = cpu_to_be32(mad_recv_wc->wc->src_qp);
packet->mad.hdr.lid = cpu_to_be16(mad_recv_wc->wc->slid);
packet->mad.hdr.lid = ib_lid_be16(mad_recv_wc->wc->slid);
packet->mad.hdr.sl = mad_recv_wc->wc->sl;
packet->mad.hdr.path_bits = mad_recv_wc->wc->dlid_path_bits;
packet->mad.hdr.pkey_index = mad_recv_wc->wc->pkey_index;
......
......@@ -100,6 +100,7 @@ struct ib_uverbs_device {
struct mutex lists_mutex; /* protect lists */
struct list_head uverbs_file_list;
struct list_head uverbs_events_file_list;
struct uverbs_root_spec *specs_root;
};
struct ib_uverbs_event_queue {
......@@ -218,6 +219,8 @@ int uverbs_dealloc_mw(struct ib_mw *mw);
void ib_uverbs_detach_umcast(struct ib_qp *qp,
struct ib_uqp_object *uobj);
long ib_uverbs_ioctl(struct file *filp, unsigned int cmd, unsigned long arg);
struct ib_uverbs_flow_spec {
union {
union {
......
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
......@@ -49,6 +49,7 @@
#include <linux/uaccess.h>
#include <rdma/ib.h>
#include <rdma/uverbs_std_types.h>
#include "uverbs.h"
#include "core_priv.h"
......@@ -595,7 +596,6 @@ struct file *ib_uverbs_alloc_async_event_file(struct ib_uverbs_file *uverbs_file
{
struct ib_uverbs_async_event_file *ev_file;
struct file *filp;
int ret;
ev_file = kzalloc(sizeof(*ev_file), GFP_KERNEL);
if (!ev_file)
......@@ -621,21 +621,11 @@ struct file *ib_uverbs_alloc_async_event_file(struct ib_uverbs_file *uverbs_file
INIT_IB_EVENT_HANDLER(&uverbs_file->event_handler,
ib_dev,
ib_uverbs_event_handler);
ret = ib_register_event_handler(&uverbs_file->event_handler);
if (ret)
goto err_put_file;
ib_register_event_handler(&uverbs_file->event_handler);
/* At that point async file stuff was fully set */
return filp;
err_put_file:
fput(filp);
kref_put(&uverbs_file->async_file->ref,
ib_uverbs_release_async_event_file);
uverbs_file->async_file = NULL;
return ERR_PTR(ret);
err_put_refs:
kref_put(&ev_file->uverbs_file->ref, ib_uverbs_release_file);
kref_put(&ev_file->ref, ib_uverbs_release_async_event_file);
......@@ -949,6 +939,9 @@ static const struct file_operations uverbs_fops = {
.open = ib_uverbs_open,
.release = ib_uverbs_close,
.llseek = no_llseek,
#if IS_ENABLED(CONFIG_INFINIBAND_EXP_USER_ACCESS)
.unlocked_ioctl = ib_uverbs_ioctl,
#endif
};
static const struct file_operations uverbs_mmap_fops = {
......@@ -958,6 +951,9 @@ static const struct file_operations uverbs_mmap_fops = {
.open = ib_uverbs_open,
.release = ib_uverbs_close,
.llseek = no_llseek,
#if IS_ENABLED(CONFIG_INFINIBAND_EXP_USER_ACCESS)
.unlocked_ioctl = ib_uverbs_ioctl,
#endif
};
static struct ib_client uverbs_client = {
......@@ -1108,6 +1104,18 @@ static void ib_uverbs_add_one(struct ib_device *device)
if (device_create_file(uverbs_dev->dev, &dev_attr_abi_version))
goto err_class;
if (!device->specs_root) {
const struct uverbs_object_tree_def *default_root[] = {
uverbs_default_get_objects()};
uverbs_dev->specs_root = uverbs_alloc_spec_tree(1,
default_root);
if (IS_ERR(uverbs_dev->specs_root))
goto err_class;
device->specs_root = uverbs_dev->specs_root;
}
ib_set_client_data(device, &uverbs_client, uverbs_dev);
return;
......@@ -1239,6 +1247,11 @@ static void ib_uverbs_remove_one(struct ib_device *device, void *client_data)
ib_uverbs_comp_dev(uverbs_dev);
if (wait_clients)
wait_for_completion(&uverbs_dev->comp);
if (uverbs_dev->specs_root) {
uverbs_free_spec_tree(uverbs_dev->specs_root);
device->specs_root = NULL;
}
kobject_put(&uverbs_dev->kobj);
}
......
......@@ -33,10 +33,47 @@
#include <linux/export.h>
#include <rdma/ib_marshall.h>
void ib_copy_ah_attr_to_user(struct ib_uverbs_ah_attr *dst,
struct rdma_ah_attr *src)
#define OPA_DEFAULT_GID_PREFIX cpu_to_be64(0xfe80000000000000ULL)
static int rdma_ah_conv_opa_to_ib(struct ib_device *dev,
struct rdma_ah_attr *ib,
struct rdma_ah_attr *opa)
{
struct ib_port_attr port_attr;
int ret = 0;
/* Do structure copy and the over-write fields */
*ib = *opa;
ib->type = RDMA_AH_ATTR_TYPE_IB;
rdma_ah_set_grh(ib, NULL, 0, 0, 1, 0);
if (ib_query_port(dev, opa->port_num, &port_attr)) {
/* Set to default subnet to indicate error */
rdma_ah_set_subnet_prefix(ib, OPA_DEFAULT_GID_PREFIX);
ret = -EINVAL;
} else {
rdma_ah_set_subnet_prefix(ib,
cpu_to_be64(port_attr.subnet_prefix));
}
rdma_ah_set_interface_id(ib, OPA_MAKE_ID(rdma_ah_get_dlid(opa)));
return ret;
}
void ib_copy_ah_attr_to_user(struct ib_device *device,
struct ib_uverbs_ah_attr *dst,
struct rdma_ah_attr *ah_attr)
{
struct rdma_ah_attr *src = ah_attr;
struct rdma_ah_attr conv_ah;
memset(&dst->grh.reserved, 0, sizeof(dst->grh.reserved));
if ((ah_attr->type == RDMA_AH_ATTR_TYPE_OPA) &&
(rdma_ah_get_dlid(ah_attr) >=
be16_to_cpu(IB_MULTICAST_LID_BASE)) &&
(!rdma_ah_conv_opa_to_ib(device, &conv_ah, ah_attr)))
src = &conv_ah;
dst->dlid = rdma_ah_get_dlid(src);
dst->sl = rdma_ah_get_sl(src);
dst->src_path_bits = rdma_ah_get_path_bits(src);
......@@ -57,7 +94,8 @@ void ib_copy_ah_attr_to_user(struct ib_uverbs_ah_attr *dst,
}
EXPORT_SYMBOL(ib_copy_ah_attr_to_user);
void ib_copy_qp_attr_to_user(struct ib_uverbs_qp_attr *dst,
void ib_copy_qp_attr_to_user(struct ib_device *device,
struct ib_uverbs_qp_attr *dst,
struct ib_qp_attr *src)
{
dst->qp_state = src->qp_state;
......@@ -76,8 +114,8 @@ void ib_copy_qp_attr_to_user(struct ib_uverbs_qp_attr *dst,
dst->max_recv_sge = src->cap.max_recv_sge;
dst->max_inline_data = src->cap.max_inline_data;
ib_copy_ah_attr_to_user(&dst->ah_attr, &src->ah_attr);
ib_copy_ah_attr_to_user(&dst->alt_ah_attr, &src->alt_ah_attr);
ib_copy_ah_attr_to_user(device, &dst->ah_attr, &src->ah_attr);
ib_copy_ah_attr_to_user(device, &dst->alt_ah_attr, &src->alt_ah_attr);
dst->pkey_index = src->pkey_index;
dst->alt_pkey_index = src->alt_pkey_index;
......
This diff is collapsed.
......@@ -180,39 +180,29 @@ EXPORT_SYMBOL(ib_rate_to_mbps);
__attribute_const__ enum rdma_transport_type
rdma_node_get_transport(enum rdma_node_type node_type)
{
switch (node_type) {
case RDMA_NODE_IB_CA:
case RDMA_NODE_IB_SWITCH:
case RDMA_NODE_IB_ROUTER:
return RDMA_TRANSPORT_IB;
case RDMA_NODE_RNIC:
return RDMA_TRANSPORT_IWARP;
case RDMA_NODE_USNIC:
if (node_type == RDMA_NODE_USNIC)
return RDMA_TRANSPORT_USNIC;
case RDMA_NODE_USNIC_UDP:
if (node_type == RDMA_NODE_USNIC_UDP)
return RDMA_TRANSPORT_USNIC_UDP;
default:
BUG();
return 0;
}
if (node_type == RDMA_NODE_RNIC)
return RDMA_TRANSPORT_IWARP;
return RDMA_TRANSPORT_IB;
}
EXPORT_SYMBOL(rdma_node_get_transport);
enum rdma_link_layer rdma_port_get_link_layer(struct ib_device *device, u8 port_num)
{
enum rdma_transport_type lt;
if (device->get_link_layer)
return device->get_link_layer(device, port_num);
switch (rdma_node_get_transport(device->node_type)) {
case RDMA_TRANSPORT_IB:
lt = rdma_node_get_transport(device->node_type);
if (lt == RDMA_TRANSPORT_IB)
return IB_LINK_LAYER_INFINIBAND;
case RDMA_TRANSPORT_IWARP:
case RDMA_TRANSPORT_USNIC:
case RDMA_TRANSPORT_USNIC_UDP:
return IB_LINK_LAYER_ETHERNET;
default:
return IB_LINK_LAYER_UNSPECIFIED;
}
return IB_LINK_LAYER_ETHERNET;
}
EXPORT_SYMBOL(rdma_port_get_link_layer);
......@@ -478,6 +468,8 @@ int ib_init_ah_from_wc(struct ib_device *device, u8 port_num,
union ib_gid dgid;
union ib_gid sgid;
might_sleep();
memset(ah_attr, 0, sizeof *ah_attr);
ah_attr->type = rdma_ah_find_type(device, port_num);
if (rdma_cap_eth_ah(device, port_num)) {
......@@ -632,11 +624,13 @@ struct ib_srq *ib_create_srq(struct ib_pd *pd,
srq->event_handler = srq_init_attr->event_handler;
srq->srq_context = srq_init_attr->srq_context;
srq->srq_type = srq_init_attr->srq_type;
if (ib_srq_has_cq(srq->srq_type)) {
srq->ext.cq = srq_init_attr->ext.cq;
atomic_inc(&srq->ext.cq->usecnt);
}
if (srq->srq_type == IB_SRQT_XRC) {
srq->ext.xrc.xrcd = srq_init_attr->ext.xrc.xrcd;
srq->ext.xrc.cq = srq_init_attr->ext.xrc.cq;
atomic_inc(&srq->ext.xrc.xrcd->usecnt);
atomic_inc(&srq->ext.xrc.cq->usecnt);
}
atomic_inc(&pd->usecnt);
atomic_set(&srq->usecnt, 0);
......@@ -677,18 +671,18 @@ int ib_destroy_srq(struct ib_srq *srq)
pd = srq->pd;
srq_type = srq->srq_type;
if (srq_type == IB_SRQT_XRC) {
if (ib_srq_has_cq(srq_type))
cq = srq->ext.cq;
if (srq_type == IB_SRQT_XRC)
xrcd = srq->ext.xrc.xrcd;
cq = srq->ext.xrc.cq;
}
ret = srq->device->destroy_srq(srq);
if (!ret) {
atomic_dec(&pd->usecnt);
if (srq_type == IB_SRQT_XRC) {
if (srq_type == IB_SRQT_XRC)
atomic_dec(&xrcd->usecnt);
if (ib_srq_has_cq(srq_type))
atomic_dec(&cq->usecnt);
}
}
return ret;
......@@ -1244,6 +1238,18 @@ int ib_resolve_eth_dmac(struct ib_device *device,
if (rdma_link_local_addr((struct in6_addr *)grh->dgid.raw)) {
rdma_get_ll_mac((struct in6_addr *)grh->dgid.raw,
ah_attr->roce.dmac);
return 0;
}
if (rdma_is_multicast_addr((struct in6_addr *)ah_attr->grh.dgid.raw)) {
if (ipv6_addr_v4mapped((struct in6_addr *)ah_attr->grh.dgid.raw)) {
__be32 addr = 0;
memcpy(&addr, ah_attr->grh.dgid.raw + 12, 4);
ip_eth_mc_map(addr, (char *)ah_attr->roce.dmac);
} else {
ipv6_eth_mc_map((struct in6_addr *)ah_attr->grh.dgid.raw,
(char *)ah_attr->roce.dmac);
}
} else {
union ib_gid sgid;
struct ib_gid_attr sgid_attr;
......@@ -1306,6 +1312,61 @@ int ib_modify_qp_with_udata(struct ib_qp *qp, struct ib_qp_attr *attr,
}
EXPORT_SYMBOL(ib_modify_qp_with_udata);
int ib_get_eth_speed(struct ib_device *dev, u8 port_num, u8 *speed, u8 *width)
{
int rc;
u32 netdev_speed;
struct net_device *netdev;
struct ethtool_link_ksettings lksettings;
if (rdma_port_get_link_layer(dev, port_num) != IB_LINK_LAYER_ETHERNET)
return -EINVAL;
if (!dev->get_netdev)
return -EOPNOTSUPP;
netdev = dev->get_netdev(dev, port_num);
if (!netdev)
return -ENODEV;
rtnl_lock();
rc = __ethtool_get_link_ksettings(netdev, &lksettings);
rtnl_unlock();
dev_put(netdev);
if (!rc) {
netdev_speed = lksettings.base.speed;
} else {
netdev_speed = SPEED_1000;
pr_warn("%s speed is unknown, defaulting to %d\n", netdev->name,
netdev_speed);
}
if (netdev_speed <= SPEED_1000) {
*width = IB_WIDTH_1X;
*speed = IB_SPEED_SDR;
} else if (netdev_speed <= SPEED_10000) {
*width = IB_WIDTH_1X;
*speed = IB_SPEED_FDR10;
} else if (netdev_speed <= SPEED_20000) {
*width = IB_WIDTH_4X;
*speed = IB_SPEED_DDR;
} else if (netdev_speed <= SPEED_25000) {
*width = IB_WIDTH_1X;
*speed = IB_SPEED_EDR;
} else if (netdev_speed <= SPEED_40000) {
*width = IB_WIDTH_4X;
*speed = IB_SPEED_FDR10;
} else {
*width = IB_WIDTH_4X;
*speed = IB_SPEED_EDR;
}
return 0;
}
EXPORT_SYMBOL(ib_get_eth_speed);
int ib_modify_qp(struct ib_qp *qp,
struct ib_qp_attr *qp_attr,
int qp_attr_mask)
......@@ -1573,15 +1634,53 @@ EXPORT_SYMBOL(ib_dealloc_fmr);
/* Multicast groups */
static bool is_valid_mcast_lid(struct ib_qp *qp, u16 lid)
{
struct ib_qp_init_attr init_attr = {};
struct ib_qp_attr attr = {};
int num_eth_ports = 0;
int port;
/* If QP state >= init, it is assigned to a port and we can check this
* port only.
*/
if (!ib_query_qp(qp, &attr, IB_QP_STATE | IB_QP_PORT, &init_attr)) {
if (attr.qp_state >= IB_QPS_INIT) {
if (qp->device->get_link_layer(qp->device, attr.port_num) !=
IB_LINK_LAYER_INFINIBAND)
return true;
goto lid_check;
}
}
/* Can't get a quick answer, iterate over all ports */
for (port = 0; port < qp->device->phys_port_cnt; port++)
if (qp->device->get_link_layer(qp->device, port) !=
IB_LINK_LAYER_INFINIBAND)
num_eth_ports++;
/* If we have at lease one Ethernet port, RoCE annex declares that
* multicast LID should be ignored. We can't tell at this step if the
* QP belongs to an IB or Ethernet port.
*/
if (num_eth_ports)
return true;
/* If all the ports are IB, we can check according to IB spec. */
lid_check:
return !(lid < be16_to_cpu(IB_MULTICAST_LID_BASE) ||
lid == be16_to_cpu(IB_LID_PERMISSIVE));
}
int ib_attach_mcast(struct ib_qp *qp, union ib_gid *gid, u16 lid)
{
int ret;
if (!qp->device->attach_mcast)
return -ENOSYS;
if (gid->raw[0] != 0xff || qp->qp_type != IB_QPT_UD ||
lid < be16_to_cpu(IB_MULTICAST_LID_BASE) ||
lid == be16_to_cpu(IB_LID_PERMISSIVE))
if (!rdma_is_multicast_addr((struct in6_addr *)gid->raw) ||
qp->qp_type != IB_QPT_UD || !is_valid_mcast_lid(qp, lid))
return -EINVAL;
ret = qp->device->attach_mcast(qp, gid, lid);
......@@ -1597,9 +1696,9 @@ int ib_detach_mcast(struct ib_qp *qp, union ib_gid *gid, u16 lid)
if (!qp->device->detach_mcast)
return -ENOSYS;
if (gid->raw[0] != 0xff || qp->qp_type != IB_QPT_UD ||
lid < be16_to_cpu(IB_MULTICAST_LID_BASE) ||
lid == be16_to_cpu(IB_LID_PERMISSIVE))
if (!rdma_is_multicast_addr((struct in6_addr *)gid->raw) ||
qp->qp_type != IB_QPT_UD || !is_valid_mcast_lid(qp, lid))
return -EINVAL;
ret = qp->device->detach_mcast(qp, gid, lid);
......
......@@ -3,4 +3,4 @@ ccflags-y := -Idrivers/net/ethernet/broadcom/bnxt
obj-$(CONFIG_INFINIBAND_BNXT_RE) += bnxt_re.o
bnxt_re-y := main.o ib_verbs.o \
qplib_res.o qplib_rcfw.o \
qplib_sp.o qplib_fp.o
qplib_sp.o qplib_fp.o hw_counters.o
......@@ -85,7 +85,7 @@ struct bnxt_re_sqp_entries {
};
#define BNXT_RE_MIN_MSIX 2
#define BNXT_RE_MAX_MSIX 16
#define BNXT_RE_MAX_MSIX 9
#define BNXT_RE_AEQ_IDX 0
#define BNXT_RE_NQ_IDX 1
......@@ -116,7 +116,7 @@ struct bnxt_re_dev {
struct bnxt_qplib_rcfw rcfw;
/* NQ */
struct bnxt_qplib_nq nq;
struct bnxt_qplib_nq nq[BNXT_RE_MAX_MSIX];
/* Device Resources */
struct bnxt_qplib_dev_attr dev_attr;
......@@ -140,6 +140,7 @@ struct bnxt_re_dev {
struct bnxt_re_qp *qp1_sqp;
struct bnxt_re_ah *sqp_ah;
struct bnxt_re_sqp_entries sqp_tbl[1024];
atomic_t nq_alloc_cnt;
};
#define to_bnxt_re_dev(ptr, member) \
......
/*
* Broadcom NetXtreme-E RoCE driver.
*
* Copyright (c) 2016 - 2017, Broadcom. All rights reserved. The term
* Broadcom refers to Broadcom Limited and/or its subsidiaries.
*
* This software is available to you under a choice of one of two
* licenses. You may choose to be licensed under the terms of the GNU
* General Public License (GPL) Version 2, available from the file
* COPYING in the main directory of this source tree, or the
* BSD license below:
*
* Redistribution and use in source and binary forms, with or without
* modification, are permitted provided that the following conditions
* are met:
*
* 1. Redistributions of source code must retain the above copyright
* notice, this list of conditions and the following disclaimer.
* 2. Redistributions in binary form must reproduce the above copyright
* notice, this list of conditions and the following disclaimer in
* the documentation and/or other materials provided with the
* distribution.
*
* THIS SOFTWARE IS PROVIDED BY THE AUTHOR AND CONTRIBUTORS ``AS IS''
* AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO,
* THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR
* PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE AUTHOR OR CONTRIBUTORS
* BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR
* CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF
* SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR
* BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY,
* WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE
* OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN
* IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
*
* Description: Statistics
*
*/
#include <linux/interrupt.h>
#include <linux/types.h>
#include <linux/spinlock.h>
#include <linux/sched.h>
#include <linux/slab.h>
#include <linux/pci.h>
#include <linux/prefetch.h>
#include <linux/delay.h>
#include <rdma/ib_addr.h>
#include "bnxt_ulp.h"
#include "roce_hsi.h"
#include "qplib_res.h"
#include "qplib_sp.h"
#include "qplib_fp.h"
#include "qplib_rcfw.h"
#include "bnxt_re.h"
#include "hw_counters.h"
static const char * const bnxt_re_stat_name[] = {
[BNXT_RE_ACTIVE_QP] = "active_qps",
[BNXT_RE_ACTIVE_SRQ] = "active_srqs",
[BNXT_RE_ACTIVE_CQ] = "active_cqs",
[BNXT_RE_ACTIVE_MR] = "active_mrs",
[BNXT_RE_ACTIVE_MW] = "active_mws",
[BNXT_RE_RX_PKTS] = "rx_pkts",
[BNXT_RE_RX_BYTES] = "rx_bytes",
[BNXT_RE_TX_PKTS] = "tx_pkts",
[BNXT_RE_TX_BYTES] = "tx_bytes",
[BNXT_RE_RECOVERABLE_ERRORS] = "recoverable_errors"
};
int bnxt_re_ib_get_hw_stats(struct ib_device *ibdev,
struct rdma_hw_stats *stats,
u8 port, int index)
{
struct bnxt_re_dev *rdev = to_bnxt_re_dev(ibdev, ibdev);
struct ctx_hw_stats *bnxt_re_stats = rdev->qplib_ctx.stats.dma;
if (!port || !stats)
return -EINVAL;
stats->value[BNXT_RE_ACTIVE_QP] = atomic_read(&rdev->qp_count);
stats->value[BNXT_RE_ACTIVE_SRQ] = atomic_read(&rdev->srq_count);
stats->value[BNXT_RE_ACTIVE_CQ] = atomic_read(&rdev->cq_count);
stats->value[BNXT_RE_ACTIVE_MR] = atomic_read(&rdev->mr_count);
stats->value[BNXT_RE_ACTIVE_MW] = atomic_read(&rdev->mw_count);
if (bnxt_re_stats) {
stats->value[BNXT_RE_RECOVERABLE_ERRORS] =
le64_to_cpu(bnxt_re_stats->tx_bcast_pkts);
stats->value[BNXT_RE_RX_PKTS] =
le64_to_cpu(bnxt_re_stats->rx_ucast_pkts);
stats->value[BNXT_RE_RX_BYTES] =
le64_to_cpu(bnxt_re_stats->rx_ucast_bytes);
stats->value[BNXT_RE_TX_PKTS] =
le64_to_cpu(bnxt_re_stats->tx_ucast_pkts);
stats->value[BNXT_RE_TX_BYTES] =
le64_to_cpu(bnxt_re_stats->tx_ucast_bytes);
}
return ARRAY_SIZE(bnxt_re_stat_name);
}
struct rdma_hw_stats *bnxt_re_ib_alloc_hw_stats(struct ib_device *ibdev,
u8 port_num)
{
BUILD_BUG_ON(ARRAY_SIZE(bnxt_re_stat_name) != BNXT_RE_NUM_COUNTERS);
/* We support only per port stats */
if (!port_num)
return NULL;
return rdma_alloc_hw_stats_struct(bnxt_re_stat_name,
ARRAY_SIZE(bnxt_re_stat_name),
RDMA_HW_STATS_DEFAULT_LIFESPAN);
}
/*
* Broadcom NetXtreme-E RoCE driver.
*
* Copyright (c) 2016 - 2017, Broadcom. All rights reserved. The term
* Broadcom refers to Broadcom Limited and/or its subsidiaries.
*
* This software is available to you under a choice of one of two
* licenses. You may choose to be licensed under the terms of the GNU
* General Public License (GPL) Version 2, available from the file
* COPYING in the main directory of this source tree, or the
* BSD license below:
*
* Redistribution and use in source and binary forms, with or without
* modification, are permitted provided that the following conditions
* are met:
*
* 1. Redistributions of source code must retain the above copyright
* notice, this list of conditions and the following disclaimer.
* 2. Redistributions in binary form must reproduce the above copyright
* notice, this list of conditions and the following disclaimer in
* the documentation and/or other materials provided with the
* distribution.
*
* THIS SOFTWARE IS PROVIDED BY THE AUTHOR AND CONTRIBUTORS ``AS IS''
* AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO,
* THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR
* PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE AUTHOR OR CONTRIBUTORS
* BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR
* CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF
* SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR
* BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY,
* WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE
* OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN
* IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
*
* Description: Statistics (header)
*
*/
#ifndef __BNXT_RE_HW_STATS_H__
#define __BNXT_RE_HW_STATS_H__
enum bnxt_re_hw_stats {
BNXT_RE_ACTIVE_QP,
BNXT_RE_ACTIVE_SRQ,
BNXT_RE_ACTIVE_CQ,
BNXT_RE_ACTIVE_MR,
BNXT_RE_ACTIVE_MW,
BNXT_RE_RX_PKTS,
BNXT_RE_RX_BYTES,
BNXT_RE_TX_PKTS,
BNXT_RE_TX_BYTES,
BNXT_RE_RECOVERABLE_ERRORS,
BNXT_RE_NUM_COUNTERS
};
struct rdma_hw_stats *bnxt_re_ib_alloc_hw_stats(struct ib_device *ibdev,
u8 port_num);
int bnxt_re_ib_get_hw_stats(struct ib_device *ibdev,
struct rdma_hw_stats *stats,
u8 port, int index);
#endif /* __BNXT_RE_HW_STATS_H__ */
This diff is collapsed.
......@@ -141,9 +141,6 @@ int bnxt_re_modify_device(struct ib_device *ibdev,
struct ib_device_modify *device_modify);
int bnxt_re_query_port(struct ib_device *ibdev, u8 port_num,
struct ib_port_attr *port_attr);
int bnxt_re_modify_port(struct ib_device *ibdev, u8 port_num,
int port_modify_mask,
struct ib_port_modify *port_modify);
int bnxt_re_get_port_immutable(struct ib_device *ibdev, u8 port_num,
struct ib_port_immutable *immutable);
int bnxt_re_query_pkey(struct ib_device *ibdev, u8 port_num,
......
This diff is collapsed.
This diff is collapsed.
......@@ -220,19 +220,20 @@ struct bnxt_qplib_q {
u16 q_full_delta;
u16 max_sge;
u32 psn;
bool flush_in_progress;
bool condition;
bool single;
bool send_phantom;
u32 phantom_wqe_cnt;
u32 phantom_cqe_cnt;
u32 next_cq_cons;
bool flushed;
};
struct bnxt_qplib_qp {
struct bnxt_qplib_pd *pd;
struct bnxt_qplib_dpi *dpi;
u64 qp_handle;
#define BNXT_QPLIB_QP_ID_INVALID 0xFFFFFFFF
u32 id;
u8 type;
u8 sig_type;
......@@ -296,6 +297,8 @@ struct bnxt_qplib_qp {
dma_addr_t sq_hdr_buf_map;
void *rq_hdr_buf;
dma_addr_t rq_hdr_buf_map;
struct list_head sq_flush;
struct list_head rq_flush;
};
#define BNXT_QPLIB_MAX_CQE_ENTRY_SIZE sizeof(struct cq_base)
......@@ -351,6 +354,7 @@ struct bnxt_qplib_cq {
u16 period;
struct bnxt_qplib_hwq hwq;
u32 cnq_hw_ring_id;
struct bnxt_qplib_nq *nq;
bool resize_in_progress;
struct scatterlist *sghead;
u32 nmap;
......@@ -360,6 +364,9 @@ struct bnxt_qplib_cq {
unsigned long flags;
#define CQ_FLAGS_RESIZE_IN_PROG 1
wait_queue_head_t waitq;
struct list_head sqf_head, rqf_head;
atomic_t arm_state;
spinlock_t compl_lock; /* synch CQ handlers */
};
#define BNXT_QPLIB_MAX_IRRQE_ENTRY_SIZE sizeof(struct xrrq_irrq)
......@@ -400,6 +407,7 @@ struct bnxt_qplib_nq {
struct pci_dev *pdev;
int vector;
cpumask_t mask;
int budget;
bool requested;
struct tasklet_struct worker;
......@@ -417,11 +425,19 @@ struct bnxt_qplib_nq {
(struct bnxt_qplib_nq *nq,
void *srq,
u8 event);
struct workqueue_struct *cqn_wq;
char name[32];
};
struct bnxt_qplib_nq_work {
struct work_struct work;
struct bnxt_qplib_nq *nq;
struct bnxt_qplib_cq *cq;
};
void bnxt_qplib_disable_nq(struct bnxt_qplib_nq *nq);
int bnxt_qplib_enable_nq(struct pci_dev *pdev, struct bnxt_qplib_nq *nq,
int msix_vector, int bar_reg_offset,
int nq_idx, int msix_vector, int bar_reg_offset,
int (*cqn_handler)(struct bnxt_qplib_nq *nq,
struct bnxt_qplib_cq *cq),
int (*srqn_handler)(struct bnxt_qplib_nq *nq,
......@@ -453,4 +469,13 @@ bool bnxt_qplib_is_cq_empty(struct bnxt_qplib_cq *cq);
void bnxt_qplib_req_notify_cq(struct bnxt_qplib_cq *cq, u32 arm_type);
void bnxt_qplib_free_nq(struct bnxt_qplib_nq *nq);
int bnxt_qplib_alloc_nq(struct pci_dev *pdev, struct bnxt_qplib_nq *nq);
void bnxt_qplib_add_flush_qp(struct bnxt_qplib_qp *qp);
void bnxt_qplib_del_flush_qp(struct bnxt_qplib_qp *qp);
void bnxt_qplib_acquire_cq_locks(struct bnxt_qplib_qp *qp,
unsigned long *flags);
void bnxt_qplib_release_cq_locks(struct bnxt_qplib_qp *qp,
unsigned long *flags);
int bnxt_qplib_process_flush_list(struct bnxt_qplib_cq *cq,
struct bnxt_qplib_cqe *cqe,
int num_cqes);
#endif /* __BNXT_QPLIB_FP_H__ */
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
......@@ -116,6 +116,7 @@ struct bnxt_qplib_sgid_tbl {
u16 max;
u16 active;
void *ctx;
u8 *vlan;
};
struct bnxt_qplib_pkey_tbl {
......@@ -188,6 +189,7 @@ struct bnxt_qplib_res {
struct bnxt_qplib_sgid_tbl sgid_tbl;
struct bnxt_qplib_pkey_tbl pkey_tbl;
struct bnxt_qplib_dpi_tbl dpi_tbl;
bool prio;
};
#define to_bnxt_qplib(ptr, type, member) \
......
This diff is collapsed.
......@@ -135,6 +135,8 @@ int bnxt_qplib_del_sgid(struct bnxt_qplib_sgid_tbl *sgid_tbl,
int bnxt_qplib_add_sgid(struct bnxt_qplib_sgid_tbl *sgid_tbl,
struct bnxt_qplib_gid *gid, u8 *mac, u16 vlan_id,
bool update, u32 *index);
int bnxt_qplib_update_sgid(struct bnxt_qplib_sgid_tbl *sgid_tbl,
struct bnxt_qplib_gid *gid, u16 gid_idx, u8 *smac);
int bnxt_qplib_get_pkey(struct bnxt_qplib_res *res,
struct bnxt_qplib_pkey_tbl *pkey_tbl, u16 index,
u16 *pkey);
......
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
Markdown is supported
0%
or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment