Commit c5b6c3ee authored by David S. Miller's avatar David S. Miller

Merge branch 'mlx5-connectx-4-sriov'

Or Gerlitz says:

====================
Introducing ConnectX-4 Ethernet SRIOV

This patchset introduces the support of Ethernet SRIOV in ConnectX-4
family of 100G Ethernet NICs.

Some features are still missing, but all the basic SRIOV functionalities
are there already.

Basic Introduction:
ConnectX-4 HW architecture provides two kinds of underlying HW switches.

MPFS (Multi Physical Function Switch) or L2 Table in Software terms:

The HCA has one MPFS switch per physical port, this switch is responsible
of forwarding Unicast traffic to the various overlying Physical Functions (PFs).
Multicast traffic is flooded amongst all the PFs, Each PF can request to
forward a unicast MAC to its E-Switch Uplink vport (which we will cover later)
through SET_L2_TABLE_ENTRY HW command.

MPFS has five ports, four are connected to PFs (one for each) and one is connected
directly to the Physical Port (Physical Link).

E-Switch (Ethernet Switch):

The HCA has one per physical function. The main responsibility of this component is
to forward Unicast/Multicast and vlan tagged/untagged traffic to the various
Virtual Functions (VFs) allocated by the PF. Unlike MPFS, the PF needs to explicitly
create the E-Switch FDB table, Which is a HW flow table managed by the PF driver
whenever vport_group_manager capability bit is set for this PF.

E-Switch has Virtual Ports (vports) entities as its ports, vport0 and uplink vport
are special kind of vports that represents PF vport (vport0) and uplink vport which
is connected to the MPFS switch (if exists) as the PF external link.
vport1..vportN represent VF0..VF(N-1) egress/ingress ports.

E-Switch FDB contains forwarding rules such as:
        UC MAC0 -> vport0(PF).
        UC MAC1 -> vport1.
        UC MAC2 -> vport2.
        MC MACX -> vport0, vport2, Uplink.
        MC MACY -> vport1, Uplink.

    For unmatched traffic FDB has the following default rules:
        Unmatched Traffic (src vport != Uplink) -> Uplink.
        Unmatched Traffic (src vport == Uplink) -> vport0(PF).

NIC VPort context:
Each NIC (VF/PF) has its own vport context which will be used to store the current
NIC vport context (UC/MC and vlan lists) and other NIC properties such as MTU, promisc
mode, etc.. NIC (VF/PF) driver is responsible of constantly updating this context.

FDB rules population:
Each NIC vport (VF/PF) will notify E-Switch manager of its UC/MC vport
context changes via modify vport context command, which will be
translated to an event that will be handled by E-Switch manager (PF)
which will update FDB table accordingly.

Both PF and VF use the same driver and submit commands directly to the firmware.
The PF sees the vport_group_manager capability bit and as such runs the code
to populate the embedded switches as explained above.

The patch goes as follows:

Patches 1-2 introduces the basic PCI SRIOV functionalities and the support of
Connectx4 to enable specific VFs via enable/disable HCA commands. These two
patches will be also in use later for the IB SRIOV flow.

Patches 3-8 Introduces the basic E-Switch capabilities and commands to be used later by
VF to modify and update its NIC vport context, and by PF (E-Switch Manager) driver to
Query the VF NIC context and acts accordingly.

Patches 9-10 Provide the needed functionality of a NIC driver VF/PF to support SRIOV,
mainly vport context update support.

Patch 11 ("net/mlx5: Introducing E-Switch and l2 table"), Introduces the basic
E-Switch support and infrastructure to read vport context events and to update
MPFS L2 Table of the UC mac addresses request by the PF.

Patches 12-18 Introduces SRIOV enablemenet and E-Switch FDB table management
It adds the Basic E-Swtich public API to set and get sriov properties to be used
in PF netdev sriov ndos.

Patchset was applied ontop of commit 3f8c0f7e "gianfar: use of_property_read_bool()"

Saeed, Eli and Or.

changes from V0, addressed feedback from Alex Duyck:
 - patch 09, remove the loop to seek the device address
 - patch 09, avoid using array as returned value from helper function
 - patch 10, fix possible buffer over-run

changes from V1, addressed feedback from and Julia Lawall and kbuild test robot
 - patch 11 check the right variable for allocation failure
 - patch 18 eliminated unneeded semicolon
====================
Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
parents 24e2416e 66e49ded
......@@ -2,7 +2,7 @@ obj-$(CONFIG_MLX5_CORE) += mlx5_core.o
mlx5_core-y := main.o cmd.o debugfs.o fw.o eq.o uar.o pagealloc.o \
health.o mcg.o cq.o srq.o alloc.o qp.o port.o mr.o pd.o \
mad.o transobj.o vport.o
mlx5_core-$(CONFIG_MLX5_CORE_EN) += wq.o flow_table.o \
mad.o transobj.o vport.o sriov.o
mlx5_core-$(CONFIG_MLX5_CORE_EN) += wq.o flow_table.o eswitch.o \
en_main.o en_flow_table.o en_ethtool.o en_tx.o en_rx.o \
en_txrx.o
......@@ -465,6 +465,7 @@ enum {
};
struct mlx5e_vlan_db {
unsigned long active_vlans[BITS_TO_LONGS(VLAN_N_VID)];
u32 active_vlans_ft_ix[VLAN_N_VID];
u32 untagged_rule_ft_ix;
u32 any_vlan_rule_ft_ix;
......
......@@ -502,6 +502,49 @@ static int mlx5e_add_eth_addr_rule(struct mlx5e_priv *priv,
return err;
}
static int mlx5e_vport_context_update_vlans(struct mlx5e_priv *priv)
{
struct net_device *ndev = priv->netdev;
int max_list_size;
int list_size;
u16 *vlans;
int vlan;
int err;
int i;
list_size = 0;
for_each_set_bit(vlan, priv->vlan.active_vlans, VLAN_N_VID)
list_size++;
max_list_size = 1 << MLX5_CAP_GEN(priv->mdev, log_max_vlan_list);
if (list_size > max_list_size) {
netdev_warn(ndev,
"netdev vlans list size (%d) > (%d) max vport list size, some vlans will be dropped\n",
list_size, max_list_size);
list_size = max_list_size;
}
vlans = kcalloc(list_size, sizeof(*vlans), GFP_KERNEL);
if (!vlans)
return -ENOMEM;
i = 0;
for_each_set_bit(vlan, priv->vlan.active_vlans, VLAN_N_VID) {
if (i >= list_size)
break;
vlans[i++] = vlan;
}
err = mlx5_modify_nic_vport_vlans(priv->mdev, vlans, list_size);
if (err)
netdev_err(ndev, "Failed to modify vport vlans list err(%d)\n",
err);
kfree(vlans);
return err;
}
enum mlx5e_vlan_rule_type {
MLX5E_VLAN_RULE_TYPE_UNTAGGED,
MLX5E_VLAN_RULE_TYPE_ANY_VID,
......@@ -552,6 +595,10 @@ static int mlx5e_add_vlan_rule(struct mlx5e_priv *priv,
1);
break;
default: /* MLX5E_VLAN_RULE_TYPE_MATCH_VID */
err = mlx5e_vport_context_update_vlans(priv);
if (err)
goto add_vlan_rule_out;
ft_ix = &priv->vlan.active_vlans_ft_ix[vid];
MLX5_SET(fte_match_param, match_value, outer_headers.vlan_tag,
1);
......@@ -588,6 +635,7 @@ static void mlx5e_del_vlan_rule(struct mlx5e_priv *priv,
case MLX5E_VLAN_RULE_TYPE_MATCH_VID:
mlx5_del_flow_table_entry(priv->ft.vlan,
priv->vlan.active_vlans_ft_ix[vid]);
mlx5e_vport_context_update_vlans(priv);
break;
}
}
......@@ -619,6 +667,8 @@ int mlx5e_vlan_rx_add_vid(struct net_device *dev, __always_unused __be16 proto,
{
struct mlx5e_priv *priv = netdev_priv(dev);
set_bit(vid, priv->vlan.active_vlans);
return mlx5e_add_vlan_rule(priv, MLX5E_VLAN_RULE_TYPE_MATCH_VID, vid);
}
......@@ -627,6 +677,8 @@ int mlx5e_vlan_rx_kill_vid(struct net_device *dev, __always_unused __be16 proto,
{
struct mlx5e_priv *priv = netdev_priv(dev);
clear_bit(vid, priv->vlan.active_vlans);
mlx5e_del_vlan_rule(priv, MLX5E_VLAN_RULE_TYPE_MATCH_VID, vid);
return 0;
......@@ -671,6 +723,91 @@ static void mlx5e_sync_netdev_addr(struct mlx5e_priv *priv)
netif_addr_unlock_bh(netdev);
}
static void mlx5e_fill_addr_array(struct mlx5e_priv *priv, int list_type,
u8 addr_array[][ETH_ALEN], int size)
{
bool is_uc = (list_type == MLX5_NVPRT_LIST_TYPE_UC);
struct net_device *ndev = priv->netdev;
struct mlx5e_eth_addr_hash_node *hn;
struct hlist_head *addr_list;
struct hlist_node *tmp;
int i = 0;
int hi;
addr_list = is_uc ? priv->eth_addr.netdev_uc : priv->eth_addr.netdev_mc;
if (is_uc) /* Make sure our own address is pushed first */
ether_addr_copy(addr_array[i++], ndev->dev_addr);
else if (priv->eth_addr.broadcast_enabled)
ether_addr_copy(addr_array[i++], ndev->broadcast);
mlx5e_for_each_hash_node(hn, tmp, addr_list, hi) {
if (ether_addr_equal(ndev->dev_addr, hn->ai.addr))
continue;
if (i >= size)
break;
ether_addr_copy(addr_array[i++], hn->ai.addr);
}
}
static void mlx5e_vport_context_update_addr_list(struct mlx5e_priv *priv,
int list_type)
{
bool is_uc = (list_type == MLX5_NVPRT_LIST_TYPE_UC);
struct mlx5e_eth_addr_hash_node *hn;
u8 (*addr_array)[ETH_ALEN] = NULL;
struct hlist_head *addr_list;
struct hlist_node *tmp;
int max_size;
int size;
int err;
int hi;
size = is_uc ? 0 : (priv->eth_addr.broadcast_enabled ? 1 : 0);
max_size = is_uc ?
1 << MLX5_CAP_GEN(priv->mdev, log_max_current_uc_list) :
1 << MLX5_CAP_GEN(priv->mdev, log_max_current_mc_list);
addr_list = is_uc ? priv->eth_addr.netdev_uc : priv->eth_addr.netdev_mc;
mlx5e_for_each_hash_node(hn, tmp, addr_list, hi)
size++;
if (size > max_size) {
netdev_warn(priv->netdev,
"netdev %s list size (%d) > (%d) max vport list size, some addresses will be dropped\n",
is_uc ? "UC" : "MC", size, max_size);
size = max_size;
}
if (size) {
addr_array = kcalloc(size, ETH_ALEN, GFP_KERNEL);
if (!addr_array) {
err = -ENOMEM;
goto out;
}
mlx5e_fill_addr_array(priv, list_type, addr_array, size);
}
err = mlx5_modify_nic_vport_mac_list(priv->mdev, list_type, addr_array, size);
out:
if (err)
netdev_err(priv->netdev,
"Failed to modify vport %s list err(%d)\n",
is_uc ? "UC" : "MC", err);
kfree(addr_array);
}
static void mlx5e_vport_context_update(struct mlx5e_priv *priv)
{
struct mlx5e_eth_addr_db *ea = &priv->eth_addr;
mlx5e_vport_context_update_addr_list(priv, MLX5_NVPRT_LIST_TYPE_UC);
mlx5e_vport_context_update_addr_list(priv, MLX5_NVPRT_LIST_TYPE_MC);
mlx5_modify_nic_vport_promisc(priv->mdev, 0,
ea->allmulti_enabled,
ea->promisc_enabled);
}
static void mlx5e_apply_netdev_addr(struct mlx5e_priv *priv)
{
struct mlx5e_eth_addr_hash_node *hn;
......@@ -748,6 +885,8 @@ void mlx5e_set_rx_mode_work(struct work_struct *work)
ea->promisc_enabled = promisc_enabled;
ea->allmulti_enabled = allmulti_enabled;
ea->broadcast_enabled = broadcast_enabled;
mlx5e_vport_context_update(priv);
}
void mlx5e_init_eth_addr(struct mlx5e_priv *priv)
......
......@@ -32,6 +32,7 @@
#include <linux/mlx5/flow_table.h>
#include "en.h"
#include "eswitch.h"
struct mlx5e_rq_param {
u32 rqc[MLX5_ST_SZ_DW(rqc)];
......@@ -63,7 +64,7 @@ static void mlx5e_update_carrier(struct mlx5e_priv *priv)
u8 port_state;
port_state = mlx5_query_vport_state(mdev,
MLX5_QUERY_VPORT_STATE_IN_OP_MOD_VNIC_VPORT);
MLX5_QUERY_VPORT_STATE_IN_OP_MOD_VNIC_VPORT, 0);
if (port_state == VPORT_STATE_UP)
netif_carrier_on(priv->netdev);
......@@ -1931,6 +1932,79 @@ static int mlx5e_change_mtu(struct net_device *netdev, int new_mtu)
return err;
}
static int mlx5e_set_vf_mac(struct net_device *dev, int vf, u8 *mac)
{
struct mlx5e_priv *priv = netdev_priv(dev);
struct mlx5_core_dev *mdev = priv->mdev;
return mlx5_eswitch_set_vport_mac(mdev->priv.eswitch, vf + 1, mac);
}
static int mlx5e_set_vf_vlan(struct net_device *dev, int vf, u16 vlan, u8 qos)
{
struct mlx5e_priv *priv = netdev_priv(dev);
struct mlx5_core_dev *mdev = priv->mdev;
return mlx5_eswitch_set_vport_vlan(mdev->priv.eswitch, vf + 1,
vlan, qos);
}
static int mlx5_vport_link2ifla(u8 esw_link)
{
switch (esw_link) {
case MLX5_ESW_VPORT_ADMIN_STATE_DOWN:
return IFLA_VF_LINK_STATE_DISABLE;
case MLX5_ESW_VPORT_ADMIN_STATE_UP:
return IFLA_VF_LINK_STATE_ENABLE;
}
return IFLA_VF_LINK_STATE_AUTO;
}
static int mlx5_ifla_link2vport(u8 ifla_link)
{
switch (ifla_link) {
case IFLA_VF_LINK_STATE_DISABLE:
return MLX5_ESW_VPORT_ADMIN_STATE_DOWN;
case IFLA_VF_LINK_STATE_ENABLE:
return MLX5_ESW_VPORT_ADMIN_STATE_UP;
}
return MLX5_ESW_VPORT_ADMIN_STATE_AUTO;
}
static int mlx5e_set_vf_link_state(struct net_device *dev, int vf,
int link_state)
{
struct mlx5e_priv *priv = netdev_priv(dev);
struct mlx5_core_dev *mdev = priv->mdev;
return mlx5_eswitch_set_vport_state(mdev->priv.eswitch, vf + 1,
mlx5_ifla_link2vport(link_state));
}
static int mlx5e_get_vf_config(struct net_device *dev,
int vf, struct ifla_vf_info *ivi)
{
struct mlx5e_priv *priv = netdev_priv(dev);
struct mlx5_core_dev *mdev = priv->mdev;
int err;
err = mlx5_eswitch_get_vport_config(mdev->priv.eswitch, vf + 1, ivi);
if (err)
return err;
ivi->linkstate = mlx5_vport_link2ifla(ivi->linkstate);
return 0;
}
static int mlx5e_get_vf_stats(struct net_device *dev,
int vf, struct ifla_vf_stats *vf_stats)
{
struct mlx5e_priv *priv = netdev_priv(dev);
struct mlx5_core_dev *mdev = priv->mdev;
return mlx5_eswitch_get_vport_stats(mdev->priv.eswitch, vf + 1,
vf_stats);
}
static struct net_device_ops mlx5e_netdev_ops = {
.ndo_open = mlx5e_open,
.ndo_stop = mlx5e_close,
......@@ -1941,7 +2015,7 @@ static struct net_device_ops mlx5e_netdev_ops = {
.ndo_vlan_rx_add_vid = mlx5e_vlan_rx_add_vid,
.ndo_vlan_rx_kill_vid = mlx5e_vlan_rx_kill_vid,
.ndo_set_features = mlx5e_set_features,
.ndo_change_mtu = mlx5e_change_mtu,
.ndo_change_mtu = mlx5e_change_mtu
};
static int mlx5e_check_required_hca_cap(struct mlx5_core_dev *mdev)
......@@ -2028,7 +2102,7 @@ static void mlx5e_set_netdev_dev_addr(struct net_device *netdev)
{
struct mlx5e_priv *priv = netdev_priv(netdev);
mlx5_query_nic_vport_mac_address(priv->mdev, netdev->dev_addr);
mlx5_query_nic_vport_mac_address(priv->mdev, 0, netdev->dev_addr);
}
static void mlx5e_build_netdev(struct net_device *netdev)
......@@ -2041,6 +2115,14 @@ static void mlx5e_build_netdev(struct net_device *netdev)
if (priv->params.num_tc > 1)
mlx5e_netdev_ops.ndo_select_queue = mlx5e_select_queue;
if (MLX5_CAP_GEN(mdev, vport_group_manager)) {
mlx5e_netdev_ops.ndo_set_vf_mac = mlx5e_set_vf_mac;
mlx5e_netdev_ops.ndo_set_vf_vlan = mlx5e_set_vf_vlan;
mlx5e_netdev_ops.ndo_get_vf_config = mlx5e_get_vf_config;
mlx5e_netdev_ops.ndo_set_vf_link_state = mlx5e_set_vf_link_state;
mlx5e_netdev_ops.ndo_get_vf_stats = mlx5e_get_vf_stats;
}
netdev->netdev_ops = &mlx5e_netdev_ops;
netdev->watchdog_timeo = 15 * HZ;
......
......@@ -35,6 +35,9 @@
#include <linux/mlx5/driver.h>
#include <linux/mlx5/cmd.h>
#include "mlx5_core.h"
#ifdef CONFIG_MLX5_CORE_EN
#include "eswitch.h"
#endif
enum {
MLX5_EQE_SIZE = sizeof(struct mlx5_eqe),
......@@ -287,6 +290,11 @@ static int mlx5_eq_int(struct mlx5_core_dev *dev, struct mlx5_eq *eq)
break;
#endif
#ifdef CONFIG_MLX5_CORE_EN
case MLX5_EVENT_TYPE_NIC_VPORT_CHANGE:
mlx5_eswitch_vport_event(dev->priv.eswitch, eqe);
break;
#endif
default:
mlx5_core_warn(dev, "Unhandled event 0x%x on EQ 0x%x\n",
eqe->type, eq->eqn);
......@@ -459,6 +467,11 @@ int mlx5_start_eqs(struct mlx5_core_dev *dev)
if (MLX5_CAP_GEN(dev, pg))
async_event_mask |= (1ull << MLX5_EVENT_TYPE_PAGE_FAULT);
if (MLX5_CAP_GEN(dev, port_type) == MLX5_CAP_PORT_TYPE_ETH &&
MLX5_CAP_GEN(dev, vport_group_manager) &&
mlx5_core_is_pf(dev))
async_event_mask |= (1ull << MLX5_EVENT_TYPE_NIC_VPORT_CHANGE);
err = mlx5_create_map_eq(dev, &table->cmd_eq, MLX5_EQ_VEC_CMD,
MLX5_NUM_CMD_EQE, 1ull << MLX5_EVENT_TYPE_CMD,
"mlx5_cmd_eq", &dev->priv.uuari.uars[0]);
......
This diff is collapsed.
/*
* Copyright (c) 2015, Mellanox Technologies, Ltd. All rights reserved.
*
* This software is available to you under a choice of one of two
* licenses. You may choose to be licensed under the terms of the GNU
* General Public License (GPL) Version 2, available from the file
* COPYING in the main directory of this source tree, or the
* OpenIB.org BSD license below:
*
* Redistribution and use in source and binary forms, with or
* without modification, are permitted provided that the following
* conditions are met:
*
* - Redistributions of source code must retain the above
* copyright notice, this list of conditions and the following
* disclaimer.
*
* - Redistributions in binary form must reproduce the above
* copyright notice, this list of conditions and the following
* disclaimer in the documentation and/or other materials
* provided with the distribution.
*
* THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND,
* EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF
* MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND
* NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS
* BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN
* ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN
* CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
* SOFTWARE.
*/
#ifndef __MLX5_ESWITCH_H__
#define __MLX5_ESWITCH_H__
#include <linux/if_ether.h>
#include <linux/if_link.h>
#include <linux/mlx5/device.h>
#define MLX5_MAX_UC_PER_VPORT(dev) \
(1 << MLX5_CAP_GEN(dev, log_max_current_uc_list))
#define MLX5_MAX_MC_PER_VPORT(dev) \
(1 << MLX5_CAP_GEN(dev, log_max_current_mc_list))
#define MLX5_L2_ADDR_HASH_SIZE (BIT(BITS_PER_BYTE))
#define MLX5_L2_ADDR_HASH(addr) (addr[5])
/* L2 -mac address based- hash helpers */
struct l2addr_node {
struct hlist_node hlist;
u8 addr[ETH_ALEN];
};
#define for_each_l2hash_node(hn, tmp, hash, i) \
for (i = 0; i < MLX5_L2_ADDR_HASH_SIZE; i++) \
hlist_for_each_entry_safe(hn, tmp, &hash[i], hlist)
#define l2addr_hash_find(hash, mac, type) ({ \
int ix = MLX5_L2_ADDR_HASH(mac); \
bool found = false; \
type *ptr = NULL; \
\
hlist_for_each_entry(ptr, &hash[ix], node.hlist) \
if (ether_addr_equal(ptr->node.addr, mac)) {\
found = true; \
break; \
} \
if (!found) \
ptr = NULL; \
ptr; \
})
#define l2addr_hash_add(hash, mac, type, gfp) ({ \
int ix = MLX5_L2_ADDR_HASH(mac); \
type *ptr = NULL; \
\
ptr = kzalloc(sizeof(type), gfp); \
if (ptr) { \
ether_addr_copy(ptr->node.addr, mac); \
hlist_add_head(&ptr->node.hlist, &hash[ix]);\
} \
ptr; \
})
#define l2addr_hash_del(ptr) ({ \
hlist_del(&ptr->node.hlist); \
kfree(ptr); \
})
struct mlx5_flow_rule {
void *ft;
u32 fi;
u8 match_criteria_enable;
u32 *match_criteria;
u32 *match_value;
u32 action;
u32 flow_tag;
bool valid;
atomic_t refcount;
struct mutex mutex; /* protect flow rule updates */
struct list_head dest_list;
};
struct mlx5_vport {
struct mlx5_core_dev *dev;
int vport;
struct hlist_head uc_list[MLX5_L2_ADDR_HASH_SIZE];
struct hlist_head mc_list[MLX5_L2_ADDR_HASH_SIZE];
struct work_struct vport_change_handler;
/* This spinlock protects access to vport data, between
* "esw_vport_disable" and ongoing interrupt "mlx5_eswitch_vport_event"
* once vport marked as disabled new interrupts are discarded.
*/
spinlock_t lock; /* vport events sync */
bool enabled;
u16 enabled_events;
};
struct mlx5_l2_table {
struct hlist_head l2_hash[MLX5_L2_ADDR_HASH_SIZE];
u32 size;
unsigned long *bitmap;
};
struct mlx5_eswitch_fdb {
void *fdb;
};
struct mlx5_eswitch {
struct mlx5_core_dev *dev;
struct mlx5_l2_table l2_table;
struct mlx5_eswitch_fdb fdb_table;
struct hlist_head mc_table[MLX5_L2_ADDR_HASH_SIZE];
struct workqueue_struct *work_queue;
struct mlx5_vport *vports;
int total_vports;
int enabled_vports;
};
/* E-Switch API */
int mlx5_eswitch_init(struct mlx5_core_dev *dev);
void mlx5_eswitch_cleanup(struct mlx5_eswitch *esw);
void mlx5_eswitch_vport_event(struct mlx5_eswitch *esw, struct mlx5_eqe *eqe);
int mlx5_eswitch_enable_sriov(struct mlx5_eswitch *esw, int nvfs);
void mlx5_eswitch_disable_sriov(struct mlx5_eswitch *esw);
int mlx5_eswitch_set_vport_mac(struct mlx5_eswitch *esw,
int vport, u8 mac[ETH_ALEN]);
int mlx5_eswitch_set_vport_state(struct mlx5_eswitch *esw,
int vport, int link_state);
int mlx5_eswitch_set_vport_vlan(struct mlx5_eswitch *esw,
int vport, u16 vlan, u8 qos);
int mlx5_eswitch_get_vport_config(struct mlx5_eswitch *esw,
int vport, struct ifla_vf_info *ivi);
int mlx5_eswitch_get_vport_stats(struct mlx5_eswitch *esw,
int vport,
struct ifla_vf_stats *vf_stats);
#endif /* __MLX5_ESWITCH_H__ */
......@@ -160,6 +160,30 @@ int mlx5_query_hca_caps(struct mlx5_core_dev *dev)
if (err)
return err;
}
if (MLX5_CAP_GEN(dev, vport_group_manager) &&
MLX5_CAP_GEN(dev, eswitch_flow_table)) {
err = mlx5_core_get_caps(dev, MLX5_CAP_ESWITCH_FLOW_TABLE,
HCA_CAP_OPMOD_GET_CUR);
if (err)
return err;
err = mlx5_core_get_caps(dev, MLX5_CAP_ESWITCH_FLOW_TABLE,
HCA_CAP_OPMOD_GET_MAX);
if (err)
return err;
}
if (MLX5_CAP_GEN(dev, vport_group_manager)) {
err = mlx5_core_get_caps(dev, MLX5_CAP_ESWITCH,
HCA_CAP_OPMOD_GET_CUR);
if (err)
return err;
err = mlx5_core_get_caps(dev, MLX5_CAP_ESWITCH,
HCA_CAP_OPMOD_GET_MAX);
if (err)
return err;
}
return 0;
}
......
......@@ -49,6 +49,9 @@
#include <linux/delay.h>
#include <linux/mlx5/mlx5_ifc.h>
#include "mlx5_core.h"
#ifdef CONFIG_MLX5_CORE_EN
#include "eswitch.h"
#endif
MODULE_AUTHOR("Eli Cohen <eli@mellanox.com>");
MODULE_DESCRIPTION("Mellanox Connect-IB, ConnectX-4 core driver");
......@@ -454,6 +457,9 @@ static int set_hca_ctrl(struct mlx5_core_dev *dev)
struct mlx5_reg_host_endianess he_out;
int err;
if (!mlx5_core_is_pf(dev))
return 0;
memset(&he_in, 0, sizeof(he_in));
he_in.he = MLX5_SET_HOST_ENDIANNESS;
err = mlx5_core_access_reg(dev, &he_in, sizeof(he_in),
......@@ -462,42 +468,39 @@ static int set_hca_ctrl(struct mlx5_core_dev *dev)
return err;
}
static int mlx5_core_enable_hca(struct mlx5_core_dev *dev)
int mlx5_core_enable_hca(struct mlx5_core_dev *dev, u16 func_id)
{
u32 out[MLX5_ST_SZ_DW(enable_hca_out)];
u32 in[MLX5_ST_SZ_DW(enable_hca_in)];
int err;
struct mlx5_enable_hca_mbox_in in;
struct mlx5_enable_hca_mbox_out out;
memset(&in, 0, sizeof(in));
memset(&out, 0, sizeof(out));
in.hdr.opcode = cpu_to_be16(MLX5_CMD_OP_ENABLE_HCA);
memset(in, 0, sizeof(in));
MLX5_SET(enable_hca_in, in, opcode, MLX5_CMD_OP_ENABLE_HCA);
MLX5_SET(enable_hca_in, in, function_id, func_id);
memset(out, 0, sizeof(out));
err = mlx5_cmd_exec(dev, &in, sizeof(in), &out, sizeof(out));
if (err)
return err;
if (out.hdr.status)
return mlx5_cmd_status_to_err(&out.hdr);
return 0;
return mlx5_cmd_status_to_err_v2(out);
}
static int mlx5_core_disable_hca(struct mlx5_core_dev *dev)
int mlx5_core_disable_hca(struct mlx5_core_dev *dev, u16 func_id)
{
u32 out[MLX5_ST_SZ_DW(disable_hca_out)];
u32 in[MLX5_ST_SZ_DW(disable_hca_in)];
int err;
struct mlx5_disable_hca_mbox_in in;
struct mlx5_disable_hca_mbox_out out;
memset(&in, 0, sizeof(in));
memset(&out, 0, sizeof(out));
in.hdr.opcode = cpu_to_be16(MLX5_CMD_OP_DISABLE_HCA);
err = mlx5_cmd_exec(dev, &in, sizeof(in), &out, sizeof(out));
memset(in, 0, sizeof(in));
MLX5_SET(disable_hca_in, in, opcode, MLX5_CMD_OP_DISABLE_HCA);
MLX5_SET(disable_hca_in, in, function_id, func_id);
memset(out, 0, sizeof(out));
err = mlx5_cmd_exec(dev, in, sizeof(in), out, sizeof(out));
if (err)
return err;
if (out.hdr.status)
return mlx5_cmd_status_to_err(&out.hdr);
return 0;
return mlx5_cmd_status_to_err_v2(out);
}
static int mlx5_irq_set_affinity_hint(struct mlx5_core_dev *mdev, int i)
......@@ -942,7 +945,7 @@ static int mlx5_load_one(struct mlx5_core_dev *dev, struct mlx5_priv *priv)
mlx5_pagealloc_init(dev);
err = mlx5_core_enable_hca(dev);
err = mlx5_core_enable_hca(dev, 0);
if (err) {
dev_err(&pdev->dev, "enable hca failed\n");
goto err_pagealloc_cleanup;
......@@ -1052,6 +1055,20 @@ static int mlx5_load_one(struct mlx5_core_dev *dev, struct mlx5_priv *priv)
mlx5_init_srq_table(dev);
mlx5_init_mr_table(dev);
#ifdef CONFIG_MLX5_CORE_EN
err = mlx5_eswitch_init(dev);
if (err) {
dev_err(&pdev->dev, "eswitch init failed %d\n", err);
goto err_reg_dev;
}
#endif
err = mlx5_sriov_init(dev);
if (err) {
dev_err(&pdev->dev, "sriov init failed %d\n", err);
goto err_sriov;
}
err = mlx5_register_device(dev);
if (err) {
dev_err(&pdev->dev, "mlx5_register_device failed %d\n", err);
......@@ -1068,6 +1085,13 @@ static int mlx5_load_one(struct mlx5_core_dev *dev, struct mlx5_priv *priv)
return 0;
err_sriov:
if (mlx5_sriov_cleanup(dev))
dev_err(&dev->pdev->dev, "sriov cleanup failed\n");
#ifdef CONFIG_MLX5_CORE_EN
mlx5_eswitch_cleanup(dev->priv.eswitch);
#endif
err_reg_dev:
mlx5_cleanup_mr_table(dev);
mlx5_cleanup_srq_table(dev);
......@@ -1106,7 +1130,7 @@ static int mlx5_load_one(struct mlx5_core_dev *dev, struct mlx5_priv *priv)
mlx5_reclaim_startup_pages(dev);
err_disable_hca:
mlx5_core_disable_hca(dev);
mlx5_core_disable_hca(dev, 0);
err_pagealloc_cleanup:
mlx5_pagealloc_cleanup(dev);
......@@ -1123,6 +1147,13 @@ static int mlx5_unload_one(struct mlx5_core_dev *dev, struct mlx5_priv *priv)
{
int err = 0;
err = mlx5_sriov_cleanup(dev);
if (err) {
dev_warn(&dev->pdev->dev, "%s: sriov cleanup failed - abort\n",
__func__);
return err;
}
mutex_lock(&dev->intf_state_mutex);
if (dev->interface_state == MLX5_INTERFACE_STATE_DOWN) {
dev_warn(&dev->pdev->dev, "%s: interface is down, NOP\n",
......@@ -1130,6 +1161,10 @@ static int mlx5_unload_one(struct mlx5_core_dev *dev, struct mlx5_priv *priv)
goto out;
}
mlx5_unregister_device(dev);
#ifdef CONFIG_MLX5_CORE_EN
mlx5_eswitch_cleanup(dev->priv.eswitch);
#endif
mlx5_cleanup_mr_table(dev);
mlx5_cleanup_srq_table(dev);
mlx5_cleanup_qp_table(dev);
......@@ -1149,7 +1184,7 @@ static int mlx5_unload_one(struct mlx5_core_dev *dev, struct mlx5_priv *priv)
}
mlx5_pagealloc_stop(dev);
mlx5_reclaim_startup_pages(dev);
mlx5_core_disable_hca(dev);
mlx5_core_disable_hca(dev, 0);
mlx5_pagealloc_cleanup(dev);
mlx5_cmd_cleanup(dev);
......@@ -1195,6 +1230,7 @@ static int init_one(struct pci_dev *pdev,
return -ENOMEM;
}
priv = &dev->priv;
priv->pci_dev_data = id->driver_data;
pci_set_drvdata(pdev, dev);
......@@ -1366,11 +1402,11 @@ static const struct pci_error_handlers mlx5_err_handler = {
static const struct pci_device_id mlx5_core_pci_table[] = {
{ PCI_VDEVICE(MELLANOX, 0x1011) }, /* Connect-IB */
{ PCI_VDEVICE(MELLANOX, 0x1012) }, /* Connect-IB VF */
{ PCI_VDEVICE(MELLANOX, 0x1012), MLX5_PCI_DEV_IS_VF}, /* Connect-IB VF */
{ PCI_VDEVICE(MELLANOX, 0x1013) }, /* ConnectX-4 */
{ PCI_VDEVICE(MELLANOX, 0x1014) }, /* ConnectX-4 VF */
{ PCI_VDEVICE(MELLANOX, 0x1014), MLX5_PCI_DEV_IS_VF}, /* ConnectX-4 VF */
{ PCI_VDEVICE(MELLANOX, 0x1015) }, /* ConnectX-4LX */
{ PCI_VDEVICE(MELLANOX, 0x1016) }, /* ConnectX-4LX VF */
{ PCI_VDEVICE(MELLANOX, 0x1016), MLX5_PCI_DEV_IS_VF}, /* ConnectX-4LX VF */
{ 0, }
};
......@@ -1381,7 +1417,8 @@ static struct pci_driver mlx5_core_driver = {
.id_table = mlx5_core_pci_table,
.probe = init_one,
.remove = remove_one,
.err_handler = &mlx5_err_handler
.err_handler = &mlx5_err_handler,
.sriov_configure = mlx5_core_sriov_configure,
};
static int __init init(void)
......
......@@ -36,6 +36,7 @@
#include <linux/types.h>
#include <linux/kernel.h>
#include <linux/sched.h>
#include <linux/if_link.h>
#define DRIVER_NAME "mlx5_core"
#define DRIVER_VERSION "3.0-1"
......@@ -90,6 +91,10 @@ void mlx5_core_event(struct mlx5_core_dev *dev, enum mlx5_dev_event event,
unsigned long param);
void mlx5_enter_error_state(struct mlx5_core_dev *dev);
void mlx5_disable_device(struct mlx5_core_dev *dev);
int mlx5_core_sriov_configure(struct pci_dev *dev, int num_vfs);
int mlx5_core_enable_hca(struct mlx5_core_dev *dev, u16 func_id);
int mlx5_core_disable_hca(struct mlx5_core_dev *dev, u16 func_id);
int mlx5_wait_for_vf_pages(struct mlx5_core_dev *dev);
void mlx5e_init(void);
void mlx5e_cleanup(void);
......
......@@ -33,6 +33,7 @@
#include <linux/highmem.h>
#include <linux/kernel.h>
#include <linux/module.h>
#include <linux/delay.h>
#include <linux/mlx5/driver.h>
#include <linux/mlx5/cmd.h>
#include "mlx5_core.h"
......@@ -95,6 +96,7 @@ struct mlx5_manage_pages_outbox {
enum {
MAX_RECLAIM_TIME_MSECS = 5000,
MAX_RECLAIM_VFS_PAGES_TIME_MSECS = 2 * 1000 * 60,
};
enum {
......@@ -352,6 +354,10 @@ static int give_pages(struct mlx5_core_dev *dev, u16 func_id, int npages,
goto out_4k;
}
dev->priv.fw_pages += npages;
if (func_id)
dev->priv.vfs_pages += npages;
mlx5_core_dbg(dev, "err %d\n", err);
kvfree(in);
......@@ -405,6 +411,12 @@ static int reclaim_pages(struct mlx5_core_dev *dev, u32 func_id, int npages,
}
num_claimed = be32_to_cpu(out->num_entries);
if (num_claimed > npages) {
mlx5_core_warn(dev, "fw returned %d, driver asked %d => corruption\n",
num_claimed, npages);
err = -EINVAL;
goto out_free;
}
if (nclaimed)
*nclaimed = num_claimed;
......@@ -412,6 +424,9 @@ static int reclaim_pages(struct mlx5_core_dev *dev, u32 func_id, int npages,
addr = be64_to_cpu(out->pas[i]);
free_4k(dev, addr);
}
dev->priv.fw_pages -= num_claimed;
if (func_id)
dev->priv.vfs_pages -= num_claimed;
out_free:
kvfree(out);
......@@ -548,3 +563,26 @@ void mlx5_pagealloc_stop(struct mlx5_core_dev *dev)
{
destroy_workqueue(dev->priv.pg_wq);
}
int mlx5_wait_for_vf_pages(struct mlx5_core_dev *dev)
{
unsigned long end = jiffies + msecs_to_jiffies(MAX_RECLAIM_VFS_PAGES_TIME_MSECS);
int prev_vfs_pages = dev->priv.vfs_pages;
mlx5_core_dbg(dev, "Waiting for %d pages from %s\n", prev_vfs_pages,
dev->priv.name);
while (dev->priv.vfs_pages) {
if (time_after(jiffies, end)) {
mlx5_core_warn(dev, "aborting while there are %d pending pages\n", dev->priv.vfs_pages);
return -ETIMEDOUT;
}
if (dev->priv.vfs_pages < prev_vfs_pages) {
end = jiffies + msecs_to_jiffies(MAX_RECLAIM_VFS_PAGES_TIME_MSECS);
prev_vfs_pages = dev->priv.vfs_pages;
}
msleep(50);
}
mlx5_core_dbg(dev, "All pages received from %s\n", dev->priv.name);
return 0;
}
/*
* Copyright (c) 2014, Mellanox Technologies inc. All rights reserved.
*
* This software is available to you under a choice of one of two
* licenses. You may choose to be licensed under the terms of the GNU
* General Public License (GPL) Version 2, available from the file
* COPYING in the main directory of this source tree, or the
* OpenIB.org BSD license below:
*
* Redistribution and use in source and binary forms, with or
* without modification, are permitted provided that the following
* conditions are met:
*
* - Redistributions of source code must retain the above
* copyright notice, this list of conditions and the following
* disclaimer.
*
* - Redistributions in binary form must reproduce the above
* copyright notice, this list of conditions and the following
* disclaimer in the documentation and/or other materials
* provided with the distribution.
*
* THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND,
* EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF
* MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND
* NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS
* BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN
* ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN
* CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
* SOFTWARE.
*/
#include <linux/pci.h>
#include <linux/mlx5/driver.h>
#include "mlx5_core.h"
#ifdef CONFIG_MLX5_CORE_EN
#include "eswitch.h"
#endif
static void enable_vfs(struct mlx5_core_dev *dev, int num_vfs)
{
struct mlx5_core_sriov *sriov = &dev->priv.sriov;
int err;
int vf;
for (vf = 1; vf <= num_vfs; vf++) {
err = mlx5_core_enable_hca(dev, vf);
if (err) {
mlx5_core_warn(dev, "failed to enable VF %d\n", vf - 1);
} else {
sriov->vfs_ctx[vf - 1].enabled = 1;
mlx5_core_dbg(dev, "successfully enabled VF %d\n", vf - 1);
}
}
}
static void disable_vfs(struct mlx5_core_dev *dev, int num_vfs)
{
struct mlx5_core_sriov *sriov = &dev->priv.sriov;
int vf;
for (vf = 1; vf <= num_vfs; vf++) {
if (sriov->vfs_ctx[vf - 1].enabled) {
if (mlx5_core_disable_hca(dev, vf))
mlx5_core_warn(dev, "failed to disable VF %d\n", vf - 1);
else
sriov->vfs_ctx[vf - 1].enabled = 0;
}
}
}
static int mlx5_core_create_vfs(struct pci_dev *pdev, int num_vfs)
{
struct mlx5_core_dev *dev = pci_get_drvdata(pdev);
int err;
if (pci_num_vf(pdev))
pci_disable_sriov(pdev);
enable_vfs(dev, num_vfs);
err = pci_enable_sriov(pdev, num_vfs);
if (err) {
dev_warn(&pdev->dev, "enable sriov failed %d\n", err);
goto ex;
}
return 0;
ex:
disable_vfs(dev, num_vfs);
return err;
}
static int mlx5_core_sriov_enable(struct pci_dev *pdev, int num_vfs)
{
struct mlx5_core_dev *dev = pci_get_drvdata(pdev);
struct mlx5_core_sriov *sriov = &dev->priv.sriov;
int err;
kfree(sriov->vfs_ctx);
sriov->vfs_ctx = kcalloc(num_vfs, sizeof(*sriov->vfs_ctx), GFP_ATOMIC);
if (!sriov->vfs_ctx)
return -ENOMEM;
sriov->enabled_vfs = num_vfs;
err = mlx5_core_create_vfs(pdev, num_vfs);
if (err) {
kfree(sriov->vfs_ctx);
sriov->vfs_ctx = NULL;
return err;
}
return 0;
}
static void mlx5_core_init_vfs(struct mlx5_core_dev *dev, int num_vfs)
{
struct mlx5_core_sriov *sriov = &dev->priv.sriov;
sriov->num_vfs = num_vfs;
}
static void mlx5_core_cleanup_vfs(struct mlx5_core_dev *dev)
{
struct mlx5_core_sriov *sriov;
sriov = &dev->priv.sriov;
disable_vfs(dev, sriov->num_vfs);
if (mlx5_wait_for_vf_pages(dev))
mlx5_core_warn(dev, "timeout claiming VFs pages\n");
sriov->num_vfs = 0;
}
int mlx5_core_sriov_configure(struct pci_dev *pdev, int num_vfs)
{
struct mlx5_core_dev *dev = pci_get_drvdata(pdev);
struct mlx5_core_sriov *sriov = &dev->priv.sriov;
int err;
mlx5_core_dbg(dev, "requsted num_vfs %d\n", num_vfs);
if (!mlx5_core_is_pf(dev))
return -EPERM;
mlx5_core_cleanup_vfs(dev);
if (!num_vfs) {
#ifdef CONFIG_MLX5_CORE_EN
mlx5_eswitch_disable_sriov(dev->priv.eswitch);
#endif
kfree(sriov->vfs_ctx);
sriov->vfs_ctx = NULL;
if (!pci_vfs_assigned(pdev))
pci_disable_sriov(pdev);
else
pr_info("unloading PF driver while leaving orphan VFs\n");
return 0;
}
err = mlx5_core_sriov_enable(pdev, num_vfs);
if (err) {
dev_warn(&pdev->dev, "mlx5_core_sriov_enable failed %d\n", err);
return err;
}
mlx5_core_init_vfs(dev, num_vfs);
#ifdef CONFIG_MLX5_CORE_EN
mlx5_eswitch_enable_sriov(dev->priv.eswitch, num_vfs);
#endif
return num_vfs;
}
static int sync_required(struct pci_dev *pdev)
{
struct mlx5_core_dev *dev = pci_get_drvdata(pdev);
struct mlx5_core_sriov *sriov = &dev->priv.sriov;
int cur_vfs = pci_num_vf(pdev);
if (cur_vfs != sriov->num_vfs) {
pr_info("current VFs %d, registered %d - sync needed\n", cur_vfs, sriov->num_vfs);
return 1;
}
return 0;
}
int mlx5_sriov_init(struct mlx5_core_dev *dev)
{
struct mlx5_core_sriov *sriov = &dev->priv.sriov;
struct pci_dev *pdev = dev->pdev;
int cur_vfs;
if (!mlx5_core_is_pf(dev))
return 0;
if (!sync_required(dev->pdev))
return 0;
cur_vfs = pci_num_vf(pdev);
sriov->vfs_ctx = kcalloc(cur_vfs, sizeof(*sriov->vfs_ctx), GFP_KERNEL);
if (!sriov->vfs_ctx)
return -ENOMEM;
sriov->enabled_vfs = cur_vfs;
mlx5_core_init_vfs(dev, cur_vfs);
#ifdef CONFIG_MLX5_CORE_EN
if (cur_vfs)
mlx5_eswitch_enable_sriov(dev->priv.eswitch, cur_vfs);
#endif
enable_vfs(dev, cur_vfs);
return 0;
}
int mlx5_sriov_cleanup(struct mlx5_core_dev *dev)
{
struct pci_dev *pdev = dev->pdev;
int err;
if (!mlx5_core_is_pf(dev))
return 0;
err = mlx5_core_sriov_configure(pdev, 0);
if (err)
return err;
return 0;
}
......@@ -251,6 +251,7 @@ enum mlx5_event {
MLX5_EVENT_TYPE_PAGE_REQUEST = 0xb,
MLX5_EVENT_TYPE_PAGE_FAULT = 0xc,
MLX5_EVENT_TYPE_NIC_VPORT_CHANGE = 0xd,
};
enum {
......@@ -520,6 +521,12 @@ struct mlx5_eqe_page_fault {
__be32 flags_qpn;
} __packed;
struct mlx5_eqe_vport_change {
u8 rsvd0[2];
__be16 vport_num;
__be32 rsvd1[6];
} __packed;
union ev_data {
__be32 raw[7];
struct mlx5_eqe_cmd cmd;
......@@ -532,6 +539,7 @@ union ev_data {
struct mlx5_eqe_stall_vl stall_vl;
struct mlx5_eqe_page_req req_pages;
struct mlx5_eqe_page_fault page_fault;
struct mlx5_eqe_vport_change vport_change;
} __packed;
struct mlx5_eqe {
......@@ -1066,6 +1074,12 @@ enum {
VPORT_STATE_UP = 0x1,
};
enum {
MLX5_ESW_VPORT_ADMIN_STATE_DOWN = 0x0,
MLX5_ESW_VPORT_ADMIN_STATE_UP = 0x1,
MLX5_ESW_VPORT_ADMIN_STATE_AUTO = 0x2,
};
enum {
MLX5_L3_PROT_TYPE_IPV4 = 0,
MLX5_L3_PROT_TYPE_IPV6 = 1,
......@@ -1102,6 +1116,12 @@ enum {
MLX5_FLOW_CONTEXT_DEST_TYPE_TIR = 2,
};
enum mlx5_list_type {
MLX5_NVPRT_LIST_TYPE_UC = 0x0,
MLX5_NVPRT_LIST_TYPE_MC = 0x1,
MLX5_NVPRT_LIST_TYPE_VLAN = 0x2,
};
enum {
MLX5_RQC_RQ_TYPE_MEMORY_RQ_INLINE = 0x0,
MLX5_RQC_RQ_TYPE_MEMORY_RQ_RPM = 0x1,
......@@ -1124,6 +1144,8 @@ enum mlx5_cap_type {
MLX5_CAP_IPOIB_OFFLOADS,
MLX5_CAP_EOIB_OFFLOADS,
MLX5_CAP_FLOW_TABLE,
MLX5_CAP_ESWITCH_FLOW_TABLE,
MLX5_CAP_ESWITCH,
/* NUM OF CAP Types */
MLX5_CAP_NUM
};
......@@ -1161,6 +1183,28 @@ enum mlx5_cap_type {
#define MLX5_CAP_FLOWTABLE_MAX(mdev, cap) \
MLX5_GET(flow_table_nic_cap, mdev->hca_caps_max[MLX5_CAP_FLOW_TABLE], cap)
#define MLX5_CAP_ESW_FLOWTABLE(mdev, cap) \
MLX5_GET(flow_table_eswitch_cap, \
mdev->hca_caps_cur[MLX5_CAP_ESWITCH_FLOW_TABLE], cap)
#define MLX5_CAP_ESW_FLOWTABLE_MAX(mdev, cap) \
MLX5_GET(flow_table_eswitch_cap, \
mdev->hca_caps_max[MLX5_CAP_ESWITCH_FLOW_TABLE], cap)
#define MLX5_CAP_ESW_FLOWTABLE_FDB(mdev, cap) \
MLX5_CAP_ESW_FLOWTABLE(mdev, flow_table_properties_nic_esw_fdb.cap)
#define MLX5_CAP_ESW_FLOWTABLE_FDB_MAX(mdev, cap) \
MLX5_CAP_ESW_FLOWTABLE_MAX(mdev, flow_table_properties_nic_esw_fdb.cap)
#define MLX5_CAP_ESW(mdev, cap) \
MLX5_GET(e_switch_cap, \
mdev->hca_caps_cur[MLX5_CAP_ESWITCH], cap)
#define MLX5_CAP_ESW_MAX(mdev, cap) \
MLX5_GET(e_switch_cap, \
mdev->hca_caps_max[MLX5_CAP_ESWITCH], cap)
#define MLX5_CAP_ODP(mdev, cap)\
MLX5_GET(odp_cap, mdev->hca_caps_cur[MLX5_CAP_ODP], cap)
......
......@@ -426,11 +426,23 @@ struct mlx5_mr_table {
struct radix_tree_root tree;
};
struct mlx5_vf_context {
int enabled;
};
struct mlx5_core_sriov {
struct mlx5_vf_context *vfs_ctx;
int num_vfs;
int enabled_vfs;
};
struct mlx5_irq_info {
cpumask_var_t mask;
char name[MLX5_MAX_IRQ_NAME];
};
struct mlx5_eswitch;
struct mlx5_priv {
char name[MLX5_MAX_NAME_LEN];
struct mlx5_eq_table eq_table;
......@@ -447,6 +459,7 @@ struct mlx5_priv {
int fw_pages;
atomic_t reg_pages;
struct list_head free_list;
int vfs_pages;
struct mlx5_core_health health;
......@@ -485,6 +498,10 @@ struct mlx5_priv {
struct list_head dev_list;
struct list_head ctx_list;
spinlock_t ctx_lock;
struct mlx5_eswitch *eswitch;
struct mlx5_core_sriov sriov;
unsigned long pci_dev_data;
};
enum mlx5_device_state {
......@@ -739,6 +756,8 @@ void mlx5_pagealloc_init(struct mlx5_core_dev *dev);
void mlx5_pagealloc_cleanup(struct mlx5_core_dev *dev);
int mlx5_pagealloc_start(struct mlx5_core_dev *dev);
void mlx5_pagealloc_stop(struct mlx5_core_dev *dev);
int mlx5_sriov_init(struct mlx5_core_dev *dev);
int mlx5_sriov_cleanup(struct mlx5_core_dev *dev);
void mlx5_core_req_pages_handler(struct mlx5_core_dev *dev, u16 func_id,
s32 npages);
int mlx5_satisfy_startup_pages(struct mlx5_core_dev *dev, int boot);
......@@ -884,6 +903,15 @@ struct mlx5_profile {
} mr_cache[MAX_MR_CACHE_ENTRIES];
};
enum {
MLX5_PCI_DEV_IS_VF = 1 << 0,
};
static inline int mlx5_core_is_pf(struct mlx5_core_dev *dev)
{
return !(dev->priv.pci_dev_data & MLX5_PCI_DEV_IS_VF);
}
static inline int mlx5_get_gid_table_len(u16 param)
{
if (param > 4) {
......
......@@ -41,6 +41,15 @@ struct mlx5_flow_table_group {
u32 match_criteria[MLX5_ST_SZ_DW(fte_match_param)];
};
struct mlx5_flow_destination {
enum mlx5_flow_destination_type type;
union {
u32 tir_num;
void *ft;
u32 vport_num;
};
};
void *mlx5_create_flow_table(struct mlx5_core_dev *dev, u8 level, u8 table_type,
u16 num_groups,
struct mlx5_flow_table_group *group);
......
......@@ -447,6 +447,29 @@ struct mlx5_ifc_flow_table_nic_cap_bits {
u8 reserved_3[0x7200];
};
struct mlx5_ifc_flow_table_eswitch_cap_bits {
u8 reserved_0[0x200];
struct mlx5_ifc_flow_table_prop_layout_bits flow_table_properties_nic_esw_fdb;
struct mlx5_ifc_flow_table_prop_layout_bits flow_table_properties_esw_acl_ingress;
struct mlx5_ifc_flow_table_prop_layout_bits flow_table_properties_esw_acl_egress;
u8 reserved_1[0x7800];
};
struct mlx5_ifc_e_switch_cap_bits {
u8 vport_svlan_strip[0x1];
u8 vport_cvlan_strip[0x1];
u8 vport_svlan_insert[0x1];
u8 vport_cvlan_insert_if_not_exist[0x1];
u8 vport_cvlan_insert_overwrite[0x1];
u8 reserved_0[0x1b];
u8 reserved_1[0x7e0];
};
struct mlx5_ifc_per_protocol_networking_offload_caps_bits {
u8 csum_cap[0x1];
u8 vlan_cap[0x1];
......@@ -665,7 +688,9 @@ struct mlx5_ifc_cmd_hca_cap_bits {
u8 reserved_17[0x1];
u8 ets[0x1];
u8 nic_flow_table[0x1];
u8 reserved_18[0x4];
u8 eswitch_flow_table[0x1];
u8 early_vf_enable;
u8 reserved_18[0x2];
u8 local_ca_ack_delay[0x5];
u8 reserved_19[0x6];
u8 port_type[0x2];
......@@ -787,27 +812,36 @@ struct mlx5_ifc_cmd_hca_cap_bits {
u8 reserved_60[0x1b];
u8 log_max_wq_sz[0x5];
u8 reserved_61[0xa0];
u8 nic_vport_change_event[0x1];
u8 reserved_61[0xa];
u8 log_max_vlan_list[0x5];
u8 reserved_62[0x3];
u8 log_max_current_mc_list[0x5];
u8 reserved_63[0x3];
u8 log_max_current_uc_list[0x5];
u8 reserved_64[0x80];
u8 reserved_65[0x3];
u8 log_max_l2_table[0x5];
u8 reserved_63[0x8];
u8 reserved_66[0x8];
u8 log_uar_page_sz[0x10];
u8 reserved_64[0x100];
u8 reserved_67[0xe0];
u8 reserved_65[0x1f];
u8 reserved_68[0x1f];
u8 cqe_zip[0x1];
u8 cqe_zip_timeout[0x10];
u8 cqe_zip_max_num[0x10];
u8 reserved_66[0x220];
u8 reserved_69[0x220];
};
enum {
MLX5_DEST_FORMAT_STRUCT_DESTINATION_TYPE_FLOW_TABLE_ = 0x1,
MLX5_DEST_FORMAT_STRUCT_DESTINATION_TYPE_TIR = 0x2,
enum mlx5_flow_destination_type {
MLX5_FLOW_DESTINATION_TYPE_VPORT = 0x0,
MLX5_FLOW_DESTINATION_TYPE_FLOW_TABLE = 0x1,
MLX5_FLOW_DESTINATION_TYPE_TIR = 0x2,
};
struct mlx5_ifc_dest_format_struct_bits {
......@@ -900,6 +934,13 @@ struct mlx5_ifc_mac_address_layout_bits {
u8 mac_addr_31_0[0x20];
};
struct mlx5_ifc_vlan_layout_bits {
u8 reserved_0[0x14];
u8 vlan[0x0c];
u8 reserved_1[0x20];
};
struct mlx5_ifc_cong_control_r_roce_ecn_np_bits {
u8 reserved_0[0xa0];
......@@ -1829,6 +1870,8 @@ union mlx5_ifc_hca_cap_union_bits {
struct mlx5_ifc_roce_cap_bits roce_cap;
struct mlx5_ifc_per_protocol_networking_offload_caps_bits per_protocol_networking_offload_caps;
struct mlx5_ifc_flow_table_nic_cap_bits flow_table_nic_cap;
struct mlx5_ifc_flow_table_eswitch_cap_bits flow_table_eswitch_cap;
struct mlx5_ifc_e_switch_cap_bits e_switch_cap;
u8 reserved_0[0x8000];
};
......@@ -2133,24 +2176,35 @@ struct mlx5_ifc_rmpc_bits {
struct mlx5_ifc_wq_bits wq;
};
enum {
MLX5_NIC_VPORT_CONTEXT_ALLOWED_LIST_TYPE_CURRENT_UC_MAC_ADDRESS = 0x0,
};
struct mlx5_ifc_nic_vport_context_bits {
u8 reserved_0[0x1f];
u8 roce_en[0x1];
u8 reserved_1[0x760];
u8 arm_change_event[0x1];
u8 reserved_1[0x1a];
u8 event_on_mtu[0x1];
u8 event_on_promisc_change[0x1];
u8 event_on_vlan_change[0x1];
u8 event_on_mc_address_change[0x1];
u8 event_on_uc_address_change[0x1];
u8 reserved_2[0x5];
u8 reserved_2[0xf0];
u8 mtu[0x10];
u8 reserved_3[0x640];
u8 promisc_uc[0x1];
u8 promisc_mc[0x1];
u8 promisc_all[0x1];
u8 reserved_4[0x2];
u8 allowed_list_type[0x3];
u8 reserved_3[0xc];
u8 reserved_5[0xc];
u8 allowed_list_size[0xc];
struct mlx5_ifc_mac_address_layout_bits permanent_address;
u8 reserved_4[0x20];
u8 reserved_6[0x20];
u8 current_uc_mac_address[0][0x40];
};
......@@ -2263,6 +2317,26 @@ struct mlx5_ifc_hca_vport_context_bits {
u8 reserved_6[0xca0];
};
struct mlx5_ifc_esw_vport_context_bits {
u8 reserved_0[0x3];
u8 vport_svlan_strip[0x1];
u8 vport_cvlan_strip[0x1];
u8 vport_svlan_insert[0x1];
u8 vport_cvlan_insert[0x2];
u8 reserved_1[0x18];
u8 reserved_2[0x20];
u8 svlan_cfi[0x1];
u8 svlan_pcp[0x3];
u8 svlan_id[0xc];
u8 cvlan_cfi[0x1];
u8 cvlan_pcp[0x3];
u8 cvlan_id[0xc];
u8 reserved_3[0x7a0];
};
enum {
MLX5_EQC_STATUS_OK = 0x0,
MLX5_EQC_STATUS_EQ_WRITE_FAILURE = 0xa,
......@@ -2940,6 +3014,7 @@ struct mlx5_ifc_query_vport_state_out_bits {
enum {
MLX5_QUERY_VPORT_STATE_IN_OP_MOD_VNIC_VPORT = 0x0,
MLX5_QUERY_VPORT_STATE_IN_OP_MOD_ESW_VPORT = 0x1,
};
struct mlx5_ifc_query_vport_state_in_bits {
......@@ -3700,6 +3775,64 @@ struct mlx5_ifc_query_flow_group_in_bits {
u8 reserved_5[0x120];
};
struct mlx5_ifc_query_esw_vport_context_out_bits {
u8 status[0x8];
u8 reserved_0[0x18];
u8 syndrome[0x20];
u8 reserved_1[0x40];
struct mlx5_ifc_esw_vport_context_bits esw_vport_context;
};
struct mlx5_ifc_query_esw_vport_context_in_bits {
u8 opcode[0x10];
u8 reserved_0[0x10];
u8 reserved_1[0x10];
u8 op_mod[0x10];
u8 other_vport[0x1];
u8 reserved_2[0xf];
u8 vport_number[0x10];
u8 reserved_3[0x20];
};
struct mlx5_ifc_modify_esw_vport_context_out_bits {
u8 status[0x8];
u8 reserved_0[0x18];
u8 syndrome[0x20];
u8 reserved_1[0x40];
};
struct mlx5_ifc_esw_vport_context_fields_select_bits {
u8 reserved[0x1c];
u8 vport_cvlan_insert[0x1];
u8 vport_svlan_insert[0x1];
u8 vport_cvlan_strip[0x1];
u8 vport_svlan_strip[0x1];
};
struct mlx5_ifc_modify_esw_vport_context_in_bits {
u8 opcode[0x10];
u8 reserved_0[0x10];
u8 reserved_1[0x10];
u8 op_mod[0x10];
u8 other_vport[0x1];
u8 reserved_2[0xf];
u8 vport_number[0x10];
struct mlx5_ifc_esw_vport_context_fields_select_bits field_select;
struct mlx5_ifc_esw_vport_context_bits esw_vport_context;
};
struct mlx5_ifc_query_eq_out_bits {
u8 status[0x8];
u8 reserved_0[0x18];
......@@ -4228,7 +4361,10 @@ struct mlx5_ifc_modify_nic_vport_context_out_bits {
};
struct mlx5_ifc_modify_nic_vport_field_select_bits {
u8 reserved_0[0x1c];
u8 reserved_0[0x19];
u8 mtu[0x1];
u8 change_event[0x1];
u8 promisc[0x1];
u8 permanent_address[0x1];
u8 addresses_list[0x1];
u8 roce_en[0x1];
......
......@@ -34,9 +34,17 @@
#define __MLX5_VPORT_H__
#include <linux/mlx5/driver.h>
#include <linux/mlx5/device.h>
u8 mlx5_query_vport_state(struct mlx5_core_dev *mdev, u8 opmod);
void mlx5_query_nic_vport_mac_address(struct mlx5_core_dev *mdev, u8 *addr);
u8 mlx5_query_vport_state(struct mlx5_core_dev *mdev, u8 opmod, u16 vport);
u8 mlx5_query_vport_admin_state(struct mlx5_core_dev *mdev, u8 opmod,
u16 vport);
int mlx5_modify_vport_admin_state(struct mlx5_core_dev *mdev, u8 opmod,
u16 vport, u8 state);
int mlx5_query_nic_vport_mac_address(struct mlx5_core_dev *mdev,
u16 vport, u8 *addr);
int mlx5_modify_nic_vport_mac_address(struct mlx5_core_dev *dev,
u16 vport, u8 *addr);
int mlx5_query_hca_vport_gid(struct mlx5_core_dev *dev, u8 other_vport,
u8 port_num, u16 vf_num, u16 gid_index,
union ib_gid *gid);
......@@ -51,5 +59,30 @@ int mlx5_query_hca_vport_system_image_guid(struct mlx5_core_dev *dev,
u64 *sys_image_guid);
int mlx5_query_hca_vport_node_guid(struct mlx5_core_dev *dev,
u64 *node_guid);
int mlx5_query_nic_vport_mac_list(struct mlx5_core_dev *dev,
u32 vport,
enum mlx5_list_type list_type,
u8 addr_list[][ETH_ALEN],
int *list_size);
int mlx5_modify_nic_vport_mac_list(struct mlx5_core_dev *dev,
enum mlx5_list_type list_type,
u8 addr_list[][ETH_ALEN],
int list_size);
int mlx5_query_nic_vport_promisc(struct mlx5_core_dev *mdev,
u32 vport,
int *promisc_uc,
int *promisc_mc,
int *promisc_all);
int mlx5_modify_nic_vport_promisc(struct mlx5_core_dev *mdev,
int promisc_uc,
int promisc_mc,
int promisc_all);
int mlx5_query_nic_vport_vlans(struct mlx5_core_dev *dev,
u32 vport,
u16 vlans[],
int *size);
int mlx5_modify_nic_vport_vlans(struct mlx5_core_dev *dev,
u16 vlans[],
int list_size);
#endif /* __MLX5_VPORT_H__ */
Markdown is supported
0%
or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment