Commit 91bca7f7 authored by Jakub Kicinski's avatar Jakub Kicinski

Merge branch 'devlink-add-reload-action-and-limit-options'

Moshe Shemesh says:

====================
Add devlink reload action and limit options

Introduce new options on devlink reload API to enable the user to select
the reload action required and constrains limits on these actions that he
may want to ensure. Complete support for reload actions in mlx5.
The following reload actions are supported:
  driver_reinit: driver entities re-initialization, applying devlink-param
                 and devlink-resource values.
  fw_activate: firmware activate.

The uAPI is backward compatible, if the reload action option is omitted
from the reload command, the driver reinit action will be used.
Note that when required to do firmware activation some drivers may need
to reload the driver. On the other hand some drivers may need to reset
the firmware to reinitialize the driver entities. Therefore, the devlink
reload command returns the actions which were actually performed.

By default reload actions are not limited and driver implementation may
include reset or downtime as needed to perform the actions.
However, if reload limit is selected, the driver should perform only if
it can do it while keeping the limit constraints.
Reload limit added:
  no_reset: No reset allowed, no down time allowed, no link flap and no
            configuration is lost.

Each driver which supports devlink reload command should expose the
reload actions and limits supported.

Add reload stats to hold the history per reload action per limit.
For example, the number of times fw_activate has been done on this
device since the driver module was added or if the firmware activation
was done with or without reset.

Patch 1 changes devlink_reload_supported() param type to enable using
        it before allocating devlink.
Patch 2-3 add the new API reload action and reload limit options to
          devlink reload.
Patch 4-5 add reload stats and remote reload stats. These stats are
          exposed through devlink dev get.
Patches 6-11 add support on mlx5 for devlink reload action fw_activate
            and handle the firmware reset events.
Patches 12-13 add devlink enable remote dev reset parameter and use it
             in mlx5.
Patches 14-15 mlx5 add devlink reload limit no_reset support for
              fw_activate reload action.
Patch 16 adds documentation file devlink-reload.rst
====================
Signed-off-by: default avatarJakub Kicinski <kuba@kernel.org>
parents 846e463a eb79d754
......@@ -108,3 +108,9 @@ own name.
* - ``region_snapshot_enable``
- Boolean
- Enable capture of ``devlink-region`` snapshots.
* - ``enable_remote_dev_reset``
- Boolean
- Enable device reset by remote host. When cleared, the device driver
will NACK any attempt of other host to reset the device. This parameter
is useful for setups where a device is shared by different hosts, such
as multi-host setup.
.. SPDX-License-Identifier: GPL-2.0
==============
Devlink Reload
==============
``devlink-reload`` provides mechanism to reinit driver entities, applying
``devlink-params`` and ``devlink-resources`` new values. It also provides
mechanism to activate firmware.
Reload Actions
==============
User may select a reload action.
By default ``driver_reinit`` action is selected.
.. list-table:: Possible reload actions
:widths: 5 90
* - Name
- Description
* - ``driver-reinit``
- Devlink driver entities re-initialization, including applying
new values to devlink entities which are used during driver
load such as ``devlink-params`` in configuration mode
``driverinit`` or ``devlink-resources``
* - ``fw_activate``
- Firmware activate. Activates new firmware if such image is stored and
pending activation. If no limitation specified this action may involve
firmware reset. If no new image pending this action will reload current
firmware image.
Note that even though user asks for a specific action, the driver
implementation might require to perform another action alongside with
it. For example, some driver do not support driver reinitialization
being performed without fw activation. Therefore, the devlink reload
command returns the list of actions which were actrually performed.
Reload Limits
=============
By default reload actions are not limited and driver implementation may
include reset or downtime as needed to perform the actions.
However, some drivers support action limits, which limit the action
implementation to specific constraints.
.. list-table:: Possible reload limits
:widths: 5 90
* - Name
- Description
* - ``no_reset``
- No reset allowed, no down time allowed, no link flap and no
configuration is lost.
Change Namespace
================
The netns option allows user to be able to move devlink instances into
namespaces during devlink reload operation.
By default all devlink instances are created in init_net and stay there.
example usage
-------------
.. code:: shell
$ devlink dev reload help
$ devlink dev reload DEV [ netns { PID | NAME | ID } ] [ action { driver_reinit | fw_activate } ] [ limit no_reset ]
# Run reload command for devlink driver entities re-initialization:
$ devlink dev reload pci/0000:82:00.0 action driver_reinit
reload_actions_performed:
driver_reinit
# Run reload command to activate firmware:
# Note that mlx5 driver reloads the driver while activating firmware
$ devlink dev reload pci/0000:82:00.0 action fw_activate
reload_actions_performed:
driver_reinit fw_activate
......@@ -20,6 +20,7 @@ general.
devlink-params
devlink-region
devlink-resource
devlink-reload
devlink-trap
Driver-specific documentation
......
......@@ -3946,6 +3946,8 @@ static int mlx4_restart_one_up(struct pci_dev *pdev, bool reload,
struct devlink *devlink);
static int mlx4_devlink_reload_down(struct devlink *devlink, bool netns_change,
enum devlink_reload_action action,
enum devlink_reload_limit limit,
struct netlink_ext_ack *extack)
{
struct mlx4_priv *priv = devlink_priv(devlink);
......@@ -3962,7 +3964,8 @@ static int mlx4_devlink_reload_down(struct devlink *devlink, bool netns_change,
return 0;
}
static int mlx4_devlink_reload_up(struct devlink *devlink,
static int mlx4_devlink_reload_up(struct devlink *devlink, enum devlink_reload_action action,
enum devlink_reload_limit limit, u32 *actions_performed,
struct netlink_ext_ack *extack)
{
struct mlx4_priv *priv = devlink_priv(devlink);
......@@ -3970,6 +3973,7 @@ static int mlx4_devlink_reload_up(struct devlink *devlink,
struct mlx4_dev_persistent *persist = dev->persist;
int err;
*actions_performed = BIT(DEVLINK_RELOAD_ACTION_DRIVER_REINIT);
err = mlx4_restart_one_up(persist->pdev, true, devlink);
if (err)
mlx4_err(persist->dev, "mlx4_restart_one_up failed, ret=%d\n",
......@@ -3980,6 +3984,7 @@ static int mlx4_devlink_reload_up(struct devlink *devlink,
static const struct devlink_ops mlx4_devlink_ops = {
.port_type_set = mlx4_devlink_port_type_set,
.reload_actions = BIT(DEVLINK_RELOAD_ACTION_DRIVER_REINIT),
.reload_down = mlx4_devlink_reload_down,
.reload_up = mlx4_devlink_reload_up,
};
......
......@@ -16,7 +16,7 @@ mlx5_core-y := main.o cmd.o debugfs.o fw.o eq.o uar.o pagealloc.o \
transobj.o vport.o sriov.o fs_cmd.o fs_core.o pci_irq.o \
fs_counters.o rl.o lag.o dev.o events.o wq.o lib/gid.o \
lib/devcom.o lib/pci_vsc.o lib/dm.o diag/fs_tracepoint.o \
diag/fw_tracer.o diag/crdump.o devlink.o diag/rsc_dump.o
diag/fw_tracer.o diag/crdump.o devlink.o diag/rsc_dump.o fw_reset.o
#
# Netdev basic
......
......@@ -4,6 +4,7 @@
#include <devlink.h>
#include "mlx5_core.h"
#include "fw_reset.h"
#include "fs_core.h"
#include "eswitch.h"
......@@ -84,21 +85,96 @@ mlx5_devlink_info_get(struct devlink *devlink, struct devlink_info_req *req,
return 0;
}
static int mlx5_devlink_reload_fw_activate(struct devlink *devlink, struct netlink_ext_ack *extack)
{
struct mlx5_core_dev *dev = devlink_priv(devlink);
u8 reset_level, reset_type, net_port_alive;
int err;
err = mlx5_fw_reset_query(dev, &reset_level, &reset_type);
if (err)
return err;
if (!(reset_level & MLX5_MFRL_REG_RESET_LEVEL3)) {
NL_SET_ERR_MSG_MOD(extack, "FW activate requires reboot");
return -EINVAL;
}
net_port_alive = !!(reset_type & MLX5_MFRL_REG_RESET_TYPE_NET_PORT_ALIVE);
err = mlx5_fw_reset_set_reset_sync(dev, net_port_alive);
if (err)
goto out;
err = mlx5_fw_reset_wait_reset_done(dev);
out:
if (err)
NL_SET_ERR_MSG_MOD(extack, "FW activate command failed");
return err;
}
static int mlx5_devlink_trigger_fw_live_patch(struct devlink *devlink,
struct netlink_ext_ack *extack)
{
struct mlx5_core_dev *dev = devlink_priv(devlink);
u8 reset_level;
int err;
err = mlx5_fw_reset_query(dev, &reset_level, NULL);
if (err)
return err;
if (!(reset_level & MLX5_MFRL_REG_RESET_LEVEL0)) {
NL_SET_ERR_MSG_MOD(extack,
"FW upgrade to the stored FW can't be done by FW live patching");
return -EINVAL;
}
return mlx5_fw_reset_set_live_patch(dev);
}
static int mlx5_devlink_reload_down(struct devlink *devlink, bool netns_change,
enum devlink_reload_action action,
enum devlink_reload_limit limit,
struct netlink_ext_ack *extack)
{
struct mlx5_core_dev *dev = devlink_priv(devlink);
switch (action) {
case DEVLINK_RELOAD_ACTION_DRIVER_REINIT:
mlx5_unload_one(dev, false);
return 0;
case DEVLINK_RELOAD_ACTION_FW_ACTIVATE:
if (limit == DEVLINK_RELOAD_LIMIT_NO_RESET)
return mlx5_devlink_trigger_fw_live_patch(devlink, extack);
return mlx5_devlink_reload_fw_activate(devlink, extack);
default:
/* Unsupported action should not get to this function */
WARN_ON(1);
return -EOPNOTSUPP;
}
}
static int mlx5_devlink_reload_up(struct devlink *devlink,
static int mlx5_devlink_reload_up(struct devlink *devlink, enum devlink_reload_action action,
enum devlink_reload_limit limit, u32 *actions_performed,
struct netlink_ext_ack *extack)
{
struct mlx5_core_dev *dev = devlink_priv(devlink);
*actions_performed = BIT(action);
switch (action) {
case DEVLINK_RELOAD_ACTION_DRIVER_REINIT:
return mlx5_load_one(dev, false);
case DEVLINK_RELOAD_ACTION_FW_ACTIVATE:
if (limit == DEVLINK_RELOAD_LIMIT_NO_RESET)
break;
/* On fw_activate action, also driver is reloaded and reinit performed */
*actions_performed |= BIT(DEVLINK_RELOAD_ACTION_DRIVER_REINIT);
return mlx5_load_one(dev, false);
default:
/* Unsupported action should not get to this function */
WARN_ON(1);
return -EOPNOTSUPP;
}
return 0;
}
static const struct devlink_ops mlx5_devlink_ops = {
......@@ -114,6 +190,9 @@ static const struct devlink_ops mlx5_devlink_ops = {
#endif
.flash_update = mlx5_devlink_flash_update,
.info_get = mlx5_devlink_info_get,
.reload_actions = BIT(DEVLINK_RELOAD_ACTION_DRIVER_REINIT) |
BIT(DEVLINK_RELOAD_ACTION_FW_ACTIVATE),
.reload_limits = BIT(DEVLINK_RELOAD_LIMIT_NO_RESET),
.reload_down = mlx5_devlink_reload_down,
.reload_up = mlx5_devlink_reload_up,
};
......@@ -224,6 +303,24 @@ static int mlx5_devlink_large_group_num_validate(struct devlink *devlink, u32 id
}
#endif
static int mlx5_devlink_enable_remote_dev_reset_set(struct devlink *devlink, u32 id,
struct devlink_param_gset_ctx *ctx)
{
struct mlx5_core_dev *dev = devlink_priv(devlink);
mlx5_fw_reset_enable_remote_dev_reset_set(dev, ctx->val.vbool);
return 0;
}
static int mlx5_devlink_enable_remote_dev_reset_get(struct devlink *devlink, u32 id,
struct devlink_param_gset_ctx *ctx)
{
struct mlx5_core_dev *dev = devlink_priv(devlink);
ctx->val.vbool = mlx5_fw_reset_enable_remote_dev_reset_get(dev);
return 0;
}
static const struct devlink_param mlx5_devlink_params[] = {
DEVLINK_PARAM_DRIVER(MLX5_DEVLINK_PARAM_ID_FLOW_STEERING_MODE,
"flow_steering_mode", DEVLINK_PARAM_TYPE_STRING,
......@@ -239,6 +336,9 @@ static const struct devlink_param mlx5_devlink_params[] = {
NULL, NULL,
mlx5_devlink_large_group_num_validate),
#endif
DEVLINK_PARAM_GENERIC(ENABLE_REMOTE_DEV_RESET, BIT(DEVLINK_PARAM_CMODE_RUNTIME),
mlx5_devlink_enable_remote_dev_reset_get,
mlx5_devlink_enable_remote_dev_reset_set, NULL),
};
static void mlx5_devlink_set_params_init_values(struct devlink *devlink)
......
......@@ -1066,6 +1066,58 @@ void mlx5_fw_tracer_destroy(struct mlx5_fw_tracer *tracer)
kvfree(tracer);
}
static int mlx5_fw_tracer_recreate_strings_db(struct mlx5_fw_tracer *tracer)
{
struct mlx5_core_dev *dev;
int err;
cancel_work_sync(&tracer->read_fw_strings_work);
mlx5_fw_tracer_clean_ready_list(tracer);
mlx5_fw_tracer_clean_print_hash(tracer);
mlx5_fw_tracer_clean_saved_traces_array(tracer);
mlx5_fw_tracer_free_strings_db(tracer);
dev = tracer->dev;
err = mlx5_query_mtrc_caps(tracer);
if (err) {
mlx5_core_dbg(dev, "FWTracer: Failed to query capabilities %d\n", err);
return err;
}
err = mlx5_fw_tracer_allocate_strings_db(tracer);
if (err) {
mlx5_core_warn(dev, "FWTracer: Allocate strings DB failed %d\n", err);
return err;
}
mlx5_fw_tracer_init_saved_traces_array(tracer);
return 0;
}
int mlx5_fw_tracer_reload(struct mlx5_fw_tracer *tracer)
{
struct mlx5_core_dev *dev;
int err;
if (IS_ERR_OR_NULL(tracer))
return -EINVAL;
dev = tracer->dev;
mlx5_fw_tracer_cleanup(tracer);
err = mlx5_fw_tracer_recreate_strings_db(tracer);
if (err) {
mlx5_core_warn(dev, "Failed to recreate FW tracer strings DB\n");
return err;
}
err = mlx5_fw_tracer_init(tracer);
if (err) {
mlx5_core_warn(dev, "Failed to re-initialize FW tracer\n");
return err;
}
return 0;
}
static int fw_tracer_event(struct notifier_block *nb, unsigned long action, void *data)
{
struct mlx5_fw_tracer *tracer = mlx5_nb_cof(nb, struct mlx5_fw_tracer, nb);
......
......@@ -191,5 +191,6 @@ void mlx5_fw_tracer_destroy(struct mlx5_fw_tracer *tracer);
int mlx5_fw_tracer_trigger_core_dump_general(struct mlx5_core_dev *dev);
int mlx5_fw_tracer_get_saved_traces_objects(struct mlx5_fw_tracer *tracer,
struct devlink_fmsg *fmsg);
int mlx5_fw_tracer_reload(struct mlx5_fw_tracer *tracer);
#endif
// SPDX-License-Identifier: GPL-2.0 OR Linux-OpenIB
/* Copyright (c) 2020, Mellanox Technologies inc. All rights reserved. */
#include "fw_reset.h"
#include "diag/fw_tracer.h"
enum {
MLX5_FW_RESET_FLAGS_RESET_REQUESTED,
MLX5_FW_RESET_FLAGS_NACK_RESET_REQUEST,
MLX5_FW_RESET_FLAGS_PENDING_COMP
};
struct mlx5_fw_reset {
struct mlx5_core_dev *dev;
struct mlx5_nb nb;
struct workqueue_struct *wq;
struct work_struct fw_live_patch_work;
struct work_struct reset_request_work;
struct work_struct reset_reload_work;
struct work_struct reset_now_work;
struct work_struct reset_abort_work;
unsigned long reset_flags;
struct timer_list timer;
struct completion done;
int ret;
};
void mlx5_fw_reset_enable_remote_dev_reset_set(struct mlx5_core_dev *dev, bool enable)
{
struct mlx5_fw_reset *fw_reset = dev->priv.fw_reset;
if (enable)
clear_bit(MLX5_FW_RESET_FLAGS_NACK_RESET_REQUEST, &fw_reset->reset_flags);
else
set_bit(MLX5_FW_RESET_FLAGS_NACK_RESET_REQUEST, &fw_reset->reset_flags);
}
bool mlx5_fw_reset_enable_remote_dev_reset_get(struct mlx5_core_dev *dev)
{
struct mlx5_fw_reset *fw_reset = dev->priv.fw_reset;
return !test_bit(MLX5_FW_RESET_FLAGS_NACK_RESET_REQUEST, &fw_reset->reset_flags);
}
static int mlx5_reg_mfrl_set(struct mlx5_core_dev *dev, u8 reset_level,
u8 reset_type_sel, u8 sync_resp, bool sync_start)
{
u32 out[MLX5_ST_SZ_DW(mfrl_reg)] = {};
u32 in[MLX5_ST_SZ_DW(mfrl_reg)] = {};
MLX5_SET(mfrl_reg, in, reset_level, reset_level);
MLX5_SET(mfrl_reg, in, rst_type_sel, reset_type_sel);
MLX5_SET(mfrl_reg, in, pci_sync_for_fw_update_resp, sync_resp);
MLX5_SET(mfrl_reg, in, pci_sync_for_fw_update_start, sync_start);
return mlx5_core_access_reg(dev, in, sizeof(in), out, sizeof(out), MLX5_REG_MFRL, 0, 1);
}
static int mlx5_reg_mfrl_query(struct mlx5_core_dev *dev, u8 *reset_level, u8 *reset_type)
{
u32 out[MLX5_ST_SZ_DW(mfrl_reg)] = {};
u32 in[MLX5_ST_SZ_DW(mfrl_reg)] = {};
int err;
err = mlx5_core_access_reg(dev, in, sizeof(in), out, sizeof(out), MLX5_REG_MFRL, 0, 0);
if (err)
return err;
if (reset_level)
*reset_level = MLX5_GET(mfrl_reg, out, reset_level);
if (reset_type)
*reset_type = MLX5_GET(mfrl_reg, out, reset_type);
return 0;
}
int mlx5_fw_reset_query(struct mlx5_core_dev *dev, u8 *reset_level, u8 *reset_type)
{
return mlx5_reg_mfrl_query(dev, reset_level, reset_type);
}
int mlx5_fw_reset_set_reset_sync(struct mlx5_core_dev *dev, u8 reset_type_sel)
{
struct mlx5_fw_reset *fw_reset = dev->priv.fw_reset;
int err;
set_bit(MLX5_FW_RESET_FLAGS_PENDING_COMP, &fw_reset->reset_flags);
err = mlx5_reg_mfrl_set(dev, MLX5_MFRL_REG_RESET_LEVEL3, reset_type_sel, 0, true);
if (err)
clear_bit(MLX5_FW_RESET_FLAGS_PENDING_COMP, &fw_reset->reset_flags);
return err;
}
int mlx5_fw_reset_set_live_patch(struct mlx5_core_dev *dev)
{
return mlx5_reg_mfrl_set(dev, MLX5_MFRL_REG_RESET_LEVEL0, 0, 0, false);
}
static void mlx5_fw_reset_complete_reload(struct mlx5_core_dev *dev)
{
struct mlx5_fw_reset *fw_reset = dev->priv.fw_reset;
/* if this is the driver that initiated the fw reset, devlink completed the reload */
if (test_bit(MLX5_FW_RESET_FLAGS_PENDING_COMP, &fw_reset->reset_flags)) {
complete(&fw_reset->done);
} else {
mlx5_load_one(dev, false);
devlink_remote_reload_actions_performed(priv_to_devlink(dev), 0,
BIT(DEVLINK_RELOAD_ACTION_DRIVER_REINIT) |
BIT(DEVLINK_RELOAD_ACTION_FW_ACTIVATE));
}
}
static void mlx5_sync_reset_reload_work(struct work_struct *work)
{
struct mlx5_fw_reset *fw_reset = container_of(work, struct mlx5_fw_reset,
reset_reload_work);
struct mlx5_core_dev *dev = fw_reset->dev;
int err;
mlx5_enter_error_state(dev, true);
mlx5_unload_one(dev, false);
err = mlx5_health_wait_pci_up(dev);
if (err)
mlx5_core_err(dev, "reset reload flow aborted, PCI reads still not working\n");
fw_reset->ret = err;
mlx5_fw_reset_complete_reload(dev);
}
static void mlx5_stop_sync_reset_poll(struct mlx5_core_dev *dev)
{
struct mlx5_fw_reset *fw_reset = dev->priv.fw_reset;
del_timer(&fw_reset->timer);
}
static void mlx5_sync_reset_clear_reset_requested(struct mlx5_core_dev *dev, bool poll_health)
{
struct mlx5_fw_reset *fw_reset = dev->priv.fw_reset;
mlx5_stop_sync_reset_poll(dev);
clear_bit(MLX5_FW_RESET_FLAGS_RESET_REQUESTED, &fw_reset->reset_flags);
if (poll_health)
mlx5_start_health_poll(dev);
}
#define MLX5_RESET_POLL_INTERVAL (HZ / 10)
static void poll_sync_reset(struct timer_list *t)
{
struct mlx5_fw_reset *fw_reset = from_timer(fw_reset, t, timer);
struct mlx5_core_dev *dev = fw_reset->dev;
u32 fatal_error;
if (!test_bit(MLX5_FW_RESET_FLAGS_RESET_REQUESTED, &fw_reset->reset_flags))
return;
fatal_error = mlx5_health_check_fatal_sensors(dev);
if (fatal_error) {
mlx5_core_warn(dev, "Got Device Reset\n");
mlx5_sync_reset_clear_reset_requested(dev, false);
queue_work(fw_reset->wq, &fw_reset->reset_reload_work);
return;
}
mod_timer(&fw_reset->timer, round_jiffies(jiffies + MLX5_RESET_POLL_INTERVAL));
}
static void mlx5_start_sync_reset_poll(struct mlx5_core_dev *dev)
{
struct mlx5_fw_reset *fw_reset = dev->priv.fw_reset;
timer_setup(&fw_reset->timer, poll_sync_reset, 0);
fw_reset->timer.expires = round_jiffies(jiffies + MLX5_RESET_POLL_INTERVAL);
add_timer(&fw_reset->timer);
}
static int mlx5_fw_reset_set_reset_sync_ack(struct mlx5_core_dev *dev)
{
return mlx5_reg_mfrl_set(dev, MLX5_MFRL_REG_RESET_LEVEL3, 0, 1, false);
}
static int mlx5_fw_reset_set_reset_sync_nack(struct mlx5_core_dev *dev)
{
return mlx5_reg_mfrl_set(dev, MLX5_MFRL_REG_RESET_LEVEL3, 0, 2, false);
}
static void mlx5_sync_reset_set_reset_requested(struct mlx5_core_dev *dev)
{
struct mlx5_fw_reset *fw_reset = dev->priv.fw_reset;
mlx5_stop_health_poll(dev, true);
set_bit(MLX5_FW_RESET_FLAGS_RESET_REQUESTED, &fw_reset->reset_flags);
mlx5_start_sync_reset_poll(dev);
}
static void mlx5_fw_live_patch_event(struct work_struct *work)
{
struct mlx5_fw_reset *fw_reset = container_of(work, struct mlx5_fw_reset,
fw_live_patch_work);
struct mlx5_core_dev *dev = fw_reset->dev;
struct mlx5_fw_tracer *tracer;
mlx5_core_info(dev, "Live patch updated firmware version: %d.%d.%d\n", fw_rev_maj(dev),
fw_rev_min(dev), fw_rev_sub(dev));
tracer = dev->tracer;
if (IS_ERR_OR_NULL(tracer))
return;
if (mlx5_fw_tracer_reload(tracer))
mlx5_core_err(dev, "Failed to reload FW tracer\n");
}
static void mlx5_sync_reset_request_event(struct work_struct *work)
{
struct mlx5_fw_reset *fw_reset = container_of(work, struct mlx5_fw_reset,
reset_request_work);
struct mlx5_core_dev *dev = fw_reset->dev;
int err;
if (test_bit(MLX5_FW_RESET_FLAGS_NACK_RESET_REQUEST, &fw_reset->reset_flags)) {
err = mlx5_fw_reset_set_reset_sync_nack(dev);
mlx5_core_warn(dev, "PCI Sync FW Update Reset Nack %s",
err ? "Failed" : "Sent");
return;
}
mlx5_sync_reset_set_reset_requested(dev);
err = mlx5_fw_reset_set_reset_sync_ack(dev);
if (err)
mlx5_core_warn(dev, "PCI Sync FW Update Reset Ack Failed. Error code: %d\n", err);
else
mlx5_core_warn(dev, "PCI Sync FW Update Reset Ack. Device reset is expected.\n");
}
#define MLX5_PCI_LINK_UP_TIMEOUT 2000
static int mlx5_pci_link_toggle(struct mlx5_core_dev *dev)
{
struct pci_bus *bridge_bus = dev->pdev->bus;
struct pci_dev *bridge = bridge_bus->self;
u16 reg16, dev_id, sdev_id;
unsigned long timeout;
struct pci_dev *sdev;
int cap, err;
u32 reg32;
/* Check that all functions under the pci bridge are PFs of
* this device otherwise fail this function.
*/
err = pci_read_config_word(dev->pdev, PCI_DEVICE_ID, &dev_id);
if (err)
return err;
list_for_each_entry(sdev, &bridge_bus->devices, bus_list) {
err = pci_read_config_word(sdev, PCI_DEVICE_ID, &sdev_id);
if (err)
return err;
if (sdev_id != dev_id)
return -EPERM;
}
cap = pci_find_capability(bridge, PCI_CAP_ID_EXP);
if (!cap)
return -EOPNOTSUPP;
list_for_each_entry(sdev, &bridge_bus->devices, bus_list) {
pci_save_state(sdev);
pci_cfg_access_lock(sdev);
}
/* PCI link toggle */
err = pci_read_config_word(bridge, cap + PCI_EXP_LNKCTL, &reg16);
if (err)
return err;
reg16 |= PCI_EXP_LNKCTL_LD;
err = pci_write_config_word(bridge, cap + PCI_EXP_LNKCTL, reg16);
if (err)
return err;
msleep(500);
reg16 &= ~PCI_EXP_LNKCTL_LD;
err = pci_write_config_word(bridge, cap + PCI_EXP_LNKCTL, reg16);
if (err)
return err;
/* Check link */
err = pci_read_config_dword(bridge, cap + PCI_EXP_LNKCAP, &reg32);
if (err)
return err;
if (!(reg32 & PCI_EXP_LNKCAP_DLLLARC)) {
mlx5_core_warn(dev, "No PCI link reporting capability (0x%08x)\n", reg32);
msleep(1000);
goto restore;
}
timeout = jiffies + msecs_to_jiffies(MLX5_PCI_LINK_UP_TIMEOUT);
do {
err = pci_read_config_word(bridge, cap + PCI_EXP_LNKSTA, &reg16);
if (err)
return err;
if (reg16 & PCI_EXP_LNKSTA_DLLLA)
break;
msleep(20);
} while (!time_after(jiffies, timeout));
if (reg16 & PCI_EXP_LNKSTA_DLLLA) {
mlx5_core_info(dev, "PCI Link up\n");
} else {
mlx5_core_err(dev, "PCI link not ready (0x%04x) after %d ms\n",
reg16, MLX5_PCI_LINK_UP_TIMEOUT);
err = -ETIMEDOUT;
}
restore:
list_for_each_entry(sdev, &bridge_bus->devices, bus_list) {
pci_cfg_access_unlock(sdev);
pci_restore_state(sdev);
}
return err;
}
static void mlx5_sync_reset_now_event(struct work_struct *work)
{
struct mlx5_fw_reset *fw_reset = container_of(work, struct mlx5_fw_reset,
reset_now_work);
struct mlx5_core_dev *dev = fw_reset->dev;
int err;
mlx5_sync_reset_clear_reset_requested(dev, false);
mlx5_core_warn(dev, "Sync Reset now. Device is going to reset.\n");
err = mlx5_cmd_fast_teardown_hca(dev);
if (err) {
mlx5_core_warn(dev, "Fast teardown failed, no reset done, err %d\n", err);
goto done;
}
err = mlx5_pci_link_toggle(dev);
if (err) {
mlx5_core_warn(dev, "mlx5_pci_link_toggle failed, no reset done, err %d\n", err);
goto done;
}
mlx5_enter_error_state(dev, true);
mlx5_unload_one(dev, false);
done:
fw_reset->ret = err;
mlx5_fw_reset_complete_reload(dev);
}
static void mlx5_sync_reset_abort_event(struct work_struct *work)
{
struct mlx5_fw_reset *fw_reset = container_of(work, struct mlx5_fw_reset,
reset_abort_work);
struct mlx5_core_dev *dev = fw_reset->dev;
mlx5_sync_reset_clear_reset_requested(dev, true);
mlx5_core_warn(dev, "PCI Sync FW Update Reset Aborted.\n");
}
static void mlx5_sync_reset_events_handle(struct mlx5_fw_reset *fw_reset, struct mlx5_eqe *eqe)
{
struct mlx5_eqe_sync_fw_update *sync_fw_update_eqe;
u8 sync_event_rst_type;
sync_fw_update_eqe = &eqe->data.sync_fw_update;
sync_event_rst_type = sync_fw_update_eqe->sync_rst_state & SYNC_RST_STATE_MASK;
switch (sync_event_rst_type) {
case MLX5_SYNC_RST_STATE_RESET_REQUEST:
queue_work(fw_reset->wq, &fw_reset->reset_request_work);
break;
case MLX5_SYNC_RST_STATE_RESET_NOW:
queue_work(fw_reset->wq, &fw_reset->reset_now_work);
break;
case MLX5_SYNC_RST_STATE_RESET_ABORT:
queue_work(fw_reset->wq, &fw_reset->reset_abort_work);
break;
}
}
static int fw_reset_event_notifier(struct notifier_block *nb, unsigned long action, void *data)
{
struct mlx5_fw_reset *fw_reset = mlx5_nb_cof(nb, struct mlx5_fw_reset, nb);
struct mlx5_eqe *eqe = data;
switch (eqe->sub_type) {
case MLX5_GENERAL_SUBTYPE_FW_LIVE_PATCH_EVENT:
queue_work(fw_reset->wq, &fw_reset->fw_live_patch_work);
break;
case MLX5_GENERAL_SUBTYPE_PCI_SYNC_FOR_FW_UPDATE_EVENT:
mlx5_sync_reset_events_handle(fw_reset, eqe);
break;
default:
return NOTIFY_DONE;
}
return NOTIFY_OK;
}
#define MLX5_FW_RESET_TIMEOUT_MSEC 5000
int mlx5_fw_reset_wait_reset_done(struct mlx5_core_dev *dev)
{
unsigned long timeout = msecs_to_jiffies(MLX5_FW_RESET_TIMEOUT_MSEC);
struct mlx5_fw_reset *fw_reset = dev->priv.fw_reset;
int err;
if (!wait_for_completion_timeout(&fw_reset->done, timeout)) {
mlx5_core_warn(dev, "FW sync reset timeout after %d seconds\n",
MLX5_FW_RESET_TIMEOUT_MSEC / 1000);
err = -ETIMEDOUT;
goto out;
}
err = fw_reset->ret;
out:
clear_bit(MLX5_FW_RESET_FLAGS_PENDING_COMP, &fw_reset->reset_flags);
return err;
}
void mlx5_fw_reset_events_start(struct mlx5_core_dev *dev)
{
struct mlx5_fw_reset *fw_reset = dev->priv.fw_reset;
MLX5_NB_INIT(&fw_reset->nb, fw_reset_event_notifier, GENERAL_EVENT);
mlx5_eq_notifier_register(dev, &fw_reset->nb);
}
void mlx5_fw_reset_events_stop(struct mlx5_core_dev *dev)
{
mlx5_eq_notifier_unregister(dev, &dev->priv.fw_reset->nb);
}
int mlx5_fw_reset_init(struct mlx5_core_dev *dev)
{
struct mlx5_fw_reset *fw_reset = kzalloc(sizeof(*fw_reset), GFP_KERNEL);
if (!fw_reset)
return -ENOMEM;
fw_reset->wq = create_singlethread_workqueue("mlx5_fw_reset_events");
if (!fw_reset->wq) {
kfree(fw_reset);
return -ENOMEM;
}
fw_reset->dev = dev;
dev->priv.fw_reset = fw_reset;
INIT_WORK(&fw_reset->fw_live_patch_work, mlx5_fw_live_patch_event);
INIT_WORK(&fw_reset->reset_request_work, mlx5_sync_reset_request_event);
INIT_WORK(&fw_reset->reset_reload_work, mlx5_sync_reset_reload_work);
INIT_WORK(&fw_reset->reset_now_work, mlx5_sync_reset_now_event);
INIT_WORK(&fw_reset->reset_abort_work, mlx5_sync_reset_abort_event);
init_completion(&fw_reset->done);
return 0;
}
void mlx5_fw_reset_cleanup(struct mlx5_core_dev *dev)
{
struct mlx5_fw_reset *fw_reset = dev->priv.fw_reset;
destroy_workqueue(fw_reset->wq);
kfree(dev->priv.fw_reset);
}
/* SPDX-License-Identifier: GPL-2.0 OR Linux-OpenIB */
/* Copyright (c) 2020, Mellanox Technologies inc. All rights reserved. */
#ifndef __MLX5_FW_RESET_H
#define __MLX5_FW_RESET_H
#include "mlx5_core.h"
void mlx5_fw_reset_enable_remote_dev_reset_set(struct mlx5_core_dev *dev, bool enable);
bool mlx5_fw_reset_enable_remote_dev_reset_get(struct mlx5_core_dev *dev);
int mlx5_fw_reset_query(struct mlx5_core_dev *dev, u8 *reset_level, u8 *reset_type);
int mlx5_fw_reset_set_reset_sync(struct mlx5_core_dev *dev, u8 reset_type_sel);
int mlx5_fw_reset_set_live_patch(struct mlx5_core_dev *dev);
int mlx5_fw_reset_wait_reset_done(struct mlx5_core_dev *dev);
void mlx5_fw_reset_events_start(struct mlx5_core_dev *dev);
void mlx5_fw_reset_events_stop(struct mlx5_core_dev *dev);
int mlx5_fw_reset_init(struct mlx5_core_dev *dev);
void mlx5_fw_reset_cleanup(struct mlx5_core_dev *dev);
#endif
......@@ -110,7 +110,7 @@ static bool sensor_fw_synd_rfr(struct mlx5_core_dev *dev)
return rfr && synd;
}
static u32 check_fatal_sensors(struct mlx5_core_dev *dev)
u32 mlx5_health_check_fatal_sensors(struct mlx5_core_dev *dev)
{
if (sensor_pci_not_working(dev))
return MLX5_SENSOR_PCI_COMM_ERR;
......@@ -173,7 +173,7 @@ static bool reset_fw_if_needed(struct mlx5_core_dev *dev)
* Check again to avoid a redundant 2nd reset. If the fatal erros was
* PCI related a reset won't help.
*/
fatal_error = check_fatal_sensors(dev);
fatal_error = mlx5_health_check_fatal_sensors(dev);
if (fatal_error == MLX5_SENSOR_PCI_COMM_ERR ||
fatal_error == MLX5_SENSOR_NIC_DISABLED ||
fatal_error == MLX5_SENSOR_NIC_SW_RESET) {
......@@ -195,7 +195,7 @@ void mlx5_enter_error_state(struct mlx5_core_dev *dev, bool force)
bool err_detected = false;
/* Mark the device as fatal in order to abort FW commands */
if ((check_fatal_sensors(dev) || force) &&
if ((mlx5_health_check_fatal_sensors(dev) || force) &&
dev->state == MLX5_DEVICE_STATE_UP) {
dev->state = MLX5_DEVICE_STATE_INTERNAL_ERROR;
err_detected = true;
......@@ -208,7 +208,7 @@ void mlx5_enter_error_state(struct mlx5_core_dev *dev, bool force)
goto unlock;
}
if (check_fatal_sensors(dev) || force) { /* protected state setting */
if (mlx5_health_check_fatal_sensors(dev) || force) { /* protected state setting */
dev->state = MLX5_DEVICE_STATE_INTERNAL_ERROR;
mlx5_cmd_flush(dev);
}
......@@ -231,7 +231,7 @@ void mlx5_error_sw_reset(struct mlx5_core_dev *dev)
mlx5_core_err(dev, "start\n");
if (check_fatal_sensors(dev) == MLX5_SENSOR_FW_SYND_RFR) {
if (mlx5_health_check_fatal_sensors(dev) == MLX5_SENSOR_FW_SYND_RFR) {
/* Get cr-dump and reset FW semaphore */
lock = lock_sem_sw_reset(dev, true);
......@@ -308,26 +308,31 @@ static void mlx5_handle_bad_state(struct mlx5_core_dev *dev)
/* How much time to wait until health resetting the driver (in msecs) */
#define MLX5_RECOVERY_WAIT_MSECS 60000
static int mlx5_health_try_recover(struct mlx5_core_dev *dev)
int mlx5_health_wait_pci_up(struct mlx5_core_dev *dev)
{
unsigned long end;
mlx5_core_warn(dev, "handling bad device here\n");
mlx5_handle_bad_state(dev);
end = jiffies + msecs_to_jiffies(MLX5_RECOVERY_WAIT_MSECS);
while (sensor_pci_not_working(dev)) {
if (time_after(jiffies, end)) {
mlx5_core_err(dev,
"health recovery flow aborted, PCI reads still not working\n");
return -EIO;
}
if (time_after(jiffies, end))
return -ETIMEDOUT;
msleep(100);
}
return 0;
}
static int mlx5_health_try_recover(struct mlx5_core_dev *dev)
{
mlx5_core_warn(dev, "handling bad device here\n");
mlx5_handle_bad_state(dev);
if (mlx5_health_wait_pci_up(dev)) {
mlx5_core_err(dev, "health recovery flow aborted, PCI reads still not working\n");
return -EIO;
}
mlx5_core_err(dev, "starting health recovery flow\n");
mlx5_recover_device(dev);
if (!test_bit(MLX5_INTERFACE_STATE_UP, &dev->intf_state) ||
check_fatal_sensors(dev)) {
mlx5_health_check_fatal_sensors(dev)) {
mlx5_core_err(dev, "health recovery failed\n");
return -EIO;
}
......@@ -696,7 +701,7 @@ static void poll_health(struct timer_list *t)
if (dev->state == MLX5_DEVICE_STATE_INTERNAL_ERROR)
goto out;
fatal_error = check_fatal_sensors(dev);
fatal_error = mlx5_health_check_fatal_sensors(dev);
if (fatal_error && !health->fatal_error) {
mlx5_core_err(dev, "Fatal error %u detected\n", fatal_error);
......
......@@ -57,6 +57,7 @@
#include "lib/mpfs.h"
#include "eswitch.h"
#include "devlink.h"
#include "fw_reset.h"
#include "lib/mlx5.h"
#include "fpga/core.h"
#include "fpga/ipsec.h"
......@@ -548,6 +549,9 @@ static int handle_hca_cap(struct mlx5_core_dev *dev, void *set_ctx)
if (MLX5_CAP_GEN_MAX(dev, dct))
MLX5_SET(cmd_hca_cap, set_hca_cap, dct, 1);
if (MLX5_CAP_GEN_MAX(dev, pci_sync_for_fw_update_event))
MLX5_SET(cmd_hca_cap, set_hca_cap, pci_sync_for_fw_update_event, 1);
if (MLX5_CAP_GEN_MAX(dev, num_vhca_ports))
MLX5_SET(cmd_hca_cap,
set_hca_cap,
......@@ -832,6 +836,12 @@ static int mlx5_init_once(struct mlx5_core_dev *dev)
goto err_eq_cleanup;
}
err = mlx5_fw_reset_init(dev);
if (err) {
mlx5_core_err(dev, "failed to initialize fw reset events\n");
goto err_events_cleanup;
}
mlx5_cq_debugfs_init(dev);
mlx5_init_reserved_gids(dev);
......@@ -893,6 +903,8 @@ static int mlx5_init_once(struct mlx5_core_dev *dev)
mlx5_geneve_destroy(dev->geneve);
mlx5_vxlan_destroy(dev->vxlan);
mlx5_cq_debugfs_cleanup(dev);
mlx5_fw_reset_cleanup(dev);
err_events_cleanup:
mlx5_events_cleanup(dev);
err_eq_cleanup:
mlx5_eq_table_cleanup(dev);
......@@ -920,6 +932,7 @@ static void mlx5_cleanup_once(struct mlx5_core_dev *dev)
mlx5_cleanup_clock(dev);
mlx5_cleanup_reserved_gids(dev);
mlx5_cq_debugfs_cleanup(dev);
mlx5_fw_reset_cleanup(dev);
mlx5_events_cleanup(dev);
mlx5_eq_table_cleanup(dev);
mlx5_irq_table_cleanup(dev);
......@@ -1078,6 +1091,7 @@ static int mlx5_load(struct mlx5_core_dev *dev)
goto err_fw_tracer;
}
mlx5_fw_reset_events_start(dev);
mlx5_hv_vhca_init(dev->hv_vhca);
err = mlx5_rsc_dump_init(dev);
......@@ -1139,6 +1153,7 @@ static int mlx5_load(struct mlx5_core_dev *dev)
mlx5_rsc_dump_cleanup(dev);
err_rsc_dump:
mlx5_hv_vhca_cleanup(dev->hv_vhca);
mlx5_fw_reset_events_stop(dev);
mlx5_fw_tracer_cleanup(dev->tracer);
err_fw_tracer:
mlx5_eq_table_destroy(dev);
......@@ -1161,6 +1176,7 @@ static void mlx5_unload(struct mlx5_core_dev *dev)
mlx5_fpga_device_stop(dev);
mlx5_rsc_dump_cleanup(dev);
mlx5_hv_vhca_cleanup(dev->hv_vhca);
mlx5_fw_reset_events_stop(dev);
mlx5_fw_tracer_cleanup(dev->tracer);
mlx5_eq_table_destroy(dev);
mlx5_irq_table_destroy(dev);
......
......@@ -128,6 +128,8 @@ int mlx5_cmd_force_teardown_hca(struct mlx5_core_dev *dev);
int mlx5_cmd_fast_teardown_hca(struct mlx5_core_dev *dev);
void mlx5_enter_error_state(struct mlx5_core_dev *dev, bool force);
void mlx5_error_sw_reset(struct mlx5_core_dev *dev);
u32 mlx5_health_check_fatal_sensors(struct mlx5_core_dev *dev);
int mlx5_health_wait_pci_up(struct mlx5_core_dev *dev);
void mlx5_disable_device(struct mlx5_core_dev *dev);
void mlx5_recover_device(struct mlx5_core_dev *dev);
int mlx5_sriov_init(struct mlx5_core_dev *dev);
......
......@@ -1414,7 +1414,8 @@ mlxsw_devlink_info_get(struct devlink *devlink, struct devlink_info_req *req,
static int
mlxsw_devlink_core_bus_device_reload_down(struct devlink *devlink,
bool netns_change,
bool netns_change, enum devlink_reload_action action,
enum devlink_reload_limit limit,
struct netlink_ext_ack *extack)
{
struct mlxsw_core *mlxsw_core = devlink_priv(devlink);
......@@ -1427,11 +1428,14 @@ mlxsw_devlink_core_bus_device_reload_down(struct devlink *devlink,
}
static int
mlxsw_devlink_core_bus_device_reload_up(struct devlink *devlink,
mlxsw_devlink_core_bus_device_reload_up(struct devlink *devlink, enum devlink_reload_action action,
enum devlink_reload_limit limit, u32 *actions_performed,
struct netlink_ext_ack *extack)
{
struct mlxsw_core *mlxsw_core = devlink_priv(devlink);
*actions_performed = BIT(DEVLINK_RELOAD_ACTION_DRIVER_REINIT) |
BIT(DEVLINK_RELOAD_ACTION_FW_ACTIVATE);
return mlxsw_core_bus_device_register(mlxsw_core->bus_info,
mlxsw_core->bus,
mlxsw_core->bus_priv, true,
......@@ -1564,6 +1568,8 @@ mlxsw_devlink_trap_policer_counter_get(struct devlink *devlink,
}
static const struct devlink_ops mlxsw_devlink_ops = {
.reload_actions = BIT(DEVLINK_RELOAD_ACTION_DRIVER_REINIT) |
BIT(DEVLINK_RELOAD_ACTION_FW_ACTIVATE),
.reload_down = mlxsw_devlink_core_bus_device_reload_down,
.reload_up = mlxsw_devlink_core_bus_device_reload_up,
.port_type_set = mlxsw_devlink_port_type_set,
......
......@@ -701,6 +701,7 @@ static int nsim_dev_reload_create(struct nsim_dev *nsim_dev,
static void nsim_dev_reload_destroy(struct nsim_dev *nsim_dev);
static int nsim_dev_reload_down(struct devlink *devlink, bool netns_change,
enum devlink_reload_action action, enum devlink_reload_limit limit,
struct netlink_ext_ack *extack)
{
struct nsim_dev *nsim_dev = devlink_priv(devlink);
......@@ -717,7 +718,8 @@ static int nsim_dev_reload_down(struct devlink *devlink, bool netns_change,
return 0;
}
static int nsim_dev_reload_up(struct devlink *devlink,
static int nsim_dev_reload_up(struct devlink *devlink, enum devlink_reload_action action,
enum devlink_reload_limit limit, u32 *actions_performed,
struct netlink_ext_ack *extack)
{
struct nsim_dev *nsim_dev = devlink_priv(devlink);
......@@ -730,6 +732,7 @@ static int nsim_dev_reload_up(struct devlink *devlink,
return -EINVAL;
}
*actions_performed = BIT(DEVLINK_RELOAD_ACTION_DRIVER_REINIT);
return nsim_dev_reload_create(nsim_dev, extack);
}
......@@ -886,6 +889,7 @@ nsim_dev_devlink_trap_policer_counter_get(struct devlink *devlink,
static const struct devlink_ops nsim_dev_devlink_ops = {
.supported_flash_update_params = DEVLINK_SUPPORT_FLASH_UPDATE_COMPONENT |
DEVLINK_SUPPORT_FLASH_UPDATE_OVERWRITE_MASK,
.reload_actions = BIT(DEVLINK_RELOAD_ACTION_DRIVER_REINIT),
.reload_down = nsim_dev_reload_down,
.reload_up = nsim_dev_reload_up,
.info_get = nsim_dev_info_get,
......
......@@ -366,6 +366,7 @@ enum {
enum {
MLX5_GENERAL_SUBTYPE_DELAY_DROP_TIMEOUT = 0x1,
MLX5_GENERAL_SUBTYPE_PCI_POWER_CHANGE_EVENT = 0x5,
MLX5_GENERAL_SUBTYPE_FW_LIVE_PATCH_EVENT = 0x7,
MLX5_GENERAL_SUBTYPE_PCI_SYNC_FOR_FW_UPDATE_EVENT = 0x8,
};
......
......@@ -501,6 +501,7 @@ struct mlx5_mpfs;
struct mlx5_eswitch;
struct mlx5_lag;
struct mlx5_devcom;
struct mlx5_fw_reset;
struct mlx5_eq_table;
struct mlx5_irq_table;
......@@ -578,6 +579,7 @@ struct mlx5_priv {
struct mlx5_core_sriov sriov;
struct mlx5_lag *lag;
struct mlx5_devcom *devcom;
struct mlx5_fw_reset *fw_reset;
struct mlx5_core_roce roce;
struct mlx5_fc_stats fc_stats;
struct mlx5_rl_table rl_table;
......
......@@ -20,6 +20,14 @@
#include <uapi/linux/devlink.h>
#include <linux/xarray.h>
#define DEVLINK_RELOAD_STATS_ARRAY_SIZE \
(__DEVLINK_RELOAD_LIMIT_MAX * __DEVLINK_RELOAD_ACTION_MAX)
struct devlink_dev_stats {
u32 reload_stats[DEVLINK_RELOAD_STATS_ARRAY_SIZE];
u32 remote_reload_stats[DEVLINK_RELOAD_STATS_ARRAY_SIZE];
};
struct devlink_ops;
struct devlink {
......@@ -38,6 +46,7 @@ struct devlink {
struct list_head trap_policer_list;
const struct devlink_ops *ops;
struct xarray snapshot_ids;
struct devlink_dev_stats stats;
struct device *dev;
possible_net_t _net;
struct mutex lock; /* Serializes access to devlink instance specific objects such as
......@@ -460,6 +469,7 @@ enum devlink_param_generic_id {
DEVLINK_PARAM_GENERIC_ID_FW_LOAD_POLICY,
DEVLINK_PARAM_GENERIC_ID_RESET_DEV_ON_DRV_PROBE,
DEVLINK_PARAM_GENERIC_ID_ENABLE_ROCE,
DEVLINK_PARAM_GENERIC_ID_ENABLE_REMOTE_DEV_RESET,
/* add new param generic ids above here*/
__DEVLINK_PARAM_GENERIC_ID_MAX,
......@@ -497,6 +507,9 @@ enum devlink_param_generic_id {
#define DEVLINK_PARAM_GENERIC_ENABLE_ROCE_NAME "enable_roce"
#define DEVLINK_PARAM_GENERIC_ENABLE_ROCE_TYPE DEVLINK_PARAM_TYPE_BOOL
#define DEVLINK_PARAM_GENERIC_ENABLE_REMOTE_DEV_RESET_NAME "enable_remote_dev_reset"
#define DEVLINK_PARAM_GENERIC_ENABLE_REMOTE_DEV_RESET_TYPE DEVLINK_PARAM_TYPE_BOOL
#define DEVLINK_PARAM_GENERIC(_id, _cmodes, _get, _set, _validate) \
{ \
.id = DEVLINK_PARAM_GENERIC_ID_##_id, \
......@@ -1150,9 +1163,14 @@ struct devlink_ops {
* implemementation.
*/
u32 supported_flash_update_params;
unsigned long reload_actions;
unsigned long reload_limits;
int (*reload_down)(struct devlink *devlink, bool netns_change,
enum devlink_reload_action action,
enum devlink_reload_limit limit,
struct netlink_ext_ack *extack);
int (*reload_up)(struct devlink *devlink,
int (*reload_up)(struct devlink *devlink, enum devlink_reload_action action,
enum devlink_reload_limit limit, u32 *actions_performed,
struct netlink_ext_ack *extack);
int (*port_type_set)(struct devlink_port *devlink_port,
enum devlink_port_type port_type);
......@@ -1554,6 +1572,9 @@ void
devlink_health_reporter_recovery_done(struct devlink_health_reporter *reporter);
bool devlink_is_reload_failed(const struct devlink *devlink);
void devlink_remote_reload_actions_performed(struct devlink *devlink,
enum devlink_reload_limit limit,
u32 actions_performed);
void devlink_flash_update_begin_notify(struct devlink *devlink);
void devlink_flash_update_end_notify(struct devlink *devlink);
......
......@@ -301,6 +301,29 @@ enum {
DEVLINK_ATTR_TRAP_METADATA_TYPE_FA_COOKIE,
};
enum devlink_reload_action {
DEVLINK_RELOAD_ACTION_UNSPEC,
DEVLINK_RELOAD_ACTION_DRIVER_REINIT, /* Driver entities re-instantiation */
DEVLINK_RELOAD_ACTION_FW_ACTIVATE, /* FW activate */
/* Add new reload actions above */
__DEVLINK_RELOAD_ACTION_MAX,
DEVLINK_RELOAD_ACTION_MAX = __DEVLINK_RELOAD_ACTION_MAX - 1
};
enum devlink_reload_limit {
DEVLINK_RELOAD_LIMIT_UNSPEC, /* unspecified, no constraints */
DEVLINK_RELOAD_LIMIT_NO_RESET, /* No reset allowed, no down time allowed,
* no link flap and no configuration is lost.
*/
/* Add new reload limit above */
__DEVLINK_RELOAD_LIMIT_MAX,
DEVLINK_RELOAD_LIMIT_MAX = __DEVLINK_RELOAD_LIMIT_MAX - 1
};
#define DEVLINK_RELOAD_LIMITS_VALID_MASK (BIT(__DEVLINK_RELOAD_LIMIT_MAX) - 1)
enum devlink_attr {
/* don't change the order or add anything between, this is ABI! */
DEVLINK_ATTR_UNSPEC,
......@@ -493,6 +516,17 @@ enum devlink_attr {
DEVLINK_ATTR_FLASH_UPDATE_STATUS_TIMEOUT, /* u64 */
DEVLINK_ATTR_FLASH_UPDATE_OVERWRITE_MASK, /* bitfield32 */
DEVLINK_ATTR_RELOAD_ACTION, /* u8 */
DEVLINK_ATTR_RELOAD_ACTIONS_PERFORMED, /* bitfield32 */
DEVLINK_ATTR_RELOAD_LIMITS, /* bitfield32 */
DEVLINK_ATTR_DEV_STATS, /* nested */
DEVLINK_ATTR_RELOAD_STATS, /* nested */
DEVLINK_ATTR_RELOAD_STATS_ENTRY, /* nested */
DEVLINK_ATTR_RELOAD_STATS_LIMIT, /* u8 */
DEVLINK_ATTR_RELOAD_STATS_VALUE, /* u32 */
DEVLINK_ATTR_REMOTE_RELOAD_STATS, /* nested */
/* add new attributes above here, update the policy in devlink.c */
__DEVLINK_ATTR_MAX,
......
......@@ -479,10 +479,115 @@ static int devlink_nl_put_handle(struct sk_buff *msg, struct devlink *devlink)
return 0;
}
struct devlink_reload_combination {
enum devlink_reload_action action;
enum devlink_reload_limit limit;
};
static const struct devlink_reload_combination devlink_reload_invalid_combinations[] = {
{
/* can't reinitialize driver with no down time */
.action = DEVLINK_RELOAD_ACTION_DRIVER_REINIT,
.limit = DEVLINK_RELOAD_LIMIT_NO_RESET,
},
};
static bool
devlink_reload_combination_is_invalid(enum devlink_reload_action action,
enum devlink_reload_limit limit)
{
int i;
for (i = 0; i < ARRAY_SIZE(devlink_reload_invalid_combinations); i++)
if (devlink_reload_invalid_combinations[i].action == action &&
devlink_reload_invalid_combinations[i].limit == limit)
return true;
return false;
}
static bool
devlink_reload_action_is_supported(struct devlink *devlink, enum devlink_reload_action action)
{
return test_bit(action, &devlink->ops->reload_actions);
}
static bool
devlink_reload_limit_is_supported(struct devlink *devlink, enum devlink_reload_limit limit)
{
return test_bit(limit, &devlink->ops->reload_limits);
}
static int devlink_reload_stat_put(struct sk_buff *msg, enum devlink_reload_action action,
enum devlink_reload_limit limit, u32 value)
{
struct nlattr *reload_stats_entry;
reload_stats_entry = nla_nest_start(msg, DEVLINK_ATTR_RELOAD_STATS_ENTRY);
if (!reload_stats_entry)
return -EMSGSIZE;
if (nla_put_u8(msg, DEVLINK_ATTR_RELOAD_ACTION, action) ||
nla_put_u8(msg, DEVLINK_ATTR_RELOAD_STATS_LIMIT, limit) ||
nla_put_u32(msg, DEVLINK_ATTR_RELOAD_STATS_VALUE, value))
goto nla_put_failure;
nla_nest_end(msg, reload_stats_entry);
return 0;
nla_put_failure:
nla_nest_cancel(msg, reload_stats_entry);
return -EMSGSIZE;
}
static int devlink_reload_stats_put(struct sk_buff *msg, struct devlink *devlink, bool is_remote)
{
struct nlattr *reload_stats_attr;
int i, j, stat_idx;
u32 value;
if (!is_remote)
reload_stats_attr = nla_nest_start(msg, DEVLINK_ATTR_RELOAD_STATS);
else
reload_stats_attr = nla_nest_start(msg, DEVLINK_ATTR_REMOTE_RELOAD_STATS);
if (!reload_stats_attr)
return -EMSGSIZE;
for (j = 0; j <= DEVLINK_RELOAD_LIMIT_MAX; j++) {
/* Remote stats are shown even if not locally supported. Stats
* of actions with unspecified limit are shown though drivers
* don't need to register unspecified limit.
*/
if (!is_remote && j != DEVLINK_RELOAD_LIMIT_UNSPEC &&
!devlink_reload_limit_is_supported(devlink, j))
continue;
for (i = 0; i <= DEVLINK_RELOAD_ACTION_MAX; i++) {
if ((!is_remote && !devlink_reload_action_is_supported(devlink, i)) ||
i == DEVLINK_RELOAD_ACTION_UNSPEC ||
devlink_reload_combination_is_invalid(i, j))
continue;
stat_idx = j * __DEVLINK_RELOAD_ACTION_MAX + i;
if (!is_remote)
value = devlink->stats.reload_stats[stat_idx];
else
value = devlink->stats.remote_reload_stats[stat_idx];
if (devlink_reload_stat_put(msg, i, j, value))
goto nla_put_failure;
}
}
nla_nest_end(msg, reload_stats_attr);
return 0;
nla_put_failure:
nla_nest_cancel(msg, reload_stats_attr);
return -EMSGSIZE;
}
static int devlink_nl_fill(struct sk_buff *msg, struct devlink *devlink,
enum devlink_command cmd, u32 portid,
u32 seq, int flags)
{
struct nlattr *dev_stats;
void *hdr;
hdr = genlmsg_put(msg, portid, seq, &devlink_nl_family, flags, cmd);
......@@ -494,9 +599,21 @@ static int devlink_nl_fill(struct sk_buff *msg, struct devlink *devlink,
if (nla_put_u8(msg, DEVLINK_ATTR_RELOAD_FAILED, devlink->reload_failed))
goto nla_put_failure;
dev_stats = nla_nest_start(msg, DEVLINK_ATTR_DEV_STATS);
if (!dev_stats)
goto nla_put_failure;
if (devlink_reload_stats_put(msg, devlink, false))
goto dev_stats_nest_cancel;
if (devlink_reload_stats_put(msg, devlink, true))
goto dev_stats_nest_cancel;
nla_nest_end(msg, dev_stats);
genlmsg_end(msg, hdr);
return 0;
dev_stats_nest_cancel:
nla_nest_cancel(msg, dev_stats);
nla_put_failure:
genlmsg_cancel(msg, hdr);
return -EMSGSIZE;
......@@ -2963,9 +3080,9 @@ static void devlink_reload_netns_change(struct devlink *devlink,
DEVLINK_CMD_PARAM_NEW);
}
static bool devlink_reload_supported(const struct devlink *devlink)
static bool devlink_reload_supported(const struct devlink_ops *ops)
{
return devlink->ops->reload_down && devlink->ops->reload_up;
return ops->reload_down && ops->reload_up;
}
static void devlink_reload_failed_set(struct devlink *devlink,
......@@ -2983,33 +3100,132 @@ bool devlink_is_reload_failed(const struct devlink *devlink)
}
EXPORT_SYMBOL_GPL(devlink_is_reload_failed);
static void
__devlink_reload_stats_update(struct devlink *devlink, u32 *reload_stats,
enum devlink_reload_limit limit, u32 actions_performed)
{
unsigned long actions = actions_performed;
int stat_idx;
int action;
for_each_set_bit(action, &actions, __DEVLINK_RELOAD_ACTION_MAX) {
stat_idx = limit * __DEVLINK_RELOAD_ACTION_MAX + action;
reload_stats[stat_idx]++;
}
devlink_notify(devlink, DEVLINK_CMD_NEW);
}
static void
devlink_reload_stats_update(struct devlink *devlink, enum devlink_reload_limit limit,
u32 actions_performed)
{
__devlink_reload_stats_update(devlink, devlink->stats.reload_stats, limit,
actions_performed);
}
/**
* devlink_remote_reload_actions_performed - Update devlink on reload actions
* performed which are not a direct result of devlink reload call.
*
* This should be called by a driver after performing reload actions in case it was not
* a result of devlink reload call. For example fw_activate was performed as a result
* of devlink reload triggered fw_activate on another host.
* The motivation for this function is to keep data on reload actions performed on this
* function whether it was done due to direct devlink reload call or not.
*
* @devlink: devlink
* @limit: reload limit
* @actions_performed: bitmask of actions performed
*/
void devlink_remote_reload_actions_performed(struct devlink *devlink,
enum devlink_reload_limit limit,
u32 actions_performed)
{
if (WARN_ON(!actions_performed ||
actions_performed & BIT(DEVLINK_RELOAD_ACTION_UNSPEC) ||
actions_performed >= BIT(__DEVLINK_RELOAD_ACTION_MAX) ||
limit > DEVLINK_RELOAD_LIMIT_MAX))
return;
__devlink_reload_stats_update(devlink, devlink->stats.remote_reload_stats, limit,
actions_performed);
}
EXPORT_SYMBOL_GPL(devlink_remote_reload_actions_performed);
static int devlink_reload(struct devlink *devlink, struct net *dest_net,
struct netlink_ext_ack *extack)
enum devlink_reload_action action, enum devlink_reload_limit limit,
u32 *actions_performed, struct netlink_ext_ack *extack)
{
u32 remote_reload_stats[DEVLINK_RELOAD_STATS_ARRAY_SIZE];
int err;
if (!devlink->reload_enabled)
return -EOPNOTSUPP;
err = devlink->ops->reload_down(devlink, !!dest_net, extack);
memcpy(remote_reload_stats, devlink->stats.remote_reload_stats,
sizeof(remote_reload_stats));
err = devlink->ops->reload_down(devlink, !!dest_net, action, limit, extack);
if (err)
return err;
if (dest_net && !net_eq(dest_net, devlink_net(devlink)))
devlink_reload_netns_change(devlink, dest_net);
err = devlink->ops->reload_up(devlink, extack);
err = devlink->ops->reload_up(devlink, action, limit, actions_performed, extack);
devlink_reload_failed_set(devlink, !!err);
if (err)
return err;
WARN_ON(!(*actions_performed & BIT(action)));
/* Catch driver on updating the remote action within devlink reload */
WARN_ON(memcmp(remote_reload_stats, devlink->stats.remote_reload_stats,
sizeof(remote_reload_stats)));
devlink_reload_stats_update(devlink, limit, *actions_performed);
return 0;
}
static int
devlink_nl_reload_actions_performed_snd(struct devlink *devlink, u32 actions_performed,
enum devlink_command cmd, struct genl_info *info)
{
struct sk_buff *msg;
void *hdr;
msg = nlmsg_new(NLMSG_DEFAULT_SIZE, GFP_KERNEL);
if (!msg)
return -ENOMEM;
hdr = genlmsg_put(msg, info->snd_portid, info->snd_seq, &devlink_nl_family, 0, cmd);
if (!hdr)
goto free_msg;
if (devlink_nl_put_handle(msg, devlink))
goto nla_put_failure;
if (nla_put_bitfield32(msg, DEVLINK_ATTR_RELOAD_ACTIONS_PERFORMED, actions_performed,
actions_performed))
goto nla_put_failure;
genlmsg_end(msg, hdr);
return genlmsg_reply(msg, info);
nla_put_failure:
genlmsg_cancel(msg, hdr);
free_msg:
nlmsg_free(msg);
return -EMSGSIZE;
}
static int devlink_nl_cmd_reload(struct sk_buff *skb, struct genl_info *info)
{
struct devlink *devlink = info->user_ptr[0];
enum devlink_reload_action action;
enum devlink_reload_limit limit;
struct net *dest_net = NULL;
u32 actions_performed;
int err;
if (!devlink_reload_supported(devlink))
if (!devlink_reload_supported(devlink->ops))
return -EOPNOTSUPP;
err = devlink_resources_validate(devlink, NULL, info);
......@@ -3026,12 +3242,61 @@ static int devlink_nl_cmd_reload(struct sk_buff *skb, struct genl_info *info)
return PTR_ERR(dest_net);
}
err = devlink_reload(devlink, dest_net, info->extack);
if (info->attrs[DEVLINK_ATTR_RELOAD_ACTION])
action = nla_get_u8(info->attrs[DEVLINK_ATTR_RELOAD_ACTION]);
else
action = DEVLINK_RELOAD_ACTION_DRIVER_REINIT;
if (!devlink_reload_action_is_supported(devlink, action)) {
NL_SET_ERR_MSG_MOD(info->extack,
"Requested reload action is not supported by the driver");
return -EOPNOTSUPP;
}
limit = DEVLINK_RELOAD_LIMIT_UNSPEC;
if (info->attrs[DEVLINK_ATTR_RELOAD_LIMITS]) {
struct nla_bitfield32 limits;
u32 limits_selected;
limits = nla_get_bitfield32(info->attrs[DEVLINK_ATTR_RELOAD_LIMITS]);
limits_selected = limits.value & limits.selector;
if (!limits_selected) {
NL_SET_ERR_MSG_MOD(info->extack, "Invalid limit selected");
return -EINVAL;
}
for (limit = 0 ; limit <= DEVLINK_RELOAD_LIMIT_MAX ; limit++)
if (limits_selected & BIT(limit))
break;
/* UAPI enables multiselection, but currently it is not used */
if (limits_selected != BIT(limit)) {
NL_SET_ERR_MSG_MOD(info->extack,
"Multiselection of limit is not supported");
return -EOPNOTSUPP;
}
if (!devlink_reload_limit_is_supported(devlink, limit)) {
NL_SET_ERR_MSG_MOD(info->extack,
"Requested limit is not supported by the driver");
return -EOPNOTSUPP;
}
if (devlink_reload_combination_is_invalid(action, limit)) {
NL_SET_ERR_MSG_MOD(info->extack,
"Requested limit is invalid for this action");
return -EINVAL;
}
}
err = devlink_reload(devlink, dest_net, action, limit, &actions_performed, info->extack);
if (dest_net)
put_net(dest_net);
if (err)
return err;
/* For backward compatibility generate reply only if attributes used by user */
if (!info->attrs[DEVLINK_ATTR_RELOAD_ACTION] && !info->attrs[DEVLINK_ATTR_RELOAD_LIMITS])
return 0;
return devlink_nl_reload_actions_performed_snd(devlink, actions_performed,
DEVLINK_CMD_RELOAD, info);
}
static int devlink_nl_flash_update_fill(struct sk_buff *msg,
......@@ -3256,6 +3521,11 @@ static const struct devlink_param devlink_param_generic[] = {
.name = DEVLINK_PARAM_GENERIC_ENABLE_ROCE_NAME,
.type = DEVLINK_PARAM_GENERIC_ENABLE_ROCE_TYPE,
},
{
.id = DEVLINK_PARAM_GENERIC_ID_ENABLE_REMOTE_DEV_RESET,
.name = DEVLINK_PARAM_GENERIC_ENABLE_REMOTE_DEV_RESET_NAME,
.type = DEVLINK_PARAM_GENERIC_ENABLE_REMOTE_DEV_RESET_TYPE,
},
};
static int devlink_param_generic_verify(const struct devlink_param *param)
......@@ -7282,6 +7552,9 @@ static const struct nla_policy devlink_nl_policy[DEVLINK_ATTR_MAX + 1] = {
[DEVLINK_ATTR_TRAP_POLICER_RATE] = { .type = NLA_U64 },
[DEVLINK_ATTR_TRAP_POLICER_BURST] = { .type = NLA_U64 },
[DEVLINK_ATTR_PORT_FUNCTION] = { .type = NLA_NESTED },
[DEVLINK_ATTR_RELOAD_ACTION] = NLA_POLICY_RANGE(NLA_U8, DEVLINK_RELOAD_ACTION_DRIVER_REINIT,
DEVLINK_RELOAD_ACTION_MAX),
[DEVLINK_ATTR_RELOAD_LIMITS] = NLA_POLICY_BITFIELD32(DEVLINK_RELOAD_LIMITS_VALID_MASK),
};
static const struct genl_small_ops devlink_nl_ops[] = {
......@@ -7615,6 +7888,35 @@ static struct genl_family devlink_nl_family __ro_after_init = {
.n_mcgrps = ARRAY_SIZE(devlink_nl_mcgrps),
};
static bool devlink_reload_actions_valid(const struct devlink_ops *ops)
{
const struct devlink_reload_combination *comb;
int i;
if (!devlink_reload_supported(ops)) {
if (WARN_ON(ops->reload_actions))
return false;
return true;
}
if (WARN_ON(!ops->reload_actions ||
ops->reload_actions & BIT(DEVLINK_RELOAD_ACTION_UNSPEC) ||
ops->reload_actions >= BIT(__DEVLINK_RELOAD_ACTION_MAX)))
return false;
if (WARN_ON(ops->reload_limits & BIT(DEVLINK_RELOAD_LIMIT_UNSPEC) ||
ops->reload_limits >= BIT(__DEVLINK_RELOAD_LIMIT_MAX)))
return false;
for (i = 0; i < ARRAY_SIZE(devlink_reload_invalid_combinations); i++) {
comb = &devlink_reload_invalid_combinations[i];
if (ops->reload_actions == BIT(comb->action) &&
ops->reload_limits == BIT(comb->limit))
return false;
}
return true;
}
/**
* devlink_alloc - Allocate new devlink instance resources
*
......@@ -7631,6 +7933,9 @@ struct devlink *devlink_alloc(const struct devlink_ops *ops, size_t priv_size)
if (WARN_ON(!ops))
return NULL;
if (!devlink_reload_actions_valid(ops))
return NULL;
devlink = kzalloc(sizeof(*devlink) + priv_size, GFP_KERNEL);
if (!devlink)
return NULL;
......@@ -7679,7 +7984,7 @@ EXPORT_SYMBOL_GPL(devlink_register);
void devlink_unregister(struct devlink *devlink)
{
mutex_lock(&devlink_mutex);
WARN_ON(devlink_reload_supported(devlink) &&
WARN_ON(devlink_reload_supported(devlink->ops) &&
devlink->reload_enabled);
devlink_notify(devlink, DEVLINK_CMD_DEL);
list_del(&devlink->list);
......@@ -8720,7 +9025,7 @@ __devlink_param_driverinit_value_set(struct devlink *devlink,
int devlink_param_driverinit_value_get(struct devlink *devlink, u32 param_id,
union devlink_param_value *init_val)
{
if (!devlink_reload_supported(devlink))
if (!devlink_reload_supported(devlink->ops))
return -EOPNOTSUPP;
return __devlink_param_driverinit_value_get(&devlink->param_list,
......@@ -8767,7 +9072,7 @@ int devlink_port_param_driverinit_value_get(struct devlink_port *devlink_port,
{
struct devlink *devlink = devlink_port->devlink;
if (!devlink_reload_supported(devlink))
if (!devlink_reload_supported(devlink->ops))
return -EOPNOTSUPP;
return __devlink_param_driverinit_value_get(&devlink_port->param_list,
......@@ -9960,6 +10265,7 @@ int devlink_compat_switch_id_get(struct net_device *dev,
static void __net_exit devlink_pernet_pre_exit(struct net *net)
{
struct devlink *devlink;
u32 actions_performed;
int err;
/* In case network namespace is getting destroyed, reload
......@@ -9968,9 +10274,12 @@ static void __net_exit devlink_pernet_pre_exit(struct net *net)
mutex_lock(&devlink_mutex);
list_for_each_entry(devlink, &devlink_list, list) {
if (net_eq(devlink_net(devlink), net)) {
if (WARN_ON(!devlink_reload_supported(devlink)))
if (WARN_ON(!devlink_reload_supported(devlink->ops)))
continue;
err = devlink_reload(devlink, &init_net, NULL);
err = devlink_reload(devlink, &init_net,
DEVLINK_RELOAD_ACTION_DRIVER_REINIT,
DEVLINK_RELOAD_LIMIT_UNSPEC,
&actions_performed, NULL);
if (err && err != -EOPNOTSUPP)
pr_warn("Failed to reload devlink instance into init_net\n");
}
......
Markdown is supported
0%
or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment