Commit 1f492eab authored by Ido Schimmel's avatar Ido Schimmel Committed by Jakub Kicinski

mlxsw: core: Use variable timeout for EMAD retries

The driver sends Ethernet Management Datagram (EMAD) packets to the
device for configuration purposes and waits for up to 200ms for a reply.
A request is retried up to 5 times.

When the system is under heavy load, replies are not always processed in
time and EMAD transactions fail.

Make the process more robust to such delays by using exponential
backoff. First wait for up to 200ms, then retransmit and wait for up to
400ms and so on.

Fixes: caf7297e ("mlxsw: core: Introduce support for asynchronous EMAD register access")
Reported-by: default avatarDenis Yulevich <denisyu@nvidia.com>
Tested-by: default avatarDenis Yulevich <denisyu@nvidia.com>
Signed-off-by: default avatarIdo Schimmel <idosch@nvidia.com>
Reviewed-by: default avatarJiri Pirko <jiri@nvidia.com>
Signed-off-by: default avatarJakub Kicinski <kuba@kernel.org>
parent fb738b99
...@@ -571,7 +571,8 @@ static void mlxsw_emad_trans_timeout_schedule(struct mlxsw_reg_trans *trans) ...@@ -571,7 +571,8 @@ static void mlxsw_emad_trans_timeout_schedule(struct mlxsw_reg_trans *trans)
if (trans->core->fw_flash_in_progress) if (trans->core->fw_flash_in_progress)
timeout = msecs_to_jiffies(MLXSW_EMAD_TIMEOUT_DURING_FW_FLASH_MS); timeout = msecs_to_jiffies(MLXSW_EMAD_TIMEOUT_DURING_FW_FLASH_MS);
queue_delayed_work(trans->core->emad_wq, &trans->timeout_dw, timeout); queue_delayed_work(trans->core->emad_wq, &trans->timeout_dw,
timeout << trans->retries);
} }
static int mlxsw_emad_transmit(struct mlxsw_core *mlxsw_core, static int mlxsw_emad_transmit(struct mlxsw_core *mlxsw_core,
......
Markdown is supported
0%
or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment