Commit 97afc2bf authored by Daniel Jurgens's avatar Daniel Jurgens Committed by Kamal Mostafa

net/mlx4_core: Do not BUG_ON during reset when PCI is offline

commit 22e3817e upstream.

The PCI channel could go offline during reset due to EEH.  Don't bug on in
this case, the error is recoverable.

Fixes: f6bc11e4 ('net/mlx4_core: Enhance the catas flow to support device reset')
Signed-off-by: default avatarDaniel Jurgens <danielj@mellanox.com>
Reviewed-by: default avatarYishai Hadas <yishaih@mellanox.com>
Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
Signed-off-by: default avatarKamal Mostafa <kamal@canonical.com>
parent 0ca34014
...@@ -182,10 +182,17 @@ void mlx4_enter_error_state(struct mlx4_dev_persistent *persist) ...@@ -182,10 +182,17 @@ void mlx4_enter_error_state(struct mlx4_dev_persistent *persist)
err = mlx4_reset_slave(dev); err = mlx4_reset_slave(dev);
else else
err = mlx4_reset_master(dev); err = mlx4_reset_master(dev);
BUG_ON(err != 0);
if (!err) {
mlx4_err(dev, "device was reset successfully\n");
} else {
/* EEH could have disabled the PCI channel during reset. That's
* recoverable and the PCI error flow will handle it.
*/
if (!pci_channel_offline(dev->persist->pdev))
BUG_ON(1);
}
dev->persist->state |= MLX4_DEVICE_STATE_INTERNAL_ERROR; dev->persist->state |= MLX4_DEVICE_STATE_INTERNAL_ERROR;
mlx4_err(dev, "device was reset successfully\n");
mutex_unlock(&persist->device_state_mutex); mutex_unlock(&persist->device_state_mutex);
/* At that step HW was already reset, now notify clients */ /* At that step HW was already reset, now notify clients */
......
Markdown is supported
0%
or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment