1. 25 Jul, 2012 1 commit
    • Kleber Sacilotto de Souza's avatar
      mlx4: Add support for EEH error recovery · 57dbf29a
      Kleber Sacilotto de Souza authored
      Currently the mlx4 drivers don't have the necessary callbacks to
      implement EEH errors detection and recovery, so the PCI layer uses the
      probe and remove callbacks to try to recover the device after an error on
      the bus. However, these callbacks have race conditions with the internal
      catastrophic error recovery functions, which will also detect the error
      and this can cause the system to crash if both EEH and catas functions
      try to reset the device.
      
      This patch adds the necessary error recovery callbacks and makes sure
      that the internal catastrophic error functions will not try to reset the
      device in such scenarios. It also adds some calls to
      pci_channel_offline() to suppress reads/writes on the bus when the slot
      cannot accept I/O operations so we prevent unnecessary accesses to the
      bus and speed up the device removal.
      Signed-off-by: default avatarKleber Sacilotto de Souza <klebers@linux.vnet.ibm.com>
      Acked-by: default avatarShlomo Pongratz <shlomop@mellanox.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      57dbf29a
  2. 24 Jul, 2012 39 commits