• Gavin Shan's avatar
    powerpc/eeh: Fix crash caused by null eeh_dev · 2ef822c5
    Gavin Shan authored
    The problem was reported by Anton Blanchard. While EEH error
    happened to the PCI device without the corresponding device
    driver, kernel crash was seen. Eventually, I successfully
    reproduced the problem on Firebird-L machine with utility
    "errinjct". Initially, the device driver for Emulex ethernet
    MAC has been disabled from .config and force data parity on
    the Emulex ethernet MAC with help of "errinjct". Eventually,
    I saw the kernel crash after issueing couple of "lspci -v"
    command.
    
    The root cause behind is that the PCI device, including the
    reference to the corresponding eeh device, will be removed
    from the system while EEH does recovery. Afterwards, the
    PCI device will be probed again and added into the system
    accordingly. So it's not safe to retrieve the eeh device from
    the corresponding PCI device after the PCI device has been removed
    and not added again.
    
    The patch fixes the issue and retrieve the eeh device from OF node
    instead of PCI device after the PCI device has been removed.
    Signed-off-by: default avatarGavin Shan <shangw@linux.vnet.ibm.com>
    Tested-by: default avatarAnton Blanchard <anton@samba.org>
    Signed-off-by: default avatarBenjamin Herrenschmidt <benh@kernel.crashing.org>
    2ef822c5
eeh.c 34.6 KB