Commit 275237a6 authored by Gavin Shan's avatar Gavin Shan Committed by Sasha Levin

powerpc/eeh: Enable IO path on permanent error

[ Upstream commit 387bbc97 ]

We give up recovery on permanent error, simply shutdown the affected
devices and remove them. If the devices can't be put into quiet state,
they spew more traffic that is likely to cause another unexpected EEH
error. This was observed on "p8dtu2u" machine:

   0002:00:00.0 PCI bridge: IBM Device 03dc
   0002:01:00.0 Ethernet controller: Intel Corporation \
                Ethernet Controller X710/X557-AT 10GBASE-T (rev 02)
   0002:01:00.1 Ethernet controller: Intel Corporation \
                Ethernet Controller X710/X557-AT 10GBASE-T (rev 02)
   0002:01:00.2 Ethernet controller: Intel Corporation \
                Ethernet Controller X710/X557-AT 10GBASE-T (rev 02)
   0002:01:00.3 Ethernet controller: Intel Corporation \
                Ethernet Controller X710/X557-AT 10GBASE-T (rev 02)

On P8 PowerNV platform, the IO path is frozen when shutdowning the
devices, meaning the memory registers are inaccessible. It is why
the devices can't be put into quiet state before removing them.
This fixes the issue by enabling IO path prior to putting the devices
into quiet state.
Reported-by: default avatarPridhiviraj Paidipeddi <ppaidipe@linux.vnet.ibm.com>
Signed-off-by: default avatarGavin Shan <gwshan@linux.vnet.ibm.com>
Acked-by: default avatarRussell Currey <ruscur@russell.cc>
Signed-off-by: default avatarMichael Ellerman <mpe@ellerman.id.au>
Signed-off-by: default avatarSasha Levin <alexander.levin@verizon.com>
parent 6b306861
...@@ -306,9 +306,17 @@ void eeh_slot_error_detail(struct eeh_pe *pe, int severity) ...@@ -306,9 +306,17 @@ void eeh_slot_error_detail(struct eeh_pe *pe, int severity)
* *
* For pHyp, we have to enable IO for log retrieval. Otherwise, * For pHyp, we have to enable IO for log retrieval. Otherwise,
* 0xFF's is always returned from PCI config space. * 0xFF's is always returned from PCI config space.
*
* When the @severity is EEH_LOG_PERM, the PE is going to be
* removed. Prior to that, the drivers for devices included in
* the PE will be closed. The drivers rely on working IO path
* to bring the devices to quiet state. Otherwise, PCI traffic
* from those devices after they are removed is like to cause
* another unexpected EEH error.
*/ */
if (!(pe->type & EEH_PE_PHB)) { if (!(pe->type & EEH_PE_PHB)) {
if (eeh_has_flag(EEH_ENABLE_IO_FOR_LOG)) if (eeh_has_flag(EEH_ENABLE_IO_FOR_LOG) ||
severity == EEH_LOG_PERM)
eeh_pci_enable(pe, EEH_OPT_THAW_MMIO); eeh_pci_enable(pe, EEH_OPT_THAW_MMIO);
/* /*
......
Markdown is supported
0%
or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment