- 05 Sep, 2019 14 commits
-
-
Oliver O'Halloran authored
Use the new eeh_dev_check and eeh_dev_break interfaces to test EEH recovery. Historically this has been done manually using platform specific EEH error injection facilities (e.g. via RTAS). However, documentation on how to use these facilities is haphazard at best and non-existent at worst so it's hard to develop a cross-platform test. The new debugfs interfaces allow the kernel to handle the platform specific details so we can write a more generic set of sets. This patch adds the most basic of recovery tests where: a) Errors are injected and recovered from sequentially, b) Errors are not injected into PCI-PCI bridges, such as PCIe switches. c) Errors are only injected into device function zero. d) No errors are injected into Virtual Functions. a), b) and c) are largely due to limitations of Linux's EEH support. EEH recovery is serialised in the EEH recovery thread which forces a). Similarly, multi-function PCI devices are almost always grouped into the same PE so injecting an error on one function exercises the same code paths. c) is because we currently more or less ignore PCI bridges during recovery and assume that the recovered topology will be the same as the original. d) is due to the limits of the eeh_dev_break interface. With the current implementation we can't inject an error into a specific VF without potentially causing additional errors on other VFs. Due to the serialised recovery process we might end up timing out waiting for another function to recover before the function of interest is recovered. The platform specific error injection facilities are finer-grained and allow this capability, but doing that requires working out how to use those facilities first. Basicly, it's better than nothing and it's a base to build on. Signed-off-by: Oliver O'Halloran <oohall@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20190903101605.2890-15-oohall@gmail.com
-
Oliver O'Halloran authored
Add an interface to debugfs for generating an EEH event on a given device. This works by disabling memory accesses to and from the device by setting the PCI_COMMAND register (or the VF Memory Space Enable on the parent PF). This is a somewhat portable alternative to using the platform specific error injection mechanisms since those tend to be either hard to use, or straight up broken. For pseries the interfaces also requires the use of /dev/mem which is probably going to go away in a post-LOCKDOWN world (and it's a horrific hack to begin with) so moving to a kernel-provided interface makes sense and provides a sane, cross-platform interface for userspace so we can write more generic testing scripts. Signed-off-by: Oliver O'Halloran <oohall@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20190903101605.2890-14-oohall@gmail.com
-
Oliver O'Halloran authored
Detecting an frozen EEH PE usually occurs when an MMIO load returns a 0xFFs response. When performing EEH testing using the EEH error injection feature available on some platforms there is no simple way to kick-off the kernel's recovery process since any accesses from userspace (usually /dev/mem) will bypass the MMIO helpers in the kernel which check if a 0xFF response is due to an EEH freeze or not. If a device contains a 0xFF byte in it's config space it's possible to trigger the recovery process via config space read from userspace, but this is not a reliable method. If a driver is bound to the device an in use it will frequently trigger the MMIO check, but this is also inconsistent. To solve these problems this patch adds a debugfs file called "eeh_dev_check" which accepts a <domain>:<bus>:<dev>.<fn> string and runs eeh_dev_check_failure() on it. This is the same check that's done when the kernel gets a 0xFF result from an config or MMIO read with the added benifit that it can be reliably triggered from userspace. Signed-off-by: Oliver O'Halloran <oohall@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20190903101605.2890-13-oohall@gmail.com
-
Oliver O'Halloran authored
I am the RAS team. Hear me roar. Roar. On a more serious note, being able to locate failed devices can be helpful. Set the attention indicator if the slot supports it once we've determined the device is present and only clear it if the device is fully recovered. Signed-off-by: Oliver O'Halloran <oohall@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20190903101605.2890-12-oohall@gmail.com
-
Oliver O'Halloran authored
pnv_php is generally used with PCIe bridges which provide a native interface for setting the attention and power indicator LEDs. Wire up those interfaces even if firmware does not have support for them (yet...) Signed-off-by: Oliver O'Halloran <oohall@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20190903101605.2890-11-oohall@gmail.com
-
Oliver O'Halloran authored
Currently we check that an IODA2 compatible PHB is upstream of this slot. This is mainly to avoid pnv_php creating slots for the various "virtual PHBs" that we create for NVLink. There's no real need for this restriction so allow it on IODA3. Signed-off-by: Oliver O'Halloran <oohall@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20190903101605.2890-10-oohall@gmail.com
-
Oliver O'Halloran authored
When performing EEH recovery of devices in a hotplug slot we need to use the slot driver's ->reset_slot() callback to prevent spurious hotplug events due to spurious DLActive and PresDet change interrupts. Add a reset_slot() callback to pnv_php so we can handle recovery of devices in pnv_php managed slots. Signed-off-by: Oliver O'Halloran <oohall@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20190903101605.2890-9-oohall@gmail.com
-
Oliver O'Halloran authored
When we reset PCI devices managed by a hotplug driver the reset may generate spurious hotplug events that cause the PCI device we're resetting to be torn down accidently. This is a problem for EEH (when the driver is EEH aware) since we want to leave the OS PCI device state intact so that the device can be re-set without losing any resources (network, disks, etc) provided by the driver. Generic PCI code provides the pci_bus_error_reset() function to handle resetting a PCI Device (or bus) by using the reset method provided by the hotplug slot driver. We can use this function if the EEH core has requested a hot reset (common case) without tripping over the hotplug driver. Signed-off-by: Oliver O'Halloran <oohall@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20190903101605.2890-8-oohall@gmail.com
-
Oliver O'Halloran authored
Support for switching CAPI cards into and out of CAPI mode was removed a while ago. Drop the comment since it's no longer relevant. Signed-off-by: Oliver O'Halloran <oohall@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20190903101605.2890-7-oohall@gmail.com
-
Oliver O'Halloran authored
Currently we print a stack trace in the event handler to help with debugging EEH issues. In the case of suprise hot-unplug this is unneeded, so we want to prevent printing the stack trace unless we know it's due to an actual device error. To accomplish this, we can save a stack trace at the point of detection and only print it once the EEH recovery handler has determined the freeze was due to an actual error. Since the whole point of this is to prevent spurious EEH output we also move a few prints out of the detection thread, or mark them as pr_debug so anyone interested can get output from the eeh_check_dev_failure() if they want. Signed-off-by: Oliver O'Halloran <oohall@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20190903101605.2890-6-oohall@gmail.com
-
Oliver O'Halloran authored
When a device is surprise removed while undergoing IO we will probably get an EEH PE freeze due to MMIO timeouts and other errors. When a freeze is detected we send a recovery event to the EEH worker thread which will notify drivers, and perform recovery as needed. In the event of a hot-remove we don't want recovery to occur since there isn't a device to recover. The recovery process is fairly long due to the number of wait states (required by PCIe) which causes problems when devices are removed and replaced (e.g. hot swapping of U.2 NVMe drives). To determine if we need to skip the recovery process we can use the get_adapter_state() operation of the hotplug_slot to determine if the slot contains a device or not, and if the slot is empty we can skip recovery entirely. One thing to note is that the slot being EEH frozen does not prevent the hotplug driver from working. We don't have the EEH recovery thread remove any of the devices since it's assumed that the hotplug driver will handle tearing down the slot state. Signed-off-by: Oliver O'Halloran <oohall@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20190903101605.2890-5-oohall@gmail.com
-
Oliver O'Halloran authored
If a device is torn down by a hotplug slot driver it's marked as removed and marked as permaantly failed. There's no point in trying to recover a permernantly failed device so it should be considered un-actionable. Signed-off-by: Oliver O'Halloran <oohall@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20190903101605.2890-4-oohall@gmail.com
-
Oliver O'Halloran authored
When hot-adding devices we rely on the hotplug driver to create pci_dn's for the devices under the hotplug slot. Converse, when hot-removing the driver will remove the pci_dn's that it created. This is a problem because the pci_dev is still live until it's refcount drops to zero. This can happen if the driver is slow to tear down it's internal state. Ideally, the driver would not attempt to perform any config accesses to the device once it's been marked as removed, but sometimes it happens. As a result, we might attempt to access the pci_dn for a device that has been torn down and the kernel may crash as a result. To fix this, don't free the pci_dn unless the corresponding pci_dev has been released. If the pci_dev is still live, then we mark the pci_dn with a flag that indicates the pci_dev's release function should free it. Signed-off-by: Oliver O'Halloran <oohall@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20190903101605.2890-3-oohall@gmail.com
-
Oliver O'Halloran authored
When the last device in an eeh_pe is removed the eeh_pe structure itself (and any empty parents) are freed since they are no longer needed. This results in a crash when a hotplug driver is involved since the following may occur: 1. Device is suprise removed. 2. Driver performs an MMIO, which fails and queues and eeh_event. 3. Hotplug driver receives a hotplug interrupt and removes any pci_devs that were under the slot. 4. pci_dev is torn down and the eeh_pe is freed. 5. The EEH event handler thread processes the eeh_event and crashes since the eeh_pe pointer in the eeh_event structure is no longer valid. Crashing is generally considered poor form. Instead of doing that use the fact PEs are marked as EEH_PE_INVALID to keep them around until the end of the recovery cycle, at which point we can safely prune any empty PEs. Signed-off-by: Oliver O'Halloran <oohall@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20190903101605.2890-2-oohall@gmail.com
-
- 30 Aug, 2019 26 commits
-
-
Nicholas Piggin authored
This avoids 3 loads in the radix page fault case, 1 load in the hash fault case, and 2 loads in the hash miss page fault case. Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20190802105709.27696-37-npiggin@gmail.com
-
Nicholas Piggin authored
Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20190802105709.27696-36-npiggin@gmail.com
-
Nicholas Piggin authored
It is clever, but the small code saving is not worth the spaghetti of jumping to a label in an expanded macro, particularly when the label is just a number rather than a descriptive name. So expand the INT_COMMON macro twice, once for the stack and no stack cases, and branch to those. The slight code size increase is worth the improved clarity of branches for this non-performance critical code. Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20190802105709.27696-35-npiggin@gmail.com
-
Nicholas Piggin authored
This better reflects the order in which the code is executed. No generated code change except BUG line number constants. Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20190802105709.27696-34-npiggin@gmail.com
-
Nicholas Piggin authored
Move DAR and DSISR saving to pt_regs into INT_COMMON. Also add an option to expand RECONCILE_IRQ_STATE. Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20190802105709.27696-33-npiggin@gmail.com
-
Nicholas Piggin authored
No generated code change except BUG line number constants. Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20190802105709.27696-32-npiggin@gmail.com
-
Nicholas Piggin authored
No generated code change except BUG line number constants. Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20190802105709.27696-31-npiggin@gmail.com
-
Nicholas Piggin authored
No generated code change except BUG line number constants. Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20190802105709.27696-30-npiggin@gmail.com
-
Nicholas Piggin authored
Merge EXCEPTION_PROLOG_COMMON_3 into EXCEPTION_PROLOG_COMMON_2. No generated code change except BUG line number constants. Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20190802105709.27696-29-npiggin@gmail.com
-
Nicholas Piggin authored
Also change argument name (n -> vec) to match others. No generated code change. Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20190802105709.27696-28-npiggin@gmail.com
-
Nicholas Piggin authored
Replace the 4 variants of cpp macros with one gas macro. No generated code change except BUG line number constants. Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20190802105709.27696-27-npiggin@gmail.com
-
Nicholas Piggin authored
Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20190802105709.27696-26-npiggin@gmail.com
-
Nicholas Piggin authored
All other virt handlers have the prolog code in the virt vector rather than branch to the real vector. Follow this pattern in the denorm virt handler. Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20190802105709.27696-25-npiggin@gmail.com
-
Nicholas Piggin authored
EXCEPTION_PROLOG_0 and _1 have only a single caller, so expand them into it. Rename EXCEPTION_PROLOG_2_REAL to INT_SAVE_SRR_AND_JUMP and EXCEPTION_PROLOG_2_VIRT to INT_VIRT_SAVE_SRR_AND_JUMP, which are more descriptive. No generated code change except BUG line number constants. Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20190802105709.27696-24-npiggin@gmail.com
-
Michael Ellerman authored
The argument lists for the INT_HANDLER macro are getting a bit unwieldy. Use keyword parameters with default values to shorten them. Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20190830011426.16810-1-mpe@ellerman.id.au
-
Nicholas Piggin authored
This creates a single macro that generates the exception prolog code, with variants specified by arguments, rather than assorted nested macros for different variants. The increasing length of macro argument list is not nice to read or modify, but this is a temporary condition that will be improved in later changes. No generated code change except BUG line number constants and label names. Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20190802105709.27696-23-npiggin@gmail.com
-
Nicholas Piggin authored
This vector is not used by any supported processor, and has been implemented as an unknown exception going back to 2.6. There is nothing special about 0xb00, so remove it like other unused vectors. Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20190802105709.27696-22-npiggin@gmail.com
-
Nicholas Piggin authored
The perf virt handler uses EXCEPTION_PROLOG_2_REAL rather than _VIRT. In practice this is okay because the _REAL variant is usable by virt mode interrupts, but should be fixed (and is a performance win). Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20190802105709.27696-21-npiggin@gmail.com
-
Nicholas Piggin authored
Add EXC_HV_OR_STD and use it to consolidate the 0x500 external interrupt. Executed code is unchanged. Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20190802105709.27696-20-npiggin@gmail.com
-
Nicholas Piggin authored
The head-64.h code should deal only with the head code sections and offset calculations. No generated code change except BUG line number constants. Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20190802105709.27696-19-npiggin@gmail.com
-
Nicholas Piggin authored
This buglet goes back to before the 64/32 arch merge, but it does not seem to have had practical consequences because bad_page_fault does not use the 2nd argument, but rather regs->dar/nip. Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20190802105709.27696-18-npiggin@gmail.com
-
Nicholas Piggin authored
Short forward and backward branches can be given number labels, but larger significant divergences in code path a more readable if they're given descriptive names. Also adjusts a comment to account for guest delivery. Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20190802105709.27696-17-npiggin@gmail.com
-
Nicholas Piggin authored
machine_check_early_common now branches to machine_check_handle_early which is its only caller. Move interleaving code out of the way, and remove the branch. Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20190802105709.27696-16-npiggin@gmail.com
-
Nicholas Piggin authored
Similarly to the previous change, all callers of the unrecoverable handler run relocated so can reach it with a direct branch. This makes it easy to move out of line, which makes the "normal" path less cluttered and easier to follow. MSR[ME] manipulation still requires the rfi, so that is moved out of line to its own function. Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20190802105709.27696-15-npiggin@gmail.com
-
Nicholas Piggin authored
machine_check_handle_early_common can reach machine_check_handle_early directly now that it runs at the relocated address, so just branch directly. The rfi sequence is required to enable MSR[ME] but that step is moved into a helper function, making the code easier to follow. Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20190802105709.27696-14-npiggin@gmail.com
-
Nicholas Piggin authored
Following convention, move the tramp code (unrelocated) above the common handlers (relocated). Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20190802105709.27696-13-npiggin@gmail.com
-