Commit cf1575d1 authored by Paul Mackerras's avatar Paul Mackerras Committed by Linus Torvalds

[PATCH] ppc64: provide notifier list for EEH slot isolations

When the EEH (enhanced i/o error handling) hardware on pSeries detects
various kinds of PCI errors, it immediately freezes and isolates the slot
of the offending PCI card.  We get to know about that by noticing that
reads from the device return all-1s, and then we have to do a firmware call
to find out whether the all-1s value was due to a slot isolation.

This patch adds a notifier so that other parts of the system (e.g.  the RPA
PCI hotplug driver) can know that a slot isolation event has occurred and
take whatever recovery action is appropriate.  The notifier is called in a
workqueue function, although the read from the device that noticed the
all-1s value may have been at interrupt level.  As a precaution, if we keep
trying to read from the device at interrupt level, and do 1000 reads
without the workqueue getting a chance to run, we panic, on the grounds
that we presumably have a badly-written driver which will spin forever in
its interrupt routine, e.g.  waiting for a bit in a device register to go
to 0.

This patch is based on an earlier patch by Linas Vepstas <linas@linas.org>.
Signed-off-by: default avatarPaul Mackerras <paulus@samba.org>
Signed-off-by: default avatarAndrew Morton <akpm@osdl.org>
Signed-off-by: default avatarLinus Torvalds <torvalds@osdl.org>
parent ffddc0d0
This diff is collapsed.
......@@ -20,8 +20,10 @@
#ifndef _PPC64_EEH_H
#define _PPC64_EEH_H
#include <linux/string.h>
#include <linux/init.h>
#include <linux/list.h>
#include <linux/string.h>
#include <linux/notifier.h>
struct pci_dev;
struct device_node;
......@@ -29,6 +31,7 @@ struct device_node;
/* Values for eeh_mode bits in device_node */
#define EEH_MODE_SUPPORTED (1<<0)
#define EEH_MODE_NOCHECK (1<<1)
#define EEH_MODE_ISOLATED (1<<2)
#ifdef CONFIG_PPC_PSERIES
extern void __init eeh_init(void);
......@@ -68,7 +71,28 @@ void eeh_remove_device(struct pci_dev *);
#define EEH_RELEASE_DMA 3
int eeh_set_option(struct pci_dev *dev, int options);
/*
/**
* Notifier event flags.
*/
#define EEH_NOTIFY_FREEZE 1
/** EEH event -- structure holding pci slot data that describes
* a change in the isolation status of a PCI slot. A pointer
* to this struct is passed as the data pointer in a notify callback.
*/
struct eeh_event {
struct list_head list;
struct pci_dev *dev;
struct device_node *dn;
int reset_state;
};
/** Register to find out about EEH events. */
int eeh_register_notifier(struct notifier_block *nb);
int eeh_unregister_notifier(struct notifier_block *nb);
/**
* EEH_POSSIBLE_ERROR() -- test for possible MMIO failure.
*
* If this macro yields TRUE, the caller relays to eeh_check_failure()
......
Markdown is supported
0%
or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment