• Paul Mackerras's avatar
    [PATCH] ppc64: provide notifier list for EEH slot isolations · cf1575d1
    Paul Mackerras authored
    When the EEH (enhanced i/o error handling) hardware on pSeries detects
    various kinds of PCI errors, it immediately freezes and isolates the slot
    of the offending PCI card.  We get to know about that by noticing that
    reads from the device return all-1s, and then we have to do a firmware call
    to find out whether the all-1s value was due to a slot isolation.
    
    This patch adds a notifier so that other parts of the system (e.g.  the RPA
    PCI hotplug driver) can know that a slot isolation event has occurred and
    take whatever recovery action is appropriate.  The notifier is called in a
    workqueue function, although the read from the device that noticed the
    all-1s value may have been at interrupt level.  As a precaution, if we keep
    trying to read from the device at interrupt level, and do 1000 reads
    without the workqueue getting a chance to run, we panic, on the grounds
    that we presumably have a badly-written driver which will spin forever in
    its interrupt routine, e.g.  waiting for a bit in a device register to go
    to 0.
    
    This patch is based on an earlier patch by Linas Vepstas <linas@linas.org>.
    Signed-off-by: default avatarPaul Mackerras <paulus@samba.org>
    Signed-off-by: default avatarAndrew Morton <akpm@osdl.org>
    Signed-off-by: default avatarLinus Torvalds <torvalds@osdl.org>
    cf1575d1
eeh.c 27.1 KB