• Borislav Petkov's avatar
    x86/nmi, EDAC: Get rid of DRAM error reporting thru PCI SERR NMI · db47d5f8
    Borislav Petkov authored
    Apparently, some machines used to report DRAM errors through a PCI SERR
    NMI. This is why we have a call into EDAC in the NMI handler. See
    
      c0d12172 ("drivers/edac: add new nmi rescan").
    
    From looking at the patch above, that's two drivers: e752x_edac.c and
    e7xxx_edac.c. Now, I wanna say those are old machines which are probably
    decommissioned already.
    
    Tony says that "[t]the newest CPU supported by either of those drivers
    is the Xeon E7520 (a.k.a. "Nehalem") released in Q1'2010. Possibly some
    folks are still using these ... but people that hold onto h/w for 7
    years generally cling to old s/w too ... so I'd guess it unlikely that
    we will get complaints for breaking these in upstream."
    
    So even if there is a small number still in use, we did load EDAC with
    edac_op_state == EDAC_OPSTATE_POLL by default (we still do, in fact)
    which means a default EDAC setup without any parameters supplied on the
    command line or otherwise would never even log the error in the NMI
    handler because we're polling by default:
    
      inline int edac_handler_set(void)
      {
             if (edac_op_state == EDAC_OPSTATE_POLL)
                     return 0;
    
             return atomic_read(&edac_handlers);
      }
    
    So, long story short, I'd like to get rid of that nastiness called
    edac_stub.c and confine all the EDAC drivers solely to drivers/edac/. If
    we ever have to do stuff like that again, it should be notifiers we're
    using and not some insanity like this one.
    Signed-off-by: default avatarBorislav Petkov <bp@suse.de>
    Acked-by: default avatarThomas Gleixner <tglx@linutronix.de>
    Cc: Tony Luck <tony.luck@intel.com>
    db47d5f8
nmi.c 15.5 KB