• Borislav Petkov's avatar
    x86/mce: Do not overwrite MCi_STATUS in mce_no_way_out() · 1f74c8a6
    Borislav Petkov authored
    mce_no_way_out() does a quick check during #MC to see whether some of
    the MCEs logged would require the kernel to panic immediately. And it
    passes a struct mce where MCi_STATUS gets written.
    
    However, after having saved a valid status value, the next iteration
    of the loop which goes over the MCA banks on the CPU, overwrites the
    valid status value because we're using struct mce as storage instead of
    a temporary variable.
    
    Which leads to MCE records with an empty status value:
    
      mce: [Hardware Error]: CPU 0: Machine Check Exception: 6 Bank 0: 0000000000000000
      mce: [Hardware Error]: RIP 10:<ffffffffbd42fbd7> {trigger_mce+0x7/0x10}
    
    In order to prevent the loss of the status register value, return
    immediately when severity is a panic one so that we can panic
    immediately with the first fatal MCE logged. This is also the intention
    of this function and not to noodle over the banks while a fatal MCE is
    already logged.
    
    Tony: read the rest of the MCA bank to populate the struct mce fully.
    Suggested-by: default avatarTony Luck <tony.luck@intel.com>
    Signed-off-by: default avatarBorislav Petkov <bp@suse.de>
    Signed-off-by: default avatarThomas Gleixner <tglx@linutronix.de>
    Cc: <stable@vger.kernel.org>
    Link: https://lkml.kernel.org/r/20180622095428.626-8-bp@alien8.de
    1f74c8a6
mce.c 56.8 KB