• Yazen Ghannam's avatar
    x86/MCE: Check a hw error's address to determine proper recovery action · e40879b6
    Yazen Ghannam authored
    Make sure that machine check errors with a usable address are properly
    marked as poison.
    
    This is needed for errors that occur on memory which have
    MCG_STATUS[RIPV] clear - i.e., the interrupted process cannot be
    restarted reliably. One example is data poison consumption through the
    instruction fetch units on AMD Zen-based systems.
    
    The MF_MUST_KILL flag is passed to memory_failure() when
    MCG_STATUS[RIPV] is not set. So the associated process will still be
    killed.  What this does, practically, is get rid of one more check to
    kill_current_task with the eventual goal to remove it completely.
    
    Also, make the handling identical to what is done on the notifier path
    (uc_decode_notifier() does that address usability check too).
    
    The scenario described above occurs when hardware can precisely identify
    the address of poisoned memory, but execution cannot reliably continue
    for the interrupted hardware thread.
    
      [ bp: Massage commit message. ]
    Signed-off-by: default avatarYazen Ghannam <yazen.ghannam@amd.com>
    Signed-off-by: default avatarBorislav Petkov (AMD) <bp@alien8.de>
    Reviewed-by: default avatarTony Luck <tony.luck@intel.com>
    Link: https://lore.kernel.org/r/20230322005131.174499-1-tony.luck@intel.com
    e40879b6
core.c 67.5 KB