Commit d93b0ac0 authored by Mahesh Salgaonkar's avatar Mahesh Salgaonkar Committed by Michael Ellerman

powerpc/book3s/mce: Move add_taint() later in virtual mode

machine_check_early() gets called in real mode. The very first time when
add_taint() is called, it prints a warning which ends up calling opal
call (that uses OPAL_CALL wrapper) for writing it to console. If we get a
very first machine check while we are in opal we are doomed. OPAL_CALL
overwrites the PACASAVEDMSR in r13 and in this case when we are done with
MCE handling the original opal call will use this new MSR on it's way
back to opal_return. This usually leads to unexpected behaviour or the
kernel to panic. Instead move the add_taint() call later in the virtual
mode where it is safe to call.

This is broken with current FW level. We got lucky so far for not getting
very first MCE hit while in OPAL. But easily reproducible on Mambo.

Fixes: 27ea2c42 ("powerpc: Set the correct kernel taint on machine check errors.")
Cc: stable@vger.kernel.org # v4.2+
Signed-off-by: default avatarMahesh Salgaonkar <mahesh@linux.vnet.ibm.com>
Signed-off-by: default avatarMichael Ellerman <mpe@ellerman.id.au>
parent 3f2290e1
...@@ -221,6 +221,8 @@ static void machine_check_process_queued_event(struct irq_work *work) ...@@ -221,6 +221,8 @@ static void machine_check_process_queued_event(struct irq_work *work)
{ {
int index; int index;
add_taint(TAINT_MACHINE_CHECK, LOCKDEP_NOW_UNRELIABLE);
/* /*
* For now just print it to console. * For now just print it to console.
* TODO: log this error event to FSP or nvram. * TODO: log this error event to FSP or nvram.
......
...@@ -323,8 +323,6 @@ long machine_check_early(struct pt_regs *regs) ...@@ -323,8 +323,6 @@ long machine_check_early(struct pt_regs *regs)
__this_cpu_inc(irq_stat.mce_exceptions); __this_cpu_inc(irq_stat.mce_exceptions);
add_taint(TAINT_MACHINE_CHECK, LOCKDEP_NOW_UNRELIABLE);
if (cur_cpu_spec && cur_cpu_spec->machine_check_early) if (cur_cpu_spec && cur_cpu_spec->machine_check_early)
handled = cur_cpu_spec->machine_check_early(regs); handled = cur_cpu_spec->machine_check_early(regs);
return handled; return handled;
...@@ -758,6 +756,8 @@ void machine_check_exception(struct pt_regs *regs) ...@@ -758,6 +756,8 @@ void machine_check_exception(struct pt_regs *regs)
__this_cpu_inc(irq_stat.mce_exceptions); __this_cpu_inc(irq_stat.mce_exceptions);
add_taint(TAINT_MACHINE_CHECK, LOCKDEP_NOW_UNRELIABLE);
/* See if any machine dependent calls. In theory, we would want /* See if any machine dependent calls. In theory, we would want
* to call the CPU first, and call the ppc_md. one if the CPU * to call the CPU first, and call the ppc_md. one if the CPU
* one returns a positive number. However there is existing code * one returns a positive number. However there is existing code
......
Markdown is supported
0%
or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment