Commit 966d713e authored by Mahesh Salgaonkar's avatar Mahesh Salgaonkar Committed by Paul Mackerras

KVM: PPC: Book3S HV: Deliver machine check with MSR(RI=0) to guest as MCE

For the machine check interrupt that happens while we are in the guest,
kvm layer attempts the recovery, and then delivers the machine check interrupt
directly to the guest if recovery fails. On successful recovery we go back to
normal functioning of the guest. But there can be cases where a machine check
interrupt can happen with MSR(RI=0) while we are in the guest. This means
MC interrupt is unrecoverable and we have to deliver a machine check to the
guest since the machine check interrupt might have trashed valid values in
SRR0/1. The current implementation do not handle this case, causing guest
to crash with Bad kernel stack pointer instead of machine check oops message.

[26281.490060] Bad kernel stack pointer 3fff9ccce5b0 at c00000000000490c
[26281.490434] Oops: Bad kernel stack pointer, sig: 6 [#1]
[26281.490472] SMP NR_CPUS=2048 NUMA pSeries

This patch fixes this issue by checking MSR(RI=0) in KVM layer and forwarding
unrecoverable interrupt to guest which then panics with proper machine check
Oops message.
Signed-off-by: default avatarMahesh Salgaonkar <mahesh@linux.vnet.ibm.com>
Acked-by: default avatarPaul Mackerras <paulus@samba.org>
Signed-off-by: default avatarPaul Mackerras <paulus@samba.org>
parent 224f3632
...@@ -2377,7 +2377,6 @@ machine_check_realmode: ...@@ -2377,7 +2377,6 @@ machine_check_realmode:
mr r3, r9 /* get vcpu pointer */ mr r3, r9 /* get vcpu pointer */
bl kvmppc_realmode_machine_check bl kvmppc_realmode_machine_check
nop nop
cmpdi r3, 0 /* Did we handle MCE ? */
ld r9, HSTATE_KVM_VCPU(r13) ld r9, HSTATE_KVM_VCPU(r13)
li r12, BOOK3S_INTERRUPT_MACHINE_CHECK li r12, BOOK3S_INTERRUPT_MACHINE_CHECK
/* /*
...@@ -2390,13 +2389,18 @@ machine_check_realmode: ...@@ -2390,13 +2389,18 @@ machine_check_realmode:
* The old code used to return to host for unhandled errors which * The old code used to return to host for unhandled errors which
* was causing guest to hang with soft lockups inside guest and * was causing guest to hang with soft lockups inside guest and
* makes it difficult to recover guest instance. * makes it difficult to recover guest instance.
*
* if we receive machine check with MSR(RI=0) then deliver it to
* guest as machine check causing guest to crash.
*/ */
ld r10, VCPU_PC(r9)
ld r11, VCPU_MSR(r9) ld r11, VCPU_MSR(r9)
andi. r10, r11, MSR_RI /* check for unrecoverable exception */
beq 1f /* Deliver a machine check to guest */
ld r10, VCPU_PC(r9)
cmpdi r3, 0 /* Did we handle MCE ? */
bne 2f /* Continue guest execution. */ bne 2f /* Continue guest execution. */
/* If not, deliver a machine check. SRR0/1 are already set */ /* If not, deliver a machine check. SRR0/1 are already set */
li r10, BOOK3S_INTERRUPT_MACHINE_CHECK 1: li r10, BOOK3S_INTERRUPT_MACHINE_CHECK
ld r11, VCPU_MSR(r9)
bl kvmppc_msr_interrupt bl kvmppc_msr_interrupt
2: b fast_interrupt_c_return 2: b fast_interrupt_c_return
......
Markdown is supported
0%
or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment