Commit 810bc075 authored by Andy Lutomirski's avatar Andy Lutomirski Committed by Ingo Molnar

x86/nmi/64: Use DF to avoid userspace RSP confusing nested NMI detection

We have a tricky bug in the nested NMI code: if we see RSP
pointing to the NMI stack on NMI entry from kernel mode, we
assume that we are executing a nested NMI.

This isn't quite true.  A malicious userspace program can point
RSP at the NMI stack, issue SYSCALL, and arrange for an NMI to
happen while RSP is still pointing at the NMI stack.

Fix it with a sneaky trick.  Set DF in the region of code that
the RSP check is intended to detect.  IRET will clear DF
atomically.

( Note: other than paravirt, there's little need for all this
  complexity. We could check RIP instead of RSP. )
Signed-off-by: default avatarAndy Lutomirski <luto@kernel.org>
Reviewed-by: default avatarSteven Rostedt <rostedt@goodmis.org>
Cc: Borislav Petkov <bp@suse.de>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: stable@vger.kernel.org
Signed-off-by: default avatarIngo Molnar <mingo@kernel.org>
parent a27507ca
...@@ -1388,7 +1388,14 @@ ENTRY(nmi) ...@@ -1388,7 +1388,14 @@ ENTRY(nmi)
/* /*
* Now test if the previous stack was an NMI stack. This covers * Now test if the previous stack was an NMI stack. This covers
* the case where we interrupt an outer NMI after it clears * the case where we interrupt an outer NMI after it clears
* "NMI executing" but before IRET. * "NMI executing" but before IRET. We need to be careful, though:
* there is one case in which RSP could point to the NMI stack
* despite there being no NMI active: naughty userspace controls
* RSP at the very beginning of the SYSCALL targets. We can
* pull a fast one on naughty userspace, though: we program
* SYSCALL to mask DF, so userspace cannot cause DF to be set
* if it controls the kernel's RSP. We set DF before we clear
* "NMI executing".
*/ */
lea 6*8(%rsp), %rdx lea 6*8(%rsp), %rdx
/* Compare the NMI stack (rdx) with the stack we came from (4*8(%rsp)) */ /* Compare the NMI stack (rdx) with the stack we came from (4*8(%rsp)) */
...@@ -1400,7 +1407,13 @@ ENTRY(nmi) ...@@ -1400,7 +1407,13 @@ ENTRY(nmi)
cmpq %rdx, 4*8(%rsp) cmpq %rdx, 4*8(%rsp)
/* If it is below the NMI stack, it is a normal NMI */ /* If it is below the NMI stack, it is a normal NMI */
jb first_nmi jb first_nmi
/* Ah, it is within the NMI stack, treat it as nested */
/* Ah, it is within the NMI stack. */
testb $(X86_EFLAGS_DF >> 8), (3*8 + 1)(%rsp)
jz first_nmi /* RSP was user controlled. */
/* This is a nested NMI. */
nested_nmi: nested_nmi:
/* /*
...@@ -1506,8 +1519,16 @@ nmi_restore: ...@@ -1506,8 +1519,16 @@ nmi_restore:
/* Point RSP at the "iret" frame. */ /* Point RSP at the "iret" frame. */
REMOVE_PT_GPREGS_FROM_STACK 6*8 REMOVE_PT_GPREGS_FROM_STACK 6*8
/* Clear "NMI executing". */ /*
movq $0, 5*8(%rsp) * Clear "NMI executing". Set DF first so that we can easily
* distinguish the remaining code between here and IRET from
* the SYSCALL entry and exit paths. On a native kernel, we
* could just inspect RIP, but, on paravirt kernels,
* INTERRUPT_RETURN can translate into a jump into a
* hypercall page.
*/
std
movq $0, 5*8(%rsp) /* clear "NMI executing" */
/* /*
* INTERRUPT_RETURN reads the "iret" frame and exits the NMI * INTERRUPT_RETURN reads the "iret" frame and exits the NMI
......
Markdown is supported
0%
or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment