1. 03 Aug, 2022 2 commits
    • Pawan Gupta's avatar
      x86/speculation: Add LFENCE to RSB fill sequence · ba6e31af
      Pawan Gupta authored
      RSB fill sequence does not have any protection for miss-prediction of
      conditional branch at the end of the sequence. CPU can speculatively
      execute code immediately after the sequence, while RSB filling hasn't
      completed yet.
      
        #define __FILL_RETURN_BUFFER(reg, nr, sp)       \
                mov     $(nr/2), reg;                   \
        771:                                            \
                ANNOTATE_INTRA_FUNCTION_CALL;           \
                call    772f;                           \
        773:    /* speculation trap */                  \
                UNWIND_HINT_EMPTY;                      \
                pause;                                  \
                lfence;                                 \
                jmp     773b;                           \
        772:                                            \
                ANNOTATE_INTRA_FUNCTION_CALL;           \
                call    774f;                           \
        775:    /* speculation trap */                  \
                UNWIND_HINT_EMPTY;                      \
                pause;                                  \
                lfence;                                 \
                jmp     775b;                           \
        774:                                            \
                add     $(BITS_PER_LONG/8) * 2, sp;     \
                dec     reg;                            \
                jnz     771b;        <----- CPU can miss-predict here.
      
      Before RSB is filled, RETs that come in program order after this macro
      can be executed speculatively, making them vulnerable to RSB-based
      attacks.
      
      Mitigate it by adding an LFENCE after the conditional branch to prevent
      speculation while RSB is being filled.
      Suggested-by: default avatarAndrew Cooper <andrew.cooper3@citrix.com>
      Signed-off-by: default avatarPawan Gupta <pawan.kumar.gupta@linux.intel.com>
      Signed-off-by: default avatarBorislav Petkov <bp@suse.de>
      ba6e31af
    • Daniel Sneddon's avatar
      x86/speculation: Add RSB VM Exit protections · 2b129932
      Daniel Sneddon authored
      tl;dr: The Enhanced IBRS mitigation for Spectre v2 does not work as
      documented for RET instructions after VM exits. Mitigate it with a new
      one-entry RSB stuffing mechanism and a new LFENCE.
      
      == Background ==
      
      Indirect Branch Restricted Speculation (IBRS) was designed to help
      mitigate Branch Target Injection and Speculative Store Bypass, i.e.
      Spectre, attacks. IBRS prevents software run in less privileged modes
      from affecting branch prediction in more privileged modes. IBRS requires
      the MSR to be written on every privilege level change.
      
      To overcome some of the performance issues of IBRS, Enhanced IBRS was
      introduced.  eIBRS is an "always on" IBRS, in other words, just turn
      it on once instead of writing the MSR on every privilege level change.
      When eIBRS is enabled, more privileged modes should be protected from
      less privileged modes, including protecting VMMs from guests.
      
      == Problem ==
      
      Here's a simplification of how guests are run on Linux' KVM:
      
      void run_kvm_guest(void)
      {
      	// Prepare to run guest
      	VMRESUME();
      	// Clean up after guest runs
      }
      
      The execution flow for that would look something like this to the
      processor:
      
      1. Host-side: call run_kvm_guest()
      2. Host-side: VMRESUME
      3. Guest runs, does "CALL guest_function"
      4. VM exit, host runs again
      5. Host might make some "cleanup" function calls
      6. Host-side: RET from run_kvm_guest()
      
      Now, when back on the host, there are a couple of possible scenarios of
      post-guest activity the host needs to do before executing host code:
      
      * on pre-eIBRS hardware (legacy IBRS, or nothing at all), the RSB is not
      touched and Linux has to do a 32-entry stuffing.
      
      * on eIBRS hardware, VM exit with IBRS enabled, or restoring the host
      IBRS=1 shortly after VM exit, has a documented side effect of flushing
      the RSB except in this PBRSB situation where the software needs to stuff
      the last RSB entry "by hand".
      
      IOW, with eIBRS supported, host RET instructions should no longer be
      influenced by guest behavior after the host retires a single CALL
      instruction.
      
      However, if the RET instructions are "unbalanced" with CALLs after a VM
      exit as is the RET in #6, it might speculatively use the address for the
      instruction after the CALL in #3 as an RSB prediction. This is a problem
      since the (untrusted) guest controls this address.
      
      Balanced CALL/RET instruction pairs such as in step #5 are not affected.
      
      == Solution ==
      
      The PBRSB issue affects a wide variety of Intel processors which
      support eIBRS. But not all of them need mitigation. Today,
      X86_FEATURE_RSB_VMEXIT triggers an RSB filling sequence that mitigates
      PBRSB. Systems setting RSB_VMEXIT need no further mitigation - i.e.,
      eIBRS systems which enable legacy IBRS explicitly.
      
      However, such systems (X86_FEATURE_IBRS_ENHANCED) do not set RSB_VMEXIT
      and most of them need a new mitigation.
      
      Therefore, introduce a new feature flag X86_FEATURE_RSB_VMEXIT_LITE
      which triggers a lighter-weight PBRSB mitigation versus RSB_VMEXIT.
      
      The lighter-weight mitigation performs a CALL instruction which is
      immediately followed by a speculative execution barrier (INT3). This
      steers speculative execution to the barrier -- just like a retpoline
      -- which ensures that speculation can never reach an unbalanced RET.
      Then, ensure this CALL is retired before continuing execution with an
      LFENCE.
      
      In other words, the window of exposure is opened at VM exit where RET
      behavior is troublesome. While the window is open, force RSB predictions
      sampling for RET targets to a dead end at the INT3. Close the window
      with the LFENCE.
      
      There is a subset of eIBRS systems which are not vulnerable to PBRSB.
      Add these systems to the cpu_vuln_whitelist[] as NO_EIBRS_PBRSB.
      Future systems that aren't vulnerable will set ARCH_CAP_PBRSB_NO.
      
        [ bp: Massage, incorporate review comments from Andy Cooper. ]
      Signed-off-by: default avatarDaniel Sneddon <daniel.sneddon@linux.intel.com>
      Co-developed-by: default avatarPawan Gupta <pawan.kumar.gupta@linux.intel.com>
      Signed-off-by: default avatarPawan Gupta <pawan.kumar.gupta@linux.intel.com>
      Signed-off-by: default avatarBorislav Petkov <bp@suse.de>
      2b129932
  2. 31 Jul, 2022 6 commits
  3. 30 Jul, 2022 2 commits
  4. 29 Jul, 2022 30 commits