An error occurred fetching the project authors.
  1. 26 Aug, 2020 1 commit
  2. 27 Jul, 2020 1 commit
  3. 08 Jul, 2020 1 commit
  4. 15 Jun, 2020 1 commit
  5. 20 Apr, 2020 1 commit
    • Mark Gross's avatar
      x86/speculation: Add Special Register Buffer Data Sampling (SRBDS) mitigation · 7e5b3c26
      Mark Gross authored
      SRBDS is an MDS-like speculative side channel that can leak bits from the
      random number generator (RNG) across cores and threads. New microcode
      serializes the processor access during the execution of RDRAND and
      RDSEED. This ensures that the shared buffer is overwritten before it is
      released for reuse.
      
      While it is present on all affected CPU models, the microcode mitigation
      is not needed on models that enumerate ARCH_CAPABILITIES[MDS_NO] in the
      cases where TSX is not supported or has been disabled with TSX_CTRL.
      
      The mitigation is activated by default on affected processors and it
      increases latency for RDRAND and RDSEED instructions. Among other
      effects this will reduce throughput from /dev/urandom.
      
      * Enable administrator to configure the mitigation off when desired using
        either mitigations=off or srbds=off.
      
      * Export vulnerability status via sysfs
      
      * Rename file-scoped macros to apply for non-whitelist table initializations.
      
       [ bp: Massage,
         - s/VULNBL_INTEL_STEPPING/VULNBL_INTEL_STEPPINGS/g,
         - do not read arch cap MSR a second time in tsx_fused_off() - just pass it in,
         - flip check in cpu_set_bug_bits() to save an indentation level,
         - reflow comments.
         jpoimboe: s/Mitigated/Mitigation/ in user-visible strings
         tglx: Dropped the fused off magic for now
       ]
      Signed-off-by: default avatarMark Gross <mgross@linux.intel.com>
      Signed-off-by: default avatarBorislav Petkov <bp@suse.de>
      Signed-off-by: default avatarThomas Gleixner <tglx@linutronix.de>
      Reviewed-by: default avatarTony Luck <tony.luck@intel.com>
      Reviewed-by: default avatarPawan Gupta <pawan.kumar.gupta@linux.intel.com>
      Reviewed-by: default avatarJosh Poimboeuf <jpoimboe@redhat.com>
      Tested-by: default avatarNeelima Krishnan <neelima.krishnan@intel.com>
      7e5b3c26
  6. 22 Mar, 2020 1 commit
  7. 12 Mar, 2020 1 commit
  8. 20 Feb, 2020 1 commit
    • Peter Zijlstra (Intel)'s avatar
      x86/split_lock: Enable split lock detection by kernel · 6650cdd9
      Peter Zijlstra (Intel) authored
      A split-lock occurs when an atomic instruction operates on data that spans
      two cache lines. In order to maintain atomicity the core takes a global bus
      lock.
      
      This is typically >1000 cycles slower than an atomic operation within a
      cache line. It also disrupts performance on other cores (which must wait
      for the bus lock to be released before their memory operations can
      complete). For real-time systems this may mean missing deadlines. For other
      systems it may just be very annoying.
      
      Some CPUs have the capability to raise an #AC trap when a split lock is
      attempted.
      
      Provide a command line option to give the user choices on how to handle
      this:
      
      split_lock_detect=
      	off	- not enabled (no traps for split locks)
      	warn	- warn once when an application does a
      		  split lock, but allow it to continue
      		  running.
      	fatal	- Send SIGBUS to applications that cause split lock
      
      On systems that support split lock detection the default is "warn". Note
      that if the kernel hits a split lock in any mode other than "off" it will
      OOPs.
      
      One implementation wrinkle is that the MSR to control the split lock
      detection is per-core, not per thread. This might result in some short
      lived races on HT systems in "warn" mode if Linux tries to enable on one
      thread while disabling on the other. Race analysis by Sean Christopherson:
      
        - Toggling of split-lock is only done in "warn" mode.  Worst case
          scenario of a race is that a misbehaving task will generate multiple
          #AC exceptions on the same instruction.  And this race will only occur
          if both siblings are running tasks that generate split-lock #ACs, e.g.
          a race where sibling threads are writing different values will only
          occur if CPUx is disabling split-lock after an #AC and CPUy is
          re-enabling split-lock after *its* previous task generated an #AC.
        - Transitioning between off/warn/fatal modes at runtime isn't supported
          and disabling is tracked per task, so hardware will always reach a steady
          state that matches the configured mode.  I.e. split-lock is guaranteed to
          be enabled in hardware once all _TIF_SLD threads have been scheduled out.
      Signed-off-by: default avatarPeter Zijlstra (Intel) <peterz@infradead.org>
      Co-developed-by: default avatarFenghua Yu <fenghua.yu@intel.com>
      Signed-off-by: default avatarFenghua Yu <fenghua.yu@intel.com>
      Co-developed-by: default avatarTony Luck <tony.luck@intel.com>
      Signed-off-by: default avatarTony Luck <tony.luck@intel.com>
      Signed-off-by: default avatarThomas Gleixner <tglx@linutronix.de>
      Signed-off-by: default avatarBorislav Petkov <bp@suse.de>
      Link: https://lore.kernel.org/r/20200126200535.GB30377@agluck-desk2.amr.corp.intel.com
      6650cdd9
  9. 13 Jan, 2020 1 commit
    • Sean Christopherson's avatar
      x86/cpufeatures: Add flag to track whether MSR IA32_FEAT_CTL is configured · 85c17291
      Sean Christopherson authored
      Add a new feature flag, X86_FEATURE_MSR_IA32_FEAT_CTL, to track whether
      IA32_FEAT_CTL has been initialized.  This will allow KVM, and any future
      subsystems that depend on IA32_FEAT_CTL, to rely purely on cpufeatures
      to query platform support, e.g. allows a future patch to remove KVM's
      manual IA32_FEAT_CTL MSR checks.
      
      Various features (on platforms that support IA32_FEAT_CTL) are dependent
      on IA32_FEAT_CTL being configured and locked, e.g. VMX and LMCE.  The
      MSR is always configured during boot, but only if the CPU vendor is
      recognized by the kernel.  Because CPUID doesn't incorporate the current
      IA32_FEAT_CTL value in its reporting of relevant features, it's possible
      for a feature to be reported as supported in cpufeatures but not truly
      enabled, e.g. if the CPU supports VMX but the kernel doesn't recognize
      the CPU.
      
      As a result, without the flag, KVM would see VMX as supported even if
      IA32_FEAT_CTL hasn't been initialized, and so would need to manually
      read the MSR and check the various enabling bits to avoid taking an
      unexpected #GP on VMXON.
      Signed-off-by: default avatarSean Christopherson <sean.j.christopherson@intel.com>
      Signed-off-by: default avatarBorislav Petkov <bp@suse.de>
      Link: https://lkml.kernel.org/r/20191221044513.21680-14-sean.j.christopherson@intel.com
      85c17291
  10. 08 Jan, 2020 1 commit
    • Tony Luck's avatar
      x86/cpufeatures: Add support for fast short REP; MOVSB · f444a5ff
      Tony Luck authored
      >From the Intel Optimization Reference Manual:
      
      3.7.6.1 Fast Short REP MOVSB
      Beginning with processors based on Ice Lake Client microarchitecture,
      REP MOVSB performance of short operations is enhanced. The enhancement
      applies to string lengths between 1 and 128 bytes long.  Support for
      fast-short REP MOVSB is enumerated by the CPUID feature flag: CPUID
      [EAX=7H, ECX=0H).EDX.FAST_SHORT_REP_MOVSB[bit 4] = 1. There is no change
      in the REP STOS performance.
      
      Add an X86_FEATURE_FSRM flag for this.
      
      memmove() avoids REP MOVSB for short (< 32 byte) copies. Check FSRM and
      use REP MOVSB for short copies on systems that support it.
      
       [ bp: Massage and add comment. ]
      Signed-off-by: default avatarTony Luck <tony.luck@intel.com>
      Signed-off-by: default avatarBorislav Petkov <bp@suse.de>
      Link: https://lkml.kernel.org/r/20191216214254.26492-1-tony.luck@intel.com
      f444a5ff
  11. 04 Nov, 2019 1 commit
  12. 28 Oct, 2019 1 commit
    • Pawan Gupta's avatar
      x86/speculation/taa: Add mitigation for TSX Async Abort · 1b42f017
      Pawan Gupta authored
      TSX Async Abort (TAA) is a side channel vulnerability to the internal
      buffers in some Intel processors similar to Microachitectural Data
      Sampling (MDS). In this case, certain loads may speculatively pass
      invalid data to dependent operations when an asynchronous abort
      condition is pending in a TSX transaction.
      
      This includes loads with no fault or assist condition. Such loads may
      speculatively expose stale data from the uarch data structures as in
      MDS. Scope of exposure is within the same-thread and cross-thread. This
      issue affects all current processors that support TSX, but do not have
      ARCH_CAP_TAA_NO (bit 8) set in MSR_IA32_ARCH_CAPABILITIES.
      
      On CPUs which have their IA32_ARCH_CAPABILITIES MSR bit MDS_NO=0,
      CPUID.MD_CLEAR=1 and the MDS mitigation is clearing the CPU buffers
      using VERW or L1D_FLUSH, there is no additional mitigation needed for
      TAA. On affected CPUs with MDS_NO=1 this issue can be mitigated by
      disabling the Transactional Synchronization Extensions (TSX) feature.
      
      A new MSR IA32_TSX_CTRL in future and current processors after a
      microcode update can be used to control the TSX feature. There are two
      bits in that MSR:
      
      * TSX_CTRL_RTM_DISABLE disables the TSX sub-feature Restricted
      Transactional Memory (RTM).
      
      * TSX_CTRL_CPUID_CLEAR clears the RTM enumeration in CPUID. The other
      TSX sub-feature, Hardware Lock Elision (HLE), is unconditionally
      disabled with updated microcode but still enumerated as present by
      CPUID(EAX=7).EBX{bit4}.
      
      The second mitigation approach is similar to MDS which is clearing the
      affected CPU buffers on return to user space and when entering a guest.
      Relevant microcode update is required for the mitigation to work.  More
      details on this approach can be found here:
      
        https://www.kernel.org/doc/html/latest/admin-guide/hw-vuln/mds.html
      
      The TSX feature can be controlled by the "tsx" command line parameter.
      If it is force-enabled then "Clear CPU buffers" (MDS mitigation) is
      deployed. The effective mitigation state can be read from sysfs.
      
       [ bp:
         - massage + comments cleanup
         - s/TAA_MITIGATION_TSX_DISABLE/TAA_MITIGATION_TSX_DISABLED/g - Josh.
         - remove partial TAA mitigation in update_mds_branch_idle() - Josh.
         - s/tsx_async_abort_cmdline/tsx_async_abort_parse_cmdline/g
       ]
      Signed-off-by: default avatarPawan Gupta <pawan.kumar.gupta@linux.intel.com>
      Signed-off-by: default avatarBorislav Petkov <bp@suse.de>
      Signed-off-by: default avatarThomas Gleixner <tglx@linutronix.de>
      Reviewed-by: default avatarJosh Poimboeuf <jpoimboe@redhat.com>
      1b42f017
  13. 08 Oct, 2019 1 commit
    • Babu Moger's avatar
      x86/cpufeatures: Add feature bit RDPRU on AMD · 9d40b85b
      Babu Moger authored
      AMD Zen 2 introduces a new RDPRU instruction which is used to give
      access to some processor registers that are typically only accessible
      when the privilege level is zero.
      
      ECX is used as the implicit register to specify which register to read.
      RDPRU places the specified register’s value into EDX:EAX.
      
      For example, the RDPRU instruction can be used to read MPERF and APERF
      at CPL > 0.
      
      Add the feature bit so it is visible in /proc/cpuinfo.
      
      Details are available in the AMD64 Architecture Programmer’s Manual:
      https://www.amd.com/system/files/TechDocs/24594.pdfSigned-off-by: default avatarBabu Moger <babu.moger@amd.com>
      Signed-off-by: default avatarBorislav Petkov <bp@suse.de>
      Cc: Aaron Lewis <aaronlewis@google.com>
      Cc: ak@linux.intel.com
      Cc: Fenghua Yu <fenghua.yu@intel.com>
      Cc: "H. Peter Anvin" <hpa@zytor.com>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: Josh Poimboeuf <jpoimboe@redhat.com>
      Cc: "Peter Zijlstra (Intel)" <peterz@infradead.org>
      Cc: robert.hu@linux.intel.com
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Thomas Hellstrom <thellstrom@vmware.com>
      Cc: x86-ml <x86@kernel.org>
      Link: https://lkml.kernel.org/r/20191007204839.5727.10803.stgit@localhost.localdomain
      9d40b85b
  14. 28 Aug, 2019 1 commit
    • Thomas Hellstrom's avatar
      x86/vmware: Add a header file for hypercall definitions · b4dd4f6e
      Thomas Hellstrom authored
      The new header is intended to be used by drivers using the backdoor.
      Follow the KVM example using alternatives self-patching to choose
      between vmcall, vmmcall and io instructions.
      
      Also define two new CPU feature flags to indicate hypervisor support
      for vmcall- and vmmcall instructions. The new XF86_FEATURE_VMW_VMMCALL
      flag is needed because using XF86_FEATURE_VMMCALL might break QEMU/KVM
      setups using the vmmouse driver. They rely on XF86_FEATURE_VMMCALL
      on AMD to get the kvm_hypercall() right. But they do not yet implement
      vmmcall for the VMware hypercall used by the vmmouse driver.
      
       [ bp: reflow hypercall %edx usage explanation comment. ]
      Signed-off-by: default avatarThomas Hellstrom <thellstrom@vmware.com>
      Signed-off-by: default avatarBorislav Petkov <bp@suse.de>
      Reviewed-by: default avatarDoug Covelli <dcovelli@vmware.com>
      Cc: Aaron Lewis <aaronlewis@google.com>
      Cc: "David S. Miller" <davem@davemloft.net>
      Cc: Fenghua Yu <fenghua.yu@intel.com>
      Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
      Cc: "H. Peter Anvin" <hpa@zytor.com>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: Josh Poimboeuf <jpoimboe@redhat.com>
      Cc: linux-graphics-maintainer@vmware.com
      Cc: Mauro Carvalho Chehab <mchehab+samsung@kernel.org>
      Cc: Nicolas Ferre <nicolas.ferre@microchip.com>
      Cc: Robert Hoo <robert.hu@linux.intel.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: virtualization@lists.linux-foundation.org
      Cc: <pv-drivers@vmware.com>
      Cc: x86-ml <x86@kernel.org>
      Link: https://lkml.kernel.org/r/20190828080353.12658-3-thomas_os@shipmail.org
      b4dd4f6e
  15. 28 Jul, 2019 1 commit
  16. 22 Jul, 2019 2 commits
    • Josh Poimboeuf's avatar
      x86: Remove X86_FEATURE_MFENCE_RDTSC · be261ffc
      Josh Poimboeuf authored
      AMD and Intel both have serializing lfence (X86_FEATURE_LFENCE_RDTSC).
      They've both had it for a long time, and AMD has had it enabled in Linux
      since Spectre v1 was announced.
      
      Back then, there was a proposal to remove the serializing mfence feature
      bit (X86_FEATURE_MFENCE_RDTSC), since both AMD and Intel have
      serializing lfence.  At the time, it was (ahem) speculated that some
      hypervisors might not yet support its removal, so it remained for the
      time being.
      
      Now a year-and-a-half later, it should be safe to remove.
      
      I asked Andrew Cooper about whether it's still needed:
      
        So if you're virtualised, you've got no choice in the matter.  lfence
        is either dispatch-serialising or not on AMD, and you won't be able to
        change it.
      
        Furthermore, you can't accurately tell what state the bit is in, because
        the MSR might not be virtualised at all, or may not reflect the true
        state in hardware.  Worse still, attempting to set the bit may not be
        successful even if there isn't a fault for doing so.
      
        Xen sets the DE_CFG bit unconditionally, as does Linux by the looks of
        things (see MSR_F10H_DECFG_LFENCE_SERIALIZE_BIT).  ISTR other hypervisor
        vendors saying the same, but I don't have any information to hand.
      
        If you are running under a hypervisor which has been updated, then
        lfence will almost certainly be dispatch-serialising in practice, and
        you'll almost certainly see the bit already set in DE_CFG.  If you're
        running under a hypervisor which hasn't been patched since Spectre,
        you've already lost in many more ways.
      
        I'd argue that X86_FEATURE_MFENCE_RDTSC is not worth keeping.
      
      So remove it.  This will reduce some code rot, and also make it easier
      to hook barrier_nospec() up to a cmdline disable for performance
      raisins, without having to need an alternative_3() macro.
      Signed-off-by: default avatarJosh Poimboeuf <jpoimboe@redhat.com>
      Signed-off-by: default avatarThomas Gleixner <tglx@linutronix.de>
      Link: https://lkml.kernel.org/r/d990aa51e40063acb9888e8c1b688e41355a9588.1562255067.git.jpoimboe@redhat.com
      be261ffc
    • Gayatri Kammela's avatar
      x86/cpufeatures: Enable a new AVX512 CPU feature · 018ebca8
      Gayatri Kammela authored
      Add a new AVX512 instruction group/feature for enumeration in
      /proc/cpuinfo: AVX512_VP2INTERSECT.
      
      CPUID.(EAX=7,ECX=0):EDX[bit 8]  AVX512_VP2INTERSECT
      
      Detailed information of CPUID bits for this feature can be found in
      the Intel Architecture Intsruction Set Extensions Programming Reference
      document (refer to Table 1-2). A copy of this document is available at
      https://bugzilla.kernel.org/show_bug.cgi?id=204215.
      Signed-off-by: default avatarGayatri Kammela <gayatri.kammela@intel.com>
      Signed-off-by: default avatarThomas Gleixner <tglx@linutronix.de>
      Link: https://lkml.kernel.org/r/20190717234632.32673-3-gayatri.kammela@intel.com
      018ebca8
  17. 09 Jul, 2019 1 commit
    • Josh Poimboeuf's avatar
      x86/speculation: Prepare entry code for Spectre v1 swapgs mitigations · 18ec54fd
      Josh Poimboeuf authored
      Spectre v1 isn't only about array bounds checks.  It can affect any
      conditional checks.  The kernel entry code interrupt, exception, and NMI
      handlers all have conditional swapgs checks.  Those may be problematic in
      the context of Spectre v1, as kernel code can speculatively run with a user
      GS.
      
      For example:
      
      	if (coming from user space)
      		swapgs
      	mov %gs:<percpu_offset>, %reg
      	mov (%reg), %reg1
      
      When coming from user space, the CPU can speculatively skip the swapgs, and
      then do a speculative percpu load using the user GS value.  So the user can
      speculatively force a read of any kernel value.  If a gadget exists which
      uses the percpu value as an address in another load/store, then the
      contents of the kernel value may become visible via an L1 side channel
      attack.
      
      A similar attack exists when coming from kernel space.  The CPU can
      speculatively do the swapgs, causing the user GS to get used for the rest
      of the speculative window.
      
      The mitigation is similar to a traditional Spectre v1 mitigation, except:
      
        a) index masking isn't possible; because the index (percpu offset)
           isn't user-controlled; and
      
        b) an lfence is needed in both the "from user" swapgs path and the
           "from kernel" non-swapgs path (because of the two attacks described
           above).
      
      The user entry swapgs paths already have SWITCH_TO_KERNEL_CR3, which has a
      CR3 write when PTI is enabled.  Since CR3 writes are serializing, the
      lfences can be skipped in those cases.
      
      On the other hand, the kernel entry swapgs paths don't depend on PTI.
      
      To avoid unnecessary lfences for the user entry case, create two separate
      features for alternative patching:
      
        X86_FEATURE_FENCE_SWAPGS_USER
        X86_FEATURE_FENCE_SWAPGS_KERNEL
      
      Use these features in entry code to patch in lfences where needed.
      
      The features aren't enabled yet, so there's no functional change.
      Signed-off-by: default avatarJosh Poimboeuf <jpoimboe@redhat.com>
      Signed-off-by: default avatarThomas Gleixner <tglx@linutronix.de>
      Reviewed-by: default avatarDave Hansen <dave.hansen@intel.com>
      18ec54fd
  18. 23 Jun, 2019 1 commit
    • Fenghua Yu's avatar
      x86/cpufeatures: Enumerate user wait instructions · 6dbbf5ec
      Fenghua Yu authored
      umonitor, umwait, and tpause are a set of user wait instructions.
      
      umonitor arms address monitoring hardware using an address. The
      address range is determined by using CPUID.0x5. A store to
      an address within the specified address range triggers the
      monitoring hardware to wake up the processor waiting in umwait.
      
      umwait instructs the processor to enter an implementation-dependent
      optimized state while monitoring a range of addresses. The optimized
      state may be either a light-weight power/performance optimized state
      (C0.1 state) or an improved power/performance optimized state
      (C0.2 state).
      
      tpause instructs the processor to enter an implementation-dependent
      optimized state C0.1 or C0.2 state and wake up when time-stamp counter
      reaches specified timeout.
      
      The three instructions may be executed at any privilege level.
      
      The instructions provide power saving method while waiting in
      user space. Additionally, they can allow a sibling hyperthread to
      make faster progress while this thread is waiting. One example of an
      application usage of umwait is when waiting for input data from another
      application, such as a user level multi-threaded packet processing
      engine.
      
      Availability of the user wait instructions is indicated by the presence
      of the CPUID feature flag WAITPKG CPUID.0x07.0x0:ECX[5].
      
      Detailed information on the instructions and CPUID feature WAITPKG flag
      can be found in the latest Intel Architecture Instruction Set Extensions
      and Future Features Programming Reference and Intel 64 and IA-32
      Architectures Software Developer's Manual.
      Signed-off-by: default avatarFenghua Yu <fenghua.yu@intel.com>
      Signed-off-by: default avatarThomas Gleixner <tglx@linutronix.de>
      Reviewed-by: default avatarAshok Raj <ashok.raj@intel.com>
      Reviewed-by: default avatarAndy Lutomirski <luto@kernel.org>
      Cc: "Borislav Petkov" <bp@alien8.de>
      Cc: "H Peter Anvin" <hpa@zytor.com>
      Cc: "Peter Zijlstra" <peterz@infradead.org>
      Cc: "Tony Luck" <tony.luck@intel.com>
      Cc: "Ravi V Shankar" <ravi.v.shankar@intel.com>
      Link: https://lkml.kernel.org/r/1560994438-235698-2-git-send-email-fenghua.yu@intel.com
      6dbbf5ec
  19. 20 Jun, 2019 2 commits
    • Fenghua Yu's avatar
      x86/cpufeatures: Enumerate the new AVX512 BFLOAT16 instructions · b302e4b1
      Fenghua Yu authored
      AVX512 BFLOAT16 instructions support 16-bit BFLOAT16 floating-point
      format (BF16) for deep learning optimization.
      
      BF16 is a short version of 32-bit single-precision floating-point
      format (FP32) and has several advantages over 16-bit half-precision
      floating-point format (FP16). BF16 keeps FP32 accumulation after
      multiplication without loss of precision, offers more than enough
      range for deep learning training tasks, and doesn't need to handle
      hardware exception.
      
      AVX512 BFLOAT16 instructions are enumerated in CPUID.7.1:EAX[bit 5]
      AVX512_BF16.
      
      CPUID.7.1:EAX contains only feature bits. Reuse the currently empty
      word 12 as a pure features word to hold the feature bits including
      AVX512_BF16.
      
      Detailed information of the CPUID bit and AVX512 BFLOAT16 instructions
      can be found in the latest Intel Architecture Instruction Set Extensions
      and Future Features Programming Reference.
      
       [ bp: Check CPUID(7) subleaf validity before accessing subleaf 1. ]
      Signed-off-by: default avatarFenghua Yu <fenghua.yu@intel.com>
      Signed-off-by: default avatarBorislav Petkov <bp@suse.de>
      Cc: "Chang S. Bae" <chang.seok.bae@intel.com>
      Cc: Frederic Weisbecker <frederic@kernel.org>
      Cc: "H. Peter Anvin" <hpa@zytor.com>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: Jann Horn <jannh@google.com>
      Cc: Masahiro Yamada <yamada.masahiro@socionext.com>
      Cc: Michael Ellerman <mpe@ellerman.id.au>
      Cc: Nadav Amit <namit@vmware.com>
      Cc: Paolo Bonzini <pbonzini@redhat.com>
      Cc: Pavel Tatashin <pasha.tatashin@oracle.com>
      Cc: Peter Feiner <pfeiner@google.com>
      Cc: Radim Krcmar <rkrcmar@redhat.com>
      Cc: "Rafael J. Wysocki" <rafael.j.wysocki@intel.com>
      Cc: "Ravi V Shankar" <ravi.v.shankar@intel.com>
      Cc: Robert Hoo <robert.hu@linux.intel.com>
      Cc: "Sean J Christopherson" <sean.j.christopherson@intel.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Thomas Lendacky <Thomas.Lendacky@amd.com>
      Cc: x86 <x86@kernel.org>
      Link: https://lkml.kernel.org/r/1560794416-217638-3-git-send-email-fenghua.yu@intel.com
      b302e4b1
    • Fenghua Yu's avatar
      x86/cpufeatures: Combine word 11 and 12 into a new scattered features word · acec0ce0
      Fenghua Yu authored
      It's a waste for the four X86_FEATURE_CQM_* feature bits to occupy two
      whole feature bits words. To better utilize feature words, re-define
      word 11 to host scattered features and move the four X86_FEATURE_CQM_*
      features into Linux defined word 11. More scattered features can be
      added in word 11 in the future.
      
      Rename leaf 11 in cpuid_leafs to CPUID_LNX_4 to reflect it's a
      Linux-defined leaf.
      
      Rename leaf 12 as CPUID_DUMMY which will be replaced by a meaningful
      name in the next patch when CPUID.7.1:EAX occupies world 12.
      
      Maximum number of RMID and cache occupancy scale are retrieved from
      CPUID.0xf.1 after scattered CQM features are enumerated. Carve out the
      code into a separate function.
      
      KVM doesn't support resctrl now. So it's safe to move the
      X86_FEATURE_CQM_* features to scattered features word 11 for KVM.
      Signed-off-by: default avatarFenghua Yu <fenghua.yu@intel.com>
      Signed-off-by: default avatarBorislav Petkov <bp@suse.de>
      Cc: Aaron Lewis <aaronlewis@google.com>
      Cc: Andy Lutomirski <luto@kernel.org>
      Cc: Babu Moger <babu.moger@amd.com>
      Cc: "Chang S. Bae" <chang.seok.bae@intel.com>
      Cc: "Sean J Christopherson" <sean.j.christopherson@intel.com>
      Cc: Frederic Weisbecker <frederic@kernel.org>
      Cc: "H. Peter Anvin" <hpa@zytor.com>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: Jann Horn <jannh@google.com>
      Cc: Juergen Gross <jgross@suse.com>
      Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
      Cc: kvm ML <kvm@vger.kernel.org>
      Cc: Masahiro Yamada <yamada.masahiro@socionext.com>
      Cc: Masami Hiramatsu <mhiramat@kernel.org>
      Cc: Nadav Amit <namit@vmware.com>
      Cc: Paolo Bonzini <pbonzini@redhat.com>
      Cc: Pavel Tatashin <pasha.tatashin@oracle.com>
      Cc: Peter Feiner <pfeiner@google.com>
      Cc: "Peter Zijlstra (Intel)" <peterz@infradead.org>
      Cc: "Radim Krčmář" <rkrcmar@redhat.com>
      Cc: "Rafael J. Wysocki" <rafael.j.wysocki@intel.com>
      Cc: Ravi V Shankar <ravi.v.shankar@intel.com>
      Cc: Sherry Hurwitz <sherry.hurwitz@amd.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Thomas Lendacky <Thomas.Lendacky@amd.com>
      Cc: x86 <x86@kernel.org>
      Link: https://lkml.kernel.org/r/1560794416-217638-2-git-send-email-fenghua.yu@intel.com
      acec0ce0
  20. 14 Jun, 2019 1 commit
    • Aaron Lewis's avatar
      x86/cpufeatures: Add FDP_EXCPTN_ONLY and ZERO_FCS_FDS · cbb99c0f
      Aaron Lewis authored
      Add the CPUID enumeration for Intel's de-feature bits to accommodate
      passing these de-features through to kvm guests.
      
      These de-features are (from SDM vol 1, section 8.1.8):
       - X86_FEATURE_FDP_EXCPTN_ONLY: If CPUID.(EAX=07H,ECX=0H):EBX[bit 6] = 1, the
         data pointer (FDP) is updated only for the x87 non-control instructions that
         incur unmasked x87 exceptions.
       - X86_FEATURE_ZERO_FCS_FDS: If CPUID.(EAX=07H,ECX=0H):EBX[bit 13] = 1, the
         processor deprecates FCS and FDS; it saves each as 0000H.
      Signed-off-by: default avatarAaron Lewis <aaronlewis@google.com>
      Signed-off-by: default avatarBorislav Petkov <bp@suse.de>
      Reviewed-by: default avatarJim Mattson <jmattson@google.com>
      Cc: Fenghua Yu <fenghua.yu@intel.com>
      Cc: Frederic Weisbecker <frederic@kernel.org>
      Cc: "H. Peter Anvin" <hpa@zytor.com>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
      Cc: marcorr@google.com
      Cc: Peter Feiner <pfeiner@google.com>
      Cc: pshier@google.com
      Cc: Robert Hoo <robert.hu@linux.intel.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Thomas Lendacky <Thomas.Lendacky@amd.com>
      Cc: x86-ml <x86@kernel.org>
      Link: https://lkml.kernel.org/r/20190605220252.103406-1-aaronlewis@google.com
      cbb99c0f
  21. 06 Mar, 2019 3 commits
    • Thomas Gleixner's avatar
      x86/speculation/mds: Add BUG_MSBDS_ONLY · e261f209
      Thomas Gleixner authored
      This bug bit is set on CPUs which are only affected by Microarchitectural
      Store Buffer Data Sampling (MSBDS) and not by any other MDS variant.
      
      This is important because the Store Buffers are partitioned between
      Hyper-Threads so cross thread forwarding is not possible. But if a thread
      enters or exits a sleep state the store buffer is repartitioned which can
      expose data from one thread to the other. This transition can be mitigated.
      
      That means that for CPUs which are only affected by MSBDS SMT can be
      enabled, if the CPU is not affected by other SMT sensitive vulnerabilities,
      e.g. L1TF. The XEON PHI variants fall into that category. Also the
      Silvermont/Airmont ATOMs, but for them it's not really relevant as they do
      not support SMT, but mark them for completeness sake.
      Signed-off-by: default avatarThomas Gleixner <tglx@linutronix.de>
      Reviewed-by: default avatarFrederic Weisbecker <frederic@kernel.org>
      Reviewed-by: default avatarJon Masters <jcm@redhat.com>
      Tested-by: default avatarJon Masters <jcm@redhat.com>
      e261f209
    • Andi Kleen's avatar
      x86/speculation/mds: Add basic bug infrastructure for MDS · ed5194c2
      Andi Kleen authored
      Microarchitectural Data Sampling (MDS), is a class of side channel attacks
      on internal buffers in Intel CPUs. The variants are:
      
       - Microarchitectural Store Buffer Data Sampling (MSBDS) (CVE-2018-12126)
       - Microarchitectural Fill Buffer Data Sampling (MFBDS) (CVE-2018-12130)
       - Microarchitectural Load Port Data Sampling (MLPDS) (CVE-2018-12127)
      
      MSBDS leaks Store Buffer Entries which can be speculatively forwarded to a
      dependent load (store-to-load forwarding) as an optimization. The forward
      can also happen to a faulting or assisting load operation for a different
      memory address, which can be exploited under certain conditions. Store
      buffers are partitioned between Hyper-Threads so cross thread forwarding is
      not possible. But if a thread enters or exits a sleep state the store
      buffer is repartitioned which can expose data from one thread to the other.
      
      MFBDS leaks Fill Buffer Entries. Fill buffers are used internally to manage
      L1 miss situations and to hold data which is returned or sent in response
      to a memory or I/O operation. Fill buffers can forward data to a load
      operation and also write data to the cache. When the fill buffer is
      deallocated it can retain the stale data of the preceding operations which
      can then be forwarded to a faulting or assisting load operation, which can
      be exploited under certain conditions. Fill buffers are shared between
      Hyper-Threads so cross thread leakage is possible.
      
      MLDPS leaks Load Port Data. Load ports are used to perform load operations
      from memory or I/O. The received data is then forwarded to the register
      file or a subsequent operation. In some implementations the Load Port can
      contain stale data from a previous operation which can be forwarded to
      faulting or assisting loads under certain conditions, which again can be
      exploited eventually. Load ports are shared between Hyper-Threads so cross
      thread leakage is possible.
      
      All variants have the same mitigation for single CPU thread case (SMT off),
      so the kernel can treat them as one MDS issue.
      
      Add the basic infrastructure to detect if the current CPU is affected by
      MDS.
      
      [ tglx: Rewrote changelog ]
      Signed-off-by: default avatarAndi Kleen <ak@linux.intel.com>
      Signed-off-by: default avatarThomas Gleixner <tglx@linutronix.de>
      Reviewed-by: default avatarBorislav Petkov <bp@suse.de>
      Reviewed-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      Reviewed-by: default avatarFrederic Weisbecker <frederic@kernel.org>
      Reviewed-by: default avatarJon Masters <jcm@redhat.com>
      Tested-by: default avatarJon Masters <jcm@redhat.com>
      ed5194c2
    • Peter Zijlstra (Intel)'s avatar
      x86: Add TSX Force Abort CPUID/MSR · 52f64909
      Peter Zijlstra (Intel) authored
      Skylake systems will receive a microcode update to address a TSX
      errata. This microcode will (by default) clobber PMC3 when TSX
      instructions are (speculatively or not) executed.
      
      It also provides an MSR to cause all TSX transaction to abort and
      preserve PMC3.
      
      Add the CPUID enumeration and MSR definition.
      Signed-off-by: default avatarPeter Zijlstra (Intel) <peterz@infradead.org>
      Signed-off-by: default avatarThomas Gleixner <tglx@linutronix.de>
      52f64909
  22. 21 Dec, 2018 1 commit
  23. 18 Dec, 2018 1 commit
    • Thomas Lendacky's avatar
      x86/speculation: Add support for STIBP always-on preferred mode · 20c3a2c3
      Thomas Lendacky authored
      Different AMD processors may have different implementations of STIBP.
      When STIBP is conditionally enabled, some implementations would benefit
      from having STIBP always on instead of toggling the STIBP bit through MSR
      writes. This preference is advertised through a CPUID feature bit.
      
      When conditional STIBP support is requested at boot and the CPU advertises
      STIBP always-on mode as preferred, switch to STIBP "on" support. To show
      that this transition has occurred, create a new spectre_v2_user_mitigation
      value and a new spectre_v2_user_strings message. The new mitigation value
      is used in spectre_v2_user_select_mitigation() to print the new mitigation
      message as well as to return a new string from stibp_state().
      Signed-off-by: default avatarTom Lendacky <thomas.lendacky@amd.com>
      Signed-off-by: default avatarThomas Gleixner <tglx@linutronix.de>
      Cc: Andrea Arcangeli <aarcange@redhat.com>
      Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
      Cc: Jiri Kosina <jkosina@suse.cz>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Tim Chen <tim.c.chen@linux.intel.com>
      Cc: David Woodhouse <dwmw@amazon.co.uk>
      Link: https://lkml.kernel.org/r/20181213230352.6937.74943.stgit@tlendack-t1.amdoffice.net
      20c3a2c3
  24. 07 Nov, 2018 1 commit
  25. 25 Oct, 2018 2 commits
    • Fenghua Yu's avatar
      x86/cpufeatures: Enumerate MOVDIR64B instruction · ace6485a
      Fenghua Yu authored
      MOVDIR64B moves 64-bytes as direct-store with 64-bytes write atomicity.
      Direct store is implemented by using write combining (WC) for writing
      data directly into memory without caching the data.
      
      In low latency offload (e.g. Non-Volatile Memory, etc), MOVDIR64B writes
      work descriptors (and data in some cases) to device-hosted work-queues
      atomically without cache pollution.
      
      Availability of the MOVDIR64B instruction is indicated by the
      presence of the CPUID feature flag MOVDIR64B (CPUID.0x07.0x0:ECX[bit 28]).
      
      Please check the latest Intel Architecture Instruction Set Extensions
      and Future Features Programming Reference for more details on the CPUID
      feature MOVDIR64B flag.
      Signed-off-by: default avatarFenghua Yu <fenghua.yu@intel.com>
      Cc: Andy Lutomirski <luto@amacapital.net>
      Cc: Ashok Raj <ashok.raj@intel.com>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Brian Gerst <brgerst@gmail.com>
      Cc: Denys Vlasenko <dvlasenk@redhat.com>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Ravi V Shankar <ravi.v.shankar@intel.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Link: http://lkml.kernel.org/r/1540418237-125817-3-git-send-email-fenghua.yu@intel.comSigned-off-by: default avatarIngo Molnar <mingo@kernel.org>
      ace6485a
    • Fenghua Yu's avatar
      x86/cpufeatures: Enumerate MOVDIRI instruction · 33823f4d
      Fenghua Yu authored
      MOVDIRI moves doubleword or quadword from register to memory through
      direct store which is implemented by using write combining (WC) for
      writing data directly into memory without caching the data.
      
      Programmable agents can handle streaming offload (e.g. high speed packet
      processing in network). Hardware implements a doorbell (tail pointer)
      register that is updated by software when adding new work-elements to
      the streaming offload work-queue.
      
      MOVDIRI can be used as the doorbell write which is a 4-byte or 8-byte
      uncachable write to MMIO. MOVDIRI has lower overhead than other ways
      to write the doorbell.
      
      Availability of the MOVDIRI instruction is indicated by the presence of
      the CPUID feature flag MOVDIRI(CPUID.0x07.0x0:ECX[bit 27]).
      
      Please check the latest Intel Architecture Instruction Set Extensions
      and Future Features Programming Reference for more details on the CPUID
      feature MOVDIRI flag.
      Signed-off-by: default avatarFenghua Yu <fenghua.yu@intel.com>
      Cc: Andy Lutomirski <luto@amacapital.net>
      Cc: Ashok Raj <ashok.raj@intel.com>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Brian Gerst <brgerst@gmail.com>
      Cc: Denys Vlasenko <dvlasenk@redhat.com>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Ravi V Shankar <ravi.v.shankar@intel.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Link: http://lkml.kernel.org/r/1540418237-125817-2-git-send-email-fenghua.yu@intel.comSigned-off-by: default avatarIngo Molnar <mingo@kernel.org>
      33823f4d
  26. 03 Aug, 2018 2 commits
    • Sai Praneeth's avatar
      x86/speculation: Support Enhanced IBRS on future CPUs · 706d5168
      Sai Praneeth authored
      Future Intel processors will support "Enhanced IBRS" which is an "always
      on" mode i.e. IBRS bit in SPEC_CTRL MSR is enabled once and never
      disabled.
      
      From the specification [1]:
      
       "With enhanced IBRS, the predicted targets of indirect branches
        executed cannot be controlled by software that was executed in a less
        privileged predictor mode or on another logical processor. As a
        result, software operating on a processor with enhanced IBRS need not
        use WRMSR to set IA32_SPEC_CTRL.IBRS after every transition to a more
        privileged predictor mode. Software can isolate predictor modes
        effectively simply by setting the bit once. Software need not disable
        enhanced IBRS prior to entering a sleep state such as MWAIT or HLT."
      
      If Enhanced IBRS is supported by the processor then use it as the
      preferred spectre v2 mitigation mechanism instead of Retpoline. Intel's
      Retpoline white paper [2] states:
      
       "Retpoline is known to be an effective branch target injection (Spectre
        variant 2) mitigation on Intel processors belonging to family 6
        (enumerated by the CPUID instruction) that do not have support for
        enhanced IBRS. On processors that support enhanced IBRS, it should be
        used for mitigation instead of retpoline."
      
      The reason why Enhanced IBRS is the recommended mitigation on processors
      which support it is that these processors also support CET which
      provides a defense against ROP attacks. Retpoline is very similar to ROP
      techniques and might trigger false positives in the CET defense.
      
      If Enhanced IBRS is selected as the mitigation technique for spectre v2,
      the IBRS bit in SPEC_CTRL MSR is set once at boot time and never
      cleared. Kernel also has to make sure that IBRS bit remains set after
      VMEXIT because the guest might have cleared the bit. This is already
      covered by the existing x86_spec_ctrl_set_guest() and
      x86_spec_ctrl_restore_host() speculation control functions.
      
      Enhanced IBRS still requires IBPB for full mitigation.
      
      [1] Speculative-Execution-Side-Channel-Mitigations.pdf
      [2] Retpoline-A-Branch-Target-Injection-Mitigation.pdf
      Both documents are available at:
      https://bugzilla.kernel.org/show_bug.cgi?id=199511Originally-by: default avatarDavid Woodhouse <dwmw@amazon.co.uk>
      Signed-off-by: default avatarSai Praneeth Prakhya <sai.praneeth.prakhya@intel.com>
      Signed-off-by: default avatarThomas Gleixner <tglx@linutronix.de>
      Cc: Tim C Chen <tim.c.chen@intel.com>
      Cc: Dave Hansen <dave.hansen@intel.com>
      Cc: Ravi Shankar <ravi.v.shankar@intel.com>
      Link: https://lkml.kernel.org/r/1533148945-24095-1-git-send-email-sai.praneeth.prakhya@intel.com
      706d5168
    • Peter Feiner's avatar
      x86/cpufeatures: Add EPT_AD feature bit · 301d328a
      Peter Feiner authored
      Some Intel processors have an EPT feature whereby the accessed & dirty bits
      in EPT entries can be updated by HW. MSR IA32_VMX_EPT_VPID_CAP exposes the
      presence of this capability.
      
      There is no point in trying to use that new feature bit in the VMX code as
      VMX needs to read the MSR anyway to access other bits, but having the
      feature bit for EPT_AD in place helps virtualization management as it
      exposes "ept_ad" in /proc/cpuinfo/$proc/flags if the feature is present.
      
      [ tglx: Amended changelog ]
      Signed-off-by: default avatarPeter Feiner <pfeiner@google.com>
      Signed-off-by: default avatarPeter Shier <pshier@google.com>
      Signed-off-by: default avatarThomas Gleixner <tglx@linutronix.de>
      Reviewed-by: default avatarJim Mattson <jmattson@google.com>
      Cc: "H. Peter Anvin" <hpa@zytor.com>
      Cc: Borislav Petkov <bp@suse.de>
      Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
      Cc: David Woodhouse <dwmw@amazon.co.uk>
      Link: https://lkml.kernel.org/r/20180801180657.138051-1-pshier@google.com
      301d328a
  27. 21 Jun, 2018 1 commit
  28. 20 Jun, 2018 1 commit
    • Andi Kleen's avatar
      x86/speculation/l1tf: Add sysfs reporting for l1tf · 17dbca11
      Andi Kleen authored
      L1TF core kernel workarounds are cheap and normally always enabled, However
      they still should be reported in sysfs if the system is vulnerable or
      mitigated. Add the necessary CPU feature/bug bits.
      
      - Extend the existing checks for Meltdowns to determine if the system is
        vulnerable. All CPUs which are not vulnerable to Meltdown are also not
        vulnerable to L1TF
      
      - Check for 32bit non PAE and emit a warning as there is no practical way
        for mitigation due to the limited physical address bits
      
      - If the system has more than MAX_PA/2 physical memory the invert page
        workarounds don't protect the system against the L1TF attack anymore,
        because an inverted physical address will also point to valid
        memory. Print a warning in this case and report that the system is
        vulnerable.
      
      Add a function which returns the PFN limit for the L1TF mitigation, which
      will be used in follow up patches for sanity and range checks.
      
      [ tglx: Renamed the CPU feature bit to L1TF_PTEINV ]
      Signed-off-by: default avatarAndi Kleen <ak@linux.intel.com>
      Signed-off-by: default avatarThomas Gleixner <tglx@linutronix.de>
      Reviewed-by: default avatarJosh Poimboeuf <jpoimboe@redhat.com>
      Acked-by: default avatarDave Hansen <dave.hansen@intel.com>
      
      17dbca11
  29. 06 Jun, 2018 2 commits
  30. 17 May, 2018 4 commits