1. 27 Mar, 2020 3 commits
    • Amir Goldstein's avatar
      ovl: avoid possible inode number collisions with xino=on · dfe51d47
      Amir Goldstein authored
      When xino feature is enabled and a real directory inode number overflows
      the lower xino bits, we cannot map this directory inode number to a unique
      and persistent inode number and we fall back to the real inode st_ino and
      overlay st_dev.
      
      The real inode st_ino with high bits may collide with a lower inode number
      on overlay st_dev that was mapped using xino.
      
      To avoid possible collision with legitimate xino values, map a non
      persistent inode number to a dedicated range in the xino address space.
      The dedicated range is created by adding one more bit to the number of
      reserved high xino bits.  We could have added just one more fsid, but that
      would have had the undesired effect of changing persistent overlay inode
      numbers on kernel or require more complex xino mapping code.
      Signed-off-by: default avatarAmir Goldstein <amir73il@gmail.com>
      Signed-off-by: default avatarMiklos Szeredi <mszeredi@redhat.com>
      dfe51d47
    • Amir Goldstein's avatar
      ovl: use a private non-persistent ino pool · 4d314f78
      Amir Goldstein authored
      There is no reason to deplete the system's global get_next_ino() pool for
      overlay non-persistent inode numbers and there is no reason at all to
      allocate non-persistent inode numbers for non-directories.
      
      For non-directories, it is much better to leave i_ino the same as real
      i_ino, to be consistent with st_ino/d_ino.
      Signed-off-by: default avatarAmir Goldstein <amir73il@gmail.com>
      Signed-off-by: default avatarMiklos Szeredi <mszeredi@redhat.com>
      4d314f78
    • Miklos Szeredi's avatar
      ovl: fix WARN_ON nlink drop to zero · 83552eac
      Miklos Szeredi authored
      Changes to underlying layers should not cause WARN_ON(), but this repro
      does:
      
       mkdir w l u mnt
       sudo mount -t overlay -o workdir=w,lowerdir=l,upperdir=u overlay mnt
       touch mnt/h
       ln u/h u/k
       rm -rf mnt/k
       rm -rf mnt/h
       dmesg
      
       ------------[ cut here ]------------
       WARNING: CPU: 1 PID: 116244 at fs/inode.c:302 drop_nlink+0x28/0x40
      
      After upper hardlinks were added while overlay is mounted, unlinking all
      overlay hardlinks drops overlay nlink to zero before all upper inodes
      are unlinked.
      
      After unlink/rename prevent i_nlink from going to zero if there are still
      hashed aliases (i.e. cached hard links to the victim) remaining.
      Reported-by: default avatarPhasip <phasip@gmail.com>
      Signed-off-by: default avatarMiklos Szeredi <mszeredi@redhat.com>
      83552eac
  2. 17 Mar, 2020 15 commits
  3. 15 Mar, 2020 10 commits
    • Linus Torvalds's avatar
      Linux 5.6-rc6 · fb33c651
      Linus Torvalds authored
      fb33c651
    • Linus Torvalds's avatar
      Merge tag 'irq-urgent-2020-03-15' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip · a42a7bb6
      Linus Torvalds authored
      Pull irq fix from Thomas Gleixner:
       "A single commit to handle an erratum in Cavium ThunderX to prevent
        access to GIC registers which are broken in the implementation"
      
      * tag 'irq-urgent-2020-03-15' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
        irqchip/gic-v3: Workaround Cavium erratum 38539 when reading GICD_TYPER2
      a42a7bb6
    • Linus Torvalds's avatar
      Merge tag 'locking-urgent-2020-03-15' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip · 34d5a4b3
      Linus Torvalds authored
      Pull futex fix from Thomas Gleixner:
       "Fix for yet another subtle futex issue.
      
        The futex code used ihold() to prevent inodes from vanishing, but
        ihold() does not guarantee inode persistence. Replace the inode
        pointer with a per boot, machine wide, unique inode identifier.
      
        The second commit fixes the breakage of the hash mechanism which
        causes a 100% performance regression"
      
      * tag 'locking-urgent-2020-03-15' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
        futex: Unbreak futex hashing
        futex: Fix inode life-time issue
      34d5a4b3
    • Linus Torvalds's avatar
      Merge tag 'x86-urgent-2020-03-15' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip · ec181b7f
      Linus Torvalds authored
      Pull x86 fixes from Thomas Gleixner:
       "Two fixes for x86:
      
         - Map EFI runtime service data as encrypted when SEV is enabled.
      
           Otherwise e.g. SMBIOS data cannot be properly decoded by dmidecode.
      
         - Remove the warning in the vector management code which triggered
           when a managed interrupt affinity changed outside of a CPU hotplug
           operation.
      
           The warning was correct until the recent core code change that
           introduced a CPU isolation feature which needs to migrate managed
           interrupts away from online CPUs under certain conditions to
           achieve the isolation"
      
      * tag 'x86-urgent-2020-03-15' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
        x86/vector: Remove warning on managed interrupt migration
        x86/ioremap: Map EFI runtime services data as encrypted for SEV
      ec181b7f
    • Linus Torvalds's avatar
      Merge tag 'perf-urgent-2020-03-15' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip · e99bc917
      Linus Torvalds authored
      Pull perf fixes from Thomas Gleixner:
       "A pile of perf fixes:
      
        Kernel side:
      
         - AMD uncore driver: Replace the open coded sanity check with the
           core variant, which provides the correct error code and also leaves
           a hint in dmesg
      
        Tooling:
      
         - Fix the stdio input handling with glibc versions >= 2.28
      
         - Unbreak the futex-wake benchmark which was reduced to 0 test
           threads due to the conversion to cpumaps
      
         - Initialize sigaction structs before invoking sys_sigactio()
      
         - Plug the mapfile memory leak in perf jevents
      
         - Fix off by one relative directory includes
      
         - Fix an undefined string comparison in perf diff"
      
      * tag 'perf-urgent-2020-03-15' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
        perf/amd/uncore: Replace manual sampling check with CAP_NO_INTERRUPT flag
        tools: Fix off-by 1 relative directory includes
        perf jevents: Fix leak of mapfile memory
        perf bench: Clear struct sigaction before sigaction() syscall
        perf bench futex-wake: Restore thread count default to online CPU count
        perf top: Fix stdio interface input handling with glibc 2.28+
        perf diff: Fix undefined string comparision spotted by clang's -Wstring-compare
        perf symbols: Don't try to find a vmlinux file when looking for kernel modules
        perf bench: Share some global variables to fix build with gcc 10
        perf parse-events: Use asprintf() instead of strncpy() to read tracepoint files
        perf env: Do not return pointers to local variables
        perf tests bp_account: Make global variable static
      e99bc917
    • Linus Torvalds's avatar
      Merge tag 'timers-urgent-2020-03-15' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip · ffe6da91
      Linus Torvalds authored
      Pull timer fix from Thomas Gleixner:
       "A single fix adding the missing time namespace adjustment in
        sys/sysinfo which caused sys/sysinfo to be inconsistent with
        /proc/uptime when read from a task inside a time namespace"
      
      * tag 'timers-urgent-2020-03-15' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
        sys/sysinfo: Respect boottime inside time namespace
      ffe6da91
    • Linus Torvalds's avatar
      Merge tag 'ras-urgent-2020-03-15' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip · 52ac3777
      Linus Torvalds authored
      Pull RAS fixes from Thomas Gleixner:
       "Two RAS related fixes:
      
         - Shut down the per CPU thermal throttling poll work properly when a
           CPU goes offline.
      
           The missing shutdown caused the poll work to be migrated to a
           unbound worker which triggered warnings about the usage of
           smp_processor_id() in preemptible context
      
         - Fix the PPIN feature initialization which missed to enable the
           functionality when PPIN_CTL was enabled but the MSR locked against
           updates"
      
      * tag 'ras-urgent-2020-03-15' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
        x86/mce: Fix logic and comments around MSR_PPIN_CTL
        x86/mce/therm_throt: Undo thermal polling properly on CPU offline
      52ac3777
    • Linus Torvalds's avatar
      Merge tag 'efi-urgent-2020-03-15' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip · b67775e1
      Linus Torvalds authored
      Pull EFI fixes from Thomas Gleixner:
       "Two EFI fixes:
      
         - Prevent a race and buffer overflow in the sysfs efivars interface
           which causes kernel memory corruption.
      
         - Add the missing NULL pointer checks in efivar_store_raw()"
      
      * tag 'efi-urgent-2020-03-15' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
        efi: Add a sanity check to efivar_store_raw()
        efi: Fix a race and a buffer overflow while reading efivars via sysfs
      b67775e1
    • Linus Torvalds's avatar
      Merge tag 'iommu-fixes-v5.6-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu · de28a65c
      Linus Torvalds authored
      Pull IOMMU fixes from Joerg Roedel:
      
       - Intel VT-d fixes:
          - RCU list handling fixes
          - Replace WARN_TAINT with pr_warn + add_taint for reporting firmware
            issues
          - DebugFS fixes
          - Fix for hugepage handling in iova_to_phys implementation
          - Fix for handling VMD devices, which have a domain number which
            doesn't fit into 16 bits
          - Warning message fix
      
       - MSI allocation fix for iommu-dma code
      
       - Sign-extension fix for io page-table code
      
       - Fix for AMD-Vi to properly update the is-running bit when AVIC is
         used
      
      * tag 'iommu-fixes-v5.6-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu:
        iommu/vt-d: Populate debugfs if IOMMUs are detected
        iommu/amd: Fix IOMMU AVIC not properly update the is_run bit in IRTE
        iommu/vt-d: Ignore devices with out-of-spec domain number
        iommu/vt-d: Fix the wrong printing in RHSA parsing
        iommu/vt-d: Fix debugfs register reads
        iommu/vt-d: quirk_ioat_snb_local_iommu: replace WARN_TAINT with pr_warn + add_taint
        iommu/vt-d: dmar_parse_one_rmrr: replace WARN_TAINT with pr_warn + add_taint
        iommu/vt-d: dmar: replace WARN_TAINT with pr_warn + add_taint
        iommu/vt-d: Silence RCU-list debugging warnings
        iommu/vt-d: Fix RCU-list bugs in intel_iommu_init()
        iommu/dma: Fix MSI reservation allocation
        iommu/io-pgtable-arm: Fix IOVA validation for 32-bit
        iommu/vt-d: Fix a bug in intel_iommu_iova_to_phys() for huge page
        iommu/vt-d: Fix RCU list debugging warnings
      de28a65c
    • Thomas Gleixner's avatar
      Merge tag 'irqchip-fixes-5.6-2' of... · 92c22755
      Thomas Gleixner authored
      Merge tag 'irqchip-fixes-5.6-2' of git://git.kernel.org/pub/scm/linux/kernel/git/maz/arm-platforms into irq/urgent
      
      Pull irqchip fixes from Marc Zyngier:
      
      - Add workaround for Cavium/Marvell ThunderX unimplemented GIC registers
      92c22755
  4. 14 Mar, 2020 12 commits
    • Linus Torvalds's avatar
      Merge branch 'i2c/for-current' of git://git.kernel.org/pub/scm/linux/kernel/git/wsa/linux · d3dca690
      Linus Torvalds authored
      Pull i2c fixes from Wolfram Sang:
       "I2C has quite some regression fixes this time.
      
        One is also related to watchdogs, we have proper acks from Guenter for
        them"
      
      * 'i2c/for-current' of git://git.kernel.org/pub/scm/linux/kernel/git/wsa/linux:
        i2c: acpi: put device when verifying client fails
        misc: eeprom: at24: fix regulator underflow
        i2c: gpio: suppress error on probe defer
        macintosh: windfarm: fix MODINFO regression
        i2c: designware-pci: Fix BUG_ON during device removal
        i2c: i801: Do not add ICH_RES_IO_SMI for the iTCO_wdt device
        watchdog: iTCO_wdt: Make ICH_RES_IO_SMI optional
        watchdog: iTCO_wdt: Export vendorsupport
      d3dca690
    • Linus Torvalds's avatar
      Merge tag 'arc-5.6-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/vgupta/arc · 3086ae07
      Linus Torvalds authored
      Pull ARC fixes from Vineet Gupta:
      
       - Fix __ALIGN_STR and __ALIGN to not use default junk padding
      
       - Misc Kconfig cleanups, header updates
      
      * tag 'arc-5.6-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/vgupta/arc:
        ARC: define __ALIGN_STR and __ALIGN symbols for ARC
        ARC: show_regs: reduce lines of output
        ARC: Replace <linux/clk-provider.h> by <linux/of_clk.h>
        ARC: fpu: fix randconfig build error reported by 0-day test service
        ARC: fix some Kconfig typos
        ARC: Cleanup old Kconfig IO scheduler options
      3086ae07
    • Linus Torvalds's avatar
      Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm · 6693075e
      Linus Torvalds authored
      Pull kvm fixes from Paolo Bonzini:
       "Bugfixes for x86 and s390"
      
      * tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm:
        KVM: nVMX: avoid NULL pointer dereference with incorrect EVMCS GPAs
        KVM: x86: Initializing all kvm_lapic_irq fields in ioapic_write_indirect
        KVM: VMX: Condition ENCLS-exiting enabling on CPU support for SGX1
        KVM: s390: Also reset registers in sync regs for initial cpu reset
        KVM: fix Kconfig menu text for -Werror
        KVM: x86: remove stale comment from struct x86_emulate_ctxt
        KVM: x86: clear stale x86_emulate_ctxt->intercept value
        KVM: SVM: Fix the svm vmexit code for WRMSR
        KVM: X86: Fix dereference null cpufreq policy
      6693075e
    • Megha Dey's avatar
      iommu/vt-d: Populate debugfs if IOMMUs are detected · 1da8347d
      Megha Dey authored
      Currently, the intel iommu debugfs directory(/sys/kernel/debug/iommu/intel)
      gets populated only when DMA remapping is enabled (dmar_disabled = 0)
      irrespective of whether interrupt remapping is enabled or not.
      
      Instead, populate the intel iommu debugfs directory if any IOMMUs are
      detected.
      
      Cc: Dan Carpenter <dan.carpenter@oracle.com>
      Fixes: ee2636b8 ("iommu/vt-d: Enable base Intel IOMMU debugfs support")
      Signed-off-by: default avatarMegha Dey <megha.dey@linux.intel.com>
      Signed-off-by: default avatarLu Baolu <baolu.lu@linux.intel.com>
      Signed-off-by: default avatarJoerg Roedel <jroedel@suse.de>
      1da8347d
    • Linus Torvalds's avatar
      Merge tag 'clk-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/clk/linux · 69a4d0ba
      Linus Torvalds authored
      Pull clk fixes from Stephen Boyd:
       "A small collection of fixes. I'll make another sweep soon to look for
        more fixes for this -rc series.
      
         - Mark device node const in of_clk_get_parent APIs to ease landing
           changes in users later
      
         - Fix flag for Qualcomm SC7180 video clocks where we thought it would
           never turn off but actually hardware takes care of it
      
         - Remove disp_cc_mdss_rscc_ahb_clk on Qualcomm SC7180 SoCs because
           this clk is always on anyway
      
         - Correct some bad dt-binding numbers for i.MX8MN SoCs"
      
      * tag 'clk-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/clk/linux:
        clk: imx8mn: Fix incorrect clock defines
        clk: qcom: dispcc: Remove support of disp_cc_mdss_rscc_ahb_clk
        clk: qcom: videocc: Update the clock flag for video_cc_vcodec0_core_clk
        of: clk: Make of_clk_get_parent_{count,name}() parameter const
      69a4d0ba
    • Paolo Bonzini's avatar
      018cabb6
    • Vitaly Kuznetsov's avatar
      KVM: nVMX: avoid NULL pointer dereference with incorrect EVMCS GPAs · 95fa1010
      Vitaly Kuznetsov authored
      When an EVMCS enabled L1 guest on KVM will tries doing enlightened VMEnter
      with EVMCS GPA = 0 the host crashes because the
      
      evmcs_gpa != vmx->nested.hv_evmcs_vmptr
      
      condition in nested_vmx_handle_enlightened_vmptrld() will evaluate to
      false (as nested.hv_evmcs_vmptr is zeroed after init). The crash will
      happen on vmx->nested.hv_evmcs pointer dereference.
      
      Another problematic EVMCS ptr value is '-1' but it only causes host crash
      after nested_release_evmcs() invocation. The problem is exactly the same as
      with '0', we mistakenly think that the EVMCS pointer hasn't changed and
      thus nested.hv_evmcs_vmptr is valid.
      
      Resolve the issue by adding an additional !vmx->nested.hv_evmcs
      check to nested_vmx_handle_enlightened_vmptrld(), this way we will
      always be trying kvm_vcpu_map() when nested.hv_evmcs is NULL
      and this is supposed to catch all invalid EVMCS GPAs.
      
      Also, initialize hv_evmcs_vmptr to '0' in nested_release_evmcs()
      to be consistent with initialization where we don't currently
      set hv_evmcs_vmptr to '-1'.
      
      Cc: stable@vger.kernel.org
      Signed-off-by: default avatarVitaly Kuznetsov <vkuznets@redhat.com>
      Signed-off-by: default avatarPaolo Bonzini <pbonzini@redhat.com>
      95fa1010
    • Paolo Bonzini's avatar
      Merge tag 'kvm-s390-master-5.6-1' of... · 997224fe
      Paolo Bonzini authored
      Merge tag 'kvm-s390-master-5.6-1' of git://git.kernel.org/pub/scm/linux/kernel/git/kvms390/linux into kvm-master
      
      KVM: s390: Fully do the CPU resets as intended
      
      With 7de3f142 ("KVM: s390: Add new reset vcpu API") we clarified
      the meaning of the reset ioctl to fully reset the CPU and not only the
      parts that can not be handled by userspace. Turns out that we missed
      some parts.
      997224fe
    • Marc Zyngier's avatar
      irqchip/gic-v3: Workaround Cavium erratum 38539 when reading GICD_TYPER2 · d01fd161
      Marc Zyngier authored
      Despite the architecture spec requiring that reserved registers in the GIC
      distributor memory map are RES0 (and thus are not allowed to generate
      an exception), the Cavium ThunderX (aka TX1) SoC explodes as such:
      
      [    0.000000] GICv3: GIC: Using split EOI/Deactivate mode
      [    0.000000] GICv3: 128 SPIs implemented
      [    0.000000] GICv3: 0 Extended SPIs implemented
      [    0.000000] Internal error: synchronous external abort: 96000210 [#1] SMP
      [    0.000000] Modules linked in:
      [    0.000000] CPU: 0 PID: 0 Comm: swapper/0 Not tainted 5.4.0-rc4-00035-g3cf6a3d5725f #7956
      [    0.000000] Hardware name: cavium,thunder-88xx (DT)
      [    0.000000] pstate: 60000085 (nZCv daIf -PAN -UAO)
      [    0.000000] pc : __raw_readl+0x0/0x8
      [    0.000000] lr : gic_init_bases+0x110/0x560
      [    0.000000] sp : ffff800011243d90
      [    0.000000] x29: ffff800011243d90 x28: 0000000000000000
      [    0.000000] x27: 0000000000000018 x26: 0000000000000002
      [    0.000000] x25: ffff8000116f0000 x24: ffff000fbe6a2c80
      [    0.000000] x23: 0000000000000000 x22: ffff010fdc322b68
      [    0.000000] x21: ffff800010a7a208 x20: 00000000009b0404
      [    0.000000] x19: ffff80001124dad0 x18: 0000000000000010
      [    0.000000] x17: 000000004d8d492b x16: 00000000f67eb9af
      [    0.000000] x15: ffffffffffffffff x14: ffff800011249908
      [    0.000000] x13: ffff800091243ae7 x12: ffff800011243af4
      [    0.000000] x11: ffff80001126e000 x10: ffff800011243a70
      [    0.000000] x9 : 00000000ffffffd0 x8 : ffff80001069c828
      [    0.000000] x7 : 0000000000000059 x6 : ffff8000113fb4d1
      [    0.000000] x5 : 0000000000000001 x4 : 0000000000000000
      [    0.000000] x3 : 0000000000000000 x2 : 0000000000000000
      [    0.000000] x1 : 0000000000000000 x0 : ffff8000116f000c
      [    0.000000] Call trace:
      [    0.000000]  __raw_readl+0x0/0x8
      [    0.000000]  gic_of_init+0x188/0x224
      [    0.000000]  of_irq_init+0x200/0x3cc
      [    0.000000]  irqchip_init+0x1c/0x40
      [    0.000000]  init_IRQ+0x160/0x1d0
      [    0.000000]  start_kernel+0x2ec/0x4b8
      [    0.000000] Code: a8c47bfd d65f03c0 d538d080 d65f03c0 (b9400000)
      
      when reading the GICv4.1 GICD_TYPER2 register, which is unexpected...
      
      Work around it by adding a new quirk for the following variants:
      
       ThunderX: CN88xx
       OCTEON TX: CN83xx, CN81xx
       OCTEON TX2: CN93xx, CN96xx, CN98xx, CNF95xx*
      
      and use this flag to avoid accessing GICD_TYPER2. Note that all
      reserved registers (including redistributors and ITS) are impacted
      by this erratum, but that only GICD_TYPER2 has to be worked around
      so far.
      Signed-off-by: default avatarMarc Zyngier <maz@kernel.org>
      Tested-by: default avatarRobert Richter <rrichter@marvell.com>
      Tested-by: default avatarMark Salter <msalter@redhat.com>
      Tested-by: default avatarTim Harvey <tharvey@gateworks.com>
      Acked-by: default avatarCatalin Marinas <catalin.marinas@arm.com>
      Acked-by: default avatarRobert Richter <rrichter@marvell.com>
      Link: https://lore.kernel.org/r/20191027144234.8395-11-maz@kernel.org
      Link: https://lore.kernel.org/r/20200311115649.26060-1-maz@kernel.org
      d01fd161
    • Nitesh Narayan Lal's avatar
      KVM: x86: Initializing all kvm_lapic_irq fields in ioapic_write_indirect · 0c22056f
      Nitesh Narayan Lal authored
      Previously all fields of structure kvm_lapic_irq were not initialized
      before it was passed to kvm_bitmap_or_dest_vcpus(). Which will cause
      an issue when any of those fields are used for processing a request.
      For example not initializing the msi_redir_hint field before passing
      to the kvm_bitmap_or_dest_vcpus(), may lead to a misbehavior of
      kvm_apic_map_get_dest_lapic(). This will specifically happen when the
      kvm_lowest_prio_delivery() returns TRUE due to a non-zero garbage
      value of msi_redir_hint, which should not happen as the request belongs
      to APIC fixed delivery mode and we do not want to deliver the
      interrupt only to the lowest priority candidate.
      
      This patch initializes all the fields of kvm_lapic_irq based on the
      values of ioapic redirect_entry object before passing it on to
      kvm_bitmap_or_dest_vcpus().
      
      Fixes: 7ee30bc1 ("KVM: x86: deliver KVM IOAPIC scan request to target vCPUs")
      Signed-off-by: default avatarNitesh Narayan Lal <nitesh@redhat.com>
      Reviewed-by: default avatarVitaly Kuznetsov <vkuznets@redhat.com>
      [Set level to false since the value doesn't really matter. Suggested
       by Vitaly Kuznetsov. - Paolo]
      Signed-off-by: default avatarPaolo Bonzini <pbonzini@redhat.com>
      0c22056f
    • Sean Christopherson's avatar
      KVM: VMX: Condition ENCLS-exiting enabling on CPU support for SGX1 · 7a57c09b
      Sean Christopherson authored
      Enable ENCLS-exiting (and thus set vmcs.ENCLS_EXITING_BITMAP) only if
      the CPU supports SGX1.  Per Intel's SDM, all ENCLS leafs #UD if SGX1
      is not supported[*], i.e. intercepting ENCLS to inject a #UD is
      unnecessary.
      
      Avoiding ENCLS-exiting even when it is reported as supported by the CPU
      works around a reported issue where SGX is "hard" disabled after an S3
      suspend/resume cycle, i.e. CPUID.0x7.SGX=0 and the VMCS field/control
      are enumerated as unsupported.  While the root cause of the S3 issue is
      unknown, it's definitely _not_ a KVM (or kernel) bug, i.e. this is a
      workaround for what is most likely a hardware or firmware issue.  As a
      bonus side effect, KVM saves a VMWRITE when first preparing vmcs01 and
      vmcs02.
      
      Note, SGX must be disabled in BIOS to take advantage of this workaround
      
      [*] The additional ENCLS CPUID check on SGX1 exists so that SGX can be
          globally "soft" disabled post-reset, e.g. if #MC bits in MCi_CTL are
          cleared.  Soft disabled meaning disabling SGX without clearing the
          primary CPUID bit (in leaf 0x7) and without poking into non-SGX
          CPU paths, e.g. for the VMCS controls.
      
      Fixes: 0b665d30 ("KVM: vmx: Inject #UD for SGX ENCLS instruction in guest")
      Reported-by: default avatarToni Spets <toni.spets@iki.fi>
      Signed-off-by: default avatarSean Christopherson <sean.j.christopherson@intel.com>
      Signed-off-by: default avatarPaolo Bonzini <pbonzini@redhat.com>
      7a57c09b
    • Suravee Suthikulpanit's avatar
      iommu/amd: Fix IOMMU AVIC not properly update the is_run bit in IRTE · 730ad0ed
      Suravee Suthikulpanit authored
      Commit b9c6ff94 ("iommu/amd: Re-factor guest virtual APIC
      (de-)activation code") accidentally left out the ir_data pointer when
      calling modity_irte_ga(), which causes the function amd_iommu_update_ga()
      to return prematurely due to struct amd_ir_data.ref is NULL and
      the "is_run" bit of IRTE does not get updated properly.
      
      This results in bad I/O performance since IOMMU AVIC always generate GA Log
      entry and notify IOMMU driver and KVM when it receives interrupt from the
      PCI pass-through device instead of directly inject interrupt to the vCPU.
      
      Fixes by passing ir_data when calling modify_irte_ga() as done previously.
      
      Fixes: b9c6ff94 ("iommu/amd: Re-factor guest virtual APIC (de-)activation code")
      Signed-off-by: default avatarSuravee Suthikulpanit <suravee.suthikulpanit@amd.com>
      Signed-off-by: default avatarJoerg Roedel <jroedel@suse.de>
      730ad0ed