Commit e639222a authored by Chen Yu's avatar Chen Yu Committed by Thomas Gleixner

x86/paravirt: Fix incorrect virt spinlock setting on bare metal

The kernel can change spinlock behavior when running as a guest. But this
guest-friendly behavior causes performance problems on bare metal.

The kernel uses a static key to switch between the two modes.

In theory, the static key is enabled by default (run in guest mode) and
should be disabled for bare metal (and in some guests that want native
behavior or paravirt spinlock).

A performance drop is reported when running encode/decode workload and
BenchSEE cache sub-workload.

Bisect points to commit ce0a1b60 ("x86/paravirt: Silence unused
native_pv_lock_init() function warning"). When CONFIG_PARAVIRT_SPINLOCKS is
disabled the virt_spin_lock_key is incorrectly set to true on bare
metal. The qspinlock degenerates to test-and-set spinlock, which decreases
the performance on bare metal.

Set the default value of virt_spin_lock_key to false. If booting in a VM,
enable this key. Later during the VM initialization, if other
high-efficient spinlock is preferred (e.g. paravirt-spinlock), or the user
wants the native qspinlock (via nopvspin boot commandline), the
virt_spin_lock_key is disabled accordingly.

This results in the following decision matrix:

X86_FEATURE_HYPERVISOR         Y    Y       Y     N
CONFIG_PARAVIRT_SPINLOCKS      Y    Y       N     Y/N
PV spinlock                    Y    N       N     Y/N

virt_spin_lock_key             N    Y/N     Y     N

Fixes: ce0a1b60 ("x86/paravirt: Silence unused native_pv_lock_init() function warning")
Reported-by: default avatarPrem Nath Dey <prem.nath.dey@intel.com>
Reported-by: default avatarXiaoping Zhou <xiaoping.zhou@intel.com>
Suggested-by: default avatarDave Hansen <dave.hansen@linux.intel.com>
Suggested-by: default avatarQiuxu Zhuo <qiuxu.zhuo@intel.com>
Suggested-by: default avatarNikolay Borisov <nik.borisov@suse.com>
Signed-off-by: default avatarChen Yu <yu.c.chen@intel.com>
Signed-off-by: default avatarThomas Gleixner <tglx@linutronix.de>
Reviewed-by: default avatarNikolay Borisov <nik.borisov@suse.com>
Cc: stable@vger.kernel.org
Link: https://lore.kernel.org/all/20240806112207.29792-1-yu.c.chen@intel.com
parent ab84ba64
...@@ -66,13 +66,15 @@ static inline bool vcpu_is_preempted(long cpu) ...@@ -66,13 +66,15 @@ static inline bool vcpu_is_preempted(long cpu)
#ifdef CONFIG_PARAVIRT #ifdef CONFIG_PARAVIRT
/* /*
* virt_spin_lock_key - enables (by default) the virt_spin_lock() hijack. * virt_spin_lock_key - disables by default the virt_spin_lock() hijack.
* *
* Native (and PV wanting native due to vCPU pinning) should disable this key. * Native (and PV wanting native due to vCPU pinning) should keep this key
* It is done in this backwards fashion to only have a single direction change, * disabled. Native does not touch the key.
* which removes ordering between native_pv_spin_init() and HV setup. *
* When in a guest then native_pv_lock_init() enables the key first and
* KVM/XEN might conditionally disable it later in the boot process again.
*/ */
DECLARE_STATIC_KEY_TRUE(virt_spin_lock_key); DECLARE_STATIC_KEY_FALSE(virt_spin_lock_key);
/* /*
* Shortcut for the queued_spin_lock_slowpath() function that allows * Shortcut for the queued_spin_lock_slowpath() function that allows
......
...@@ -51,13 +51,12 @@ DEFINE_ASM_FUNC(pv_native_irq_enable, "sti", .noinstr.text); ...@@ -51,13 +51,12 @@ DEFINE_ASM_FUNC(pv_native_irq_enable, "sti", .noinstr.text);
DEFINE_ASM_FUNC(pv_native_read_cr2, "mov %cr2, %rax", .noinstr.text); DEFINE_ASM_FUNC(pv_native_read_cr2, "mov %cr2, %rax", .noinstr.text);
#endif #endif
DEFINE_STATIC_KEY_TRUE(virt_spin_lock_key); DEFINE_STATIC_KEY_FALSE(virt_spin_lock_key);
void __init native_pv_lock_init(void) void __init native_pv_lock_init(void)
{ {
if (IS_ENABLED(CONFIG_PARAVIRT_SPINLOCKS) && if (boot_cpu_has(X86_FEATURE_HYPERVISOR))
!boot_cpu_has(X86_FEATURE_HYPERVISOR)) static_branch_enable(&virt_spin_lock_key);
static_branch_disable(&virt_spin_lock_key);
} }
static void native_tlb_remove_table(struct mmu_gather *tlb, void *table) static void native_tlb_remove_table(struct mmu_gather *tlb, void *table)
......
Markdown is supported
0%
or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment