• Andre Przywara's avatar
    x86/perf: Fix virtualization sanity check · bffd5fc2
    Andre Przywara authored
    In check_hw_exists() we try to detect non-emulated MSR accesses
    by writing an arbitrary value into one of the PMU registers
    and check if it's value after a readout is still the same.
    This algorithm silently assumes that the register does not contain
    the magic value already, which is wrong in at least one situation.
    
    Fix the algorithm to really do a read-modify-write cycle. This fixes
    a warning under Xen under some circumstances on AMD family 10h CPUs.
    
    The reasons in more details actually sound like a story from
    Believe It or Not!:
    
    First you need an AMD family 10h/12h CPU. These do not reset the
    PERF_CTR registers on a reboot.
    Now you boot bare metal Linux, which goes successfully through this
    check, but leaves the magic value of 0xabcd in the register. You
    don't use the performance counters, but do a reboot (warm reset).
    Then you choose to boot Xen. The check will be triggered with a
    recent Linux kernel as Dom0 again, trying to write 0xabcd into the
    MSR. Xen silently drops the write (expected), but the subsequent read
    will return the value in the register, which just happens to be the
    expected magic value. Thus the test misleadingly succeeds, leaving
    the kernel in the belief that the PMU is available. This will trigger
    the following message:
    
    [    0.020294] ------------[ cut here ]------------
    [    0.020311] WARNING: at arch/x86/xen/enlighten.c:730 xen_apic_write+0x15/0x17()
    [    0.020318] Hardware name: empty
    [    0.020323] Modules linked in:
    [    0.020334] Pid: 1, comm: swapper/0 Not tainted 3.3.8 #7
    [    0.020340] Call Trace:
    [    0.020354]  [<ffffffff81050379>] warn_slowpath_common+0x80/0x98
    [    0.020369]  [<ffffffff810503a6>] warn_slowpath_null+0x15/0x17
    [    0.020378]  [<ffffffff810034df>] xen_apic_write+0x15/0x17
    [    0.020392]  [<ffffffff8101cb2b>] perf_events_lapic_init+0x2e/0x30
    [    0.020410]  [<ffffffff81ee4dd0>] init_hw_perf_events+0x250/0x407
    [    0.020419]  [<ffffffff81ee4b80>] ? check_bugs+0x2d/0x2d
    [    0.020430]  [<ffffffff81002181>] do_one_initcall+0x7a/0x131
    [    0.020444]  [<ffffffff81edbbf9>] kernel_init+0x91/0x15d
    [    0.020456]  [<ffffffff817caaa4>] kernel_thread_helper+0x4/0x10
    [    0.020471]  [<ffffffff817c347c>] ? retint_restore_args+0x5/0x6
    [    0.020481]  [<ffffffff817caaa0>] ? gs_change+0x13/0x13
    [    0.020500] ---[ end trace a7919e7f17c0a725 ]---
    
    The new code will change every of the 16 low bits read from the
    register and tries to write and read-back that modified number
    from the MSR.
    Signed-off-by: default avatarAndre Przywara <andre.przywara@amd.com>
    Signed-off-by: default avatarPeter Zijlstra <a.p.zijlstra@chello.nl>
    Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
    Cc: Avi Kivity <avi@redhat.com>
    Link: http://lkml.kernel.org/r/1349797115-28346-2-git-send-email-andre.przywara@amd.comSigned-off-by: default avatarIngo Molnar <mingo@kernel.org>
    bffd5fc2
perf_event.c 43.6 KB