• Andy Lutomirski's avatar
    x86, vdso: Use asm volatile in __getcpu · 1ddf0b1b
    Andy Lutomirski authored
    In Linux 3.18 and below, GCC hoists the lsl instructions in the
    pvclock code all the way to the beginning of __vdso_clock_gettime,
    slowing the non-paravirt case significantly.  For unknown reasons,
    presumably related to the removal of a branch, the performance issue
    is gone as of
    
    e76b027e x86,vdso: Use LSL unconditionally for vgetcpu
    
    but I don't trust GCC enough to expect the problem to stay fixed.
    
    There should be no correctness issue, because the __getcpu calls in
    __vdso_vlock_gettime were never necessary in the first place.
    
    Note to stable maintainers: In 3.18 and below, depending on
    configuration, gcc 4.9.2 generates code like this:
    
         9c3:       44 0f 03 e8             lsl    %ax,%r13d
         9c7:       45 89 eb                mov    %r13d,%r11d
         9ca:       0f 03 d8                lsl    %ax,%ebx
    
    This patch won't apply as is to any released kernel, but I'll send a
    trivial backported version if needed.
    
    Fixes: 51c19b4f x86: vdso: pvclock gettime support
    Cc: stable@vger.kernel.org # 3.8+
    Cc: Marcelo Tosatti <mtosatti@redhat.com>
    Acked-by: default avatarPaolo Bonzini <pbonzini@redhat.com>
    Signed-off-by: default avatarAndy Lutomirski <luto@amacapital.net>
    1ddf0b1b
vgtod.h 1.84 KB