• Andy Lutomirski's avatar
    x86, vdso: Use asm volatile in __getcpu · ccf82969
    Andy Lutomirski authored
    commit 1ddf0b1b upstream.
    
    In Linux 3.18 and below, GCC hoists the lsl instructions in the
    pvclock code all the way to the beginning of __vdso_clock_gettime,
    slowing the non-paravirt case significantly.  For unknown reasons,
    presumably related to the removal of a branch, the performance issue
    is gone as of
    
    e76b027e x86,vdso: Use LSL unconditionally for vgetcpu
    
    but I don't trust GCC enough to expect the problem to stay fixed.
    
    There should be no correctness issue, because the __getcpu calls in
    __vdso_vlock_gettime were never necessary in the first place.
    
    Note to stable maintainers: In 3.18 and below, depending on
    configuration, gcc 4.9.2 generates code like this:
    
         9c3:       44 0f 03 e8             lsl    %ax,%r13d
         9c7:       45 89 eb                mov    %r13d,%r11d
         9ca:       0f 03 d8                lsl    %ax,%ebx
    
    This patch won't apply as is to any released kernel, but I'll send a
    trivial backported version if needed.
    
    [
     Backported by Andy Lutomirski.  Should apply to all affected
     versions.  This fixes a functionality bug as well as a performance
     bug: buggy kernels can infinite loop in __vdso_clock_gettime on
     affected compilers.  See, for exammple:
    
     https://bugzilla.redhat.com/show_bug.cgi?id=1178975
    ]
    
    Fixes: 51c19b4f x86: vdso: pvclock gettime support
    Cc: Marcelo Tosatti <mtosatti@redhat.com>
    Acked-by: default avatarPaolo Bonzini <pbonzini@redhat.com>
    Signed-off-by: default avatarAndy Lutomirski <luto@amacapital.net>
    Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
    ccf82969
vsyscall.h 883 Bytes