• Uros Bizjak's avatar
    x86/percpu: Use C for percpu read/write accessors · ca425634
    Uros Bizjak authored
    The percpu code mostly uses inline assembly. Using segment qualifiers
    allows to use C code instead, which enables the compiler to perform
    various optimizations (e.g. propagation of memory arguments). Convert
    percpu read and write accessors to C code, so the memory argument can
    be propagated to the instruction that uses this argument.
    
    Some examples of propagations:
    
    a) into sign/zero extensions:
    
    the code improves from:
    
        65 8a 05 00 00 00 00    mov    %gs:0x0(%rip),%al
        0f b6 c0                movzbl %al,%eax
    
    to:
    
        65 0f b6 05 00 00 00    movzbl %gs:0x0(%rip),%eax
        00
    
    and in a similar way for:
    
        movzbl %gs:0x0(%rip),%edx
        movzwl %gs:0x0(%rip),%esi
        movzbl %gs:0x78(%rbx),%eax
    
        movslq %gs:0x0(%rip),%rdx
        movslq %gs:(%rdi),%rbx
    
    b) into compares:
    
    the code improves from:
    
        65 8b 05 00 00 00 00    mov    %gs:0x0(%rip),%eax
        a9 00 00 0f 00          test   $0xf0000,%eax
    
    to:
    
        65 f7 05 00 00 00 00    testl  $0xf0000,%gs:0x0(%rip)
        00 00 0f 00
    
    and in a similar way for:
    
        testl  $0xf0000,%gs:0x0(%rip)
        testb  $0x1,%gs:0x0(%rip)
        testl  $0xff00,%gs:0x0(%rip)
    
        cmpb   $0x0,%gs:0x0(%rip)
        cmp    %gs:0x0(%rip),%r14d
        cmpw   $0x8,%gs:0x0(%rip)
        cmpb   $0x0,%gs:(%rax)
    
    c) into other insns:
    
    the code improves from:
    
       1a355:	83 fa ff             	cmp    $0xffffffff,%edx
       1a358:	75 07                	jne    1a361 <...>
       1a35a:	65 8b 15 00 00 00 00 	mov    %gs:0x0(%rip),%edx
       1a361:
    
    to:
    
       1a35a:	83 fa ff             	cmp    $0xffffffff,%edx
       1a35d:	65 0f 44 15 00 00 00 	cmove  %gs:0x0(%rip),%edx
       1a364:	00
    
    The above propagations result in the following code size
    improvements for current mainline kernel (with the default config),
    compiled with:
    
       # gcc (GCC) 12.3.1 20230508 (Red Hat 12.3.1-1)
    
       text            data     bss    dec             filename
       25508862        4386540  808388 30703790        vmlinux-vanilla.o
       25500922        4386532  808388 30695842        vmlinux-new.o
    Co-developed-by: default avatarNadav Amit <namit@vmware.com>
    Signed-off-by: default avatarNadav Amit <namit@vmware.com>
    Signed-off-by: default avatarUros Bizjak <ubizjak@gmail.com>
    Signed-off-by: default avatarIngo Molnar <mingo@kernel.org>
    Cc: Andy Lutomirski <luto@kernel.org>
    Cc: Brian Gerst <brgerst@gmail.com>
    Cc: Denys Vlasenko <dvlasenk@redhat.com>
    Cc: H. Peter Anvin <hpa@zytor.com>
    Cc: Linus Torvalds <torvalds@linux-foundation.org>
    Cc: Peter Zijlstra <peterz@infradead.org>
    Cc: Thomas Gleixner <tglx@linutronix.de>
    Cc: Josh Poimboeuf <jpoimboe@redhat.com>
    Link: https://lore.kernel.org/r/20231004192404.31733-1-ubizjak@gmail.com
    ca425634
percpu.h 22.8 KB