• Anton Blanchard's avatar
    powerpc: Optimise enable_kernel_altivec · 35000870
    Anton Blanchard authored
    Add two optimisations to enable_kernel_altivec:
    
    - enable_kernel_altivec has already determined if we need to
    save the previous task's state but we call giveup_altivec
    in both cases, requiring an extra branch in giveup_altivec. Create
    giveup_altivec_notask which only turns on the VMX bit in the
    MSR.
    
    - We write the VMX MSR bit each time we call enable_kernel_altivec
    even it was already set. Check the bit and branch out if we have
    already set it. The classic case for this is vectored IO
    where we have to copy multiple buffers to or from userspace.
    
    The following testcase was used to confirm this patch improves
    performance:
    
    http://ozlabs.org/~anton/junkcode/copy_to_user.c
    
    Since the current breakpoint for using VMX in copy_tofrom_user is
    4096 bytes, I'm using buffers of 4096 + 1 cacheline (4224) bytes.
    A benchmark of 16 entry readvs (-s 16):
    
    time copy_to_user -l 4224 -s 16 -i 1000000
    
    completes 5.2% faster on a POWER7 PS700.
    Signed-off-by: default avatarAnton Blanchard <anton@samba.org>
    Signed-off-by: default avatarBenjamin Herrenschmidt <benh@kernel.crashing.org>
    35000870
vector.S 8.71 KB