• David S. Miller's avatar
    sparc64: Improve 64-bit constant loading in eBPF JIT. · 14933dc8
    David S. Miller authored
    Doing a full 64-bit decomposition is really stupid especially for
    simple values like 0 and -1.
    
    But if we are going to optimize this, go all the way and try for all 2
    and 3 instruction sequences not requiring a temporary register as
    well.
    
    First we do the easy cases where it's a zero or sign extended 32-bit
    number (sethi+or, sethi+xor, respectively).
    
    Then we try to find a range of set bits we can load simply then shift
    up into place, in various ways.
    
    Then we try negating the constant and see if we can do a simple
    sequence using that with a xor at the end.  (f.e. the range of set
    bits can't be loaded simply, but for the negated value it can)
    
    The final optimized strategy involves 4 instructions sequences not
    needing a temporary register.
    
    Otherwise we sadly fully decompose using a temp..
    
    Example, from ALU64_XOR_K: 0x0000ffffffff0000 ^ 0x0 = 0x0000ffffffff0000:
    
    0000000000000000 <foo>:
       0:   9d e3 bf 50     save  %sp, -176, %sp
       4:   01 00 00 00     nop
       8:   90 10 00 18     mov  %i0, %o0
       c:   13 3f ff ff     sethi  %hi(0xfffffc00), %o1
      10:   92 12 63 ff     or  %o1, 0x3ff, %o1     ! ffffffff <foo+0xffffffff>
      14:   93 2a 70 10     sllx  %o1, 0x10, %o1
      18:   15 3f ff ff     sethi  %hi(0xfffffc00), %o2
      1c:   94 12 a3 ff     or  %o2, 0x3ff, %o2     ! ffffffff <foo+0xffffffff>
      20:   95 2a b0 10     sllx  %o2, 0x10, %o2
      24:   92 1a 60 00     xor  %o1, 0, %o1
      28:   12 e2 40 8a     cxbe  %o1, %o2, 38 <foo+0x38>
      2c:   9a 10 20 02     mov  2, %o5
      30:   10 60 00 03     b,pn   %xcc, 3c <foo+0x3c>
      34:   01 00 00 00     nop
      38:   9a 10 20 01     mov  1, %o5     ! 1 <foo+0x1>
      3c:   81 c7 e0 08     ret
      40:   91 eb 40 00     restore  %o5, %g0, %o0
    Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
    14933dc8
bpf_jit_comp_64.c 37.5 KB