1. 27 Mar, 2017 1 commit
    • Alexey Dobriyan's avatar
      xfrm: branchless addr4_match() on 64-bit · 6c786bcb
      Alexey Dobriyan authored
      Current addr4_match() code has special test for /0 prefixes because of
      standard required undefined behaviour. However, it is possible to omit
      it on 64-bit because shifting can be done within a 64-bit register and
      then truncated to the expected value (which is 0 mask).
      
      Implicit truncation by htonl() fits nicely into R32-within-R64 model
      on x86-64.
      
      Space savings: none (coincidence)
      Branch savings: 1
      
      Before:
      
      	movzx  eax,BYTE PTR [rdi+0x2a]		# ->prefixlen_d
      	test   al,al
      	jne    xfrm_selector_match + 0x23f
      		...
      	movzx  eax,BYTE PTR [rbx+0x2b]		# ->prefixlen_s
      	test   al,al
      	je     xfrm_selector_match + 0x1c7
      
      After (no branches):
      
      	mov    r8d,0x20
      	mov    rdx,0xffffffffffffffff
      	mov    esi,DWORD PTR [rsi+0x2c]
      	mov    ecx,r8d
      	sub    cl,BYTE PTR [rdi+0x2a]
      	xor    esi,DWORD PTR [rbx]
      	mov    rdi,rdx
      	xor    eax,eax
      	shl    rdi,cl
      	bswap  edi
      Signed-off-by: default avatarAlexey Dobriyan <adobriyan@gmail.com>
      Signed-off-by: default avatarSteffen Klassert <steffen.klassert@secunet.com>
      6c786bcb
  2. 24 Mar, 2017 10 commits
  3. 23 Mar, 2017 29 commits