crypto: camellia-aesni-avx2 - tune assembly code for more performance
Add implementation tuned for more performance on real hardware. Changes are mostly around the part mixing 128-bit extract and insert instructions and AES-NI instructions. Also 'vpbroadcastb' instructions have been change to 'vpshufb with zero mask'. Tests on Intel Core i5-4570: tcrypt ECB results, old-AVX2 vs new-AVX2: size 128bit key 256bit key enc dec enc dec 256 1.00x 1.00x 1.00x 1.00x 1k 1.08x 1.09x 1.05x 1.06x 8k 1.06x 1.06x 1.06x 1.06x tcrypt ECB results, AVX vs new-AVX2: size 128bit key 256bit key enc dec enc dec 256 1.00x 1.00x 1.00x 1.00x 1k 1.51x 1.50x 1.52x 1.50x 8k 1.47x 1.48x 1.48x 1.48x Signed-off-by: Jussi Kivilinna <jussi.kivilinna@iki.fi> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Showing
Please register or sign in to comment