• Johannes Goetzfried's avatar
    crypto: cast5 - add x86_64/avx assembler implementation · 4d6d6a2c
    Johannes Goetzfried authored
    This patch adds a x86_64/avx assembler implementation of the Cast5 block
    cipher. The implementation processes sixteen blocks in parallel (four 4 block
    chunk AVX operations). The table-lookups are done in general-purpose registers.
    For small blocksizes the functions from the generic module are called. A good
    performance increase is provided for blocksizes greater or equal to 128B.
    
    Patch has been tested with tcrypt and automated filesystem tests.
    
    Tcrypt benchmark results:
    
    Intel Core i5-2500 CPU (fam:6, model:42, step:7)
    
    cast5-avx-x86_64 vs. cast5-generic
    64bit key:
    size    ecb-enc ecb-dec cbc-enc cbc-dec ctr-enc ctr-dec
    16B     0.99x   0.99x   1.00x   1.00x   1.02x   1.01x
    64B     1.00x   1.00x   0.98x   1.00x   1.01x   1.02x
    256B    2.03x   2.01x   0.95x   2.11x   2.12x   2.13x
    1024B   2.30x   2.24x   0.95x   2.29x   2.35x   2.35x
    8192B   2.31x   2.27x   0.95x   2.31x   2.39x   2.39x
    
    128bit key:
    size    ecb-enc ecb-dec cbc-enc cbc-dec ctr-enc ctr-dec
    16B     0.99x   0.99x   1.00x   1.00x   1.01x   1.01x
    64B     1.00x   1.00x   0.98x   1.01x   1.02x   1.01x
    256B    2.17x   2.13x   0.96x   2.19x   2.19x   2.19x
    1024B   2.29x   2.32x   0.95x   2.34x   2.37x   2.38x
    8192B   2.35x   2.32x   0.95x   2.35x   2.39x   2.39x
    Signed-off-by: default avatarJohannes Goetzfried <Johannes.Goetzfried@informatik.stud.uni-erlangen.de>
    Signed-off-by: default avatarHerbert Xu <herbert@gondor.apana.org.au>
    4d6d6a2c
cast5-avx-x86_64-asm_64.S 7.43 KB