• Tim Chen's avatar
    crypto: crc32c - Optimize CRC32C calculation with PCLMULQDQ instruction · 6a8ce1ef
    Tim Chen authored
    This patch adds the crc_pcl function that calculates CRC32C checksum using the
    PCLMULQDQ instruction on processors that support this feature. This will
    provide speedup over using CRC32 instruction only.
    The usage of PCLMULQDQ necessitate the invocation of kernel_fpu_begin and
    kernel_fpu_end and incur some overhead.  So the new crc_pcl function is only
    invoked for buffer size of 512 bytes or more.  Larger sized
    buffers will expect to see greater speedup.  This feature is best used coupled
    with eager_fpu which reduces the kernel_fpu_begin/end overhead.  For
    buffer size of 1K the speedup is around 1.6x and for buffer size greater than
    4K, the speedup is around 3x compared to original implementation in crc32c-intel
    module. Test was performed on Sandy Bridge based platform with constant frequency
    set for cpu.
    
    A white paper detailing the algorithm can be found here:
    http://download.intel.com/design/intarch/papers/323405.pdfSigned-off-by: default avatarTim Chen <tim.c.chen@linux.intel.com>
    Signed-off-by: default avatarHerbert Xu <herbert@gondor.apana.org.au>
    6a8ce1ef
crc32c-pcl-intel-asm_64.S 13.9 KB