Commit 0974037f authored by Eric Biggers's avatar Eric Biggers Committed by Herbert Xu

crypto: x86/crct10dif-pcl - cleanup and optimizations

The x86, arm, and arm64 asm implementations of crct10dif are very
difficult to understand partly because many of the comments, labels, and
macros are named incorrectly: the lengths mentioned are usually off by a
factor of two from the actual code.  Many other things are unnecessarily
convoluted as well, e.g. there are many more fold constants than
actually needed and some aren't fully reduced.

This series therefore cleans up all these implementations to be much
more maintainable.  I also made some small optimizations where I saw
opportunities, resulting in slightly better performance.

This patch cleans up the x86 version.

As part of this, I removed support for len < 16 from the x86 assembly;
now the glue code falls back to the generic table-based implementation
in this case.  Due to the overhead of kernel_fpu_begin(), this actually
significantly improves performance on these lengths.  (And even if
kernel_fpu_begin() were free, the generic code is still faster for about
len < 11.)  This removal also eliminates error-prone special cases and
makes the x86, arm32, and arm64 ports of the code match more closely.
Acked-by: default avatarArd Biesheuvel <ard.biesheuvel@linaro.org>
Signed-off-by: default avatarEric Biggers <ebiggers@google.com>
Signed-off-by: default avatarHerbert Xu <herbert@gondor.apana.org.au>
parent f8903b3e
This diff is collapsed.
...@@ -33,18 +33,12 @@ ...@@ -33,18 +33,12 @@
#include <asm/cpufeatures.h> #include <asm/cpufeatures.h>
#include <asm/cpu_device_id.h> #include <asm/cpu_device_id.h>
asmlinkage __u16 crc_t10dif_pcl(__u16 crc, const unsigned char *buf, asmlinkage u16 crc_t10dif_pcl(u16 init_crc, const u8 *buf, size_t len);
size_t len);
struct chksum_desc_ctx { struct chksum_desc_ctx {
__u16 crc; __u16 crc;
}; };
/*
* Steps through buffer one byte at at time, calculates reflected
* crc using table.
*/
static int chksum_init(struct shash_desc *desc) static int chksum_init(struct shash_desc *desc)
{ {
struct chksum_desc_ctx *ctx = shash_desc_ctx(desc); struct chksum_desc_ctx *ctx = shash_desc_ctx(desc);
...@@ -59,7 +53,7 @@ static int chksum_update(struct shash_desc *desc, const u8 *data, ...@@ -59,7 +53,7 @@ static int chksum_update(struct shash_desc *desc, const u8 *data,
{ {
struct chksum_desc_ctx *ctx = shash_desc_ctx(desc); struct chksum_desc_ctx *ctx = shash_desc_ctx(desc);
if (irq_fpu_usable()) { if (length >= 16 && irq_fpu_usable()) {
kernel_fpu_begin(); kernel_fpu_begin();
ctx->crc = crc_t10dif_pcl(ctx->crc, data, length); ctx->crc = crc_t10dif_pcl(ctx->crc, data, length);
kernel_fpu_end(); kernel_fpu_end();
...@@ -79,7 +73,7 @@ static int chksum_final(struct shash_desc *desc, u8 *out) ...@@ -79,7 +73,7 @@ static int chksum_final(struct shash_desc *desc, u8 *out)
static int __chksum_finup(__u16 *crcp, const u8 *data, unsigned int len, static int __chksum_finup(__u16 *crcp, const u8 *data, unsigned int len,
u8 *out) u8 *out)
{ {
if (irq_fpu_usable()) { if (len >= 16 && irq_fpu_usable()) {
kernel_fpu_begin(); kernel_fpu_begin();
*(__u16 *)out = crc_t10dif_pcl(*crcp, data, len); *(__u16 *)out = crc_t10dif_pcl(*crcp, data, len);
kernel_fpu_end(); kernel_fpu_end();
......
Markdown is supported
0%
or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment