Commits · 8f461b1e02ed546fbd0f11611138da67fd85a30f · Kirill Smelkov / linux

02 Mar, 2018 24 commits

crypto: x86/cast5-avx - fix ECB encryption when long sg follows short one · 8f461b1e

Eric Biggers authored Feb 19, 2018

With ecb-cast5-avx, if a 128+ byte scatterlist element followed a
shorter one, then the algorithm accidentally encrypted/decrypted only 8
bytes instead of the expected 128 bytes.  Fix it by setting the
encryption/decryption 'fn' correctly.

Fixes: c12ab20b ("crypto: cast5/avx - avoid using temporary stack buffers")
Cc: <stable@vger.kernel.org> # v3.8+
Signed-off-by: Eric Biggers <ebiggers@google.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>

8f461b1e

crypto: x86/twofish-avx - convert to skcipher interface · 0e6ab46d

Eric Biggers authored Feb 19, 2018

Convert the AVX implementation of Twofish from the (deprecated)
ablkcipher and blkcipher interfaces over to the skcipher interface.
Note that this includes replacing the use of ablk_helper with
crypto_simd.
Signed-off-by: Eric Biggers <ebiggers@google.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>

0e6ab46d

crypto: x86/twofish-avx - remove LRW algorithm · 876e9f0c

Eric Biggers authored Feb 19, 2018

The LRW template now wraps an ECB mode algorithm rather than the block
cipher directly. Therefore it is now redundant for crypto modules to
wrap their ECB code with generic LRW code themselves via lrw_crypt().

Remove the lrw-twofish-avx algorithm which did this. Users who request
lrw(twofish) and previously would have gotten lrw-twofish-avx will now
get lrw(ecb-twofish-avx) instead, which is just as fast.
Signed-off-by: Eric Biggers <ebiggers@google.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>

876e9f0c

crypto: x86/twofish-3way - convert to skcipher interface · 37992fa4

Eric Biggers authored Feb 19, 2018

Convert the 3-way implementation of Twofish from the (deprecated)
blkcipher interface over to the skcipher interface.
Signed-off-by: Eric Biggers <ebiggers@google.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>

37992fa4

crypto: x86/twofish-3way - remove XTS algorithm · ebeea983

Eric Biggers authored Feb 19, 2018

The XTS template now wraps an ECB mode algorithm rather than the block
cipher directly. Therefore it is now redundant for crypto modules to
wrap their ECB code with generic XTS code themselves via xts_crypt().

Remove the xts-twofish-3way algorithm which did this. Users who request
xts(twofish) and previously would have gotten xts-twofish-3way will now
get xts(ecb-twofish-3way) instead, which is just as fast.
Signed-off-by: Eric Biggers <ebiggers@google.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>

ebeea983

crypto: x86/twofish-3way - remove LRW algorithm · 68bfc492

Eric Biggers authored Feb 19, 2018

Remove the lrw-twofish-3way algorithm which did this. Users who request
lrw(twofish) and previously would have gotten lrw-twofish-3way will now
get lrw(ecb-twofish-3way) instead, which is just as fast.
Signed-off-by: Eric Biggers <ebiggers@google.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>

68bfc492

crypto: x86/serpent-avx,avx2 - convert to skcipher interface · e16bf974

Eric Biggers authored Feb 19, 2018

Convert the AVX and AVX2 implementations of Serpent from the
(deprecated) ablkcipher and blkcipher interfaces over to the skcipher
interface.  Note that this includes replacing the use of ablk_helper
with crypto_simd.
Signed-off-by: Eric Biggers <ebiggers@google.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>

e16bf974

crypto: x86/serpent-avx - remove LRW algorithm · 340b8303

Eric Biggers authored Feb 19, 2018

Remove the lrw-serpent-avx algorithm which did this. Users who request
lrw(serpent) and previously would have gotten lrw-serpent-avx will now
get lrw(ecb-serpent-avx) instead, which is just as fast.
Signed-off-by: Eric Biggers <ebiggers@google.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>

340b8303

crypto: x86/serpent-avx2 - remove LRW algorithm · e5f382e6

Eric Biggers authored Feb 19, 2018

Remove the lrw-serpent-avx2 algorithm which did this. Users who request
lrw(serpent) and previously would have gotten lrw-serpent-avx2 will now
get lrw(ecb-serpent-avx2) instead, which is just as fast.
Signed-off-by: Eric Biggers <ebiggers@google.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>

e5f382e6

crypto: x86/serpent-sse2 - convert to skcipher interface · e0f409dc

Eric Biggers authored Feb 19, 2018

Convert the SSE2 implementation of Serpent from the (deprecated)
ablkcipher and blkcipher interfaces over to the skcipher interface.
Note that this includes replacing the use of ablk_helper with
crypto_simd.
Signed-off-by: Eric Biggers <ebiggers@google.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>

e0f409dc

crypto: x86/serpent-sse2 - remove XTS algorithm · 8bab4e3c

Eric Biggers authored Feb 19, 2018

Remove the xts-serpent-sse2 algorithm which did this. Users who request
xts(serpent) and previously would have gotten xts-serpent-sse2 will now
get xts(ecb-serpent-sse2) instead, which is just as fast.
Signed-off-by: Eric Biggers <ebiggers@google.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>

8bab4e3c

crypto: x86/serpent-sse2 - remove LRW algorithm · 2a05cfc3

Eric Biggers authored Feb 19, 2018

Remove the lrw-serpent-sse2 algorithm which did this. Users who request
lrw(serpent) and previously would have gotten lrw-serpent-sse2 will now
get lrw(ecb-serpent-sse2) instead, which is just as fast.
Signed-off-by: Eric Biggers <ebiggers@google.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>

2a05cfc3

crypto: x86/glue_helper - add skcipher_walk functions · f15f2a25

Eric Biggers authored Feb 19, 2018

Add ECB, CBC, and CTR functions to glue_helper which use skcipher_walk
rather than blkcipher_walk.  This will allow converting the remaining
x86 algorithms from the blkcipher interface over to the skcipher
interface, after which we'll be able to remove the blkcipher_walk
versions of these functions.
Signed-off-by: Eric Biggers <ebiggers@google.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>

f15f2a25

crypto: simd - allow registering multiple algorithms at once · d14f0a1f

Eric Biggers authored Feb 19, 2018

Add a function to crypto_simd that registers an array of skcipher
algorithms, then allocates and registers the simd wrapper algorithms for
them.  It assumes the naming scheme where the names of the underlying
algorithms are prefixed with two underscores.

Also add the corresponding 'unregister' function.

Most of the x86 crypto modules will be able to use these.
Signed-off-by: Eric Biggers <ebiggers@google.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>

d14f0a1f

crypto: ccree - replace memset+kfree with kzfree · d800e343

Gilad Ben-Yossef authored Feb 19, 2018

Replace memset to 0 followed by kfree with kzfree for
simplicity.
Signed-off-by: Gilad Ben-Yossef <gilad@benyossef.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>

d800e343

crypto: ccree - add support for older HW revs · 27b3b22d

Gilad Ben-Yossef authored Feb 19, 2018

Add support for the legacy CryptoCell 630 and 710 revs.
Signed-off-by: Gilad Ben-Yossef <gilad@benyossef.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>

27b3b22d

dt-bindings: Add DT bindings for ccree 710 and 630p · 9d3a45ea

Gilad Ben-Yossef authored Feb 19, 2018

Add device tree bindings for Arm CryptoCell 710 and 630p hardware
revisions.
Signed-off-by: Gilad Ben-Yossef <gilad@benyossef.com>
Reviewed-by: Rob Herring <robh@kernel.org>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>

9d3a45ea

crypto: ccree - remove unused definitions · 61371392

Gilad Ben-Yossef authored Feb 19, 2018

Remove enum definition which are not used by the REE interface
driver.
Signed-off-by: Gilad Ben-Yossef <gilad@benyossef.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>

61371392

crypto: marvell/cesa - Clean up redundant #include · c42bd633

Robin Murphy authored Feb 19, 2018

The inclusion of dma-direct.h was only needed temporarily to prevent
breakage from the DMA API rework, since the actual CESA fix making it
redundant was merged in parallel. Now that both have landed, it can go.
Signed-off-by: Robin Murphy <robin.murphy@arm.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>

c42bd633

hwrng: stm32 - rework read timeout calculation · 279f4f8f

lionel.debieve@st.com authored Feb 15, 2018

Increase timeout delay to support longer timing linked
to rng initialization. Measurement is based on timer instead
of instructions per iteration which is not powerful on all
targets.
Signed-off-by: Lionel Debieve <lionel.debieve@st.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>

279f4f8f

dt-bindings: rng: add clock detection error for stm32 · a888df9b

lionel.debieve@st.com authored Feb 15, 2018

Add optional property to enable the clock detection error
on rng block. It is used to allow slow clock source which
give correct entropy for rng.
Signed-off-by: Lionel Debieve <lionel.debieve@st.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>

a888df9b

hwrng: stm32 - allow disable clock error detection · 529571ed

lionel.debieve@st.com authored Feb 15, 2018

Add a new property that allow to disable the clock error
detection which is required when the clock source selected
is out of specification (which is not mandatory).
Signed-off-by: Lionel Debieve <lionel.debieve@st.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>

529571ed

dt-bindings: rng: add reset node for stm32 · 695788fd

lionel.debieve@st.com authored Feb 15, 2018

Adding optional resets property for rng.
Signed-off-by: Lionel Debieve <lionel.debieve@st.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>

695788fd

hwrng: stm32 - add reset during probe · 326ed382

lionel.debieve@st.com authored Feb 15, 2018

Avoid issue when probing the RNG without
reset if bad status has been detected previously
Signed-off-by: Lionel Debieve <lionel.debieve@st.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>

326ed382

22 Feb, 2018 16 commits

crypto: ccree - fix memdup.cocci warnings · 01745706

Fengguang Wu authored Feb 16, 2018

drivers/crypto/ccree/cc_cipher.c:629:15-22: WARNING opportunity for kmemdep

 Use kmemdup rather than duplicating its implementation

Generated by: scripts/coccinelle/api/memdup.cocci

Fixes: 63ee04c8 ("crypto: ccree - add skcipher support")
CC: Gilad Ben-Yossef <gilad@benyossef.com>
Signed-off-by: Fengguang Wu <fengguang.wu@intel.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>

01745706

crypto: atmel - Delete error messages for a failed memory allocation in six functions · 02684839

Markus Elfring authored Feb 15, 2018

Omit extra messages for a memory allocation failure in these functions.

This issue was detected by using the Coccinelle software.
Signed-off-by: Markus Elfring <elfring@users.sourceforge.net>
Reviewed-by: Tudor Ambarus <tudor.ambarus@microchip.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>

02684839

crypto: bcm - Delete an error message for a failed memory allocation in do_shash() · 72e8d3f8

Markus Elfring authored Feb 14, 2018

Omit an extra message for a memory allocation failure in this function.

This issue was detected by using the Coccinelle software.
Signed-off-by: Markus Elfring <elfring@users.sourceforge.net>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>

72e8d3f8

crypto: bfin_crc - Delete an error message for a failed memory allocation in... · 5471f2e2

Markus Elfring authored Feb 14, 2018

crypto: bfin_crc - Delete an error message for a failed memory allocation in bfin_crypto_crc_probe()

Omit an extra message for a memory allocation failure in this function.

This issue was detected by using the Coccinelle software.
Signed-off-by: Markus Elfring <elfring@users.sourceforge.net>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>

5471f2e2

crypto: speck - add test vectors for Speck64-XTS · 41b3316e

Eric Biggers authored Feb 14, 2018

Add test vectors for Speck64-XTS, generated in userspace using C code.
The inputs were borrowed from the AES-XTS test vectors, with key lengths
adjusted.

xts-speck64-neon passes these tests.  However, they aren't currently
applicable for the generic XTS template, as that only supports a 128-bit
block size.
Signed-off-by: Eric Biggers <ebiggers@google.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>

41b3316e

crypto: speck - add test vectors for Speck128-XTS · c3bb521b

Eric Biggers authored Feb 14, 2018

Add test vectors for Speck128-XTS, generated in userspace using C code.
The inputs were borrowed from the AES-XTS test vectors.

Both xts(speck128-generic) and xts-speck128-neon pass these tests.
Signed-off-by: Eric Biggers <ebiggers@google.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>

c3bb521b

crypto: arm/speck - add NEON-accelerated implementation of Speck-XTS · ede96221

Eric Biggers authored Feb 14, 2018

Add an ARM NEON-accelerated implementation of Speck-XTS. It operates on
128-byte chunks at a time, i.e. 8 blocks for Speck128 or 16 blocks for
Speck64. Each 128-byte chunk goes through XTS preprocessing, then is
encrypted/decrypted (doing one cipher round for all the blocks, then the
next round, etc.), then goes through XTS postprocessing.

The performance depends on the processor but can be about 3 times faster
than the generic code. For example, on an ARMv7 processor we observe
the following performance with Speck128/256-XTS:

xts-speck128-neon: Encryption 107.9 MB/s, Decryption 108.1 MB/s
xts(speck128-generic): Encryption 32.1 MB/s, Decryption 36.6 MB/s

In comparison to AES-256-XTS without the Cryptography Extensions:

xts-aes-neonbs: Encryption 41.2 MB/s, Decryption 36.7 MB/s
xts(aes-asm): Encryption 31.7 MB/s, Decryption 30.8 MB/s
xts(aes-generic): Encryption 21.2 MB/s, Decryption 20.9 MB/s

Speck64/128-XTS is even faster:

xts-speck64-neon: Encryption 138.6 MB/s, Decryption 139.1 MB/s

Note that as with the generic code, only the Speck128 and Speck64
variants are supported. Also, for now only the XTS mode of operation is
supported, to target the disk and file encryption use cases. The NEON
code also only handles the portion of the data that is evenly divisible
into 128-byte chunks, with any remainder handled by a C fallback. Of
course, other modes of operation could be added later if needed, and/or
the NEON code could be updated to handle other buffer sizes.

The XTS specification is only defined for AES which has a 128-bit block
size, so for the GF(2^64) math needed for Speck64-XTS we use the
reducing polynomial 'x^64 + x^4 + x^3 + x + 1' given by the original XEX
paper. Of course, when possible users should use Speck128-XTS, but even
that may be too slow on some processors; Speck64-XTS can be faster.
Signed-off-by: Eric Biggers <ebiggers@google.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>

ede96221

crypto: speck - export common helpers · c8c36413

Eric Biggers authored Feb 14, 2018

Export the Speck constants and transform context and the ->setkey(),
->encrypt(), and ->decrypt() functions so that they can be reused by the
ARM NEON implementation of Speck-XTS.  The generic key expansion code
will be reused because it is not performance-critical and is not
vectorizable, while the generic encryption and decryption functions are
needed as fallbacks and for the XTS tweak encryption.
Signed-off-by: Eric Biggers <ebiggers@google.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>

c8c36413

crypto: speck - add support for the Speck block cipher · da7a0ab5

Eric Biggers authored Feb 14, 2018

Add a generic implementation of Speck, including the Speck128 and
Speck64 variants. Speck is a lightweight block cipher that can be much
faster than AES on processors that don't have AES instructions.

We are planning to offer Speck-XTS (probably Speck128/256-XTS) as an
option for dm-crypt and fscrypt on Android, for low-end mobile devices
with older CPUs such as ARMv7 which don't have the Cryptography
Extensions. Currently, such devices are unencrypted because AES is not
fast enough, even when the NEON bit-sliced implementation of AES is
used. Other AES alternatives such as Twofish, Threefish, Camellia,
CAST6, and Serpent aren't fast enough either; it seems that only a
modern ARX cipher can provide sufficient performance on these devices.

This is a replacement for our original proposal
(https://patchwork.kernel.org/patch/10101451/) which was to offer
ChaCha20 for these devices. However, the use of a stream cipher for
disk/file encryption with no space to store nonces would have been much
more insecure than we thought initially, given that it would be used on
top of flash storage as well as potentially on top of F2FS, neither of
which is guaranteed to overwrite data in-place.

Speck has been somewhat controversial due to its origin. Nevertheless,
it has a straightforward design (it's an ARX cipher), and it appears to
be the leading software-optimized lightweight block cipher currently,
with the most cryptanalysis. It's also easy to implement without side
channels, unlike AES. Moreover, we only intend Speck to be used when
the status quo is no encryption, due to AES not being fast enough.

We've also considered a novel length-preserving encryption mode based on
ChaCha20 and Poly1305. While theoretically attractive, such a mode
would be a brand new crypto construction and would be more complicated
and difficult to implement efficiently in comparison to Speck-XTS.

There is confusion about the byte and word orders of Speck, since the
original paper doesn't specify them. But we have implemented it using
the orders the authors recommended in a correspondence with them. The
test vectors are taken from the original paper but were mapped to byte
arrays using the recommended byte and word orders.
Signed-off-by: Eric Biggers <ebiggers@google.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>

da7a0ab5

crypto: aesni - Update aesni-intel_glue to use scatter/gather · e8455207

Dave Watson authored Feb 14, 2018

Add gcmaes_crypt_by_sg routine, that will do scatter/gather
by sg. Either src or dst may contain multiple buffers, so
iterate over both at the same time if they are different.
If the input is the same as the output, iterate only over one.

Currently both the AAD and TAG must be linear, so copy them out
with scatterlist_map_and_copy.  If first buffer contains the
entire AAD, we can optimize and not copy.   Since the AAD
can be any size, if copied it must be on the heap.  TAG can
be on the stack since it is always < 16 bytes.

Only the SSE routines are updated so far, so leave the previous
gcmaes_en/decrypt routines, and branch to the sg ones if the
keysize is inappropriate for avx, or we are SSE only.
Signed-off-by: Dave Watson <davejwatson@fb.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>

e8455207

crypto: aesni - Introduce scatter/gather asm function stubs · fb8986e6

Dave Watson authored Feb 14, 2018

The asm macros are all set up now, introduce entry points.

GCM_INIT and GCM_COMPLETE have arguments supplied, so that
the new scatter/gather entry points don't have to take all the
arguments, and only the ones they need.
Signed-off-by: Dave Watson <davejwatson@fb.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>

fb8986e6

crypto: aesni - Add fast path for > 16 byte update · 933d6aef

Dave Watson authored Feb 14, 2018

We can fast-path any < 16 byte read if the full message is > 16 bytes,
and shift over by the appropriate amount.  Usually we are
reading > 16 bytes, so this should be faster than the READ_PARTIAL
macro introduced in b20209c9 for the average case.
Signed-off-by: Dave Watson <davejwatson@fb.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>

933d6aef

crypto: aesni - Introduce partial block macro · ae952c5e

Dave Watson authored Feb 14, 2018

Before this diff, multiple calls to GCM_ENC_DEC will
succeed, but only if all calls are a multiple of 16 bytes.

Handle partial blocks at the start of GCM_ENC_DEC, and update
aadhash as appropriate.

The data offset %r11 is also updated after the partial block.
Signed-off-by: Dave Watson <davejwatson@fb.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>

ae952c5e

crypto: aesni - Move HashKey computation from stack to gcm_context · 1476db2d

Dave Watson authored Feb 14, 2018

HashKey computation only needs to happen once per scatter/gather operation,
save it between calls in gcm_context struct instead of on the stack.
Since the asm no longer stores anything on the stack, we can use
%rsp directly, and clean up the frame save/restore macros a bit.

Hashkeys actually only need to be calculated once per key and could
be moved to when set_key is called, however, the current glue code
falls back to generic aes code if fpu is disabled.
Signed-off-by: Dave Watson <davejwatson@fb.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>

1476db2d

crypto: aesni - Move ghash_mul to GCM_COMPLETE · e2e34b08

Dave Watson authored Feb 14, 2018

Prepare to handle partial blocks between scatter/gather calls.
For the last partial block, we only want to calculate the aadhash
in GCM_COMPLETE, and a new partial block macro will handle both
aadhash update and encrypting partial blocks between calls.
Signed-off-by: Dave Watson <davejwatson@fb.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>

e2e34b08

crypto: aesni - Fill in new context data structures · 9660474b

Dave Watson authored Feb 14, 2018

Fill in aadhash, aadlen, pblocklen, curcount with appropriate values.
pblocklen, aadhash, and pblockenckey are also updated at the end
of each scatter/gather operation, to be carried over to the next
operation.
Signed-off-by: Dave Watson <davejwatson@fb.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>

9660474b