Commits · b62b3db76f73c5a8cc132f78e7174bf57d582182 · Kirill Smelkov / linux

12 Jan, 2018 17 commits

crypto: salsa20-generic - cleanup and convert to skcipher API · b62b3db7

Eric Biggers authored Jan 05, 2018

Convert salsa20-generic from the deprecated "blkcipher" API to the
"skcipher" API, in the process fixing it up to be thread-safe (as the
crypto API expects) by maintaining each request's state separately from
the transform context.

Also remove the unnecessary cra_alignmask and tighten validation of the
key size by accepting only 16 or 32 bytes, not anything in between.

These changes bring the code close to the way chacha20-generic does
things, so hopefully it will be easier to maintain in the future.

However, the way Salsa20 interprets the IV is still slightly different;
that was not changed.
Signed-off-by: Eric Biggers <ebiggers@google.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>

b62b3db7

crypto: aes-generic - build with -Os on gcc-7+ · 148b974d

Arnd Bergmann authored Jan 03, 2018

While testing other changes, I discovered that gcc-7.2.1 produces badly
optimized code for aes_encrypt/aes_decrypt. This is especially true when
CONFIG_UBSAN_SANITIZE_ALL is enabled, where it leads to extremely
large stack usage that in turn might cause kernel stack overflows:

crypto/aes_generic.c: In function 'aes_encrypt':
crypto/aes_generic.c:1371:1: warning: the frame size of 4880 bytes is larger than 2048 bytes [-Wframe-larger-than=]
crypto/aes_generic.c: In function 'aes_decrypt':
crypto/aes_generic.c:1441:1: warning: the frame size of 4864 bytes is larger than 2048 bytes [-Wframe-larger-than=]

I verified that this problem exists on all architectures that are
supported by gcc-7.2, though arm64 in particular is less affected than
the others. I also found that gcc-7.1 and gcc-8 do not show the extreme
stack usage but still produce worse code than earlier versions for this
file, apparently because of optimization passes that generally provide
a substantial improvement in object code quality but understandably fail
to find any shortcuts in the AES algorithm.

Possible workarounds include

a) disabling -ftree-pre and -ftree-sra optimizations, this was an earlier
   patch I tried, which reliably fixed the stack usage, but caused a
   serious performance regression in some versions, as later testing
   found.

b) disabling UBSAN on this file or all ciphers, as suggested by Ard
   Biesheuvel. This would lead to massively better crypto performance in
   UBSAN-enabled kernels and avoid the stack usage, but there is a concern
   over whether we should exclude arbitrary files from UBSAN at all.

c) Forcing the optimization level in a different way. Similar to a),
   but rather than deselecting specific optimization stages,
   this now uses "gcc -Os" for this file, regardless of the
   CONFIG_CC_OPTIMIZE_FOR_PERFORMANCE/SIZE option. This is a reliable
   workaround for the stack consumption on all architecture, and I've
   retested the performance results now on x86, cycles/byte (lower is
   better) for cbc(aes-generic) with 256 bit keys:

			-O2     -Os
	gcc-6.3.1	14.9	15.1
	gcc-7.0.1	14.7	15.3
	gcc-7.1.1	15.3	14.7
	gcc-7.2.1	16.8	15.9
	gcc-8.0.0	15.5	15.6

This implements the option c) by enabling forcing -Os on all compiler
versions starting with gcc-7.1. As a workaround for PR83356, it would
only be needed for gcc-7.2+ with UBSAN enabled, but since it also shows
better performance on gcc-7.1 without UBSAN, it seems appropriate to
use the faster version here as well.

Side note: during testing, I also played with the AES code in libressl,
which had a similar performance regression from gcc-6 to gcc-7.2,
but was three times slower overall. It might be interesting to
investigate that further and possibly port the Linux implementation
into that.

Link: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=83356
Link: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=83651
Cc: Richard Biener <rguenther@suse.de>
Cc: Jakub Jelinek <jakub@gcc.gnu.org>
Cc: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Acked-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>

148b974d

crypto: aead - prevent using AEADs without setting key · dc26c17f

Eric Biggers authored Jan 03, 2018

Similar to what was done for the hash API, update the AEAD API to track
whether each transform has been keyed, and reject encryption/decryption
if a key is needed but one hasn't been set.

This isn't quite as important as the equivalent fix for the hash API
because AEADs always require a key, so are unlikely to be used without
one. Still, tracking the key will prevent accidental unkeyed use.
algif_aead also had to track the key anyway, so the new flag replaces
that and slightly simplifies the algif_aead implementation.
Signed-off-by: Eric Biggers <ebiggers@google.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>

dc26c17f

crypto: skcipher - prevent using skciphers without setting key · f8d33fac

Eric Biggers authored Jan 03, 2018

Similar to what was done for the hash API, update the skcipher API to
track whether each transform has been keyed, and reject
encryption/decryption if a key is needed but one hasn't been set.

This isn't as important as the equivalent fix for the hash API because
symmetric ciphers almost always require a key (the "null cipher" is the
only exception), so are unlikely to be used without one. Still,
tracking the key will prevent accidental unkeyed use. algif_skcipher
also had to track the key anyway, so the new flag replaces that and
simplifies the algif_skcipher implementation.
Signed-off-by: Eric Biggers <ebiggers@google.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>

f8d33fac

crypto: ghash - remove checks for key being set · 4e1d14bc

Eric Biggers authored Jan 03, 2018

Now that the crypto API prevents a keyed hash from being used without
setting the key, there's no need for GHASH to do this check itself.
Signed-off-by: Eric Biggers <ebiggers@google.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>

4e1d14bc

crypto: hash - prevent using keyed hashes without setting key · 9fa68f62

Eric Biggers authored Jan 03, 2018

Currently, almost none of the keyed hash algorithms check whether a key
has been set before proceeding.  Some algorithms are okay with this and
will effectively just use a key of all 0's or some other bogus default.
However, others will severely break, as demonstrated using
"hmac(sha3-512-generic)", the unkeyed use of which causes a kernel crash
via a (potentially exploitable) stack buffer overflow.

A while ago, this problem was solved for AF_ALG by pairing each hash
transform with a 'has_key' bool.  However, there are still other places
in the kernel where userspace can specify an arbitrary hash algorithm by
name, and the kernel uses it as unkeyed hash without checking whether it
is really unkeyed.  Examples of this include:

    - KEYCTL_DH_COMPUTE, via the KDF extension
    - dm-verity
    - dm-crypt, via the ESSIV support
    - dm-integrity, via the "internal hash" mode with no key given
    - drbd (Distributed Replicated Block Device)

This bug is especially bad for KEYCTL_DH_COMPUTE as that requires no
privileges to call.

Fix the bug for all users by adding a flag CRYPTO_TFM_NEED_KEY to the
->crt_flags of each hash transform that indicates whether the transform
still needs to be keyed or not.  Then, make the hash init, import, and
digest functions return -ENOKEY if the key is still needed.

The new flag also replaces the 'has_key' bool which algif_hash was
previously using, thereby simplifying the algif_hash implementation.
Reported-by: syzbot <syzkaller@googlegroups.com>
Cc: stable@vger.kernel.org
Signed-off-by: Eric Biggers <ebiggers@google.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>

9fa68f62

crypto: hash - annotate algorithms taking optional key · a208fa8f

Eric Biggers authored Jan 03, 2018

We need to consistently enforce that keyed hashes cannot be used without
setting the key.  To do this we need a reliable way to determine whether
a given hash algorithm is keyed or not.  AF_ALG currently does this by
checking for the presence of a ->setkey() method.  However, this is
actually slightly broken because the CRC-32 algorithms implement
->setkey() but can also be used without a key.  (The CRC-32 "key" is not
actually a cryptographic key but rather represents the initial state.
If not overridden, then a default initial state is used.)

Prepare to fix this by introducing a flag CRYPTO_ALG_OPTIONAL_KEY which
indicates that the algorithm has a ->setkey() method, but it is not
required to be called.  Then set it on all the CRC-32 algorithms.

The same also applies to the Adler-32 implementation in Lustre.

Also, the cryptd and mcryptd templates have to pass through the flag
from their underlying algorithm.

Cc: stable@vger.kernel.org
Signed-off-by: Eric Biggers <ebiggers@google.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>

a208fa8f

crypto: poly1305 - remove ->setkey() method · a16e772e

Eric Biggers authored Jan 03, 2018

Since Poly1305 requires a nonce per invocation, the Linux kernel
implementations of Poly1305 don't use the crypto API's keying mechanism
and instead expect the key and nonce as the first 32 bytes of the data.
But ->setkey() is still defined as a stub returning an error code.  This
prevents Poly1305 from being used through AF_ALG and will also break it
completely once we start enforcing that all crypto API users (not just
AF_ALG) call ->setkey() if present.

Fix it by removing crypto_poly1305_setkey(), leaving ->setkey as NULL.

Cc: stable@vger.kernel.org
Signed-off-by: Eric Biggers <ebiggers@google.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>

a16e772e

crypto: mcryptd - pass through absence of ->setkey() · fa59b92d

Eric Biggers authored Jan 03, 2018

When the mcryptd template is used to wrap an unkeyed hash algorithm,
don't install a ->setkey() method to the mcryptd instance.  This change
is necessary for mcryptd to keep working with unkeyed hash algorithms
once we start enforcing that ->setkey() is called when present.

Cc: stable@vger.kernel.org
Signed-off-by: Eric Biggers <ebiggers@google.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>

fa59b92d

crypto: cryptd - pass through absence of ->setkey() · 841a3ff3

Eric Biggers authored Jan 03, 2018

When the cryptd template is used to wrap an unkeyed hash algorithm,
don't install a ->setkey() method to the cryptd instance.  This change
is necessary for cryptd to keep working with unkeyed hash algorithms
once we start enforcing that ->setkey() is called when present.

Cc: stable@vger.kernel.org
Signed-off-by: Eric Biggers <ebiggers@google.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>

841a3ff3

crypto: hash - introduce crypto_hash_alg_has_setkey() · cd6ed77a

Eric Biggers authored Jan 03, 2018

Templates that use an shash spawn can use crypto_shash_alg_has_setkey()
to determine whether the underlying algorithm requires a key or not.
But there was no corresponding function for ahash spawns.  Add it.

Note that the new function actually has to support both shash and ahash
algorithms, since the ahash API can be used with either.

Cc: stable@vger.kernel.org
Signed-off-by: Eric Biggers <ebiggers@google.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>

cd6ed77a

crypto: tcrypt - free xoutbuf instead of axbuf · c6ba4f3e

Colin Ian King authored Jan 02, 2018

There seems to be a cut-n-paste bug with the name of the buffer being
free'd, xoutbuf should be used instead of axbuf.

Detected by CoverityScan, CID#1463420 ("Copy-paste error")

Fixes: 427988d9 ("crypto: tcrypt - add multibuf aead speed test")
Signed-off-by: Colin Ian King <colin.king@canonical.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>

c6ba4f3e

crypto: tcrypt - fix spelling mistake: "bufufer"-> "buffer" · 38dbe2d1

Colin Ian King authored Jan 02, 2018

Trivial fix to spelling mistakes in pr_err error message text.
Signed-off-by: Colin Ian King <colin.king@canonical.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>

38dbe2d1

crypto: af_alg - whitelist mask and type · bb30b884

Stephan Mueller authored Jan 02, 2018

The user space interface allows specifying the type and mask field used
to allocate the cipher. Only a subset of the possible flags are intended
for user space. Therefore, white-list the allowed flags.

In case the user space caller uses at least one non-allowed flag, EINVAL
is returned.
Reported-by: syzbot <syzkaller@googlegroups.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Stephan Mueller <smueller@chronox.de>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>

bb30b884

crypto: testmgr - change `guard` to unsigned char · da1729ce

Joey Pabalinas authored Jan 01, 2018

When char is signed, storing the values 0xba (186) and 0xad (173) in the
`guard` array produces signed overflow. Change the type of `guard` to
static unsigned char to correct undefined behavior and reduce function
stack usage.
Signed-off-by: Joey Pabalinas <joeypabalinas@gmail.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>

da1729ce

crypto: chacha20 - use rol32() macro from bitops.h · 7660b1fb

Eric Biggers authored Dec 31, 2017

For chacha20_block(), use the existing 32-bit left-rotate function
instead of defining one ourselves.
Signed-off-by: Eric Biggers <ebiggers@google.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>

7660b1fb

crypto: Use zeroing memory allocator instead of allocator/memset · 75d68369

Himanshu Jha authored Dec 31, 2017

Use dma_zalloc_coherent for allocating zeroed
memory and remove unnecessary memset function.

Done using Coccinelle.
Generated-by: scripts/coccinelle/api/alloc/kzalloc-simple.cocci
0-day tested with no failures.
Signed-off-by: Himanshu Jha <himanshujha199640@gmail.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>

75d68369

05 Jan, 2018 14 commits

crypto: x86/poly1305 - remove cra_alignmask · b7dac373

Eric Biggers authored Dec 29, 2017

crypto_poly1305_final() no longer requires a cra_alignmask, and nothing
else in the x86 poly1305-simd implementation does either.  So remove the
cra_alignmask so that the crypto API does not have to unnecessarily
align the buffers.
Signed-off-by: Eric Biggers <ebiggers@google.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>

b7dac373

crypto: poly1305 - remove cra_alignmask · 4c7dfbd4

Eric Biggers authored Dec 29, 2017

Now that nothing in poly1305-generic assumes any special alignment,
remove the cra_alignmask so that the crypto API does not have to
unnecessarily align the buffers.
Signed-off-by: Eric Biggers <ebiggers@google.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>

4c7dfbd4

crypto: poly1305 - use unaligned access macros to output digest · fcfbeedf

Eric Biggers authored Dec 29, 2017

Currently the only part of poly1305-generic which is assuming special
alignment is the part where the final digest is written.  Switch this
over to the unaligned access macros so that we'll be able to remove the
cra_alignmask.
Signed-off-by: Eric Biggers <ebiggers@google.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>

fcfbeedf

crypto: algapi - remove unused notifications · 8b55107c

Eric Biggers authored Dec 29, 2017

There is a message posted to the crypto notifier chain when an algorithm
is unregistered, and when a template is registered or unregistered.  But
nothing is listening for those messages; currently there are only
listeners for the algorithm request and registration messages.

Get rid of these unused notifications for now.
Signed-off-by: Eric Biggers <ebiggers@google.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>

8b55107c

crypto: algapi - convert cra_refcnt to refcount_t · ce8614a3

Eric Biggers authored Dec 29, 2017

Reference counters should use refcount_t rather than atomic_t, since the
refcount_t implementation can prevent overflows, reducing the
exploitability of reference leak bugs.  crypto_alg.cra_refcount is a
reference counter with the usual semantics, so switch it over to
refcount_t.
Signed-off-by: Eric Biggers <ebiggers@google.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>

ce8614a3

crypto: inside-secure - fix hash when length is a multiple of a block · 809778e0

Antoine Ténart authored Dec 26, 2017

This patch fixes the hash support in the SafeXcel driver when the update
size is a multiple of a block size, and when a final call is made just
after with a size of 0. In such cases the driver should cache the last
block from the update to avoid handling 0 length data on the final call
(that's a hardware limitation).

Cc: stable@vger.kernel.org
Fixes: 1b44c5a6 ("crypto: inside-secure - add SafeXcel EIP197 crypto engine driver")
Signed-off-by: Antoine Tenart <antoine.tenart@free-electrons.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>

809778e0

crypto: inside-secure - avoid unmapping DMA memory that was not mapped · c957f8b3

Antoine Ténart authored Dec 26, 2017

This patch adds a parameter in the SafeXcel ahash request structure to
keep track of the number of SG entries mapped. This allows not to call
dma_unmap_sg() when dma_map_sg() wasn't called in the first place. This
also removes a warning when the debugging of the DMA-API is enabled in
the kernel configuration: "DMA-API: device driver tries to free DMA
memory it has not allocated".

Cc: stable@vger.kernel.org
Fixes: 1b44c5a6 ("crypto: inside-secure - add SafeXcel EIP197 crypto engine driver")
Signed-off-by: Antoine Tenart <antoine.tenart@free-electrons.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>

c957f8b3

crypto: crypto4xx - perform aead icv check in the driver · 0b5a7f71

Christian Lamparter authored Dec 23, 2017

The ccm-aes-ppc4xx now fails one of testmgr's expected
failure test cases as such:

|decryption failed on test 10 for ccm-aes-ppc4xx:
|ret was 0, |expected -EBADMSG

It doesn't look like the hardware sets the authentication failure
flag. The original vendor source from which this was ported does
not have any special code or notes about why this would happen or
if there are any WAs.

Hence, this patch converts the aead_done callback handler to
perform the icv check in the driver. And this fixes the false
negative and the ccm-aes-ppc4xx passes the selftests once again.

|name         : ccm(aes)
|driver       : ccm-aes-ppc4xx
|module       : crypto4xx
|priority     : 300
|refcnt       : 1
|selftest     : passed
|internal     : no
|type         : aead
|async        : yes
|blocksize    : 1
|ivsize       : 16
|maxauthsize  : 16
|geniv        : <none>
Signed-off-by: Christian Lamparter <chunkeey@gmail.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>

0b5a7f71

crypto: crypto4xx - kill MODULE_NAME · 333b1928

Christian Lamparter authored Dec 22, 2017

KBUILD_MODNAME provides the same value.
Signed-off-by: Christian Lamparter <chunkeey@gmail.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>

333b1928

crypto: crypto4xx - fix missing irq devname · 57268aba

Christian Lamparter authored Dec 22, 2017

crypto4xx_device's name variable is not set to anything.
The common devname for request_irq seems to be the module
name. This will fix the seemingly anonymous interrupt
entry in /proc/interrupts for crypto4xx.
Signed-off-by: Christian Lamparter <chunkeey@gmail.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>

57268aba

crypto: crypto4xx - support Revision B parts · b66c685a

Christian Lamparter authored Dec 22, 2017

This patch adds support for the crypto4xx RevB cores
found in the 460EX, 460SX and later cores (like the APM821xx).

Without this patch, the crypto4xx driver will not be
able to process any offloaded requests and simply hang
indefinitely.
Signed-off-by: Christian Lamparter <chunkeey@gmail.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>

b66c685a

crypto: crypto4xx - shuffle iomap in front of request_irq · b0a191ce

Christian Lamparter authored Dec 22, 2017

It is possible to avoid the ce_base null pointer check in the
drivers' interrupt handler routine "crypto4xx_ce_interrupt_handler()"
by simply doing the iomap in front of the IRQ registration.

This way, the ce_base will always be valid in the handler and
a branch in an critical path can be avoided.
Signed-off-by: Christian Lamparter <chunkeey@gmail.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>

b0a191ce

hwrng: exynos - add Samsung Exynos True RNG driver · 6cd225cc

Łukasz Stelmach authored Dec 22, 2017

Add support for True Random Number Generator found in Samsung Exynos
5250+ SoCs.
Signed-off-by: Łukasz Stelmach <l.stelmach@samsung.com>
Reviewed-by: Krzysztof Kozlowski <krzk@kernel.org>
Acked-by: Philippe Ombredanne <pombredanne@nexb.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>

6cd225cc

padata: add SPDX identifier · 08b21fbf

Cheah Kok Cheong authored Dec 21, 2017

Add SPDX license identifier according to the type of license text found
in the file.

Cc: Philippe Ombredanne <pombredanne@nexb.com>
Signed-off-by: Cheah Kok Cheong <thrust73@gmail.com>
Acked-by: Steffen Klassert <steffen.klassert@secunet.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>

08b21fbf

28 Dec, 2017 9 commits

crypto: aesni - Fix out-of-bounds access of the AAD buffer in generic-gcm-aesni · 1ecdd37e

Junaid Shahid authored Dec 20, 2017

The aesni_gcm_enc/dec functions can access memory after the end of
the AAD buffer if the AAD length is not a multiple of 4 bytes.
It didn't matter with rfc4106-gcm-aesni as in that case the AAD was
always followed by the 8 byte IV, but that is no longer the case with
generic-gcm-aesni. This can potentially result in accessing a page that
is not mapped and thus causing the machine to crash. This patch fixes
that by reading the last <16 byte block of the AAD byte-by-byte and
optionally via an 8-byte load if the block was at least 8 bytes.

Fixes: 0487ccac ("crypto: aesni - make non-AVX AES-GCM work with any aadlen")
Cc: <stable@vger.kernel.org>
Signed-off-by: Junaid Shahid <junaids@google.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>

1ecdd37e

crypto: aesni - Fix out-of-bounds access of the data buffer in generic-gcm-aesni · b20209c9

Junaid Shahid authored Dec 20, 2017

The aesni_gcm_enc/dec functions can access memory before the start of
the data buffer if the length of the data buffer is less than 16 bytes.
This is because they perform the read via a single 16-byte load. This
can potentially result in accessing a page that is not mapped and thus
causing the machine to crash. This patch fixes that by reading the
partial block byte-by-byte and optionally an via 8-byte load if the block
was at least 8 bytes.

Fixes: 0487ccac ("crypto: aesni - make non-AVX AES-GCM work with any aadlen")
Cc: <stable@vger.kernel.org>
Signed-off-by: Junaid Shahid <junaids@google.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>

b20209c9

crypto: stm32 - Use standard CONFIG name · 02d9e320

Corentin Labbe authored Dec 20, 2017

All hardware crypto devices have their CONFIG names using the following
convention:
CRYPTO_DEV_name_algo

This patch apply this conventions on STM32 CONFIG names.
Signed-off-by: Corentin Labbe <clabbe@baylibre.com>
Reviewed-by: Fabien Dessenne <fabien.dessenne@st.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>

02d9e320

crypto: scomp - delete unused comments · 14359bd7

Zhou Wang authored Dec 20, 2017

There are no init and exit callbacks, so delete its comments.
Signed-off-by: Zhou Wang <wangzhou1@hisilicon.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>

14359bd7

crypto: caam - add Derived Key Protocol (DKP) support · 7e0880b9

Horia Geantă authored Dec 19, 2017

Offload split key generation in CAAM engine, using DKP.
DKP is supported starting with Era 6.

Note that the way assoclen is transmitted from the job descriptor
to the shared descriptor changes - DPOVRD register is used instead
of MATH3 (where available), since DKP protocol thrashes the MATH
registers.

The replacement of MDHA split key generation with DKP has the side
effect of the crypto engine writing the authentication key, and thus
the DMA mapping direction for the buffer holding the key has to change
from DMA_TO_DEVICE to DMA_BIDIRECTIONAL.
There are two cases:
-key is inlined in descriptor - descriptor buffer mapping changes
-key is referenced - key buffer mapping changes
Signed-off-by: Horia Geantă <horia.geanta@nxp.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>

7e0880b9

crypto: caam - save Era in driver's private data · 9fe712df

Horia Geantă authored Dec 19, 2017

Save Era in driver's private data for further usage,
like deciding whether an erratum applies or a feature is available
based on its value.
Signed-off-by: Horia Geantă <horia.geanta@nxp.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>

9fe712df

crypto: caam - remove needless ablkcipher key copy · 662f70ed

Horia Geantă authored Dec 19, 2017

ablkcipher shared descriptors are relatively small, thus there is enough
space for the key to be inlined.
Accordingly, there is no need to copy the key in ctx->key.
Signed-off-by: Horia Geantă <horia.geanta@nxp.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>

662f70ed

crypto: caam - constify key data · 6674a4fd

Horia Geantă authored Dec 19, 2017

Key data is not modified, it is copied in the shared descriptor.
Signed-off-by: Horia Geantă <horia.geanta@nxp.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>

6674a4fd

crypto: x86/twofish-3way - Fix %rbp usage · d8c7fe9f

Eric Biggers authored Dec 18, 2017

Using %rbp as a temporary register breaks frame pointer convention and
breaks stack traces when unwinding from an interrupt in the crypto code.

In twofish-3way, we can't simply replace %rbp with another register
because there are none available.  Instead, we use the stack to hold the
values that %rbp, %r11, and %r12 were holding previously.  Each of these
values represents the half of the output from the previous Feistel round
that is being passed on unchanged to the following round.  They are only
used once per round, when they are exchanged with %rax, %rbx, and %rcx.

As a result, we free up 3 registers (one per block) and can reassign
them so that %rbp is not used, and additionally %r14 and %r15 are not
used so they do not need to be saved/restored.

There may be a small overhead caused by replacing 'xchg REG, REG' with
the needed sequence 'mov MEM, REG; mov REG, MEM; mov REG, REG' once per
round.  But, counterintuitively, when I tested "ctr-twofish-3way" on a
Haswell processor, the new version was actually about 2% faster.
(Perhaps 'xchg' is not as well optimized as plain moves.)
Reported-by: syzbot <syzkaller@googlegroups.com>
Signed-off-by: Eric Biggers <ebiggers@google.com>
Reviewed-by: Josh Poimboeuf <jpoimboe@redhat.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>

d8c7fe9f