Commits · c2b24c36e0a30ebd8fc7d068da7f0451f2c05c76 · Kirill Smelkov / linux

25 Aug, 2018 7 commits

crypto: arm64/aes-gcm-ce - fix scatterwalk API violation · c2b24c36

Ard Biesheuvel authored Aug 20, 2018

Commit 71e52c27 ("crypto: arm64/aes-ce-gcm - operate on
two input blocks at a time") modified the granularity at which
the AES/GCM code processes its input to allow subsequent changes
to be applied that improve performance by using aggregation to
process multiple input blocks at once.

For this reason, it doubled the algorithm's 'chunksize' property
to 2 x AES_BLOCK_SIZE, but retained the non-SIMD fallback path that
processes a single block at a time. In some cases, this violates the
skcipher scatterwalk API, by calling skcipher_walk_done() with a
non-zero residue value for a chunk that is expected to be handled
in its entirety. This results in a WARN_ON() to be hit by the TLS
self test code, but is likely to break other user cases as well.
Unfortunately, none of the current test cases exercises this exact
code path at the moment.

Fixes: 71e52c27 ("crypto: arm64/aes-ce-gcm - operate on two ...")
Reported-by: Vakul Garg <vakul.garg@nxp.com>
Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Tested-by: Vakul Garg <vakul.garg@nxp.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>

c2b24c36

crypto: aesni - Use unaligned loads from gcm_context_data · e5b954e8

Dave Watson authored Aug 15, 2018

A regression was reported bisecting to 1476db2d
"Move HashKey computation from stack to gcm_context".  That diff
moved HashKey computation from the stack, which was explicitly aligned
in the asm, to a struct provided from the C code, depending on
AESNI_ALIGN_ATTR for alignment.   It appears some compilers may not
align this struct correctly, resulting in a crash on the movdqa
instruction when attempting to encrypt or decrypt data.

Fix by using unaligned loads for the HashKeys.  On modern
hardware there is no perf difference between the unaligned and
aligned loads.  All other accesses to gcm_context_data already use
unaligned loads.
Reported-by: Mauro Rossi <issor.oruam@gmail.com>
Fixes: 1476db2d ("Move HashKey computation from stack to gcm_context")
Cc: <stable@vger.kernel.org>
Signed-off-by: Dave Watson <davejwatson@fb.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>

e5b954e8

crypto: chtls - fix null dereference chtls_free_uld() · 65b2c12d

Ganesh Goudar authored Aug 10, 2018

call chtls_free_uld() only for the initialized cdev,
this fixes NULL dereference in chtls_free_uld()
Signed-off-by: Ganesh Goudar <ganeshgr@chelsio.com>
Signed-off-by: Atul Gupta <atul.gupta@chelsio.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>

65b2c12d

crypto: arm64/sm4-ce - check for the right CPU feature bit · 7fa885e2

Ard Biesheuvel authored Aug 07, 2018

ARMv8.2 specifies special instructions for the SM3 cryptographic hash
and the SM4 symmetric cipher. While it is unlikely that a core would
implement one and not the other, we should only use SM4 instructions
if the SM4 CPU feature bit is set, and we currently check the SM3
feature bit instead. So fix that.

Fixes: e99ce921 ("crypto: arm64 - add support for SM4...")
Cc: <stable@vger.kernel.org>
Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>

7fa885e2

crypto: caam - fix DMA mapping direction for RSA forms 2 & 3 · f1bf9e60

Horia Geantă authored Aug 06, 2018

Crypto engine needs some temporary locations in external memory for
running RSA decrypt forms 2 and 3 (CRT).
These are named "tmp1" and "tmp2" in the PDB.

Update DMA mapping direction of tmp1 and tmp2 from TO_DEVICE to
BIDIRECTIONAL, since engine needs r/w access.

Cc: <stable@vger.kernel.org> # 4.13+
Fixes: 52e26d77 ("crypto: caam - add support for RSA key form 2")
Fixes: 4a651b12 ("crypto: caam - add support for RSA key form 3")
Signed-off-by: Horia Geantă <horia.geanta@nxp.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>

f1bf9e60

crypto: caam/qi - fix error path in xts setkey · ad876a18

Horia Geantă authored Aug 06, 2018

xts setkey callback returns 0 on some error paths.
Fix this by returning -EINVAL.

Cc: <stable@vger.kernel.org> # 4.12+
Fixes: b189817c ("crypto: caam/qi - add ablkcipher and authenc algorithms")
Signed-off-by: Horia Geantă <horia.geanta@nxp.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>

ad876a18

crypto: caam/jr - fix descriptor DMA unmapping · cc98963d

Horia Geantă authored Aug 06, 2018

Descriptor address needs to be swapped to CPU endianness before being
DMA unmapped.

Cc: <stable@vger.kernel.org> # 4.8+
Fixes: 261ea058 ("crypto: caam - handle core endianness != caam endianness")
Reported-by: Laurentiu Tudor <laurentiu.tudor@nxp.com>
Signed-off-by: Horia Geantă <horia.geanta@nxp.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>

cc98963d

07 Aug, 2018 9 commits

crypto: arm64/ghash-ce - implement 4-way aggregation · 22240df7

Ard Biesheuvel authored Aug 04, 2018

Enhance the GHASH implementation that uses 64-bit polynomial
multiplication by adding support for 4-way aggregation. This
more than doubles the performance, from 2.4 cycles per byte
to 1.1 cpb on Cortex-A53.
Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>

22240df7

crypto: arm64/ghash-ce - replace NEON yield check with block limit · 8e492eff

Ard Biesheuvel authored Aug 04, 2018

Checking the TIF_NEED_RESCHED flag is disproportionately costly on cores
with fast crypto instructions and comparatively slow memory accesses.

On algorithms such as GHASH, which executes at ~1 cycle per byte on
cores that implement support for 64 bit polynomial multiplication,
there is really no need to check the TIF_NEED_RESCHED particularly
often, and so we can remove the NEON yield check from the assembler
routines.

However, unlike the AEAD or skcipher APIs, the shash/ahash APIs take
arbitrary input lengths, and so there needs to be some sanity check
to ensure that we don't hog the CPU for excessive amounts of time.

So let's simply cap the maximum input size that is processed in one go
to 64 KB.
Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>

8e492eff

crypto: hisilicon - sec_send_request() can be static · 8418cf54

kbuild test robot authored Aug 04, 2018

Fixes: 915e4e84 ("crypto: hisilicon - SEC security accelerator driver")
Signed-off-by: kbuild test robot <fengguang.wu@intel.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>

8418cf54

lib/mpi: remove redundant variable esign · 6122bbbd

Colin Ian King authored Jul 30, 2018

Variable esign is being assigned but is never used hence it is
redundant and can be removed.

Cleans up clang warning:
warning: variable 'esign' set but not used [-Wunused-but-set-variable]
Signed-off-by: Colin Ian King <colin.king@canonical.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>

6122bbbd

crypto: arm64/aes-ce-gcm - don't reload key schedule if avoidable · 30f1a9f5

Ard Biesheuvel authored Jul 30, 2018

Squeeze out another 5% of performance by minimizing the number
of invocations of kernel_neon_begin()/kernel_neon_end() on the
common path, which also allows some reloads of the key schedule
to be optimized away.

The resulting code runs at 2.3 cycles per byte on a Cortex-A53.
Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>

30f1a9f5

crypto: arm64/aes-ce-gcm - implement 2-way aggregation · e0bd888d

Ard Biesheuvel authored Jul 30, 2018

Implement a faster version of the GHASH transform which amortizes
the reduction modulo the characteristic polynomial across two
input blocks at a time.

On a Cortex-A53, the gcm(aes) performance increases 24%, from
3.0 cycles per byte to 2.4 cpb for large input sizes.
Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>

e0bd888d

crypto: arm64/aes-ce-gcm - operate on two input blocks at a time · 71e52c27

Ard Biesheuvel authored Jul 30, 2018

Update the core AES/GCM transform and the associated plumbing to operate
on 2 AES/GHASH blocks at a time. By itself, this is not expected to
result in a noticeable speedup, but it paves the way for reimplementing
the GHASH component using 2-way aggregation.
Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>

71e52c27

Merge git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6 · 3465893d
Herbert Xu authored Aug 07, 2018
```
Merge crypto-2.6 to pick up NEON yield revert.
```
3465893d

crypto: arm64 - revert NEON yield for fast AEAD implementations · f10dc56c

Ard Biesheuvel authored Jul 29, 2018

As it turns out, checking the TIF_NEED_RESCHED flag after each
iteration results in a significant performance regression (~10%)
when running fast algorithms (i.e., ones that use special instructions
and operate in the < 4 cycles per byte range) on in-order cores with
comparatively slow memory accesses such as the Cortex-A53.

Given the speed of these ciphers, and the fact that the page based
nature of the AEAD scatterwalk API guarantees that the core NEON
transform is never invoked with more than a single page's worth of
input, we can estimate the worst case duration of any resulting
scheduling blackout: on a 1 GHz Cortex-A53 running with 64k pages,
processing a page's worth of input at 4 cycles per byte results in
a delay of ~250 us, which is a reasonable upper bound.

So let's remove the yield checks from the fused AES-CCM and AES-GCM
routines entirely.

This reverts commit 7b67ae4d and
partially reverts commit 7c50136a.

Fixes: 7c50136a ("crypto: arm64/aes-ghash - yield NEON after every ...")
Fixes: 7b67ae4d ("crypto: arm64/aes-ccm - yield NEON after every ...")
Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Acked-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>

f10dc56c

03 Aug, 2018 24 commits

crypto: dh - make crypto_dh_encode_key() make robust · d6e43798

Eric Biggers authored Jul 27, 2018

Make it return -EINVAL if crypto_dh_key_len() is incorrect rather than
overflowing the buffer.
Signed-off-by: Eric Biggers <ebiggers@google.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>

d6e43798

crypto: dh - fix calculating encoded key size · 35f7d522

Eric Biggers authored Jul 27, 2018

It was forgotten to increase DH_KPP_SECRET_MIN_SIZE to include 'q_size',
causing an out-of-bounds write of 4 bytes in crypto_dh_encode_key(), and
an out-of-bounds read of 4 bytes in crypto_dh_decode_key(). Fix it, and
fix the lengths of the test vectors to match this.

Reported-by: syzbot+6d38d558c25b53b8f4ed@syzkaller.appspotmail.com
Fixes: e3fe0ae1 ("crypto: dh - add public key verification test")
Signed-off-by: Eric Biggers <ebiggers@google.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>

35f7d522

crypto: ccp - Check for NULL PSP pointer at module unload · afb31cd2

Tom Lendacky authored Jul 26, 2018

Should the PSP initialization fail, the PSP data structure will be
freed and the value contained in the sp_device struct set to NULL.
At module unload, psp_dev_destroy() does not check if the pointer
value is NULL and will end up dereferencing a NULL pointer.

Add a pointer check of the psp_data field in the sp_device struct
in psp_dev_destroy() and return immediately if it is NULL.

Cc: <stable@vger.kernel.org> # 4.16.x-
Fixes: 2a6170df ("crypto: ccp: Add Platform Security Processor (PSP) device support")
Signed-off-by: Tom Lendacky <thomas.lendacky@amd.com>
Acked-by: Gary R Hook <gary.hook@amd.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>

afb31cd2

crypto: arm/chacha20 - always use vrev for 16-bit rotates · 4e34e51f

Eric Biggers authored Jul 24, 2018

The 4-way ChaCha20 NEON code implements 16-bit rotates with vrev32.16,
but the one-way code (used on remainder blocks) implements it with
vshl + vsri, which is slower.  Switch the one-way code to vrev32.16 too.
Signed-off-by: Eric Biggers <ebiggers@google.com>
Acked-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>

4e34e51f

crypto: ccree - allow bigger than sector XTS op · f53ad3e1

Gilad Ben-Yossef authored Jul 24, 2018

The ccree driver had a sanity check that we are not asked
to encrypt an XTS buffer bigger than a sane sector size
since XTS IV needs to include the sector number in the IV
so this is not expected in any real use case.

Unfortunately, this breaks cryptsetup benchmark test which
has a synthetic performance test using 64k buffer of data
with the same IV.

Remove the sanity check and allow the user to hang themselves
and/or run benchmarks if they so wish.
Reported-by: Geert Uytterhoeven <geert+renesas@glider.be>
Signed-off-by: Gilad Ben-Yossef <gilad@benyossef.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>

f53ad3e1

crypto: ccree - zero all of request ctx before use · e30368f3

Gilad Ben-Yossef authored Jul 24, 2018

In certain error path req_ctx->iv was being freed despite
not being allocated because it was not initialized to NULL.
Rather than play whack a mole with the structure various
field, zero it before use.

This fixes a kernel panic that may occur if an invalid
buffer size was requested triggering the bug above.

Fixes: 63ee04c8 ("crypto: ccree - add skcipher support")
Reported-by: Geert Uytterhoeven <geert+renesas@glider.be>
Signed-off-by: Gilad Ben-Yossef <gilad@benyossef.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>

e30368f3

crypto: ccree - remove cipher ivgen left overs · f5c19df9

Gilad Ben-Yossef authored Jul 24, 2018

IV generation is not available via the skcipher interface.
Remove the left over support of it from the ablkcipher days.
Signed-off-by: Gilad Ben-Yossef <gilad@benyossef.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>

f5c19df9

crypto: ccree - drop useless type flag during reg · 76c9e53e

Gilad Ben-Yossef authored Jul 24, 2018

Drop the explicit setting of CRYPTO_ALG_TYPE_AEAD or
CRYPTO_ALG_TYPE_SKCIPHER flags during alg registration as they are
set anyway by the framework.
Signed-off-by: Gilad Ben-Yossef <gilad@benyossef.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>

76c9e53e

crypto: ablkcipher - fix crash flushing dcache in error path · 318abdfb

Eric Biggers authored Jul 23, 2018

Like the skcipher_walk and blkcipher_walk cases:

scatterwalk_done() is only meant to be called after a nonzero number of
bytes have been processed, since scatterwalk_pagedone() will flush the
dcache of the *previous* page.  But in the error case of
ablkcipher_walk_done(), e.g. if the input wasn't an integer number of
blocks, scatterwalk_done() was actually called after advancing 0 bytes.
This caused a crash ("BUG: unable to handle kernel paging request")
during '!PageSlab(page)' on architectures like arm and arm64 that define
ARCH_IMPLEMENTS_FLUSH_DCACHE_PAGE, provided that the input was
page-aligned as in that case walk->offset == 0.

Fix it by reorganizing ablkcipher_walk_done() to skip the
scatterwalk_advance() and scatterwalk_done() if an error has occurred.
Reported-by: Liu Chao <liuchao741@huawei.com>
Fixes: bf06099d ("crypto: skcipher - Add ablkcipher_walk interfaces")
Cc: <stable@vger.kernel.org> # v2.6.35+
Signed-off-by: Eric Biggers <ebiggers@google.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>

318abdfb

crypto: blkcipher - fix crash flushing dcache in error path · 0868def3

Eric Biggers authored Jul 23, 2018

Like the skcipher_walk case:

scatterwalk_done() is only meant to be called after a nonzero number of
bytes have been processed, since scatterwalk_pagedone() will flush the
dcache of the *previous* page.  But in the error case of
blkcipher_walk_done(), e.g. if the input wasn't an integer number of
blocks, scatterwalk_done() was actually called after advancing 0 bytes.
This caused a crash ("BUG: unable to handle kernel paging request")
during '!PageSlab(page)' on architectures like arm and arm64 that define
ARCH_IMPLEMENTS_FLUSH_DCACHE_PAGE, provided that the input was
page-aligned as in that case walk->offset == 0.

Fix it by reorganizing blkcipher_walk_done() to skip the
scatterwalk_advance() and scatterwalk_done() if an error has occurred.

This bug was found by syzkaller fuzzing.

Reproducer, assuming ARCH_IMPLEMENTS_FLUSH_DCACHE_PAGE:

	#include <linux/if_alg.h>
	#include <sys/socket.h>
	#include <unistd.h>

	int main()
	{
		struct sockaddr_alg addr = {
			.salg_type = "skcipher",
			.salg_name = "ecb(aes-generic)",
		};
		char buffer[4096] __attribute__((aligned(4096))) = { 0 };
		int fd;

		fd = socket(AF_ALG, SOCK_SEQPACKET, 0);
		bind(fd, (void *)&addr, sizeof(addr));
		setsockopt(fd, SOL_ALG, ALG_SET_KEY, buffer, 16);
		fd = accept(fd, NULL, NULL);
		write(fd, buffer, 15);
		read(fd, buffer, 15);
	}
Reported-by: Liu Chao <liuchao741@huawei.com>
Fixes: 5cde0af2 ("[CRYPTO] cipher: Added block cipher type")
Cc: <stable@vger.kernel.org> # v2.6.19+
Signed-off-by: Eric Biggers <ebiggers@google.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>

0868def3

crypto: skcipher - fix crash flushing dcache in error path · 8088d3dd

Eric Biggers authored Jul 23, 2018

scatterwalk_done() is only meant to be called after a nonzero number of
bytes have been processed, since scatterwalk_pagedone() will flush the
dcache of the *previous* page.  But in the error case of
skcipher_walk_done(), e.g. if the input wasn't an integer number of
blocks, scatterwalk_done() was actually called after advancing 0 bytes.
This caused a crash ("BUG: unable to handle kernel paging request")
during '!PageSlab(page)' on architectures like arm and arm64 that define
ARCH_IMPLEMENTS_FLUSH_DCACHE_PAGE, provided that the input was
page-aligned as in that case walk->offset == 0.

Fix it by reorganizing skcipher_walk_done() to skip the
scatterwalk_advance() and scatterwalk_done() if an error has occurred.

This bug was found by syzkaller fuzzing.

Reproducer, assuming ARCH_IMPLEMENTS_FLUSH_DCACHE_PAGE:

	#include <linux/if_alg.h>
	#include <sys/socket.h>
	#include <unistd.h>

	int main()
	{
		struct sockaddr_alg addr = {
			.salg_type = "skcipher",
			.salg_name = "cbc(aes-generic)",
		};
		char buffer[4096] __attribute__((aligned(4096))) = { 0 };
		int fd;

		fd = socket(AF_ALG, SOCK_SEQPACKET, 0);
		bind(fd, (void *)&addr, sizeof(addr));
		setsockopt(fd, SOL_ALG, ALG_SET_KEY, buffer, 16);
		fd = accept(fd, NULL, NULL);
		write(fd, buffer, 15);
		read(fd, buffer, 15);
	}
Reported-by: Liu Chao <liuchao741@huawei.com>
Fixes: b286d8b1 ("crypto: skcipher - Add skcipher walk interface")
Cc: <stable@vger.kernel.org> # v4.10+
Signed-off-by: Eric Biggers <ebiggers@google.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>

8088d3dd

crypto: skcipher - remove unnecessary setting of walk->nbytes · 2a57c0be

Eric Biggers authored Jul 23, 2018

Setting 'walk->nbytes = walk->total' in skcipher_walk_first() doesn't
make sense because actually walk->nbytes needs to be set to the length
of the first step in the walk, which may be less than walk->total. This
is done by skcipher_walk_next() which is called immediately afterwards.
Also walk->nbytes was already set to 0 in skcipher_walk_skcipher(),
which is a better default value in case it's forgotten to be set later.

Therefore, remove the unnecessary assignment to walk->nbytes.
Signed-off-by: Eric Biggers <ebiggers@google.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>

2a57c0be

crypto: scatterwalk - remove scatterwalk_samebuf() · 3dd8cc00

Eric Biggers authored Jul 23, 2018

scatterwalk_samebuf() is never used.  Remove it.
Signed-off-by: Eric Biggers <ebiggers@google.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>

3dd8cc00

crypto: scatterwalk - remove 'chain' argument from scatterwalk_crypto_chain() · 8c30fbe6

Eric Biggers authored Jul 23, 2018

All callers pass chain=0 to scatterwalk_crypto_chain().

Remove this unneeded parameter.
Signed-off-by: Eric Biggers <ebiggers@google.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>

8c30fbe6

crypto: skcipher - fix aligning block size in skcipher_copy_iv() · 0567fc9e

Eric Biggers authored Jul 23, 2018

The ALIGN() macro needs to be passed the alignment, not the alignmask
(which is the alignment minus 1).

Fixes: b286d8b1 ("crypto: skcipher - Add skcipher walk interface")
Cc: <stable@vger.kernel.org> # v4.10+
Signed-off-by: Eric Biggers <ebiggers@google.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>

0567fc9e

arm64: dts: hisi: add SEC crypto accelerator nodes for hip07 SoC · e4a1f785

Jonathan Cameron authored Jul 23, 2018

Enable all 4 SEC units available on d05 boards.
Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>

e4a1f785

crypto: hisilicon - SEC security accelerator driver · 915e4e84

Jonathan Cameron authored Jul 23, 2018

This accelerator is found inside hisilicon hip06 and hip07 SoCs.
Each instance provides a number of queues which feed a different number of
backend acceleration units.

The queues are operating in an out of order mode in the interests of
throughput. The silicon does not do tracking of dependencies between
multiple 'messages' or update of the IVs as appropriate for training.
Hence where relevant we need to do this in software.
Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>

915e4e84

dt-bindings: Add bindings for Hisilicon SEC crypto accelerators. · 778bd992

Jonathan Cameron authored Jul 23, 2018

The hip06 and hip07 SoCs contain a number of these crypto units which
accelerate AES and DES operations.
Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Reviewed-by: Rob Herring <robh@kernel.org>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>

778bd992

crypto: tcrypt - reschedule during speed tests · 2af63299

Horia Geantă authored Jul 23, 2018

Avoid RCU stalls in the case of non-preemptible kernel and lengthy
speed tests by rescheduling when advancing from one block size
to another.
Signed-off-by: Horia Geantă <horia.geanta@nxp.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>

2af63299

crypto: virtio - Replace GFP_ATOMIC with GFP_KERNEL in __virtio_crypto_ablkcipher_do_req() · f6adeef7

Jia-Ju Bai authored Jul 23, 2018

__virtio_crypto_ablkcipher_do_req() is never called in atomic context.

__virtio_crypto_ablkcipher_do_req() is only called by
virtio_crypto_ablkcipher_crypt_req(), which is only called by
virtcrypto_find_vqs() that is never called in atomic context.

__virtio_crypto_ablkcipher_do_req() calls kzalloc_node() with GFP_ATOMIC,
which is not necessary.
GFP_ATOMIC can be replaced with GFP_KERNEL.

This is found by a static analysis tool named DCNS written by myself.
I also manually check the kernel code before reporting it.
Signed-off-by: Jia-Ju Bai <baijiaju1990@gmail.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>

f6adeef7

crypto: qat/adf_aer - Replace GFP_ATOMIC with GFP_KERNEL in adf_dev_aer_schedule_reset() · 8e8c0386

Jia-Ju Bai authored Jul 23, 2018

adf_dev_aer_schedule_reset() is never called in atomic context, as it
calls wait_for_completion_timeout().

adf_dev_aer_schedule_reset() calls kzalloc() with GFP_ATOMIC,
which is not necessary.
GFP_ATOMIC can be replaced with GFP_KERNEL.

This is found by a static analysis tool named DCNS written by myself.
I also manually check the kernel code before reporting it.
Signed-off-by: Jia-Ju Bai <baijiaju1990@gmail.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>

8e8c0386

crypto: cavium/nitrox - Replace GFP_ATOMIC with GFP_KERNEL in crypto_alloc_context() · 1c96dde1

Jia-Ju Bai authored Jul 23, 2018

crypto_alloc_context() is only called by nitrox_skcipher_init(), which is
never called in atomic context.

crypto_alloc_context() calls dma_pool_alloc() with GFP_ATOMIC,
which is not necessary.
GFP_ATOMIC can be replaced with GFP_KERNEL.

This is found by a static analysis tool named DCNS written by myself.
I also manually check the kernel code before reporting it.
Signed-off-by: Jia-Ju Bai <baijiaju1990@gmail.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>

1c96dde1

crypto: drbg - in-place cipher operation for CTR · 43490e80

Stephan Müller authored Jul 20, 2018

The cipher implementations of the kernel crypto API favor in-place
cipher operations. Thus, switch the CTR cipher operation in the DRBG to
perform in-place operations. This is implemented by using the output
buffer as input buffer and zeroizing it before the cipher operation to
implement a CTR encryption of a NULL buffer.

The speed improvement is quite visibile with the following comparison
using the LRNG implementation.

Without the patch set:

With the patch set:

43490e80

Merge git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux · c5f5aeef
Herbert Xu authored Aug 03, 2018
```
Merge mainline to pick up c7513c2a ("crypto/arm64: aes-ce-gcm -
add missing kernel_neon_begin/end pair").
```
c5f5aeef