- 27 Jul, 2019 7 commits
-
-
Herbert Xu authored
The function padata_reorder will use a timer when it cannot progress while completed jobs are outstanding (pd->reorder_objects > 0). This is suboptimal as if we do end up using the timer then it would have introduced a gratuitous delay of one second. In fact we can easily distinguish between whether completed jobs are outstanding and whether we can make progress. All we have to do is look at the next pqueue list. This patch does that by replacing pd->processed with pd->cpu so that the next pqueue is more accessible. A work queue is used instead of the original try_again to avoid hogging the CPU. Note that we don't bother removing the work queue in padata_flush_queues because the whole premise is broken. You cannot flush async crypto requests so it makes no sense to even try. A subsequent patch will fix it by replacing it with a ref counting scheme. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
-
Arnd Bergmann authored
Clang sometimes makes very different inlining decisions from gcc. In case of the aegis crypto algorithms, it decides to turn the innermost primitives (and, xor, ...) into separate functions but inline most of the rest. This results in a huge amount of variables spilled on the stack, leading to rather slow execution as well as kernel stack usage beyond the 32-bit warning limit when CONFIG_KASAN is enabled: crypto/aegis256.c:123:13: warning: stack frame size of 648 bytes in function 'crypto_aegis256_encrypt_chunk' [-Wframe-larger-than=] crypto/aegis256.c:366:13: warning: stack frame size of 1264 bytes in function 'crypto_aegis256_crypt' [-Wframe-larger-than=] crypto/aegis256.c:187:13: warning: stack frame size of 656 bytes in function 'crypto_aegis256_decrypt_chunk' [-Wframe-larger-than=] crypto/aegis128l.c:135:13: warning: stack frame size of 832 bytes in function 'crypto_aegis128l_encrypt_chunk' [-Wframe-larger-than=] crypto/aegis128l.c:415:13: warning: stack frame size of 1480 bytes in function 'crypto_aegis128l_crypt' [-Wframe-larger-than=] crypto/aegis128l.c:218:13: warning: stack frame size of 848 bytes in function 'crypto_aegis128l_decrypt_chunk' [-Wframe-larger-than=] crypto/aegis128.c:116:13: warning: stack frame size of 584 bytes in function 'crypto_aegis128_encrypt_chunk' [-Wframe-larger-than=] crypto/aegis128.c:351:13: warning: stack frame size of 1064 bytes in function 'crypto_aegis128_crypt' [-Wframe-larger-than=] crypto/aegis128.c:177:13: warning: stack frame size of 592 bytes in function 'crypto_aegis128_decrypt_chunk' [-Wframe-larger-than=] Forcing the primitives to all get inlined avoids the issue and the resulting code is similar to what gcc produces. Signed-off-by: Arnd Bergmann <arnd@arndb.de> Acked-by: Nick Desaulniers <ndesaulniers@google.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
-
Chuhong Yuan authored
Use dma_pool_zalloc instead of using dma_pool_alloc to allocate memory and then zeroing it with memset 0. This simplifies the code. Signed-off-by: Chuhong Yuan <hslester96@gmail.com> Acked-by: Gary R Hook <gary.hook@amd.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
-
Vakul Garg authored
While running ipsec processing for traffic through multiple network interfaces, it is observed that caam driver gets less time to poll responses from caam block compared to ethernet driver. This is because ethernet driver has as many napi instances per cpu as the number of ethernet interfaces in system. Therefore, caam driver's napi executes lesser than the ethernet driver's napi instances. This results in situation that we end up submitting more requests to caam (which it is able to finish off quite fast), but don't dequeue the responses at same rate. This makes caam response FQs bloat with large number of frames. In some situations, it makes kernel crash due to out-of-memory. To prevent it We increase the napi budget of dpseci driver to a big value so that caam driver is able to drain its response queues at enough rate. Signed-off-by: Vakul Garg <vakul.garg@nxp.com> Reviewed-by: Horia Geantă <horia.geanta@nxp.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
-
Anson Huang authored
Use the new helper devm_platform_ioremap_resource() which wraps the platform_get_resource() and devm_ioremap_resource() together, to simplify the code. Signed-off-by: Anson Huang <Anson.Huang@nxp.com> Reviewed-by: Dong Aisheng <aisheng.dong@nxp.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
-
Anson Huang authored
Use the new helper devm_platform_ioremap_resource() which wraps the platform_get_resource() and devm_ioremap_resource() together, to simplify the code. Signed-off-by: Anson Huang <Anson.Huang@nxp.com> Acked-by: Arnd Bergmann <arnd@arndb.de> Reviewed-by: Dong Aisheng <aisheng.dong@nxp.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
-
Arnd Bergmann authored
Each of the operations in ccp_run_cmd() needs several hundred bytes of kernel stack. Depending on the inlining, these may need separate stack slots that add up to more than the warning limit, as shown in this clang based build: drivers/crypto/ccp/ccp-ops.c:871:12: error: stack frame size of 1164 bytes in function 'ccp_run_aes_cmd' [-Werror,-Wframe-larger-than=] static int ccp_run_aes_cmd(struct ccp_cmd_queue *cmd_q, struct ccp_cmd *cmd) The problem may also happen when there is no warning, e.g. in the ccp_run_cmd()->ccp_run_aes_cmd()->ccp_run_aes_gcm_cmd() call chain with over 2000 bytes. Mark each individual function as 'noinline_for_stack' to prevent this from happening, and move the calls to the two special cases for aes into the top-level function. This will keep the actual combined stack usage to the mimimum: 828 bytes for ccp_run_aes_gcm_cmd() and at most 524 bytes for each of the other cases. Fixes: 63b94509 ("crypto: ccp - CCP device driver and interface support") Signed-off-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
-
- 26 Jul, 2019 33 commits
-
-
Hook, Gary authored
Redefine pr_fmt so that the module name is prefixed to every log message produced by the ccp-crypto module Signed-off-by: Gary R Hook <gary.hook@amd.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
-
Herbert Xu authored
The directory tools/crypto and the only file under it never gets built anywhere. This program should instead be incorporated into one of the existing user-space projects, crconf or libkcapi. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
-
Phani Kiran Hemadri authored
This patch adds support to load Asymmetric crypto firmware on AE cores of CNN55XX device. Firmware is stored on UCD block 2 and all available AE cores are tagged to group 0. Signed-off-by: Phani Kiran Hemadri <phemadri@marvell.com> Reviewed-by: Srikanth Jampala <jsrikanth@marvell.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
-
Hook, Gary authored
The CCP driver is able to act as a DMA engine. Add a module parameter that allows this feature to be enabled/disabled. Signed-off-by: Gary R Hook <gary.hook@amd.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
-
Hook, Gary authored
Provide the ability to constrain the total number of enabled devices in the system. Once max_devs devices have been configured, subsequently probed devices are ignored. The max_devs parameter may be zero, in which case all CCPs are disabled. PSPs are always enabled and active. Disabling the CCPs also disables DMA and RNG registration. Signed-off-by: Gary R Hook <gary.hook@amd.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
-
Hook, Gary authored
Add a module parameter to limit the number of queues per CCP. The default value (nqueues=0) is to set up every available queue on each device. The count of queues starts from the first one found on the device (which varies based on the device ID). Signed-off-by: Gary R Hook <gary.hook@amd.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
-
Hook, Gary authored
Add a config option to exclude DebugFS support in the CCP driver. Signed-off-by: Gary R Hook <gary.hook@amd.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
-
Ondrej Mosnacek authored
Currently, NETLINK_CRYPTO works only in the init network namespace. It doesn't make much sense to cut it out of the other network namespaces, so do the minor plumbing work necessary to make it work in any network namespace. Code inspired by net/core/sock_diag.c. Tested using kcapi-dgst from libkcapi [1]: Before: # unshare -n kcapi-dgst -c sha256 </dev/null | wc -c libkcapi - Error: Netlink error: sendmsg failed libkcapi - Error: Netlink error: sendmsg failed libkcapi - Error: NETLINK_CRYPTO: cannot obtain cipher information for hmac(sha512) (is required crypto_user.c patch missing? see documentation) 0 After: # unshare -n kcapi-dgst -c sha256 </dev/null | wc -c 32 [1] https://github.com/smuellerDD/libkcapiSigned-off-by: Ondrej Mosnacek <omosnace@redhat.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
-
Pascal van Leeuwen authored
This patch recognises the fact that the hardware cannot ever process more than 2,199,023,386,111 bytes of hash or HMAC payload, so there is no point in maintaining 128 bit wide byte counters, 64 bits is more than sufficient Signed-off-by: Pascal van Leeuwen <pvanleeuwen@verimatrix.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
-
Pascal van Leeuwen authored
This patch adds support for the following AEAD ciphersuites: - authenc(hmac(sha1),rfc3686(ctr(aes))) - authenc(hmac(sha224),rfc3686(ctr(aes))) - authenc(hmac(sha256),rfc3686(ctr(aes))) - authenc(hmac(sha384),rfc3686(ctr(aes))) - authenc(hmac(sha512),rfc3686(ctr(aes))) Signed-off-by: Pascal van Leeuwen <pvanleeuwen@verimatrix.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
-
Pascal van Leeuwen authored
Signed-off-by: Pascal van Leeuwen <pvanleeuwen@verimatrix.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
-
Pascal van Leeuwen authored
Signed-off-by: Pascal van Leeuwen <pvanleeuwen@verimatrix.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
-
Sebastian Andrzej Siewior authored
For spinlocks the type spinlock_t should be used instead of "struct spinlock". Use spinlock_t for spinlock's definition. Cc: Herbert Xu <herbert@gondor.apana.org.au> Cc: "David S. Miller" <davem@davemloft.net> Cc: linux-crypto@vger.kernel.org Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
-
Fuqian Huang authored
kmemdup is introduced to duplicate a region of memory in a neat way. Rather than kmalloc/kzalloc + memcpy, which the programmer needs to write the size twice (sometimes lead to mistakes), kmemdup improves readability, leads to smaller code and also reduce the chances of mistakes. Suggestion to use kmemdup rather than using kmalloc/kzalloc + memcpy. Signed-off-by: Fuqian Huang <huangfq.daxian@gmail.com> Reviewed-by: Horia Geantă <horia.geanta@nxp.com> Acked-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
-
Ard Biesheuvel authored
Reviewed-by: Ondrej Mosnacek <omosnace@redhat.com> Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
-
Ard Biesheuvel authored
Provide an accelerated implementation of aegis128 by wiring up the SIMD hooks in the generic driver to an implementation based on NEON intrinsics, which can be compiled to both ARM and arm64 code. This results in a performance of 2.2 cycles per byte on Cortex-A53, which is a performance increase of ~11x compared to the generic code. Reviewed-by: Ondrej Mosnacek <omosnace@redhat.com> Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
-
Ard Biesheuvel authored
Add some plumbing to allow the AEGIS128 code to be built with SIMD routines for acceleration. Reviewed-by: Ondrej Mosnacek <omosnace@redhat.com> Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
-
Ard Biesheuvel authored
The generic AES code provides four sets of lookup tables, where each set consists of four tables containing the same 32-bit values, but rotated by 0, 8, 16 and 24 bits, respectively. This makes sense for CISC architectures such as x86 which support memory operands, but for other architectures, the rotates are quite cheap, and using all four tables needlessly thrashes the D-cache, and actually hurts rather than helps performance. Since x86 already has its own implementation of AEGIS based on AES-NI instructions, let's tweak the generic implementation towards other architectures, and avoid the prerotated tables, and perform the rotations inline. On ARM Cortex-A53, this results in a ~8% speedup. Acked-by: Ondrej Mosnacek <omosnace@redhat.com> Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
-
Ard Biesheuvel authored
TFM init/exit routines are optional, so no need to provide empty ones. Reviewed-by: Ondrej Mosnacek <omosnace@redhat.com> Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
-
Ard Biesheuvel authored
Three variants of AEGIS were proposed for the CAESAR competition, and only one was selected for the final portfolio: AEGIS128. The other variants, AEGIS128L and AEGIS256, are not likely to ever turn up in networking protocols or other places where interoperability between Linux and other systems is a concern, nor are they likely to be subjected to further cryptanalysis. However, uninformed users may think that AEGIS128L (which is faster) is equally fit for use. So let's remove them now, before anyone starts using them and we are forced to support them forever. Note that there are no known flaws in the algorithms or in any of these implementations, but they have simply outlived their usefulness. Reviewed-by: Ondrej Mosnacek <omosnace@redhat.com> Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
-
Ard Biesheuvel authored
MORUS was not selected as a winner in the CAESAR competition, which is not surprising since it is considered to be cryptographically broken [0]. (Note that this is not an implementation defect, but a flaw in the underlying algorithm). Since it is unlikely to be in use currently, let's remove it before we're stuck with it. [0] https://eprint.iacr.org/2019/172.pdfReviewed-by: Ondrej Mosnacek <omosnace@redhat.com> Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
-
Hannah Pan authored
Add self-tests for the lzo-rle algorithm. Signed-off-by: Hannah Pan <hannahpan@google.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
-
Ard Biesheuvel authored
The scalar table based AES routines are not used by other drivers, so let's keep it that way and unexport the symbols. Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
-
Ard Biesheuvel authored
Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
-
Ard Biesheuvel authored
Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
-
Ard Biesheuvel authored
Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
-
Ard Biesheuvel authored
There are a few copies of the AES S-boxes floating around, so export the ones from the AES library so that we can reuse them in other modules. Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
-
Ard Biesheuvel authored
The versions of the AES lookup tables that are only used during the last round are never used outside of the driver, so there is no need to export their symbols. Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
-
Ard Biesheuvel authored
Replace a couple of occurrences where the "aes-generic" cipher is instantiated explicitly and only used for encryption of a single block. Use AES library calls instead. Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
-
Ard Biesheuvel authored
Use the AES library instead of the cipher interface to perform the single block of AES processing involved in updating the key of the cmac(aes) hash. Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
-
Ard Biesheuvel authored
The AMCC code for GCM key derivation allocates a AES cipher to perform a single block encryption. So let's switch to the new and more lightweight AES library instead. Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
-
Ard Biesheuvel authored
The bluetooth code uses a bare AES cipher for the encryption operations. Given that it carries out a set_key() operation right before every encryption operation, this is clearly not a hot path, and so the use of the cipher interface (which provides the best implementation available on the system) is not really required. In fact, when using a cipher like AES-NI or AES-CE, both the set_key() and the encrypt() operations involve en/disabling preemption as well as stacking and unstacking the SIMD context, and this is most certainly not worth it for encrypting 16 bytes of data. So let's switch to the new lightweight library interface instead. Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
-
Ard Biesheuvel authored
GHASH is used by the GCM mode, which is often used in contexts where only synchronous ciphers are permitted. So provide a synchronous version of GHASH based on the existing code. This requires a non-SIMD fallback to deal with invocations occurring from a context where SIMD instructions may not be used. Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
-