1. 27 Jan, 2023 5 commits
  2. 20 Jan, 2023 14 commits
  3. 18 Jan, 2023 1 commit
    • Herbert Xu's avatar
      crypto: p10-aes-gcm - Revert implementation · 596f674d
      Herbert Xu authored
      Revert the changes that added p10-aes-gcm:
      
      	0781bbd7 ("crypto: p10-aes-gcm - A perl script to process PowerPC assembler source")
      	41a6437a ("crypto: p10-aes-gcm - Supporting functions for ghash")
      	3b47ecca ("crypto: p10-aes-gcm - Supporting functions for AES")
      	ca68a96c ("crypto: p10-aes-gcm - An accelerated AES/GCM stitched implementation")
      	cc40379b ("crypto: p10-aes-gcm - Glue code for AES/GCM stitched implementation")
      	3c657e86 ("crypto: p10-aes-gcm - Update Kconfig and Makefile")
      
      These changes fail to build in many configurations and are not ready
      for prime time.
      Signed-off-by: default avatarHerbert Xu <herbert@gondor.apana.org.au>
      596f674d
  4. 13 Jan, 2023 13 commits
  5. 06 Jan, 2023 7 commits
    • Taehee Yoo's avatar
      crypto: x86/aria - implement aria-avx512 · c970d420
      Taehee Yoo authored
      aria-avx512 implementation uses AVX512 and GFNI.
      It supports 64way parallel processing.
      So, byteslicing code is changed to support 64way parallel.
      And it exports some aria-avx2 functions such as encrypt() and decrypt().
      
      AVX and AVX2 have 16 registers.
      They should use memory to store/load state because of lack of registers.
      But AVX512 supports 32 registers.
      So, it doesn't require store/load in the s-box layer.
      It means that it can reduce overhead of store/load in the s-box layer.
      Also code become much simpler.
      
      Benchmark with modprobe tcrypt mode=610 num_mb=8192, i3-12100:
      
      ARIA-AVX512(128bit and 256bit)
          testing speed of multibuffer ecb(aria) (ecb-aria-avx512) encryption
      tcrypt: 1 operation in 1504 cycles (1024 bytes)
      tcrypt: 1 operation in 4595 cycles (4096 bytes)
      tcrypt: 1 operation in 1763 cycles (1024 bytes)
      tcrypt: 1 operation in 5540 cycles (4096 bytes)
          testing speed of multibuffer ecb(aria) (ecb-aria-avx512) decryption
      tcrypt: 1 operation in 1502 cycles (1024 bytes)
      tcrypt: 1 operation in 4615 cycles (4096 bytes)
      tcrypt: 1 operation in 1759 cycles (1024 bytes)
      tcrypt: 1 operation in 5554 cycles (4096 bytes)
      
      ARIA-AVX2 with GFNI(128bit and 256bit)
          testing speed of multibuffer ecb(aria) (ecb-aria-avx2) encryption
      tcrypt: 1 operation in 2003 cycles (1024 bytes)
      tcrypt: 1 operation in 5867 cycles (4096 bytes)
      tcrypt: 1 operation in 2358 cycles (1024 bytes)
      tcrypt: 1 operation in 7295 cycles (4096 bytes)
          testing speed of multibuffer ecb(aria) (ecb-aria-avx2) decryption
      tcrypt: 1 operation in 2004 cycles (1024 bytes)
      tcrypt: 1 operation in 5956 cycles (4096 bytes)
      tcrypt: 1 operation in 2409 cycles (1024 bytes)
      tcrypt: 1 operation in 7564 cycles (4096 bytes)
      Signed-off-by: default avatarTaehee Yoo <ap420073@gmail.com>
      Signed-off-by: default avatarHerbert Xu <herbert@gondor.apana.org.au>
      c970d420
    • Taehee Yoo's avatar
      crypto: x86/aria - implement aria-avx2 · 37d8d3ae
      Taehee Yoo authored
      aria-avx2 implementation uses AVX2, AES-NI, and GFNI.
      It supports 32way parallel processing.
      So, byteslicing code is changed to support 32way parallel.
      And it exports some aria-avx functions such as encrypt() and decrypt().
      
      There are two main logics, s-box layer and diffusion layer.
      These codes are the same as aria-avx implementation.
      But some instruction are exchanged because they don't support 256bit
      registers.
      Also, AES-NI doesn't support 256bit register.
      So, aesenclast and aesdeclast are used twice like below:
      	vextracti128 $1, ymm0, xmm6;
      	vaesenclast xmm7, xmm0, xmm0;
      	vaesenclast xmm7, xmm6, xmm6;
      	vinserti128 $1, xmm6, ymm0, ymm0;
      
      Benchmark with modprobe tcrypt mode=610 num_mb=8192, i3-12100:
      
      ARIA-AVX2 with GFNI(128bit and 256bit)
          testing speed of multibuffer ecb(aria) (ecb-aria-avx2) encryption
      tcrypt: 1 operation in 2003 cycles (1024 bytes)
      tcrypt: 1 operation in 5867 cycles (4096 bytes)
      tcrypt: 1 operation in 2358 cycles (1024 bytes)
      tcrypt: 1 operation in 7295 cycles (4096 bytes)
          testing speed of multibuffer ecb(aria) (ecb-aria-avx2) decryption
      tcrypt: 1 operation in 2004 cycles (1024 bytes)
      tcrypt: 1 operation in 5956 cycles (4096 bytes)
      tcrypt: 1 operation in 2409 cycles (1024 bytes)
      tcrypt: 1 operation in 7564 cycles (4096 bytes)
      
      ARIA-AVX with GFNI(128bit and 256bit)
          testing speed of multibuffer ecb(aria) (ecb-aria-avx) encryption
      tcrypt: 1 operation in 2761 cycles (1024 bytes)
      tcrypt: 1 operation in 9390 cycles (4096 bytes)
      tcrypt: 1 operation in 3401 cycles (1024 bytes)
      tcrypt: 1 operation in 11876 cycles (4096 bytes)
          testing speed of multibuffer ecb(aria) (ecb-aria-avx) decryption
      tcrypt: 1 operation in 2735 cycles (1024 bytes)
      tcrypt: 1 operation in 9424 cycles (4096 bytes)
      tcrypt: 1 operation in 3369 cycles (1024 bytes)
      tcrypt: 1 operation in 11954 cycles (4096 bytes)
      Signed-off-by: default avatarTaehee Yoo <ap420073@gmail.com>
      Signed-off-by: default avatarHerbert Xu <herbert@gondor.apana.org.au>
      37d8d3ae
    • Taehee Yoo's avatar
      crypto: x86/aria - do not use magic number offsets of aria_ctx · 35344cf3
      Taehee Yoo authored
      aria-avx assembly code accesses members of aria_ctx with magic number
      offset. If the shape of struct aria_ctx is changed carelessly,
      aria-avx will not work.
      So, we need to ensure accessing members of aria_ctx with correct
      offset values, not with magic numbers.
      
      It adds ARIA_CTX_enc_key, ARIA_CTX_dec_key, and ARIA_CTX_rounds in the
      asm-offsets.c So, correct offset definitions will be generated.
      aria-avx assembly code can access members of aria_ctx safely with
      these definitions.
      Signed-off-by: default avatarTaehee Yoo <ap420073@gmail.com>
      Signed-off-by: default avatarHerbert Xu <herbert@gondor.apana.org.au>
      35344cf3
    • Taehee Yoo's avatar
      crypto: x86/aria - add keystream array into request ctx · 8e7d7ce2
      Taehee Yoo authored
      avx accelerated aria module used local keystream array.
      But, keystream array size is too big.
      So, it puts the keystream array into request ctx.
      Signed-off-by: default avatarTaehee Yoo <ap420073@gmail.com>
      Signed-off-by: default avatarHerbert Xu <herbert@gondor.apana.org.au>
      8e7d7ce2
    • David Rientjes's avatar
      crypto: ccp - Avoid page allocation failure warning for SEV_GET_ID2 · 91dfd982
      David Rientjes authored
      For SEV_GET_ID2, the user provided length does not have a specified
      limitation because the length of the ID may change in the future.  The
      kernel memory allocation, however, is implicitly limited to 4MB on x86 by
      the page allocator, otherwise the kzalloc() will fail.
      
      When this happens, it is best not to spam the kernel log with the warning.
      Simply fail the allocation and return ENOMEM to the user.
      
      Fixes: d6112ea0 ("crypto: ccp - introduce SEV_GET_ID2 command")
      Reported-by: default avatarAndy Nguyen <theflow@google.com>
      Reported-by: default avatarPeter Gonda <pgonda@google.com>
      Suggested-by: default avatarHerbert Xu <herbert@gondor.apana.org.au>
      Signed-off-by: default avatarDavid Rientjes <rientjes@google.com>
      Acked-by: default avatarTom Lendacky <thomas.lendacky@amd.com>
      Signed-off-by: default avatarHerbert Xu <herbert@gondor.apana.org.au>
      91dfd982
    • Herbert Xu's avatar
      crypto: talitos - Remove GFP_DMA and add DMA alignment padding · 8e613cec
      Herbert Xu authored
      GFP_DMA does not guarantee that the returned memory is aligned
      for DMA.  It should be removed where it is superfluous.
      
      However, kmalloc may start returning DMA-unaligned memory in future
      so fix this by adding the alignment by hand.
      Signed-off-by: default avatarHerbert Xu <herbert@gondor.apana.org.au>
      8e613cec
    • Herbert Xu's avatar
      crypto: caam - Remove GFP_DMA and add DMA alignment padding · 199354d7
      Herbert Xu authored
      GFP_DMA does not guarantee that the returned memory is aligned
      for DMA.  It should be removed where it is superfluous.
      
      However, kmalloc may start returning DMA-unaligned memory in future
      so fix this by adding the alignment by hand.
      Signed-off-by: default avatarHerbert Xu <herbert@gondor.apana.org.au>
      199354d7