1. 25 Nov, 2022 13 commits
  2. 22 Nov, 2022 1 commit
  3. 18 Nov, 2022 13 commits
  4. 14 Nov, 2022 1 commit
  5. 11 Nov, 2022 5 commits
    • Shashank Gupta's avatar
      crypto: qat - remove ADF_STATUS_PF_RUNNING flag from probe · 557ffd5a
      Shashank Gupta authored
      The ADF_STATUS_PF_RUNNING bit is set after the successful initialization
      of the communication between VF to PF in adf_vf2pf_notify_init().
      So, it is not required to be set after the execution of the function
      adf_dev_init().
      Signed-off-by: default avatarShashank Gupta <shashank.gupta@intel.com>
      Reviewed-by: default avatarGiovanni Cabiddu <giovanni.cabiddu@intel.com>
      Reviewed-by: default avatarWojciech Ziemba <wojciech.ziemba@intel.com>
      Signed-off-by: default avatarHerbert Xu <herbert@gondor.apana.org.au>
      557ffd5a
    • Yang Li's avatar
      crypto: rockchip - Remove surplus dev_err() when using platform_get_irq() · fb11cddf
      Yang Li authored
      There is no need to call the dev_err() function directly to print a
      custom message when handling an error from either the platform_get_irq()
      or platform_get_irq_byname() functions as both are going to display an
      appropriate error message in case of a failure.
      
      ./drivers/crypto/rockchip/rk3288_crypto.c:351:2-9: line 351 is
      redundant because platform_get_irq() already prints an error
      
      Link: https://bugzilla.openanolis.cn/show_bug.cgi?id=2677Reported-by: default avatarAbaci Robot <abaci@linux.alibaba.com>
      Signed-off-by: default avatarYang Li <yang.lee@linux.alibaba.com>
      Acked-by: default avatarCorentin Labbe <clabbe@baylibre.com>
      Signed-off-by: default avatarHerbert Xu <herbert@gondor.apana.org.au>
      fb11cddf
    • Ard Biesheuvel's avatar
      crypto: lib/aesgcm - Provide minimal library implementation · 520af5da
      Ard Biesheuvel authored
      Implement a minimal library version of AES-GCM based on the existing
      library implementations of AES and multiplication in GF(2^128). Using
      these primitives, GCM can be implemented in a straight-forward manner.
      
      GCM has a couple of sharp edges, i.e., the amount of input data
      processed with the same initialization vector (IV) should be capped to
      protect the counter from 32-bit rollover (or carry), and the size of the
      authentication tag should be fixed for a given key. [0]
      
      The former concern is addressed trivially, given that the function call
      API uses 32-bit signed types for the input lengths. It is still up to
      the caller to avoid IV reuse in general, but this is not something we
      can police at the implementation level.
      
      As for the latter concern, let's make the authentication tag size part
      of the key schedule, and only permit it to be configured as part of the
      key expansion routine.
      
      Note that table based AES implementations are susceptible to known
      plaintext timing attacks on the encryption key. The AES library already
      attempts to mitigate this to some extent, but given that the counter
      mode encryption used by GCM operates exclusively on known plaintext by
      construction (the IV and therefore the initial counter value are known
      to an attacker), let's take some extra care to mitigate this, by calling
      the AES library with interrupts disabled.
      
      [0] https://nvlpubs.nist.gov/nistpubs/legacy/sp/nistspecialpublication800-38d.pdf
      
      Link: https://lore.kernel.org/all/c6fb9b25-a4b6-2e4a-2dd1-63adda055a49@amd.com/Signed-off-by: default avatarArd Biesheuvel <ardb@kernel.org>
      Tested-by: default avatarNikunj A Dadhania <nikunj@amd.com>
      Signed-off-by: default avatarHerbert Xu <herbert@gondor.apana.org.au>
      520af5da
    • Ard Biesheuvel's avatar
      crypto: lib/gf128mul - make gf128mul_lle time invariant · b67ce439
      Ard Biesheuvel authored
      The gf128mul library has different variants with different
      memory/performance tradeoffs, where the faster ones use 4k or 64k lookup
      tables precomputed at runtime, which are based on one of the
      multiplication factors, which is commonly the key for keyed hash
      algorithms such as GHASH.
      
      The slowest variant is gf128_mul_lle() [and its bbe/ble counterparts],
      which does not use precomputed lookup tables, but it still relies on a
      single u16[256] lookup table which is input independent. The use of such
      a table may cause the execution time of gf128_mul_lle() to correlate
      with the value of the inputs, which is generally something that must be
      avoided for cryptographic algorithms. On top of that, the function uses
      a sequence of if () statements that conditionally invoke be128_xor()
      based on which bits are set in the second argument of the function,
      which is usually a pointer to the multiplication factor that represents
      the key.
      
      In order to remove the correlation between the execution time of
      gf128_mul_lle() and the value of its inputs, let's address the
      identified shortcomings:
      - add a time invariant version of gf128mul_x8_lle() that replaces the
        table lookup with the expression that is used at compile time to
        populate the lookup table;
      - make the invocations of be128_xor() unconditional, but pass a zero
        vector as the third argument if the associated bit in the key is
        cleared.
      
      The resulting code is likely to be significantly slower. However, given
      that this is the slowest version already, making it even slower in order
      to make it more secure is assumed to be justified.
      
      The bbe and ble counterparts could receive the same treatment, but the
      former is never used anywhere in the kernel, and the latter is only
      used in the driver for a asynchronous crypto h/w accelerator (Chelsio),
      where timing variances are unlikely to matter.
      Signed-off-by: default avatarArd Biesheuvel <ardb@kernel.org>
      Signed-off-by: default avatarHerbert Xu <herbert@gondor.apana.org.au>
      b67ce439
    • Ard Biesheuvel's avatar
      crypto: move gf128mul library into lib/crypto · 61c581a4
      Ard Biesheuvel authored
      The gf128mul library does not depend on the crypto API at all, so it can
      be moved into lib/crypto. This will allow us to use it in other library
      code in a subsequent patch without having to depend on CONFIG_CRYPTO.
      
      While at it, change the Kconfig symbol name to align with other crypto
      library implementations. However, the source file name is retained, as
      it is reflected in the module .ko filename, and changing this might
      break things for users.
      Signed-off-by: default avatarArd Biesheuvel <ardb@kernel.org>
      Signed-off-by: default avatarHerbert Xu <herbert@gondor.apana.org.au>
      61c581a4
  6. 04 Nov, 2022 7 commits
    • Ralph Siemsen's avatar
      crypto: doc - use correct function name · 329cfa42
      Ralph Siemsen authored
      The hashing API does not have a function called .finish()
      Signed-off-by: default avatarRalph Siemsen <ralph.siemsen@linaro.org>
      Signed-off-by: default avatarHerbert Xu <herbert@gondor.apana.org.au>
      329cfa42
    • Tianjia Zhang's avatar
      crypto: arm64/sm4 - add CE implementation for GCM mode · ae1b83c7
      Tianjia Zhang authored
      This patch is a CE-optimized assembly implementation for GCM mode.
      
      Benchmark on T-Head Yitian-710 2.75 GHz, the data comes from the 224 and 224
      modes of tcrypt, and compared the performance before and after this patch (the
      driver used before this patch is gcm_base(ctr-sm4-ce,ghash-generic)).
      The abscissas are blocks of different lengths. The data is tabulated and the
      unit is Mb/s:
      
      Before (gcm_base(ctr-sm4-ce,ghash-generic)):
      
      gcm(sm4)     |     16      64      256      512     1024     1420     4096     8192
      -------------+---------------------------------------------------------------------
        GCM enc    |  25.24   64.65   104.66   116.69   123.81   125.12   129.67   130.62
        GCM dec    |  25.40   64.80   104.74   116.70   123.81   125.21   129.68   130.59
        GCM mb enc |  24.95   64.06   104.20   116.38   123.55   124.97   129.63   130.61
        GCM mb dec |  24.92   64.00   104.13   116.34   123.55   124.98   129.56   130.48
      
      After:
      
      gcm-sm4-ce   |     16      64      256      512     1024     1420     4096     8192
      -------------+---------------------------------------------------------------------
        GCM enc    | 108.62  397.18   971.60  1283.92  1522.77  1513.39  1777.00  1806.96
        GCM dec    | 116.36  398.14  1004.27  1319.11  1624.21  1635.43  1932.54  1974.20
        GCM mb enc | 107.13  391.79   962.05  1274.94  1514.76  1508.57  1769.07  1801.58
        GCM mb dec | 113.40  389.36   988.51  1307.68  1619.10  1631.55  1931.70  1970.86
      Signed-off-by: default avatarTianjia Zhang <tianjia.zhang@linux.alibaba.com>
      Signed-off-by: default avatarHerbert Xu <herbert@gondor.apana.org.au>
      ae1b83c7
    • Tianjia Zhang's avatar
      crypto: arm64/sm4 - add CE implementation for CCM mode · 67fa3a7f
      Tianjia Zhang authored
      This patch is a CE-optimized assembly implementation for CCM mode.
      
      Benchmark on T-Head Yitian-710 2.75 GHz, the data comes from the 223 and 225
      modes of tcrypt, and compared the performance before and after this patch (the
      driver used before this patch is ccm_base(ctr-sm4-ce,cbcmac-sm4-ce)).
      The abscissas are blocks of different lengths. The data is tabulated and the
      unit is Mb/s:
      
      Before (rfc4309(ccm_base(ctr-sm4-ce,cbcmac-sm4-ce))):
      
      ccm(sm4)     |     16      64     256     512    1024    1420    4096    8192
      -------------+---------------------------------------------------------------
        CCM enc    |  35.07  125.40  336.47  468.17  581.97  619.18  712.56  736.01
        CCM dec    |  34.87  124.40  335.08  466.75  581.04  618.81  712.25  735.89
        CCM mb enc |  34.71  123.96  333.92  465.39  579.91  617.49  711.45  734.92
        CCM mb dec |  34.42  122.80  331.02  462.81  578.28  616.42  709.88  734.19
      
      After (rfc4309(ccm-sm4-ce)):
      
      ccm-sm4-ce   |     16      64     256     512    1024    1420    4096    8192
      -------------+---------------------------------------------------------------
        CCM enc    |  77.12  249.82  569.94  725.17  839.27  867.71  952.87  969.89
        CCM dec    |  75.90  247.26  566.29  722.12  836.90  865.95  951.74  968.57
        CCM mb enc |  75.98  245.25  562.91  718.99  834.76  864.70  950.17  967.90
        CCM mb dec |  75.06  243.78  560.58  717.13  833.68  862.70  949.35  967.11
      Signed-off-by: default avatarTianjia Zhang <tianjia.zhang@linux.alibaba.com>
      Signed-off-by: default avatarHerbert Xu <herbert@gondor.apana.org.au>
      67fa3a7f
    • Tianjia Zhang's avatar
      crypto: arm64/sm4 - add CE implementation for cmac/xcbc/cbcmac · 6b5360a5
      Tianjia Zhang authored
      This patch is a CE-optimized assembly implementation for cmac/xcbc/cbcmac.
      
      Benchmark on T-Head Yitian-710 2.75 GHz, the data comes from the 300 mode of
      tcrypt, and compared the performance before and after this patch (the driver
      used before this patch is XXXmac(sm4-ce)). The abscissas are blocks of
      different lengths. The data is tabulated and the unit is Mb/s:
      
      Before:
      
      update-size    |      16      64     256    1024    2048    4096    8192
      ---------------+--------------------------------------------------------
      cmac(sm4-ce)   |  293.33  403.69  503.76  527.78  531.10  535.46  535.81
      xcbc(sm4-ce)   |  292.83  402.50  504.02  529.08  529.87  536.55  538.24
      cbcmac(sm4-ce) |  318.42  415.79  497.12  515.05  523.15  521.19  523.01
      
      After:
      
      update-size    |      16      64     256    1024    2048    4096    8192
      ---------------+--------------------------------------------------------
      cmac-sm4-ce    |  371.99  675.28  903.56  971.65  980.57  990.40  991.04
      xcbc-sm4-ce    |  372.11  674.55  903.47  971.61  980.96  990.42  991.10
      cbcmac-sm4-ce  |  371.63  675.33  903.23  972.07  981.42  990.93  991.45
      Signed-off-by: default avatarTianjia Zhang <tianjia.zhang@linux.alibaba.com>
      Signed-off-by: default avatarHerbert Xu <herbert@gondor.apana.org.au>
      6b5360a5
    • Tianjia Zhang's avatar
      crypto: arm64/sm4 - add CE implementation for XTS mode · 01f63311
      Tianjia Zhang authored
      This patch is a CE-optimized assembly implementation for XTS mode.
      
      Benchmark on T-Head Yitian-710 2.75 GHz, the data comes from the 218 mode of
      tcrypt, and compared the performance before and after this patch (the driver
      used before this patch is xts(ecb-sm4-ce)). The abscissas are blocks of
      different lengths. The data is tabulated and the unit is Mb/s:
      
      Before:
      
      xts(ecb-sm4-ce) |      16       64      128      256     1024     1420     4096
      ----------------+--------------------------------------------------------------
              XTS enc |  117.17   430.56   732.92  1134.98  2007.03  2136.23  2347.20
              XTS dec |  116.89   429.02   733.40  1132.96  2006.13  2130.50  2347.92
      
      After:
      
      xts-sm4-ce      |      16       64      128      256     1024     1420     4096
      ----------------+--------------------------------------------------------------
              XTS enc |  224.68   798.91  1248.08  1714.60  2413.73  2467.84  2612.62
              XTS dec |  229.85   791.34  1237.79  1720.00  2413.30  2473.84  2611.95
      Signed-off-by: default avatarTianjia Zhang <tianjia.zhang@linux.alibaba.com>
      Signed-off-by: default avatarHerbert Xu <herbert@gondor.apana.org.au>
      01f63311
    • Tianjia Zhang's avatar
      crypto: arm64/sm4 - add CE implementation for CTS-CBC mode · b1863fd0
      Tianjia Zhang authored
      This patch is a CE-optimized assembly implementation for CTS-CBC mode.
      
      Benchmark on T-Head Yitian-710 2.75 GHz, the data comes from the 218 mode of
      tcrypt, and compared the performance before and after this patch (the driver
      used before this patch is cts(cbc-sm4-ce)). The abscissas are blocks of
      different lengths. The data is tabulated and the unit is Mb/s:
      
      Before:
      
      cts(cbc-sm4-ce) |      16       64      128      256     1024     1420     4096
      ----------------+--------------------------------------------------------------
          CTS-CBC enc |  286.09   297.17   457.97   627.75   868.58   900.80   957.69
          CTS-CBC dec |  286.67   285.63   538.35   947.08  2241.03  2577.32  3391.14
      
      After:
      
      cts-cbc-sm4-ce  |      16       64      128      256     1024     1420     4096
      ----------------+--------------------------------------------------------------
          CTS-CBC enc |  288.19   428.80   593.57   741.04   911.73   931.80   950.00
          CTS-CBC dec |  292.22   468.99   838.23  1380.76  2741.17  3036.42  3409.62
      Signed-off-by: default avatarTianjia Zhang <tianjia.zhang@linux.alibaba.com>
      Signed-off-by: default avatarHerbert Xu <herbert@gondor.apana.org.au>
      b1863fd0
    • Tianjia Zhang's avatar
      crypto: arm64/sm4 - export reusable CE acceleration functions · 45089dbe
      Tianjia Zhang authored
      In the accelerated implementation of the SM4 algorithm using the Crypto
      Extension instructions, there are some functions that can be reused in
      the upcoming accelerated implementation of the GCM/CCM mode, and the
      CBC/CFB encryption is reused in the optimized implementation of SVESM4.
      Signed-off-by: default avatarTianjia Zhang <tianjia.zhang@linux.alibaba.com>
      Signed-off-by: default avatarHerbert Xu <herbert@gondor.apana.org.au>
      45089dbe