1. 12 Jul, 2024 2 commits
  2. 06 Jul, 2024 8 commits
  3. 28 Jun, 2024 10 commits
  4. 21 Jun, 2024 9 commits
  5. 16 Jun, 2024 5 commits
    • Stefan Berger's avatar
      crypto: ecc - Fix off-by-one missing to clear most significant digit · 1dcf865d
      Stefan Berger authored
      Fix an off-by-one error where the most significant digit was not
      initialized leading to signature verification failures by the testmgr.
      
      Example: If a curve requires ndigits (=9) and diff (=2) indicates that
      2 digits need to be set to zero then start with digit 'ndigits - diff' (=7)
      and clear 'diff' digits starting from there, so 7 and 8.
      Reported-by: default avatarVenkat Rao Bagalkote <venkat88@linux.vnet.ibm.com>
      Closes: https://lore.kernel.org/linux-crypto/619bc2de-b18a-4939-a652-9ca886bf6349@linux.ibm.com/T/#m045d8812409ce233c17fcdb8b88b6629c671f9f4
      Fixes: 2fd2a82c ("crypto: ecdsa - Use ecc_digits_from_bytes to create hash digits array")
      Signed-off-by: default avatarStefan Berger <stefanb@linux.ibm.com>
      Tested-by: default avatarVenkat Rao Bagalkote <venkat88@linux.vnet.ibm.com>
      Signed-off-by: default avatarHerbert Xu <herbert@gondor.apana.org.au>
      1dcf865d
    • Stefan Berger's avatar
      crypto: ecc - Add comment to ecc_digits_from_bytes about input byte array · 0eb3bed5
      Stefan Berger authored
      Add comment to ecc_digits_from_bytes kdoc that the first byte is expected
      to hold the most significant bits of the large integer that is converted
      into an array of digits.
      Signed-off-by: default avatarStefan Berger <stefanb@linux.ibm.com>
      Signed-off-by: default avatarHerbert Xu <herbert@gondor.apana.org.au>
      0eb3bed5
    • Andy Shevchenko's avatar
      hwrng: core - Remove list.h from the hw_random.h · 4604b388
      Andy Shevchenko authored
      The 'struct list' type is defined in types.h, no need to include list.h
      for that.
      Signed-off-by: default avatarAndy Shevchenko <andriy.shevchenko@linux.intel.com>
      Signed-off-by: default avatarHerbert Xu <herbert@gondor.apana.org.au>
      4604b388
    • Neil Armstrong's avatar
      dt-bindings: rng: meson: add optional power-domains · 293695f1
      Neil Armstrong authored
      On newer SoCs, the random number generator can require a power-domain to
      operate, add it as optional.
      Signed-off-by: default avatarNeil Armstrong <neil.armstrong@linaro.org>
      Acked-by: default avatarRob Herring (Arm) <robh@kernel.org>
      Signed-off-by: default avatarHerbert Xu <herbert@gondor.apana.org.au>
      293695f1
    • Kim Phillips's avatar
      crypto: ccp - Fix null pointer dereference in __sev_snp_shutdown_locked · 468e3295
      Kim Phillips authored
      Fix a null pointer dereference induced by DEBUG_TEST_DRIVER_REMOVE.
      Return from __sev_snp_shutdown_locked() if the psp_device or the
      sev_device structs are not initialized. Without the fix, the driver will
      produce the following splat:
      
         ccp 0000:55:00.5: enabling device (0000 -> 0002)
         ccp 0000:55:00.5: sev enabled
         ccp 0000:55:00.5: psp enabled
         BUG: kernel NULL pointer dereference, address: 00000000000000f0
         #PF: supervisor read access in kernel mode
         #PF: error_code(0x0000) - not-present page
         PGD 0 P4D 0
         Oops: 0000 [#1] PREEMPT SMP DEBUG_PAGEALLOC NOPTI
         CPU: 262 PID: 1 Comm: swapper/0 Not tainted 6.9.0-rc1+ #29
         RIP: 0010:__sev_snp_shutdown_locked+0x2e/0x150
         Code: 00 55 48 89 e5 41 57 41 56 41 54 53 48 83 ec 10 41 89 f7 49 89 fe 65 48 8b 04 25 28 00 00 00 48 89 45 d8 48 8b 05 6a 5a 7f 06 <4c> 8b a0 f0 00 00 00 41 0f b6 9c 24 a2 00 00 00 48 83 fb 02 0f 83
         RSP: 0018:ffffb2ea4014b7b8 EFLAGS: 00010286
         RAX: 0000000000000000 RBX: ffff9e4acd2e0a28 RCX: 0000000000000000
         RDX: 0000000000000000 RSI: 0000000000000000 RDI: ffffb2ea4014b808
         RBP: ffffb2ea4014b7e8 R08: 0000000000000106 R09: 000000000003d9c0
         R10: 0000000000000001 R11: ffffffffa39ff070 R12: ffff9e49d40590c8
         R13: 0000000000000000 R14: ffffb2ea4014b808 R15: 0000000000000000
         FS:  0000000000000000(0000) GS:ffff9e58b1e00000(0000) knlGS:0000000000000000
         CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
         CR2: 00000000000000f0 CR3: 0000000418a3e001 CR4: 0000000000770ef0
         PKRU: 55555554
         Call Trace:
          <TASK>
          ? __die_body+0x6f/0xb0
          ? __die+0xcc/0xf0
          ? page_fault_oops+0x330/0x3a0
          ? save_trace+0x2a5/0x360
          ? do_user_addr_fault+0x583/0x630
          ? exc_page_fault+0x81/0x120
          ? asm_exc_page_fault+0x2b/0x30
          ? __sev_snp_shutdown_locked+0x2e/0x150
          __sev_firmware_shutdown+0x349/0x5b0
          ? pm_runtime_barrier+0x66/0xe0
          sev_dev_destroy+0x34/0xb0
          psp_dev_destroy+0x27/0x60
          sp_destroy+0x39/0x90
          sp_pci_remove+0x22/0x60
          pci_device_remove+0x4e/0x110
          really_probe+0x271/0x4e0
          __driver_probe_device+0x8f/0x160
          driver_probe_device+0x24/0x120
          __driver_attach+0xc7/0x280
          ? driver_attach+0x30/0x30
          bus_for_each_dev+0x10d/0x130
          driver_attach+0x22/0x30
          bus_add_driver+0x171/0x2b0
          ? unaccepted_memory_init_kdump+0x20/0x20
          driver_register+0x67/0x100
          __pci_register_driver+0x83/0x90
          sp_pci_init+0x22/0x30
          sp_mod_init+0x13/0x30
          do_one_initcall+0xb8/0x290
          ? sched_clock_noinstr+0xd/0x10
          ? local_clock_noinstr+0x3e/0x100
          ? stack_depot_save_flags+0x21e/0x6a0
          ? local_clock+0x1c/0x60
          ? stack_depot_save_flags+0x21e/0x6a0
          ? sched_clock_noinstr+0xd/0x10
          ? local_clock_noinstr+0x3e/0x100
          ? __lock_acquire+0xd90/0xe30
          ? sched_clock_noinstr+0xd/0x10
          ? local_clock_noinstr+0x3e/0x100
          ? __create_object+0x66/0x100
          ? local_clock+0x1c/0x60
          ? __create_object+0x66/0x100
          ? parameq+0x1b/0x90
          ? parse_one+0x6d/0x1d0
          ? parse_args+0xd7/0x1f0
          ? do_initcall_level+0x180/0x180
          do_initcall_level+0xb0/0x180
          do_initcalls+0x60/0xa0
          ? kernel_init+0x1f/0x1d0
          do_basic_setup+0x41/0x50
          kernel_init_freeable+0x1ac/0x230
          ? rest_init+0x1f0/0x1f0
          kernel_init+0x1f/0x1d0
          ? rest_init+0x1f0/0x1f0
          ret_from_fork+0x3d/0x50
          ? rest_init+0x1f0/0x1f0
          ret_from_fork_asm+0x11/0x20
          </TASK>
         Modules linked in:
         CR2: 00000000000000f0
         ---[ end trace 0000000000000000 ]---
         RIP: 0010:__sev_snp_shutdown_locked+0x2e/0x150
         Code: 00 55 48 89 e5 41 57 41 56 41 54 53 48 83 ec 10 41 89 f7 49 89 fe 65 48 8b 04 25 28 00 00 00 48 89 45 d8 48 8b 05 6a 5a 7f 06 <4c> 8b a0 f0 00 00 00 41 0f b6 9c 24 a2 00 00 00 48 83 fb 02 0f 83
         RSP: 0018:ffffb2ea4014b7b8 EFLAGS: 00010286
         RAX: 0000000000000000 RBX: ffff9e4acd2e0a28 RCX: 0000000000000000
         RDX: 0000000000000000 RSI: 0000000000000000 RDI: ffffb2ea4014b808
         RBP: ffffb2ea4014b7e8 R08: 0000000000000106 R09: 000000000003d9c0
         R10: 0000000000000001 R11: ffffffffa39ff070 R12: ffff9e49d40590c8
         R13: 0000000000000000 R14: ffffb2ea4014b808 R15: 0000000000000000
         FS:  0000000000000000(0000) GS:ffff9e58b1e00000(0000) knlGS:0000000000000000
         CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
         CR2: 00000000000000f0 CR3: 0000000418a3e001 CR4: 0000000000770ef0
         PKRU: 55555554
         Kernel panic - not syncing: Fatal exception
         Kernel Offset: 0x1fc00000 from 0xffffffff81000000 (relocation range: 0xffffffff80000000-0xffffffffbfffffff)
      
      Fixes: 1ca5614b ("crypto: ccp: Add support to initialize the AMD-SP for SEV-SNP")
      Cc: stable@vger.kernel.org
      Signed-off-by: default avatarKim Phillips <kim.phillips@amd.com>
      Reviewed-by: default avatarLiam Merwick <liam.merwick@oracle.com>
      Reviewed-by: default avatarMario Limonciello <mario.limonciello@amd.com>
      Reviewed-by: default avatarJohn Allen <john.allen@amd.com>
      Reviewed-by: default avatarTom Lendacky <thomas.lendacky@amd.com>
      Signed-off-by: default avatarHerbert Xu <herbert@gondor.apana.org.au>
      468e3295
  6. 07 Jun, 2024 6 commits
    • Jeff Johnson's avatar
      hwrng: omap - add missing MODULE_DESCRIPTION() macro · 6d4e1993
      Jeff Johnson authored
      make allmodconfig && make W=1 C=1 reports:
      WARNING: modpost: missing MODULE_DESCRIPTION() in drivers/char/hw_random/omap-rng.o
      WARNING: modpost: missing MODULE_DESCRIPTION() in drivers/char/hw_random/omap3-rom-rng.o
      
      Add the missing invocation of the MODULE_DESCRIPTION() macro.
      Signed-off-by: default avatarJeff Johnson <quic_jjohnson@quicinc.com>
      Signed-off-by: default avatarHerbert Xu <herbert@gondor.apana.org.au>
      6d4e1993
    • Jeff Johnson's avatar
      crypto: xilinx - add missing MODULE_DESCRIPTION() macro · ed6261d5
      Jeff Johnson authored
      make allmodconfig && make W=1 C=1 reports:
      WARNING: modpost: missing MODULE_DESCRIPTION() in drivers/crypto/xilinx/zynqmp-aes-gcm.o
      
      Add the missing invocation of the MODULE_DESCRIPTION() macro.
      Signed-off-by: default avatarJeff Johnson <quic_jjohnson@quicinc.com>
      Reviewed-by: default avatarMichal Simek <michal.simek@amd.com>
      Signed-off-by: default avatarHerbert Xu <herbert@gondor.apana.org.au>
      ed6261d5
    • Jeff Johnson's avatar
      crypto: sa2ul - add missing MODULE_DESCRIPTION() macro · c8edb3cc
      Jeff Johnson authored
      make allmodconfig && make W=1 C=1 reports:
      WARNING: modpost: missing MODULE_DESCRIPTION() in drivers/crypto/sa2ul.o
      
      Add the missing invocation of the MODULE_DESCRIPTION() macro.
      Signed-off-by: default avatarJeff Johnson <quic_jjohnson@quicinc.com>
      Signed-off-by: default avatarHerbert Xu <herbert@gondor.apana.org.au>
      c8edb3cc
    • Jeff Johnson's avatar
      crypto: keembay - add missing MODULE_DESCRIPTION() macro · f2cbb746
      Jeff Johnson authored
      make allmodconfig && make W=1 C=1 reports:
      WARNING: modpost: missing MODULE_DESCRIPTION() in drivers/crypto/intel/keembay/keembay-ocs-hcu.o
      
      Add the missing invocation of the MODULE_DESCRIPTION() macro.
      Signed-off-by: default avatarJeff Johnson <quic_jjohnson@quicinc.com>
      Signed-off-by: default avatarJeff Johnson <quic_jjohnson@quicinc.com>
      Signed-off-by: default avatarHerbert Xu <herbert@gondor.apana.org.au>
      f2cbb746
    • Jeff Johnson's avatar
      crypto: atmel-sha204a - add missing MODULE_DESCRIPTION() macro · 3aa461e3
      Jeff Johnson authored
      make allmodconfig && make W=1 C=1 reports:
      WARNING: modpost: missing MODULE_DESCRIPTION() in drivers/crypto/atmel-sha204a.o
      
      Add the missing invocation of the MODULE_DESCRIPTION() macro.
      Signed-off-by: default avatarJeff Johnson <quic_jjohnson@quicinc.com>
      Signed-off-by: default avatarHerbert Xu <herbert@gondor.apana.org.au>
      3aa461e3
    • Eric Biggers's avatar
      crypto: x86/aes-gcm - rewrite the AES-NI optimized AES-GCM · e6e758fa
      Eric Biggers authored
      Rewrite the AES-NI implementations of AES-GCM, taking advantage of
      things I learned while writing the VAES-AVX10 implementations.  This is
      a complete rewrite that reduces the AES-NI GCM source code size by about
      70% and the binary code size by about 95%, while not regressing
      performance and in fact improving it significantly in many cases.
      
      The following summarizes the state before this patch:
      
      - The aesni-intel module registered algorithms "generic-gcm-aesni" and
        "rfc4106-gcm-aesni" with the crypto API that actually delegated to one
        of three underlying implementations according to the CPU capabilities
        detected at runtime: AES-NI, AES-NI + AVX, or AES-NI + AVX2.
      
      - The AES-NI + AVX and AES-NI + AVX2 assembly code was in
        aesni-intel_avx-x86_64.S and consisted of 2804 lines of source and
        257 KB of binary.  This massive binary size was not really
        appropriate, and depending on the kconfig it could take up over 1% the
        size of the entire vmlinux.  The main loops did 8 blocks per
        iteration.  The AVX code minimized the use of carryless multiplication
        whereas the AVX2 code did not.  The "AVX2" code did not actually use
        AVX2; the check for AVX2 was really a check for Intel Haswell or later
        to detect support for fast carryless multiplication.  The long source
        length was caused by factors such as significant code duplication.
      
      - The AES-NI only assembly code was in aesni-intel_asm.S and consisted
        of 1501 lines of source and 15 KB of binary.  The main loops did 4
        blocks per iteration and minimized the use of carryless multiplication
        by using Karatsuba multiplication and a multiplication-less reduction.
      
      - The assembly code was contributed in 2010-2013.  Maintenance has been
        sporadic and most design choices haven't been revisited.
      
      - The assembly function prototypes and the corresponding glue code were
        separate from and were not consistent with the new VAES-AVX10 code I
        recently added.  The older code had several issues such as not
        precomputing the GHASH key powers, which hurt performance.
      
      This rewrite achieves the following goals:
      
      - Much shorter source and binary sizes.  The assembly source shrinks
        from 4300 lines to 1130 lines, and it produces about 9 KB of binary
        instead of 272 KB.  This is achieved via a better designed AES-GCM
        implementation that doesn't excessively unroll the code and instead
        prioritizes the parts that really matter.  Sharing the C glue code
        with the VAES-AVX10 implementations also saves 250 lines of C source.
      
      - Improve performance on most (possibly all) CPUs on which this code
        runs, for most (possibly all) message lengths.  Benchmark results are
        given in Tables 1 and 2 below.
      
      - Use the same function prototypes and glue code as the new VAES-AVX10
        algorithms.  This fixes some issues with the integration of the
        assembly and results in some significant performance improvements,
        primarily on short messages.  Also, the AVX and non-AVX
        implementations are now registered as separate algorithms with the
        crypto API, which makes them both testable by the self-tests.
      
      - Keep support for AES-NI without AVX (for Westmere, Silvermont,
        Goldmont, and Tremont), but unify the source code with AES-NI + AVX.
        Since 256-bit vectors cannot be used without VAES anyway, this is made
        feasible by just using the non-VEX coded form of most instructions.
      
      - Use a unified approach where the main loop does 8 blocks per iteration
        and uses Karatsuba multiplication to save one pclmulqdq per block but
        does not use the multiplication-less reduction.  This strikes a good
        balance across the range of CPUs on which this code runs.
      
      - Don't spam the kernel log with an informational message on every boot.
      
      The following tables summarize the improvement in AES-GCM throughput on
      various CPU microarchitectures as a result of this patch:
      
      Table 1: AES-256-GCM encryption throughput improvement,
               CPU microarchitecture vs. message length in bytes:
      
                         | 16384 |  4096 |  4095 |  1420 |   512 |   500 |
      -------------------+-------+-------+-------+-------+-------+-------+
      Intel Broadwell    |    2% |    8% |   11% |   18% |   31% |   26% |
      Intel Skylake      |    1% |    4% |    7% |   12% |   26% |   19% |
      Intel Cascade Lake |    3% |    8% |   10% |   18% |   33% |   24% |
      AMD Zen 1          |    6% |   12% |    6% |   15% |   27% |   24% |
      AMD Zen 2          |    8% |   13% |   13% |   19% |   26% |   28% |
      AMD Zen 3          |    8% |   14% |   13% |   19% |   26% |   25% |
      
                         |   300 |   200 |    64 |    63 |    16 |
      -------------------+-------+-------+-------+-------+-------+
      Intel Broadwell    |   35% |   29% |   45% |   55% |   54% |
      Intel Skylake      |   25% |   19% |   28% |   33% |   27% |
      Intel Cascade Lake |   36% |   28% |   39% |   49% |   54% |
      AMD Zen 1          |   27% |   22% |   23% |   29% |   26% |
      AMD Zen 2          |   32% |   24% |   22% |   25% |   31% |
      AMD Zen 3          |   30% |   24% |   22% |   23% |   26% |
      
      Table 2: AES-256-GCM decryption throughput improvement,
               CPU microarchitecture vs. message length in bytes:
      
                         | 16384 |  4096 |  4095 |  1420 |   512 |   500 |
      -------------------+-------+-------+-------+-------+-------+-------+
      Intel Broadwell    |    3% |    8% |   11% |   19% |   32% |   28% |
      Intel Skylake      |    3% |    4% |    7% |   13% |   28% |   27% |
      Intel Cascade Lake |    3% |    9% |   11% |   19% |   33% |   28% |
      AMD Zen 1          |   15% |   18% |   14% |   20% |   36% |   33% |
      AMD Zen 2          |    9% |   16% |   13% |   21% |   26% |   27% |
      AMD Zen 3          |    8% |   15% |   12% |   18% |   23% |   23% |
      
                         |   300 |   200 |    64 |    63 |    16 |
      -------------------+-------+-------+-------+-------+-------+
      Intel Broadwell    |   36% |   31% |   40% |   51% |   53% |
      Intel Skylake      |   28% |   21% |   23% |   30% |   30% |
      Intel Cascade Lake |   36% |   29% |   36% |   47% |   53% |
      AMD Zen 1          |   35% |   31% |   32% |   35% |   36% |
      AMD Zen 2          |   31% |   30% |   27% |   38% |   30% |
      AMD Zen 3          |   27% |   23% |   24% |   32% |   26% |
      
      The above numbers are percentage improvements in single-thread
      throughput, so e.g. an increase from 3000 MB/s to 3300 MB/s would be
      listed as 10%.  They were collected by directly measuring the Linux
      crypto API performance using a custom kernel module.  Note that indirect
      benchmarks (e.g. 'cryptsetup benchmark' or benchmarking dm-crypt I/O)
      include more overhead and won't see quite as much of a difference.  All
      these benchmarks used an associated data length of 16 bytes.  Note that
      AES-GCM is almost always used with short associated data lengths.
      
      I didn't test Intel CPUs before Broadwell, AMD CPUs before Zen 1, or
      Intel low-power CPUs, as these weren't readily available to me.
      However, based on the design of the new code and the available
      information about these other CPU microarchitectures, I wouldn't expect
      any significant regressions, and there's a good chance performance is
      improved just as it is above.
      Signed-off-by: default avatarEric Biggers <ebiggers@google.com>
      Signed-off-by: default avatarHerbert Xu <herbert@gondor.apana.org.au>
      e6e758fa