1. 13 Feb, 2023 15 commits
    • Nathan Lynch's avatar
      powerpc/pseries: add RTAS work area allocator · 43033bc6
      Nathan Lynch authored
      Various pseries-specific RTAS functions take a temporary "work area"
      parameter - a buffer in memory accessible to RTAS. Typically such
      functions are passed the statically allocated rtas_data_buf buffer as
      the argument. This buffer is protected by a global spinlock. So users
      of rtas_data_buf cannot perform sleeping operations while accessing
      the buffer.
      
      Most RTAS functions that have a work area parameter can return a
      status (-2/990x) that indicates that the caller should retry. Before
      retrying, the caller may need to reschedule or sleep (see
      rtas_busy_delay() for details). This combination of factors
      leads to uncomfortable constructions like this:
      
      	do {
      		spin_lock(&rtas_data_buf_lock);
      		rc = rtas_call(token, __pa(rtas_data_buf, ...);
      		if (rc == 0) {
      			/* parse or copy out rtas_data_buf contents */
      		}
      		spin_unlock(&rtas_data_buf_lock);
      	} while (rtas_busy_delay(rc));
      
      Another unfortunately common way of handling this is for callers to
      blithely ignore the possibility of a -2/990x status and hope for the
      best.
      
      If users were allowed to perform blocking operations while owning a
      work area, the programming model would become less tedious and
      error-prone. Users could schedule away, sleep, or perform other
      blocking operations without having to release and re-acquire
      resources.
      
      We could continue to use a single work area buffer, and convert
      rtas_data_buf_lock to a mutex. But that would impose an unnecessarily
      coarse serialization on all users. As awkward as the current design
      is, it prevents longer running operations that need to repeatedly use
      rtas_data_buf from blocking the progress of others.
      
      There are more considerations. One is that while 4KB is fine for all
      current in-kernel uses, some RTAS calls can take much smaller buffers,
      and some (VPD, platform dumps) would likely benefit from larger
      ones. Another is that at least one RTAS function (ibm,get-vpd)
      has *two* work area parameters. And finally, we should expect the
      number of work area users in the kernel to increase over time as we
      introduce lockdown-compatible ABIs to replace less safe use cases
      based on sys_rtas/librtas.
      
      So a special-purpose allocator for RTAS work area buffers seems worth
      trying.
      
      Properties:
      
      * The backing memory for the allocator is reserved early in boot in
        order to satisfy RTAS addressing requirements, and then managed with
        genalloc.
      * Allocations can block, but they never fail (mempool-like).
      * Prioritizes first-come, first-serve fairness over throughput.
      * Early boot allocations before the allocator has been initialized are
        served via an internal static buffer.
      
      Intended to replace rtas_data_buf. New code that needs RTAS work area
      buffers should prefer this API.
      Signed-off-by: default avatarNathan Lynch <nathanl@linux.ibm.com>
      Signed-off-by: default avatarMichael Ellerman <mpe@ellerman.id.au>
      Link: https://lore.kernel.org/r/20230125-b4-powerpc-rtas-queue-v3-12-26929c8cce78@linux.ibm.com
      43033bc6
    • Nathan Lynch's avatar
      powerpc/rtas: add tracepoints around RTAS entry · 24098f58
      Nathan Lynch authored
      Decompose the RTAS entry C code into tracing and non-tracing variants,
      calling the just-added tracepoints in the tracing-enabled path. Skip
      tracing in contexts known to be unsafe (real mode, CPU offline).
      Signed-off-by: default avatarNathan Lynch <nathanl@linux.ibm.com>
      Signed-off-by: default avatarMichael Ellerman <mpe@ellerman.id.au>
      Link: https://lore.kernel.org/r/20230125-b4-powerpc-rtas-queue-v3-11-26929c8cce78@linux.ibm.com
      24098f58
    • Nathan Lynch's avatar
      powerpc/tracing: tracepoints for RTAS entry and exit · 2c81ca7f
      Nathan Lynch authored
      Add two sets of tracepoints to be used around RTAS entry:
      
      * rtas_input/rtas_output, which emit the function name, its inputs,
        the returned status, and any other outputs. These produce an API-level
        record of OS<->RTAS activity.
      
      * rtas_ll_entry/rtas_ll_exit, which are lower-level and emit the
        entire contents of the parameter block (aka rtas_args) on entry and
        exit. Likely useful only for debugging.
      
      With uses of these tracepoints in do_enter_rtas() to be added in the
      following patch, examples of get-time-of-day and event-scan functions
      as rendered by trace-cmd (with some multi-line formatting manually
      imposed on the rtas_ll_* entries to avoid extremely long lines in the
      commit message):
      
      cat-36800 [059]  4978.518303: rtas_input:           get-time-of-day arguments:
      cat-36800 [059]  4978.518306: rtas_ll_entry:        token=3 nargs=0 nret=8
                                                          params: [0]=0x00000000 [1]=0x00000000 [2]=0x00000000 [3]=0x00000000
                                                                  [4]=0x00000000 [5]=0x00000000 [6]=0x00000000 [7]=0x00000000
      							    [8]=0x00000000 [9]=0x00000000 [10]=0x00000000 [11]=0x00000000
      							    [12]=0x00000000 [13]=0x00000000 [14]=0x00000000 [15]=0x00000000
      cat-36800 [059]  4978.518366: rtas_ll_exit:         token=3 nargs=0 nret=8
                                                          params: [0]=0x00000000 [1]=0x000007e6 [2]=0x0000000b [3]=0x00000001
      						            [4]=0x00000000 [5]=0x0000000e [6]=0x00000008 [7]=0x2e0dac40
      							    [8]=0x00000000 [9]=0x00000000 [10]=0x00000000 [11]=0x00000000
      							    [12]=0x00000000 [13]=0x00000000 [14]=0x00000000 [15]=0x00000000
      cat-36800 [059]  4978.518366: rtas_output:          get-time-of-day status: 0, other outputs: 2022 11 1 0 14 8 772648000
      
      kworker/39:1-336   [039]  4982.731623: rtas_input:           event-scan arguments: 4294967295 0 80484920 2048
      kworker/39:1-336   [039]  4982.731626: rtas_ll_entry:        token=6 nargs=4 nret=1
                                                                   params: [0]=0xffffffff [1]=0x00000000 [2]=0x04cc1a38 [3]=0x00000800
      							             [4]=0x00000000 [5]=0x0000000e [6]=0x00000008 [7]=0x2e0dac40
      								     [8]=0x00000000 [9]=0x00000000 [10]=0x00000000 [11]=0x00000000
      								     [12]=0x00000000 [13]=0x00000000 [14]=0x00000000 [15]=0x00000000
      kworker/39:1-336   [039]  4982.731676: rtas_ll_exit:         token=6 nargs=4 nret=1
                                                                   params: [0]=0xffffffff [1]=0x00000000 [2]=0x04cc1a38 [3]=0x00000800
      							             [4]=0x00000001 [5]=0x0000000e [6]=0x00000008 [7]=0x2e0dac40
      								     [8]=0x00000000 [9]=0x00000000 [10]=0x00000000 [11]=0x00000000
      								     [12]=0x00000000 [13]=0x00000000 [14]=0x00000000 [15]=0x00000000
      kworker/39:1-336   [039]  4982.731677: rtas_output:          event-scan status: 1, other outputs:
      Signed-off-by: default avatarNathan Lynch <nathanl@linux.ibm.com>
      Signed-off-by: default avatarMichael Ellerman <mpe@ellerman.id.au>
      Link: https://lore.kernel.org/r/20230125-b4-powerpc-rtas-queue-v3-10-26929c8cce78@linux.ibm.com
      2c81ca7f
    • Nathan Lynch's avatar
      powerpc/rtas: strengthen do_enter_rtas() type safety, drop inline · 77f85f69
      Nathan Lynch authored
      Make do_enter_rtas() take a pointer to struct rtas_args and do the
      __pa() conversion in one place instead of leaving it to callers. This
      also makes it possible to introduce enter/exit tracepoints that access
      the rtas_args struct fields.
      
      There's no apparent reason to force inlining of do_enter_rtas()
      either, and it seems to bloat the code a bit. Let the compiler decide.
      Signed-off-by: default avatarNathan Lynch <nathanl@linux.ibm.com>
      Reviewed-by: default avatarAndrew Donnellan <ajd@linux.ibm.com>
      Reviewed-by: default avatarNicholas Piggin <npiggin@gmail.com>
      Signed-off-by: default avatarMichael Ellerman <mpe@ellerman.id.au>
      Link: https://lore.kernel.org/r/20230125-b4-powerpc-rtas-queue-v3-9-26929c8cce78@linux.ibm.com
      77f85f69
    • Nathan Lynch's avatar
      powerpc/rtas: improve function information lookups · 8252b882
      Nathan Lynch authored
      The core RTAS support code and its clients perform two types of lookup
      for RTAS firmware function information.
      
      First, mapping a known function name to a token. The typical use case
      invokes rtas_token() to retrieve the token value to pass to
      rtas_call(). rtas_token() relies on of_get_property(), which performs
      a linear search of the /rtas node's property list under a lock with
      IRQs disabled.
      
      Second, and less common: given a token value, looking up some
      information about the function. The primary example is the sys_rtas
      filter path, which linearly scans a small table to match the token to
      a rtas_filter struct. Another use case to come is RTAS entry/exit
      tracepoints, which will require efficient lookup of function names
      from token values. Currently there is no general API for this.
      
      We need something much like the existing rtas_filters table, but more
      general and organized to facilitate efficient lookups.
      
      Introduce:
      
      * A new rtas_function type, aggregating function name, token,
        and filter. Other function characteristics could be added in the
        future.
      
      * An array of rtas_function, where each element corresponds to a known
        RTAS function. All information in the table is static save the token
        values, which are derived from the device tree at boot. The array is
        sorted by function name to allow binary search.
      
      * A named constant for each known RTAS function, used to index the
        function array. These also will be used in a client-facing API to be
        added later.
      
      * An xarray that maps valid tokens to rtas_function objects.
      
      Fold the existing rtas_filter table into the new rtas_function array,
      with the appropriate adjustments to block_rtas_call(). Remove
      now-redundant fields from struct rtas_filter. Preserve the function of
      the CONFIG_CPU_BIG_ENDIAN guard in the current filter table by
      introducing a per-function flag that is set for the function entries
      related to pseries LPAR migration. These have never had working users
      via sys_rtas on ppc64le; see commit de0f7349 ("powerpc/rtas:
      prevent suspend-related sys_rtas use on LE").
      
      Convert rtas_token() to use a lockless binary search on the function
      table. Fall back to the old behavior for lookups against names that
      are not known to be RTAS functions, but issue a warning. rtas_token()
      is for function names; it is not a general facility for accessing
      arbitrary properties of the /rtas node. All known misuses of
      rtas_token() have been converted to more appropriate of_ APIs in
      preceding changes.
      Signed-off-by: default avatarNathan Lynch <nathanl@linux.ibm.com>
      Signed-off-by: default avatarMichael Ellerman <mpe@ellerman.id.au>
      Link: https://lore.kernel.org/r/20230125-b4-powerpc-rtas-queue-v3-8-26929c8cce78@linux.ibm.com
      8252b882
    • Nathan Lynch's avatar
      powerpc/pseries: drop RTAS-based timebase synchronization · d6f7fe3b
      Nathan Lynch authored
      The pseries platform has been LPAR-only for several generations, and
      the PAPR spec:
      
      * Guarantees that timebase synchronization is performed by
        the platform ("The timebase registers are synchronized by the
        platform before CPUs are given to the OS" - 7.3.8 SMP Support).
      
      * Completely omits the RTAS freeze-time-base and thaw-time-base RTAS
        functions, which are CHRP artifacts.
      
      This code is effectively unused on currently supported models, so drop
      it.
      Signed-off-by: default avatarNathan Lynch <nathanl@linux.ibm.com>
      Signed-off-by: default avatarMichael Ellerman <mpe@ellerman.id.au>
      Link: https://lore.kernel.org/r/20230125-b4-powerpc-rtas-queue-v3-7-26929c8cce78@linux.ibm.com
      d6f7fe3b
    • Nathan Lynch's avatar
      powerpc/rtas: ensure 4KB alignment for rtas_data_buf · 836b5b9f
      Nathan Lynch authored
      Some RTAS functions that have work area parameters impose alignment
      requirements on the work area passed to them by the OS. Examples
      include:
      
      - ibm,configure-connector
      - ibm,update-nodes
      - ibm,update-properties
      
      4KB is the greatest alignment required by PAPR for such
      buffers. rtas_data_buf used to have a __page_aligned attribute in the
      arch/ppc64 days, but that was changed to __cacheline_aligned for
      unknown reasons by commit 033ef338 ("powerpc: Merge rtas.c into
      arch/powerpc/kernel"). That works out to 128-byte alignment
      on ppc64, which isn't right.
      
      This was found by inspection and I'm not aware of any real problems
      caused by this. Either current RTAS implementations don't enforce the
      alignment constraints, or rtas_data_buf is always being placed at a
      4KB boundary by accident (or both, perhaps).
      
      Use __aligned(SZ_4K) to ensure the rtas_data_buf has alignment
      appropriate for all users.
      Signed-off-by: default avatarNathan Lynch <nathanl@linux.ibm.com>
      Fixes: 033ef338 ("powerpc: Merge rtas.c into arch/powerpc/kernel")
      Signed-off-by: default avatarMichael Ellerman <mpe@ellerman.id.au>
      Link: https://lore.kernel.org/r/20230125-b4-powerpc-rtas-queue-v3-6-26929c8cce78@linux.ibm.com
      836b5b9f
    • Nathan Lynch's avatar
      powerpc/pseries/setup: add missing RTAS retry status handling · b7d5333c
      Nathan Lynch authored
      The ibm,get-system-parameter RTAS function may return -2 or 990x,
      which indicate that the caller should try again.
      
      pSeries_cmo_feature_init() ignores this, making it possible to fail to
      detect cooperative memory overcommit capabilities during boot.
      
      Move the RTAS call into a conventional rtas_busy_delay()-based
      loop, dropping unnecessary clearing of rtas_data_buf.
      Signed-off-by: default avatarNathan Lynch <nathanl@linux.ibm.com>
      Signed-off-by: default avatarMichael Ellerman <mpe@ellerman.id.au>
      Link: https://lore.kernel.org/r/20230125-b4-powerpc-rtas-queue-v3-5-26929c8cce78@linux.ibm.com
      b7d5333c
    • Nathan Lynch's avatar
      powerpc/pseries/lparcfg: add missing RTAS retry status handling · 5d08633e
      Nathan Lynch authored
      The ibm,get-system-parameter RTAS function may return -2 or 990x,
      which indicate that the caller should try again.
      
      lparcfg's parse_system_parameter_string() ignores this, making it
      possible to intermittently report incorrect SPLPAR characteristics.
      
      Move the RTAS call into a coventional rtas_busy_delay()-based loop.
      Signed-off-by: default avatarNathan Lynch <nathanl@linux.ibm.com>
      Fixes: 1da177e4 ("Linux-2.6.12-rc2")
      Signed-off-by: default avatarMichael Ellerman <mpe@ellerman.id.au>
      Link: https://lore.kernel.org/r/20230125-b4-powerpc-rtas-queue-v3-4-26929c8cce78@linux.ibm.com
      5d08633e
    • Nathan Lynch's avatar
      powerpc/pseries/lpar: add missing RTAS retry status handling · daa8ab59
      Nathan Lynch authored
      The ibm,get-system-parameter RTAS function may return -2 or 990x,
      which indicate that the caller should try again.
      
      pseries_lpar_read_hblkrm_characteristics() ignores this, making it
      possible to incorrectly detect TLB block invalidation characteristics
      at boot.
      
      Move the RTAS call into a coventional rtas_busy_delay()-based loop.
      Signed-off-by: default avatarNathan Lynch <nathanl@linux.ibm.com>
      Fixes: 1211ee61 ("powerpc/pseries: Read TLB Block Invalidate Characteristics")
      Signed-off-by: default avatarMichael Ellerman <mpe@ellerman.id.au>
      Link: https://lore.kernel.org/r/20230125-b4-powerpc-rtas-queue-v3-3-26929c8cce78@linux.ibm.com
      daa8ab59
    • Nathan Lynch's avatar
      powerpc/perf/hv-24x7: add missing RTAS retry status handling · cc4b26ea
      Nathan Lynch authored
      The ibm,get-system-parameter RTAS function may return -2 or 990x,
      which indicate that the caller should try again. read_24x7_sys_info()
      ignores this, allowing transient failures in reporting processor
      module information.
      
      Move the RTAS call into a coventional rtas_busy_delay()-based loop,
      along with the parsing of results on success.
      Signed-off-by: default avatarNathan Lynch <nathanl@linux.ibm.com>
      Fixes: 8ba21426 ("powerpc/hv-24x7: Add rtas call in hv-24x7 driver to get processor details")
      Signed-off-by: default avatarMichael Ellerman <mpe@ellerman.id.au>
      Link: https://lore.kernel.org/r/20230125-b4-powerpc-rtas-queue-v3-2-26929c8cce78@linux.ibm.com
      cc4b26ea
    • Nathan Lynch's avatar
      powerpc/rtas: handle extended delays safely in early boot · 09d1ea72
      Nathan Lynch authored
      Some code that runs early in boot calls RTAS functions that can return
      -2 or 990x statuses, which mean the caller should retry. An example is
      pSeries_cmo_feature_init(), which invokes ibm,get-system-parameter but
      treats these benign statuses as errors instead of retrying.
      
      pSeries_cmo_feature_init() and similar code should be made to retry
      until they succeed or receive a real error, using the usual pattern:
      
      	do {
      		rc = rtas_call(token, etc...);
      	} while (rtas_busy_delay(rc));
      
      But rtas_busy_delay() will perform a timed sleep on any 990x
      status. This isn't safe so early in boot, before the CPU scheduler and
      timer subsystem have initialized.
      
      The -2 RTAS status is much more likely to occur during single-threaded
      boot than 990x in practice, at least on PowerVM. This is because -2
      usually means that RTAS made progress but exhausted its self-imposed
      timeslice, while 990x is associated with concurrent requests from the
      OS causing internal contention. Regardless, according to the language
      in PAPR, the OS should be prepared to handle either type of status at
      any time.
      
      Add a fallback path to rtas_busy_delay() to handle this as safely as
      possible, performing a small delay on 990x. Include a counter to
      detect retry loops that aren't making progress and bail out. Add __ref
      to rtas_busy_delay() since it now conditionally calls an __init
      function.
      
      This was found by inspection and I'm not aware of any real
      failures. However, the implementation of rtas_busy_delay() before
      commit 38f7b706 ("powerpc/rtas: rtas_busy_delay() improvements")
      was not susceptible to this problem, so let's treat this as a
      regression.
      Signed-off-by: default avatarNathan Lynch <nathanl@linux.ibm.com>
      Fixes: 38f7b706 ("powerpc/rtas: rtas_busy_delay() improvements")
      Signed-off-by: default avatarMichael Ellerman <mpe@ellerman.id.au>
      Link: https://lore.kernel.org/r/20230125-b4-powerpc-rtas-queue-v3-1-26929c8cce78@linux.ibm.com
      09d1ea72
    • Russell Currey's avatar
      integrity/powerpc: Support loading keys from PLPKS · 4b3e71e9
      Russell Currey authored
      Add support for loading keys from the PLPKS on pseries machines, with the
      "ibm,plpks-sb-v1" format.
      
      The object format is expected to be the same, so there shouldn't be any
      functional differences between objects retrieved on powernv or pseries.
      
      Unlike on powernv, on pseries the format string isn't contained in the
      device tree. Use secvar_ops->format() to fetch the format string in a
      generic manner, rather than searching the device tree ourselves.
      
      (The current code searches the device tree for a node compatible with
      "ibm,edk2-compat-v1". This patch switches to calling secvar_ops->format(),
      which in the case of OPAL/powernv means opal_secvar_format(), which
      searches the device tree for a node compatible with "ibm,secvar-backend"
      and checks its "format" property. These are equivalent, as skiboot creates
      a node with both "ibm,edk2-compat-v1" and "ibm,secvar-backend" as
      compatible strings.)
      Signed-off-by: default avatarRussell Currey <ruscur@russell.cc>
      Signed-off-by: default avatarAndrew Donnellan <ajd@linux.ibm.com>
      Reviewed-by: default avatarStefan Berger <stefanb@linux.ibm.com>
      Signed-off-by: default avatarMichael Ellerman <mpe@ellerman.id.au>
      Link: https://lore.kernel.org/r/20230210080401.345462-27-ajd@linux.ibm.com
      4b3e71e9
    • Russell Currey's avatar
      integrity/powerpc: Improve error handling & reporting when loading certs · 3c8069b0
      Russell Currey authored
      A few improvements to load_powerpc.c:
      
       - include integrity.h for the pr_fmt()
       - move all error reporting out of get_cert_list()
       - use ERR_PTR() to better preserve error detail
       - don't use pr_err() for missing keys
      Reviewed-by: default avatarMimi Zohar <zohar@linux.ibm.com>
      Signed-off-by: default avatarRussell Currey <ruscur@russell.cc>
      Signed-off-by: default avatarAndrew Donnellan <ajd@linux.ibm.com>
      Signed-off-by: default avatarMichael Ellerman <mpe@ellerman.id.au>
      Link: https://lore.kernel.org/r/20230210080401.345462-26-ajd@linux.ibm.com
      3c8069b0
    • Russell Currey's avatar
      powerpc/pseries: Implement secvars for dynamic secure boot · ccadf154
      Russell Currey authored
      The pseries platform can support dynamic secure boot (i.e. secure boot
      using user-defined keys) using variables contained with the PowerVM LPAR
      Platform KeyStore (PLPKS).  Using the powerpc secvar API, expose the
      relevant variables for pseries dynamic secure boot through the existing
      secvar filesystem layout.
      
      The relevant variables for dynamic secure boot are signed in the
      keystore, and can only be modified using the H_PKS_SIGNED_UPDATE hcall.
      Object labels in the keystore are encoded using ucs2 format.  With our
      fixed variable names we don't have to care about encoding outside of the
      necessary byte padding.
      
      When a user writes to a variable, the first 8 bytes of data must contain
      the signed update flags as defined by the hypervisor.
      
      When a user reads a variable, the first 4 bytes of data contain the
      policies defined for the object.
      
      Limitations exist due to the underlying implementation of sysfs binary
      attributes, as is the case for the OPAL secvar implementation -
      partial writes are unsupported and writes cannot be larger than PAGE_SIZE.
      (Even when using bin_attributes, which can be larger than a single page,
      sysfs only gives us one page's worth of write buffer at a time, and the
      hypervisor does not expose an interface for partial writes.)
      Co-developed-by: default avatarNayna Jain <nayna@linux.ibm.com>
      Signed-off-by: default avatarNayna Jain <nayna@linux.ibm.com>
      Co-developed-by: default avatarAndrew Donnellan <ajd@linux.ibm.com>
      Signed-off-by: default avatarAndrew Donnellan <ajd@linux.ibm.com>
      Signed-off-by: default avatarRussell Currey <ruscur@russell.cc>
      [mpe: Add NLS dependency to fix build errors, squash fix from ajd]
      Signed-off-by: default avatarMichael Ellerman <mpe@ellerman.id.au>
      Link: https://lore.kernel.org/r/20230210080401.345462-25-ajd@linux.ibm.com
      ccadf154
  2. 12 Feb, 2023 25 commits