• Mark Rutland's avatar
    arm64: Avoid cpus_have_const_cap() for ARM64_MTE · 94324bcb
    Mark Rutland authored
    In system_supports_mte() we use cpus_have_const_cap() to check for
    ARM64_MTE, but this is not necessary and cpus_have_final_boot_cap()
    would be preferable.
    
    For historical reasons, cpus_have_const_cap() is more complicated than
    it needs to be. Before cpucaps are finalized, it will perform a bitmap
    test of the system_cpucaps bitmap, and once cpucaps are finalized it
    will use an alternative branch. This used to be necessary to handle some
    race conditions in the window between cpucap detection and the
    subsequent patching of alternatives and static branches, where different
    branches could be out-of-sync with one another (or w.r.t. alternative
    sequences). Now that we use alternative branches instead of static
    branches, these are all patched atomically w.r.t. one another, and there
    are only a handful of cases that need special care in the window between
    cpucap detection and alternative patching.
    
    Due to the above, it would be nice to remove cpus_have_const_cap(), and
    migrate callers over to alternative_has_cap_*(), cpus_have_final_cap(),
    or cpus_have_cap() depending on when their requirements. This will
    remove redundant instructions and improve code generation, and will make
    it easier to determine how each callsite will behave before, during, and
    after alternative patching.
    
    The ARM64_MTE cpucap is a boot cpu feature which is detected and patched
    early on the boot CPU under smp_prepare_boot_cpu(). In the window
    between detecting the ARM64_MTE cpucap and patching alternatives,
    nothing depends on the ARM64_MTE cpucap:
    
    * The kasan_hw_tags_enabled() helper depends upon the kasan_flag_enabled
      static key, which is initialized later in kasan_init_hw_tags() after
      alternatives have been applied.
    
    * No KVM code is called during this window, and KVM is not initialized
      until after system cpucaps have been detected and patched. KVM code
      can safely use cpus_have_final_cap() or alternative_has_cap_*().
    
    * We don't context-switch prior to patching boot alternatives, and thus
      mte_thread_switch() is not reachable during this window. Thus, we can
      safely use cpus_have_final_boot_cap() or alternative_has_cap_*() in
      the context-switch code.
    
    * IRQ and FIQ are masked during this window, and we can only take SError
      and Debug exceptions. SError exceptions are fatal at this point in
      time, and we do not expect to take Debug exceptions, thus:
    
      - It's fine to lave TCO set for exceptions taken during this window,
        and mte_disable_tco_entry() doesn't need to do anything.
    
      - We don't need to detect and report asynchronous tag cehck faults
        during this window, and neither mte_check_tfsr_entry() nor
        mte_check_tfsr_exit() need to do anything.
    
      Since we want to report any SErrors taken during thiw window, these
      cannot safely use cpus_have_final_boot_cap() or cpus_have_final_cap(),
      but these can safely use alternative_has_cap_*().
    
    * The __set_pte_at() function is not used during this window. It is
      possible for this to be used on kernel mappings prior to boot cpucaps
      being finalized, so this cannot safely use cpus_have_final_boot_cap()
      or cpus_have_final_cap(), but this can safely use
      alternative_has_cap_*().
    
    * No userspace translation tables have been created yet, and swap has
      not been initialized yet. Thus swapping is not possible and none of
      the following are called:
    
      - arch_thp_swp_supported()
      - arch_prepare_to_swap()
      - arch_swap_invalidate_page()
      - arch_swap_invalidate_area()
      - arch_swap_restore()
    
      These can safely use system_has_final_cap() or
      alternative_has_cap_*().
    
    * The elfcore functions are only reachable after userspace is brought
      up, which happens after system cpucaps have been detected and patched.
      Thus the elfcore code can safely use cpus_have_final_cap() or
      alternative_has_cap_*().
    
    * Hibernation is only possible after userspace is brought up, which
      happens after system cpucaps have been detected and patched. Thus the
      hibernate code can safely use cpus_have_final_cap() or
      alternative_has_cap_*().
    
    * The set_tagged_addr_ctrl() function is only reachable after userspace
      is brought up, which happens after system cpucaps have been detected
      and patched. Thus this can safely use cpus_have_final_cap() or
      alternative_has_cap_*().
    
    * The copy_user_highpage() and copy_highpage() functions are not used
      during this window, and can safely use alternative_has_cap_*().
    
    This patch replaces the use of cpus_have_const_cap() with
    alternative_has_cap_unlikely(), which avoid generating code to test the
    system_cpucaps bitmap and should be better for all subsequent calls at
    runtime.
    Signed-off-by: default avatarMark Rutland <mark.rutland@arm.com>
    Cc: Peter Collingbourne <pcc@google.com>
    Cc: Suzuki K Poulose <suzuki.poulose@arm.com>
    Cc: Will Deacon <will@kernel.org>
    Signed-off-by: default avatarCatalin Marinas <catalin.marinas@arm.com>
    94324bcb
cpufeature.h 29.7 KB