1. 17 Jun, 2024 2 commits
    • Kan Liang's avatar
      perf/x86/uncore: Support per PMU cpumask · c74443d9
      Kan Liang authored
      The cpumask of some uncore units, e.g., CXL uncore units, may be wrong
      under some configurations. Perf may access an uncore counter of a
      non-existent uncore unit.
      
      The uncore driver assumes that all uncore units are symmetric among
      dies. A global cpumask is shared among all uncore PMUs. However, some
      CXL uncore units may only be available on some dies.
      
      A per PMU cpumask is introduced to track the CPU mask of this PMU.
      The driver searches the unit control RB tree to check whether the PMU is
      available on a given die, and updates the per PMU cpumask accordingly.
      Signed-off-by: default avatarKan Liang <kan.liang@linux.intel.com>
      Signed-off-by: default avatarPeter Zijlstra (Intel) <peterz@infradead.org>
      Tested-by: default avatarYunying Sun <yunying.sun@intel.com>
      Link: https://lore.kernel.org/r/20240614134631.1092359-3-kan.liang@linux.intel.com
      c74443d9
    • Kan Liang's avatar
      perf/x86/uncore: Save the unit control address of all units · 0007f393
      Kan Liang authored
      The unit control address of some CXL units may be wrongly calculated
      under some configuration on a EMR machine.
      
      The current implementation only saves the unit control address of the
      units from the first die, and the first unit of the rest of dies. Perf
      assumed that the units from the other dies have the same offset as the
      first die. So the unit control address of the rest of the units can be
      calculated. However, the assumption is wrong, especially for the CXL
      units.
      
      Introduce an RB tree for each uncore type to save the unit control
      address and three kinds of ID information (unit ID, PMU ID, and die ID)
      for all units.
      The unit ID is a physical ID of a unit.
      The PMU ID is a logical ID assigned to a unit. The logical IDs start
      from 0 and must be contiguous. The physical ID and the logical ID are
      1:1 mapping. The units with the same physical ID in different dies share
      the same PMU.
      The die ID indicates which die a unit belongs to.
      
      The RB tree can be searched by two different keys (unit ID or PMU ID +
      die ID). During the RB tree setup, the unit ID is used as a key to look
      up the RB tree. The perf can create/assign a proper PMU ID to the unit.
      Later, after the RB tree is setup, PMU ID + die ID is used as a key to
      look up the RB tree to fill the cpumask of a PMU. It's used more
      frequently, so PMU ID + die ID is compared in the unit_less().
      The uncore_find_unit() has to be O(N). But the RB tree setup only occurs
      once during the driver load time. It should be acceptable.
      
      Compared with the current implementation, more space is required to save
      the information of all units. The extra size should be acceptable.
      For example, on EMR, there are 221 units at most. For a 2-socket machine,
      the extra space is ~6KB at most.
      Signed-off-by: default avatarKan Liang <kan.liang@linux.intel.com>
      Signed-off-by: default avatarPeter Zijlstra (Intel) <peterz@infradead.org>
      Link: https://lore.kernel.org/r/20240614134631.1092359-2-kan.liang@linux.intel.com
      0007f393
  2. 18 May, 2024 1 commit
  3. 08 May, 2024 1 commit
  4. 02 May, 2024 11 commits
    • Dhananjay Ugwekar's avatar
      perf/x86/rapl: Rename 'maxdie' to nr_rapl_pmu and 'dieid' to rapl_pmu_idx · 626c5acf
      Dhananjay Ugwekar authored
      AMD CPUs have the scope of RAPL energy-pkg event as package, whereas
      Intel Cascade Lake CPUs have the scope as die.
      
      To account for the difference in the energy-pkg event scope between AMD
      and Intel CPUs, give more generic and semantically correct names to the
      maxdie and dieid variables.
      
      No functional change.
      Signed-off-by: default avatarDhananjay Ugwekar <Dhananjay.Ugwekar@amd.com>
      Signed-off-by: default avatarIngo Molnar <mingo@kernel.org>
      Tested-by: default avatarK Prateek Nayak <kprateek.nayak@amd.com>
      Link: https://lore.kernel.org/r/20240502095115.177713-2-Dhananjay.Ugwekar@amd.com
      626c5acf
    • Ingo Molnar's avatar
      Merge branch 'x86/cpu' into perf/core, to pick up dependent commits · 10ed2b11
      Ingo Molnar authored
      We are going to fix perf-events fallout of changes in tip:x86/cpu,
      so merge in that branch first.
      Signed-off-by: default avatarIngo Molnar <mingo@kernel.org>
      10ed2b11
    • Adrian Hunter's avatar
      x86/insn: Add support for APX EVEX instructions to the opcode map · 690ca3a3
      Adrian Hunter authored
      To support APX functionality, the EVEX prefix is used to:
      
       - promote legacy instructions
       - promote VEX instructions
       - add new instructions
      
      Promoted VEX instructions require no extra annotation because the opcodes
      do not change and the permissive nature of the instruction decoder already
      allows them to have an EVEX prefix.
      
      Promoted legacy instructions and new instructions are placed in map 4 which
      has not been used before.
      
      Create a new table for map 4 and add APX instructions.
      
      Annotate SCALABLE instructions with "(es)" - refer to patch "x86/insn: Add
      support for APX EVEX to the instruction decoder logic". SCALABLE
      instructions must be represented in both no-prefix (NP) and 66 prefix
      forms.
      Signed-off-by: default avatarAdrian Hunter <adrian.hunter@intel.com>
      Signed-off-by: default avatarIngo Molnar <mingo@kernel.org>
      Link: https://lore.kernel.org/r/20240502105853.5338-9-adrian.hunter@intel.com
      690ca3a3
    • Adrian Hunter's avatar
      x86/insn: Add support for APX EVEX to the instruction decoder logic · 87bbaf1a
      Adrian Hunter authored
      Intel Advanced Performance Extensions (APX) extends the EVEX prefix to
      support:
      
       - extended general purpose registers (EGPRs) i.e. r16 to r31
       - Push-Pop Acceleration (PPX) hints
       - new data destination (NDD) register
       - suppress status flags writes (NF) of common instructions
       - new instructions
      
      Refer to the Intel Advanced Performance Extensions (Intel APX) Architecture
      Specification for details.
      
      The extended EVEX prefix does not need amended instruction decoder logic,
      except in one area. Some instructions are defined as SCALABLE which means
      the EVEX.W bit and EVEX.pp bits are used to determine operand size.
      Specifically, if an instruction is SCALABLE and EVEX.W is zero, then
      EVEX.pp value 0 (representing no prefix NP) means default operand size,
      whereas EVEX.pp value 1 (representing 66 prefix) means operand size
      override i.e. 16 bits
      
      Add an attribute (INAT_EVEX_SCALABLE) to identify such instructions, and
      amend the logic appropriately.
      
      Amend the awk script that generates the attribute tables from the opcode
      map, to recognise "(es)" as attribute INAT_EVEX_SCALABLE.
      Signed-off-by: default avatarAdrian Hunter <adrian.hunter@intel.com>
      Signed-off-by: default avatarIngo Molnar <mingo@kernel.org>
      Link: https://lore.kernel.org/r/20240502105853.5338-8-adrian.hunter@intel.com
      87bbaf1a
    • Adrian Hunter's avatar
      x86/insn: x86/insn: Add support for REX2 prefix to the instruction decoder opcode map · 159039af
      Adrian Hunter authored
      Support for REX2 has been added to the instruction decoder logic and the
      awk script that generates the attribute tables from the opcode map.
      
      Add REX2 prefix byte (0xD5) to the opcode map.
      
      Add annotation (!REX2) for map 0/1 opcodes that are reserved under REX2.
      
      Add JMPABS to the opcode map and add annotation (REX2) to identify that it
      has a mandatory REX2 prefix. A separate opcode attribute table is not
      needed at this time because JMPABS has the same attribute encoding as the
      MOV instruction that it shares an opcode with i.e. INAT_MOFFSET.
      Signed-off-by: default avatarAdrian Hunter <adrian.hunter@intel.com>
      Signed-off-by: default avatarIngo Molnar <mingo@kernel.org>
      Link: https://lore.kernel.org/r/20240502105853.5338-7-adrian.hunter@intel.com
      159039af
    • Adrian Hunter's avatar
      x86/insn: Add support for REX2 prefix to the instruction decoder logic · eada38d5
      Adrian Hunter authored
      Intel Advanced Performance Extensions (APX) uses a new 2-byte prefix named
      REX2 to select extended general purpose registers (EGPRs) i.e. r16 to r31.
      
      The REX2 prefix is effectively an extended version of the REX prefix.
      
      REX2 and EVEX are also used with PUSH/POP instructions to provide a
      Push-Pop Acceleration (PPX) hint. With PPX hints, a CPU will attempt to
      fast-forward register data between matching PUSH and POP instructions.
      
      REX2 is valid only with opcodes in maps 0 and 1. Similar extension for
      other maps is provided by the EVEX prefix, covered in a separate patch.
      
      Some opcodes in maps 0 and 1 are reserved under REX2. One of these is used
      for a new 64-bit absolute direct jump instruction JMPABS.
      
      Refer to the Intel Advanced Performance Extensions (Intel APX) Architecture
      Specification for details.
      
      Define a code value for the REX2 prefix (INAT_PFX_REX2), and add attribute
      flags for opcodes reserved under REX2 (INAT_NO_REX2) and to identify
      opcodes (only JMPABS) that require a mandatory REX2 prefix
      (INAT_REX2_VARIANT).
      
      Amend logic to read the REX2 prefix and get the opcode attribute for the
      map number (0 or 1) encoded in the REX2 prefix.
      
      Amend the awk script that generates the attribute tables from the opcode
      map, to recognise "REX2" as attribute INAT_PFX_REX2, and "(!REX2)"
      as attribute INAT_NO_REX2, and "(REX2)" as attribute INAT_REX2_VARIANT.
      Signed-off-by: default avatarAdrian Hunter <adrian.hunter@intel.com>
      Signed-off-by: default avatarIngo Molnar <mingo@kernel.org>
      Link: https://lore.kernel.org/r/20240502105853.5338-6-adrian.hunter@intel.com
      eada38d5
    • Adrian Hunter's avatar
      x86/insn: Add misc new Intel instructions · 9dd36128
      Adrian Hunter authored
      The x86 instruction decoder is used not only for decoding kernel
      instructions. It is also used by perf uprobes (user space probes) and by
      perf tools Intel Processor Trace decoding. Consequently, it needs to
      support instructions executed by user space also.
      
      Add instructions documented in Intel Architecture Instruction Set
      Extensions and Future Features Programming Reference March 2024
      319433-052, that have not been added yet:
      
      	AADD
      	AAND
      	AOR
      	AXOR
      	CMPccXADD
      	PBNDKB
      	RDMSRLIST
      	URDMSR
      	UWRMSR
      	VBCSTNEBF162PS
      	VBCSTNESH2PS
      	VCVTNEEBF162PS
      	VCVTNEEPH2PS
      	VCVTNEOBF162PS
      	VCVTNEOPH2PS
      	VCVTNEPS2BF16
      	VPDPB[SU,UU,SS]D[,S]
      	VPDPW[SU,US,UU]D[,S]
      	VPMADD52HUQ
      	VPMADD52LUQ
      	VSHA512MSG1
      	VSHA512MSG2
      	VSHA512RNDS2
      	VSM3MSG1
      	VSM3MSG2
      	VSM3RNDS2
      	VSM4KEY4
      	VSM4RNDS4
      	WRMSRLIST
      	TCMMIMFP16PS
      	TCMMRLFP16PS
      	TDPFP16PS
      	PREFETCHIT1
      	PREFETCHIT0
      Signed-off-by: default avatarAdrian Hunter <adrian.hunter@intel.com>
      Signed-off-by: default avatarIngo Molnar <mingo@kernel.org>
      Link: https://lore.kernel.org/r/20240502105853.5338-5-adrian.hunter@intel.com
      9dd36128
    • Adrian Hunter's avatar
      x86/insn: Add VEX versions of VPDPBUSD, VPDPBUSDS, VPDPWSSD and VPDPWSSDS · b8000264
      Adrian Hunter authored
      The x86 instruction decoder is used not only for decoding kernel
      instructions. It is also used by perf uprobes (user space probes) and by
      perf tools Intel Processor Trace decoding. Consequently, it needs to
      support instructions executed by user space also.
      
      Intel Architecture Instruction Set Extensions and Future Features manual
      number 319433-044 of May 2021, documented VEX versions of instructions
      VPDPBUSD, VPDPBUSDS, VPDPWSSD and VPDPWSSDS, but the opcode map has them
      listed as EVEX only.
      
      Remove EVEX-only (ev) annotation from instructions VPDPBUSD, VPDPBUSDS,
      VPDPWSSD and VPDPWSSDS, which allows them to be decoded with either a VEX
      or EVEX prefix.
      
      Fixes: 0153d98f ("x86/insn: Add misc instructions to x86 instruction decoder")
      Signed-off-by: default avatarAdrian Hunter <adrian.hunter@intel.com>
      Signed-off-by: default avatarIngo Molnar <mingo@kernel.org>
      Link: https://lore.kernel.org/r/20240502105853.5338-4-adrian.hunter@intel.com
      b8000264
    • Adrian Hunter's avatar
      x86/insn: Fix PUSH instruction in x86 instruction decoder opcode map · 59162e0c
      Adrian Hunter authored
      The x86 instruction decoder is used not only for decoding kernel
      instructions. It is also used by perf uprobes (user space probes) and by
      perf tools Intel Processor Trace decoding. Consequently, it needs to
      support instructions executed by user space also.
      
      Opcode 0x68 PUSH instruction is currently defined as 64-bit operand size
      only i.e. (d64). That was based on Intel SDM Opcode Map. However that is
      contradicted by the Instruction Set Reference section for PUSH in the
      same manual.
      
      Remove 64-bit operand size only annotation from opcode 0x68 PUSH
      instruction.
      
      Example:
      
        $ cat pushw.s
        .global  _start
        .text
        _start:
                pushw   $0x1234
                mov     $0x1,%eax   # system call number (sys_exit)
                int     $0x80
        $ as -o pushw.o pushw.s
        $ ld -s -o pushw pushw.o
        $ objdump -d pushw | tail -4
        0000000000401000 <.text>:
          401000:       66 68 34 12             pushw  $0x1234
          401004:       b8 01 00 00 00          mov    $0x1,%eax
          401009:       cd 80                   int    $0x80
        $ perf record -e intel_pt//u ./pushw
        [ perf record: Woken up 1 times to write data ]
        [ perf record: Captured and wrote 0.014 MB perf.data ]
      
       Before:
      
        $ perf script --insn-trace=disasm
        Warning:
        1 instruction trace errors
                 pushw   10349 [000] 10586.869237014:            401000 [unknown] (/home/ahunter/git/misc/rtit-tests/pushw)           pushw $0x1234
                 pushw   10349 [000] 10586.869237014:            401006 [unknown] (/home/ahunter/git/misc/rtit-tests/pushw)           addb %al, (%rax)
                 pushw   10349 [000] 10586.869237014:            401008 [unknown] (/home/ahunter/git/misc/rtit-tests/pushw)           addb %cl, %ch
                 pushw   10349 [000] 10586.869237014:            40100a [unknown] (/home/ahunter/git/misc/rtit-tests/pushw)           addb $0x2e, (%rax)
         instruction trace error type 1 time 10586.869237224 cpu 0 pid 10349 tid 10349 ip 0x40100d code 6: Trace doesn't match instruction
      
       After:
      
        $ perf script --insn-trace=disasm
                   pushw   10349 [000] 10586.869237014:            401000 [unknown] (./pushw)           pushw $0x1234
                   pushw   10349 [000] 10586.869237014:            401004 [unknown] (./pushw)           movl $1, %eax
      
      Fixes: eb13296c ("x86: Instruction decoder API")
      Signed-off-by: default avatarAdrian Hunter <adrian.hunter@intel.com>
      Signed-off-by: default avatarIngo Molnar <mingo@kernel.org>
      Link: https://lore.kernel.org/r/20240502105853.5338-3-adrian.hunter@intel.com
      59162e0c
    • Chang S. Bae's avatar
      x86/insn: Add Key Locker instructions to the opcode map · a5dd673a
      Chang S. Bae authored
      The x86 instruction decoder needs to know these new instructions that
      are going to be used in the crypto library as well as the x86 core
      code. Add the following:
      
      LOADIWKEY:
      	Load a CPU-internal wrapping key.
      
      ENCODEKEY128:
      	Wrap a 128-bit AES key to a key handle.
      
      ENCODEKEY256:
      	Wrap a 256-bit AES key to a key handle.
      
      AESENC128KL:
      	Encrypt a 128-bit block of data using a 128-bit AES key
      	indicated by a key handle.
      
      AESENC256KL:
      	Encrypt a 128-bit block of data using a 256-bit AES key
      	indicated by a key handle.
      
      AESDEC128KL:
      	Decrypt a 128-bit block of data using a 128-bit AES key
      	indicated by a key handle.
      
      AESDEC256KL:
      	Decrypt a 128-bit block of data using a 256-bit AES key
      	indicated by a key handle.
      
      AESENCWIDE128KL:
      	Encrypt 8 128-bit blocks of data using a 128-bit AES key
      	indicated by a key handle.
      
      AESENCWIDE256KL:
      	Encrypt 8 128-bit blocks of data using a 256-bit AES key
      	indicated by a key handle.
      
      AESDECWIDE128KL:
      	Decrypt 8 128-bit blocks of data using a 128-bit AES key
      	indicated by a key handle.
      
      AESDECWIDE256KL:
      	Decrypt 8 128-bit blocks of data using a 256-bit AES key
      	indicated by a key handle.
      
      The detail can be found in Intel Software Developer Manual.
      Signed-off-by: default avatarChang S. Bae <chang.seok.bae@intel.com>
      Signed-off-by: default avatarAdrian Hunter <adrian.hunter@intel.com>
      Signed-off-by: default avatarIngo Molnar <mingo@kernel.org>
      Reviewed-by: default avatarDan Williams <dan.j.williams@intel.com>
      Link: https://lore.kernel.org/r/20240502105853.5338-2-adrian.hunter@intel.com
      a5dd673a
    • Ingo Molnar's avatar
      ad112b3a
  5. 29 Apr, 2024 13 commits
  6. 28 Apr, 2024 6 commits
  7. 27 Apr, 2024 6 commits
    • Linus Torvalds's avatar
      Merge tag 'rust-fixes-6.9' of https://github.com/Rust-for-Linux/linux · 2c815938
      Linus Torvalds authored
      Pull Rust fixes from Miguel Ojeda:
      
       - Soundness: make internal functions generated by the 'module!' macro
         inaccessible, do not implement 'Zeroable' for 'Infallible' and
         require 'Send' for the 'Module' trait.
      
       - Build: avoid errors with "empty" files and workaround 'rustdoc' ICE.
      
       - Kconfig: depend on '!CFI_CLANG' and avoid selecting 'CONSTRUCTORS'.
      
       - Code docs: remove non-existing key from 'module!' macro example.
      
       - Docs: trivial rendering fix in arch table.
      
      * tag 'rust-fixes-6.9' of https://github.com/Rust-for-Linux/linux:
        rust: remove `params` from `module` macro example
        kbuild: rust: force `alloc` extern to allow "empty" Rust files
        kbuild: rust: remove unneeded `@rustc_cfg` to avoid ICE
        rust: kernel: require `Send` for `Module` implementations
        rust: phy: implement `Send` for `Registration`
        rust: make mutually exclusive with CFI_CLANG
        rust: macros: fix soundness issue in `module!` macro
        rust: init: remove impl Zeroable for Infallible
        docs: rust: fix improper rendering in Arch Support page
        rust: don't select CONSTRUCTORS
      2c815938
    • Linus Torvalds's avatar
      Merge tag 'riscv-for-linus-6.9-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux · 57865f39
      Linus Torvalds authored
      Pull RISC-V fixes from Palmer Dabbelt:
      
       - A fix for TASK_SIZE on rv64/NOMMU, to reflect the lack of user/kernel
         separation
      
       - A fix to avoid loading rv64/NOMMU kernel past the start of RAM
      
       - A fix for RISCV_HWPROBE_EXT_ZVFHMIN on ilp32 to avoid signed integer
         overflow in the bitmask
      
       - The sud_test kselftest has been fixed to properly swizzle the syscall
         number into the return register, which are not the same on RISC-V
      
       - A fix for a build warning in the perf tools on rv32
      
       - A fix for the CBO selftests, to avoid non-constants leaking into the
         inline asm
      
       - A pair of fixes for T-Head PBMT errata probing, which has been
         renamed MAE by the vendor
      
      * tag 'riscv-for-linus-6.9-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux:
        RISC-V: selftests: cbo: Ensure asm operands match constraints, take 2
        perf riscv: Fix the warning due to the incompatible type
        riscv: T-Head: Test availability bit before enabling MAE errata
        riscv: thead: Rename T-Head PBMT to MAE
        selftests: sud_test: return correct emulated syscall value on RISC-V
        riscv: hwprobe: fix invalid sign extension for RISCV_HWPROBE_EXT_ZVFHMIN
        riscv: Fix loading 64-bit NOMMU kernels past the start of RAM
        riscv: Fix TASK_SIZE on 64-bit NOMMU
      57865f39
    • Linus Torvalds's avatar
      Merge tag '6.9-rc5-cifs-fixes-part2' of git://git.samba.org/sfrench/cifs-2.6 · d43df69f
      Linus Torvalds authored
      Pull smb client fixes from Steve French:
       "Three smb3 client fixes, all also for stable:
      
         - two small locking fixes spotted by Coverity
      
         - FILE_ALL_INFO and network_open_info packing fix"
      
      * tag '6.9-rc5-cifs-fixes-part2' of git://git.samba.org/sfrench/cifs-2.6:
        smb3: fix lock ordering potential deadlock in cifs_sync_mid_result
        smb3: missing lock when picking channel
        smb: client: Fix struct_group() usage in __packed structs
      d43df69f
    • Linus Torvalds's avatar
      Merge tag 'i2c-for-6.9-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/wsa/linux · 5d12ed4b
      Linus Torvalds authored
      Pull i2c fixes from Wolfram Sang:
       "Fix a race condition in the at24 eeprom handler, a NULL pointer
        exception in the I2C core for controllers only using target modes,
        drop a MAINTAINERS entry, and fix an incorrect DT binding for at24"
      
      * tag 'i2c-for-6.9-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/wsa/linux:
        i2c: smbus: fix NULL function pointer dereference
        MAINTAINERS: Drop entry for PCA9541 bus master selector
        eeprom: at24: fix memory corruption race condition
        dt-bindings: eeprom: at24: Fix ST M24C64-D compatible schema
      5d12ed4b
    • Tetsuo Handa's avatar
      profiling: Remove create_prof_cpu_mask(). · 2e5449f4
      Tetsuo Handa authored
      create_prof_cpu_mask() is no longer used after commit 1f44a225 ("s390:
      convert interrupt handling to use generic hardirq").
      Signed-off-by: default avatarTetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      2e5449f4
    • Linus Torvalds's avatar
      Merge tag 'soundwire-6.9-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/vkoul/soundwire · 8a5c3ef7
      Linus Torvalds authored
      Pull soundwire fix from Vinod Koul:
      
       - Single AMD driver fix for wake interrupt handling in clockstop mode
      
      * tag 'soundwire-6.9-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/vkoul/soundwire:
        soundwire: amd: fix for wake interrupt handling for clockstop mode
      8a5c3ef7