1. 13 Mar, 2023 17 commits
    • Thomas Richter's avatar
      perf list: Add PMU pai_ext event description for IBM z16 · d30baf2c
      Thomas Richter authored
      Add the event description for the IBM z16 pai_ext PMU released with
      commit c432fefe ("s390/pai: Add support for PAI Extension 1 NNPA
      counters")
      
      The document SA22-7832-13 "z/Architecture Principles of Operation",
      published May, 2022, contains the description of the
      Processor Activity Instrumentation Facility and the NNPA counter
      set., See Pages 5-113 to 5-116 and chapter 26 for details.
      Signed-off-by: default avatarThomas Richter <tmricht@linux.ibm.com>
      Acked-by: default avatarIan Rogers <irogers@google.com>
      Acked-by: default avatarSumanth Korikkar <sumanthk@linux.ibm.com>
      Cc: Heiko Carstens <hca@linux.ibm.com>
      Cc: Sven Schnelle <svens@linux.ibm.com>
      Cc: Vasily Gorbik <gor@linux.ibm.com>
      Link: https://lore.kernel.org/r/20230308125326.2195613-1-tmricht@linux.ibm.comSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      d30baf2c
    • Thomas Richter's avatar
      perf vendor events s390: Add cache metrics for z16 · f8a6cea4
      Thomas Richter authored
      Add metrics for s390 z16
      
      - Percentage sourced from Level 2 cache
      - Percentage sourced from Level 3 on same chip cache
      - Percentage sourced from Level 4 Local cache on same book
      - Percentage sourced from Level 4 Remote cache on different book
      - Percentage sourced from memory
      
      For details about the formulas see this documentation:
      https://www.ibm.com/support/pages/system/files/inline-files/CPU%20MF%20Formulas%20including%20z16%20-%20May%202022_1.pdf
      
      Output after:
      
        # ./perf stat -M l4rp -- dd if=/dev/zero of=/dev/null bs=10M count=10K
        .... dd output deleted
      
        Performance counter stats for 'dd if=/dev/zero of=/dev/null bs=10M count=10K':
      
                        0      IDCW_OFF_DRAWER_CHIP_HIT         #     0.00 l4rp
                  431,866      L1I_DIR_WRITES
                    2,395      IDCW_OFF_DRAWER_IV
                        0      ICW_OFF_DRAWER
                        0      IDCW_OFF_DRAWER_DRAWER_HIT
                    1,437      DCW_OFF_DRAWER
              425,960,793      L1D_DIR_WRITES
      
             12.165030699 seconds time elapsed
      
              0.001037000 seconds user
             12.162140000 seconds sys
        #
      Signed-off-by: default avatarThomas Richter <tmricht@linux.ibm.com>
      Acked-by: default avatarIan Rogers <irogers@google.com>
      Acked-By: default avatarSumanth Korikkar <sumanthk@linux.ibm.com>
      Cc: Heiko Carstens <hca@linux.ibm.com>
      Cc: Sven Schnelle <svens@linux.ibm.com>
      Cc: Vasily Gorbik <gor@linux.ibm.com>
      Link: https://lore.kernel.org/r/20230313080201.2440201-1-tmricht@linux.ibm.comSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      f8a6cea4
    • Thomas Richter's avatar
      perf vendor events s390: Add common metrics · 74395567
      Thomas Richter authored
      Add 3 metrics for s390 machines:
      
      - Cycles per instruction: Amount of CPU cycles used per instructions,
        named cpi.
      - Problem state ratio: Ratio of instructions executed in problem state
        compared to total number of instructions, named prbstate.
      - Level one instruction and data cache misses per 100 instructions,
        named l1mp.
      
      For details about the formulas see this documentation:
      https://www.ibm.com/support/pages/system/files/inline-files/CPU%20MF%20Formulas%20including%20z16%20-%20May%202022_1.pdf
      
      Output after:
      
        # ./perf stat -M cpi -- dd if=/dev/zero of=/dev/null bs=1M count=10K
        10240+0 records in
        10240+0 records out
        10737418240 bytes (11 GB, 10 GiB) copied, 1.30151 s, 8.2 GB/s
      
        Performance counter stats for 'dd if=/dev/zero of=/dev/null .....':
      
          6,779,778,802      CPU_CYCLES              #     1.96 cpi
          3,461,975,090      INSTRUCTIONS
      
          1.306873021 seconds time elapsed
      
          0.001034000 seconds user
          1.305677000 seconds sys
        #
      Signed-off-by: default avatarThomas Richter <tmricht@linux.ibm.com>
      Acked-by: default avatarIan Rogers <irogers@google.com>
      Acked-By: default avatarSumanth Korikkar <sumanthk@linux.ibm.com>
      Cc: Heiko Carstens <hca@linux.ibm.com>
      Cc: Sven Schnelle <svens@linux.ibm.com>
      Cc: Vasily Gorbik <gor@linux.ibm.com>
      Link: https://lore.kernel.org/r/20230313080201.2440201-1-tmricht@linux.ibm.comSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      74395567
    • Ian Rogers's avatar
      perf parse-events: Warn when events are regrouped · a4c7d7c5
      Ian Rogers authored
      Use if an event is reordered or the number of groups increases to
      signal that regrouping has happened and warn about it. Disable the
      warning in the case wild card PMU names are used and for metrics.
      Signed-off-by: default avatarIan Rogers <irogers@google.com>
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Florian Fischer <florian.fischer@muhq.space>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: James Clark <james.clark@arm.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: John Garry <john.g.garry@oracle.com>
      Cc: Kajol Jain <kjain@linux.ibm.com>
      Cc: Kan Liang <kan.liang@linux.intel.com>
      Cc: Kim Phillips <kim.phillips@amd.com>
      Cc: Leo Yan <leo.yan@linaro.org>
      Cc: Mark Rutland <mark.rutland@arm.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Ravi Bangoria <ravi.bangoria@amd.com>
      Cc: Sean Christopherson <seanjc@google.com>
      Cc: Steinar H. Gunderson <sesse@google.com>
      Cc: Stephane Eranian <eranian@google.com>
      Cc: Suzuki Poulouse <suzuki.poulose@arm.com>
      Cc: Xing Zhengjun <zhengjun.xing@linux.intel.com>
      Link: https://lore.kernel.org/r/20230312021543.3060328-2-irogers@google.comSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      a4c7d7c5
    • Ian Rogers's avatar
      perf evlist: Remove nr_groups · 9d2dc632
      Ian Rogers authored
      Maintaining the number of groups during event parsing is problematic
      and since changing to sort/regroup events can only be computed by a
      linear pass over the evlist. As the value is generally only used in
      tests, rather than hold it in a variable compute it by passing over
      the evlist when necessary.
      
      This change highlights that libpfm's counting of groups with a single
      entry disagreed with regular event parsing. The libpfm tests are
      updated accordingly.
      Signed-off-by: default avatarIan Rogers <irogers@google.com>
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Florian Fischer <florian.fischer@muhq.space>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: James Clark <james.clark@arm.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: John Garry <john.g.garry@oracle.com>
      Cc: Kajol Jain <kjain@linux.ibm.com>
      Cc: Kan Liang <kan.liang@linux.intel.com>
      Cc: Kim Phillips <kim.phillips@amd.com>
      Cc: Leo Yan <leo.yan@linaro.org>
      Cc: Mark Rutland <mark.rutland@arm.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Ravi Bangoria <ravi.bangoria@amd.com>
      Cc: Sean Christopherson <seanjc@google.com>
      Cc: Steinar H. Gunderson <sesse@google.com>
      Cc: Stephane Eranian <eranian@google.com>
      Cc: Suzuki Poulouse <suzuki.poulose@arm.com>
      Cc: Xing Zhengjun <zhengjun.xing@linux.intel.com>
      Link: https://lore.kernel.org/r/20230312021543.3060328-2-irogers@google.comSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      9d2dc632
    • Ian Rogers's avatar
      perf evsel: Remove use_uncore_alias · e733f87e
      Ian Rogers authored
      This flag used to be used when regrouping uncore events in particular
      due to wildcard matches. This is now handled by sorting evlist and so
      the flag is redundant.
      Signed-off-by: default avatarIan Rogers <irogers@google.com>
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Florian Fischer <florian.fischer@muhq.space>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: James Clark <james.clark@arm.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: John Garry <john.g.garry@oracle.com>
      Cc: Kajol Jain <kjain@linux.ibm.com>
      Cc: Kan Liang <kan.liang@linux.intel.com>
      Cc: Kim Phillips <kim.phillips@amd.com>
      Cc: Leo Yan <leo.yan@linaro.org>
      Cc: Mark Rutland <mark.rutland@arm.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Ravi Bangoria <ravi.bangoria@amd.com>
      Cc: Sean Christopherson <seanjc@google.com>
      Cc: Steinar H. Gunderson <sesse@google.com>
      Cc: Stephane Eranian <eranian@google.com>
      Cc: Suzuki Poulouse <suzuki.poulose@arm.com>
      Cc: Xing Zhengjun <zhengjun.xing@linux.intel.com>
      Link: https://lore.kernel.org/r/20230312021543.3060328-2-irogers@google.comSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      e733f87e
    • Ian Rogers's avatar
      perf parse-events: Sort and group parsed events · 347c2f0a
      Ian Rogers authored
      This change is intended to be a no-op for most current cases, the
      default sort order is the order the events were parsed. Where it
      varies is in how groups are handled. Previously an uncore and core
      event that are grouped would most often cause the group to be removed:
      
      ```
      $ perf stat -e '{instructions,uncore_imc_free_running_0/data_total/}' -a sleep 1
      WARNING: grouped events cpus do not match, disabling group:
        anon group { instructions, uncore_imc_free_running_0/data_total/ }
      ...
      ```
      
      However, when wildcards are used the events should be re-sorted and
      re-grouped in parse_events__set_leader, but this currently fails for
      simple examples:
      
      ```
      $ perf stat -e '{uncore_imc_free_running/data_read/,uncore_imc_free_running/data_write/}' -a sleep 1
      
       Performance counter stats for 'system wide':
      
           <not counted> MiB  uncore_imc_free_running/data_read/
           <not counted> MiB  uncore_imc_free_running/data_write/
      
             1.000996992 seconds time elapsed
      ```
      
      A futher failure mode, fixed in this patch, is to force topdown events
      into a group.
      
      This change moves sorting the evsels in the evlist after parsing. It
      requires parsing to set up groups. First the evsels are sorted
      respecting the existing groupings and parse order, but also reordering
      to ensure evsels of the same PMU and group appear together. So that
      software and aux events respect groups, their pmu_name is taken from
      the group leader. The sorting is done with list_sort removing a memory
      allocation.
      
      After sorting a pass is done to correct the group leaders and for
      topdown events ensuring they have a group leader.
      
      This fixes the problems seen before:
      
      ```
      $ perf stat -e '{uncore_imc_free_running/data_read/,uncore_imc_free_running/data_write/}' -a sleep 1
      
       Performance counter stats for 'system wide':
      
                  727.42 MiB  uncore_imc_free_running/data_read/
                   81.84 MiB  uncore_imc_free_running/data_write/
      
             1.000948615 seconds time elapsed
      ```
      
      As well as making groups not fail for cases like:
      
      ```
      $ perf stat -e '{imc_free_running_0/data_total/,imc_free_running_1/data_total/}' -a sleep 1
      
       Performance counter stats for 'system wide':
      
                  256.47 MiB  imc_free_running_0/data_total/
                  256.48 MiB  imc_free_running_1/data_total/
      
             1.001165442 seconds time elapsed
      ```
      Signed-off-by: default avatarIan Rogers <irogers@google.com>
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Florian Fischer <florian.fischer@muhq.space>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: James Clark <james.clark@arm.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: John Garry <john.g.garry@oracle.com>
      Cc: Kajol Jain <kjain@linux.ibm.com>
      Cc: Kan Liang <kan.liang@linux.intel.com>
      Cc: Kim Phillips <kim.phillips@amd.com>
      Cc: Leo Yan <leo.yan@linaro.org>
      Cc: Mark Rutland <mark.rutland@arm.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Ravi Bangoria <ravi.bangoria@amd.com>
      Cc: Sean Christopherson <seanjc@google.com>
      Cc: Steinar H. Gunderson <sesse@google.com>
      Cc: Stephane Eranian <eranian@google.com>
      Cc: Suzuki Poulouse <suzuki.poulose@arm.com>
      Cc: Xing Zhengjun <zhengjun.xing@linux.intel.com>
      Link: https://lore.kernel.org/r/20230312021543.3060328-2-irogers@google.comSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      347c2f0a
    • Ian Rogers's avatar
      perf parse-events: Pass ownership of the group name · 4bb311b2
      Ian Rogers authored
      Pass ownership of the group name rather than copying and freeing the
      original. This saves a memory allocation and copy.
      Signed-off-by: default avatarIan Rogers <irogers@google.com>
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Florian Fischer <florian.fischer@muhq.space>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: James Clark <james.clark@arm.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: John Garry <john.g.garry@oracle.com>
      Cc: Kajol Jain <kjain@linux.ibm.com>
      Cc: Kan Liang <kan.liang@linux.intel.com>
      Cc: Kim Phillips <kim.phillips@amd.com>
      Cc: Leo Yan <leo.yan@linaro.org>
      Cc: Mark Rutland <mark.rutland@arm.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Ravi Bangoria <ravi.bangoria@amd.com>
      Cc: Sean Christopherson <seanjc@google.com>
      Cc: Steinar H. Gunderson <sesse@google.com>
      Cc: Stephane Eranian <eranian@google.com>
      Cc: Suzuki Poulouse <suzuki.poulose@arm.com>
      Cc: Xing Zhengjun <zhengjun.xing@linux.intel.com>
      Link: https://lore.kernel.org/r/20230312021543.3060328-2-irogers@google.comSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      4bb311b2
    • Ian Rogers's avatar
      perf evsel: Add function to compute group PMU name · 7abf0bcc
      Ian Rogers authored
      The computed name respects software events and aux event groups, such
      that the pmu_name is changed to be that of the aux event leader or
      group leader for software events. This is done as a later change will
      split events that are in different PMUs into different groups.
      
      Committer notes:
      
      Added a stub for this new function so that 'perf test python' passes.
      Signed-off-by: default avatarIan Rogers <irogers@google.com>
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Florian Fischer <florian.fischer@muhq.space>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: James Clark <james.clark@arm.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: John Garry <john.g.garry@oracle.com>
      Cc: Kajol Jain <kjain@linux.ibm.com>
      Cc: Kan Liang <kan.liang@linux.intel.com>
      Cc: Kim Phillips <kim.phillips@amd.com>
      Cc: Leo Yan <leo.yan@linaro.org>
      Cc: Mark Rutland <mark.rutland@arm.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Ravi Bangoria <ravi.bangoria@amd.com>
      Cc: Sean Christopherson <seanjc@google.com>
      Cc: Steinar H. Gunderson <sesse@google.com>
      Cc: Stephane Eranian <eranian@google.com>
      Cc: Suzuki Poulouse <suzuki.poulose@arm.com>
      Cc: Xing Zhengjun <zhengjun.xing@linux.intel.com>
      Link: https://lore.kernel.org/r/20230312021543.3060328-2-irogers@google.comSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      7abf0bcc
    • Ian Rogers's avatar
      perf evsel: Allow const evsel for certain accesses · c6d616fe
      Ian Rogers authored
      List sorting, added later to evlist, passes const elements requiring
      helper functions to also be const. Make the argument to
      evsel__find_pmu, evsel__is_aux_event and evsel__leader const.
      Signed-off-by: default avatarIan Rogers <irogers@google.com>
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Florian Fischer <florian.fischer@muhq.space>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: James Clark <james.clark@arm.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: John Garry <john.g.garry@oracle.com>
      Cc: Kajol Jain <kjain@linux.ibm.com>
      Cc: Kan Liang <kan.liang@linux.intel.com>
      Cc: Kim Phillips <kim.phillips@amd.com>
      Cc: Leo Yan <leo.yan@linaro.org>
      Cc: Mark Rutland <mark.rutland@arm.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Ravi Bangoria <ravi.bangoria@amd.com>
      Cc: Sean Christopherson <seanjc@google.com>
      Cc: Steinar H. Gunderson <sesse@google.com>
      Cc: Stephane Eranian <eranian@google.com>
      Cc: Suzuki Poulouse <suzuki.poulose@arm.com>
      Cc: Xing Zhengjun <zhengjun.xing@linux.intel.com>
      Link: https://lore.kernel.org/r/20230312021543.3060328-2-irogers@google.comSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      c6d616fe
    • Ian Rogers's avatar
      perf stat: Modify the group test · ce5b8590
      Ian Rogers authored
      Currently nr_members is 0 for an event with no group, however, they
      are always a leader of their own group. A later change will make that
      count 1 because the event is its own leader. Make the find_stat logic
      consistent with this, an improvement suggested by Namhyung Kim.
      Suggested-by: default avatarNamhyung Kim <namhyung@kernel.org>
      Signed-off-by: default avatarIan Rogers <irogers@google.com>
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Florian Fischer <florian.fischer@muhq.space>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: James Clark <james.clark@arm.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: John Garry <john.g.garry@oracle.com>
      Cc: Kajol Jain <kjain@linux.ibm.com>
      Cc: Kan Liang <kan.liang@linux.intel.com>
      Cc: Kim Phillips <kim.phillips@amd.com>
      Cc: Leo Yan <leo.yan@linaro.org>
      Cc: Mark Rutland <mark.rutland@arm.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Ravi Bangoria <ravi.bangoria@amd.com>
      Cc: Sean Christopherson <seanjc@google.com>
      Cc: Steinar H. Gunderson <sesse@google.com>
      Cc: Stephane Eranian <eranian@google.com>
      Cc: Suzuki Poulouse <suzuki.poulose@arm.com>
      Cc: Xing Zhengjun <zhengjun.xing@linux.intel.com>
      Link: https://lore.kernel.org/r/20230312021543.3060328-2-irogers@google.comSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      ce5b8590
    • Ian Rogers's avatar
      perf pmu: Earlier PMU auxtrace initialization · 3c7b84d4
      Ian Rogers authored
      This allows event parsing to use the evsel__is_aux_event function,
      which is important when determining event grouping.
      Suggested-by: default avatarAdrian Hunter <adrian.hunter@intel.com>
      Signed-off-by: default avatarIan Rogers <irogers@google.com>
      Acked-by: default avatarAdrian Hunter <adrian.hunter@intel.com>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Florian Fischer <florian.fischer@muhq.space>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: James Clark <james.clark@arm.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: John Garry <john.g.garry@oracle.com>
      Cc: Kajol Jain <kjain@linux.ibm.com>
      Cc: Kan Liang <kan.liang@linux.intel.com>
      Cc: Kim Phillips <kim.phillips@amd.com>
      Cc: Leo Yan <leo.yan@linaro.org>
      Cc: Mark Rutland <mark.rutland@arm.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Ravi Bangoria <ravi.bangoria@amd.com>
      Cc: Sean Christopherson <seanjc@google.com>
      Cc: Steinar H. Gunderson <sesse@google.com>
      Cc: Stephane Eranian <eranian@google.com>
      Cc: Suzuki Poulouse <suzuki.poulose@arm.com>
      Cc: Xing Zhengjun <zhengjun.xing@linux.intel.com>
      Link: https://lore.kernel.org/r/20230312021543.3060328-2-irogers@google.comSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      3c7b84d4
    • Ian Rogers's avatar
      perf stat: Don't remove all grouped events when CPU maps disagree · bc6c6cdc
      Ian Rogers authored
      If the events in an evlist's CPU map differ then the entire group is
      removed. For example:
      
      ```
      $ perf stat -e '{imc_free_running/data_read/,imc_free_running/data_write/,cs}' -a sleep 1
      WARNING: grouped events cpus do not match, disabling group:
        anon group { imc_free_running/data_read/, imc_free_running/data_write/, cs }
      ```
      
      Change the behavior so that just the events not matching the leader
      are removed. So in the example above, just 'cs' will be removed.
      
      Modify the warning so that it is produced once for each group, rather
      than once for the entire evlist. Shrink the scope and size of the
      warning text buffer.
      Signed-off-by: default avatarIan Rogers <irogers@google.com>
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Florian Fischer <florian.fischer@muhq.space>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: James Clark <james.clark@arm.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: John Garry <john.g.garry@oracle.com>
      Cc: Kajol Jain <kjain@linux.ibm.com>
      Cc: Kan Liang <kan.liang@linux.intel.com>
      Cc: Kim Phillips <kim.phillips@amd.com>
      Cc: Leo Yan <leo.yan@linaro.org>
      Cc: Mark Rutland <mark.rutland@arm.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Ravi Bangoria <ravi.bangoria@amd.com>
      Cc: Sean Christopherson <seanjc@google.com>
      Cc: Steinar H. Gunderson <sesse@google.com>
      Cc: Stephane Eranian <eranian@google.com>
      Cc: Suzuki Poulouse <suzuki.poulose@arm.com>
      Cc: Xing Zhengjun <zhengjun.xing@linux.intel.com>
      Link: https://lore.kernel.org/r/20230312021543.3060328-2-irogers@google.comSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      bc6c6cdc
    • Ian Rogers's avatar
      libperf evlist: Avoid a use of evsel idx · 5dd827e0
      Ian Rogers authored
      Setting the leader iterates the list, so rather than use idx (which
      may be changed through list reordering) just count the elements and
      set afterwards.
      Signed-off-by: default avatarIan Rogers <irogers@google.com>
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Florian Fischer <florian.fischer@muhq.space>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: James Clark <james.clark@arm.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: John Garry <john.g.garry@oracle.com>
      Cc: Kajol Jain <kjain@linux.ibm.com>
      Cc: Kan Liang <kan.liang@linux.intel.com>
      Cc: Kim Phillips <kim.phillips@amd.com>
      Cc: Leo Yan <leo.yan@linaro.org>
      Cc: Mark Rutland <mark.rutland@arm.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Ravi Bangoria <ravi.bangoria@amd.com>
      Cc: Sean Christopherson <seanjc@google.com>
      Cc: Steinar H. Gunderson <sesse@google.com>
      Cc: Stephane Eranian <eranian@google.com>
      Cc: Suzuki Poulouse <suzuki.poulose@arm.com>
      Cc: Xing Zhengjun <zhengjun.xing@linux.intel.com>
      Link: https://lore.kernel.org/r/20230312021543.3060328-2-irogers@google.comSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      5dd827e0
    • Changbin Du's avatar
      perf ftrace: Reuse target::initial_delay · f9f60efb
      Changbin Du authored
      Replace perf_ftrace::initial_delay with target::initial_delay.
      Specifying a negative initial_delay is meaningless for ftrace in
      practice but allowed here.
      Signed-off-by: default avatarChangbin Du <changbin.du@huawei.com>
      Acked-by: default avatarNamhyung Kim <namhyung@kernel.org>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Hui Wang <hw.huiwang@huawei.com>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Mark Rutland <mark.rutland@arm.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Link: https://lore.kernel.org/r/20230302031146.2801588-4-changbin.du@huawei.comSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      f9f60efb
    • Changbin Du's avatar
      perf record: Reuse target::initial_delay · cb4b9e68
      Changbin Du authored
      This just simply replace record_opts::initial_delay with
      target::initial_delay. Nothing else is changed.
      Signed-off-by: default avatarChangbin Du <changbin.du@huawei.com>
      Acked-by: default avatarNamhyung Kim <namhyung@kernel.org>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Hui Wang <hw.huiwang@huawei.com>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Mark Rutland <mark.rutland@arm.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Link: https://lore.kernel.org/r/20230302031146.2801588-3-changbin.du@huawei.comSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      cb4b9e68
    • Kan Liang's avatar
      perf record: Fix "read LOST count failed" msg with sample read · 07d85ba9
      Kan Liang authored
      Hundreds of "read LOST count failed" error messages may be displayed,
      when the below command is launched.
      
      perf record -e '{cpu/mem-loads-aux/,cpu/event=0xcd,umask=0x1/}:S' -a
      
      According to the commit 89e3106f ("libperf: Handle read format
      in perf_evsel__read()"), the PERF_FORMAT_GROUP is only available for
      the leader. However, the record__read_lost_samples() goes through every
      entry of an evlist, which includes both leader and member. The member
      event errors out and triggers the error message. Since there may be
      hundreds of CPUs on a server, the message will be printed hundreds of
      times, which is very annoying.
      
      The message itself is correct, but the pr_err is a overkill. Other error
      messages in the record__read_lost_samples() are all pr_debug. To make
      the output format consistent, change the pr_err("read LOST count
      failed\n"); to pr_debug("read LOST count failed\n");.
      User can still get the message via -v option.
      
      Fixes: e3a23261 ("perf record: Read and inject LOST_SAMPLES events")
      Signed-off-by: default avatarKan Liang <kan.liang@linux.intel.com>
      Acked-by: default avatarNamhyung Kim <namhyung@kernel.org>
      Cc: Ian Rogers <irogers@google.com>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Link: https://lore.kernel.org/r/20230301150413.27011-1-kan.liang@linux.intel.comSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      07d85ba9
  2. 10 Mar, 2023 17 commits
    • Arnaldo Carvalho de Melo's avatar
      Merge remote-tracking branch 'acme/perf-tools' into perf-tools-next · b8fa3e38
      Arnaldo Carvalho de Melo authored
      To pick up perf-tools fixes just merged upstream.
      Signed-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      b8fa3e38
    • Linus Torvalds's avatar
      Merge tag 'riscv-for-linus-6.3-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux · 55a21105
      Linus Torvalds authored
      Pull RISC-V fixes from Palmer Dabbelt:
      
       - RISC-V architecture-specific ELF attributes have been disabled in the
         kernel builds
      
       - A fix for a locking failure while during errata patching that
         manifests on SiFive-based systems
      
       - A fix for a KASAN failure during stack unwinding
      
       - A fix for some lockdep failures during text patching
      
      * tag 'riscv-for-linus-6.3-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux:
        RISC-V: Don't check text_mutex during stop_machine
        riscv: Use READ_ONCE_NOCHECK in imprecise unwinding stack mode
        RISC-V: fix taking the text_mutex twice during sifive errata patching
        RISC-V: Stop emitting attributes
      55a21105
    • Linus Torvalds's avatar
      Merge tag 'drm-fixes-2023-03-10' of git://anongit.freedesktop.org/drm/drm · b0d14d2a
      Linus Torvalds authored
      Pull drm fixes from Dave Airlie:
       "Weekly fixes.
      
        msm and amdgpu are the vast majority of these, otherwise some
        straggler misc from last week for nouveau and cirrus and a mailmap
        update for a drm developer.
      
        mailmap:
         - add an entry
      
        nouveau:
         - fix system shutdown regression
         - build warning fix
      
        cirrus:
         - NULL ptr deref fix
      
        msm:
         - fix invalid ptr free in syncobj cleanup
         - sync GMU removal in teardown
         - a5xx preemption fixes
         - fix runpm imbalance
         - DPU hw fixes
         - stack corruption fix
         - clear DSPP reservation
      
        amdgpu:
         - Misc display fixes
         - UMC 8.10 fixes
         - Driver unload fixes
         - NBIO 7.3.0 fix
         - Error checking fixes for soc15, nv, soc21 read register interface
         - Fix video cap query for VCN 4.0.4
      
        amdkfd:
         - Fix return check in doorbell handling"
      
      * tag 'drm-fixes-2023-03-10' of git://anongit.freedesktop.org/drm/drm: (42 commits)
        drm/amdgpu/soc21: Add video cap query support for VCN_4_0_4
        drm/amdgpu: fix error checking in amdgpu_read_mm_registers for nv
        drm/amdgpu: fix error checking in amdgpu_read_mm_registers for soc21
        drm/amdgpu: fix error checking in amdgpu_read_mm_registers for soc15
        drm/amdgpu: Fix the warning info when removing amdgpu device
        drm/amdgpu: fix return value check in kfd
        drm/amd: Fix initialization mistake for NBIO 7.3.0
        drm/amdgpu: Fix call trace warning and hang when removing amdgpu device
        mailmap: add mailmap entries for Faith.
        drm/msm: DEVFREQ_GOV_SIMPLE_ONDEMAND is no longer needed
        drm/amd/display: Update clock table to include highest clock setting
        drm/amd/pm: Enable ecc_info table support for smu v13_0_10
        drm/amdgpu: Support umc node harvest config on umc v8_10
        drm/connector: print max_requested_bpc in state debugfs
        drm/display: Don't block HDR_OUTPUT_METADATA on unknown EOTF
        drm/msm/dpu: clear DSPP reservations in rm release
        drm/msm/disp/dpu: fix sc7280_pp base offset
        drm/msm/dpu: fix stack smashing in dpu_hw_ctl_setup_blendstage
        drm/msm/dpu: don't use DPU_CLK_CTRL_CURSORn for DMA SSPP clocks
        drm/msm/dpu: fix clocks settings for msm8998 SSPP blocks
        ...
      b0d14d2a
    • Linus Torvalds's avatar
      Merge tag 'erofs-for-6.3-rc2-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/xiang/erofs · 388a8101
      Linus Torvalds authored
      Pull erofs fixes from Gao Xiang:
       "The most important one reverts an improper fix which can cause an
        unexpected warning more often on specific images, and another one
        fixes LZMA decompression on 32-bit platforms. The others are minor
        fixes and cleanups.
      
         - Fix LZMA decompression failure on HIGHMEM platforms
      
         - Revert an inproper fix since it is actually an implementation issue
           of vmalloc()
      
         - Avoid a wrong DBG_BUGON since it could be triggered with -EINTR
      
         - Minor cleanups"
      
      * tag 'erofs-for-6.3-rc2-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/xiang/erofs:
        erofs: use wrapper i_blocksize() in erofs_file_read_iter()
        erofs: get rid of a useless DBG_BUGON
        erofs: Revert "erofs: fix kvcalloc() misuse with __GFP_NOFAIL"
        erofs: fix wrong kunmap when using LZMA on HIGHMEM platforms
        erofs: mark z_erofs_lzma_init/erofs_pcpubuf_init w/ __init
      388a8101
    • Linus Torvalds's avatar
      Merge tag 'nfsd-6.3-2' of git://git.kernel.org/pub/scm/linux/kernel/git/cel/linux · 92cadfcf
      Linus Torvalds authored
      Pull nfsd fixes from Chuck Lever:
      
       - Protect NFSD writes against filesystem freezing
      
       - Fix a potential memory leak during server shutdown
      
      * tag 'nfsd-6.3-2' of git://git.kernel.org/pub/scm/linux/kernel/git/cel/linux:
        SUNRPC: Fix a server shutdown leak
        NFSD: Protect against filesystem freezing
      92cadfcf
    • Linus Torvalds's avatar
      Merge tag 'for-6.3-rc1-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux · ae195ca1
      Linus Torvalds authored
      Pull btrfs fixes from David Sterba:
       "First batch of fixes. Among them there are two updates to sysfs and
        ioctl which are not strictly fixes but are used for testing so there's
        no reason to delay them.
      
         - fix block group item corruption after inserting new block group
      
         - fix extent map logging bit not cleared for split maps after
           dropping range
      
         - fix calculation of unusable block group space reporting bogus
           values due to 32/64b division
      
         - fix unnecessary increment of read error stat on write error
      
         - improve error handling in inode update
      
         - export per-device fsid in DEV_INFO ioctl to distinguish seeding
           devices, needed for testing
      
         - allocator size classes:
            - fix potential dead lock in size class loading logic
            - print sysfs stats for the allocation classes"
      
      * tag 'for-6.3-rc1-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux:
        btrfs: fix block group item corruption after inserting new block group
        btrfs: fix extent map logging bit not cleared for split maps after dropping range
        btrfs: fix percent calculation for bg reclaim message
        btrfs: fix unnecessary increment of read error stat on write error
        btrfs: handle btrfs_del_item errors in __btrfs_update_delayed_inode
        btrfs: ioctl: return device fsid from DEV_INFO ioctl
        btrfs: fix potential dead lock in size class loading logic
        btrfs: sysfs: add size class stats
      ae195ca1
    • Linus Torvalds's avatar
      Merge tag 'io_uring-6.3-2023-03-09' of git://git.kernel.dk/linux · f331c5de
      Linus Torvalds authored
      Pull io_uring fixes from Jens Axboe:
      
       - Stop setting PF_NO_SETAFFINITY on io-wq workers.
      
         This has been reported in the past as it confuses some applications,
         as some of their threads will fail with -1/EINVAL if attempted
         affinitized. Most recent report was on cpusets, where enabling that
         with io-wq workers active will fail.
      
         Just deal with the mask changing by checking when a worker times out,
         and then exit if we have no work pending.
      
       - Fix an issue with passthrough support where we don't properly check
         if the file type has pollable uring_cmd support.
      
       - Fix a reported W=1 warning on a variable being set and unused. Add a
         special helper for iterating these lists that doesn't save the
         previous list element, if that iterator never ends up using it.
      
      * tag 'io_uring-6.3-2023-03-09' of git://git.kernel.dk/linux:
        io_uring: silence variable ‘prev’ set but not used warning
        io_uring/uring_cmd: ensure that device supports IOPOLL
        io_uring/io-wq: stop setting PF_NO_SETAFFINITY on io-wq workers
      f331c5de
    • Linus Torvalds's avatar
      Merge tag 'perf-tools-fixes-for-v6.3-1-2023-03-09' of... · 49be4fb2
      Linus Torvalds authored
      Merge tag 'perf-tools-fixes-for-v6.3-1-2023-03-09' of git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux
      
      Pull perf tools fixes from Arnaldo Carvalho de Melo:
      
       - Add Adrian Hunter to MAINTAINERS as a perf tools reviewer
      
       - Sync various tools/ copies of kernel headers with the kernel sources,
         this time trying to avoid first merging with upstream to then update
         but instead copy from upstream so that a merge is avoided and the end
         result after merging this pull request is the one expected,
         tools/perf/check-headers.sh (mostly) happy, less warnings while
         building tools/perf/
      
       - Fix counting when initial delay configured by setting
         perf_attr.enable_on_exec when starting workloads from the perf
         command line
      
       - Don't avoid emitting a PERF_RECORD_MMAP2 in 'perf inject
         --buildid-all' when that record comes with a build-id, otherwise we
         end up not being able to resolve symbols
      
       - Don't use comma as the CSV output separator the "stat+csv_output"
         test, as comma can appear on some tests as a modifier for an event,
         use @ instead, ditto for the JSON linter test
      
       - The offcpu test was looking for some bits being set on
         task_struct->prev_state without masking other bits not important for
         this specific 'perf test', fix it
      
      * tag 'perf-tools-fixes-for-v6.3-1-2023-03-09' of git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux:
        perf tools: Add Adrian Hunter to MAINTAINERS as a reviewer
        tools headers UAPI: Sync linux/perf_event.h with the kernel sources
        tools headers x86 cpufeatures: Sync with the kernel sources
        tools include UAPI: Sync linux/vhost.h with the kernel sources
        tools arch x86: Sync the msr-index.h copy with the kernel sources
        tools headers kvm: Sync uapi/{asm/linux} kvm.h headers with the kernel sources
        tools include UAPI: Synchronize linux/fcntl.h with the kernel sources
        tools headers: Synchronize {linux,vdso}/bits.h with the kernel sources
        tools headers UAPI: Sync linux/prctl.h with the kernel sources
        tools headers: Update the copy of x86's mem{cpy,set}_64.S used in 'perf bench'
        perf stat: Fix counting when initial delay configured
        tools headers svm: Sync svm headers with the kernel sources
        perf test: Avoid counting commas in json linter
        perf tests stat+csv_output: Switch CSV separator to @
        perf inject: Fix --buildid-all not to eat up MMAP2
        tools arch x86: Sync the msr-index.h copy with the kernel sources
        perf test: Fix offcpu test prev_state check
      49be4fb2
    • Dave Airlie's avatar
      Merge tag 'amd-drm-fixes-6.3-2023-03-09' of... · 519b2331
      Dave Airlie authored
      Merge tag 'amd-drm-fixes-6.3-2023-03-09' of https://gitlab.freedesktop.org/agd5f/linux into drm-fixes
      
      amd-drm-fixes-6.3-2023-03-09:
      
      amdgpu:
      - Misc display fixes
      - UMC 8.10 fixes
      - Driver unload fixes
      - NBIO 7.3.0 fix
      - Error checking fixes for soc15, nv, soc21 read register interface
      - Fix video cap query for VCN 4.0.4
      
      amdkfd:
      - Fix return check in doorbell handling
      Signed-off-by: default avatarDave Airlie <airlied@redhat.com>
      From: Alex Deucher <alexander.deucher@amd.com>
      Link: https://patchwork.freedesktop.org/patch/msgid/20230310031314.1296929-1-alexander.deucher@amd.com
      519b2331
    • Veerabadhran Gopalakrishnan's avatar
      drm/amdgpu/soc21: Add video cap query support for VCN_4_0_4 · 6ce2ea07
      Veerabadhran Gopalakrishnan authored
      Added the video capability query support for VCN version 4_0_4
      Signed-off-by: default avatarVeerabadhran Gopalakrishnan <veerabadhran.gopalakrishnan@amd.com>
      Reviewed-by: default avatarLeo Liu <leo.liu@amd.com>
      Signed-off-by: default avatarAlex Deucher <alexander.deucher@amd.com>
      Cc: stable@vger.kernel.org # 6.1.x
      6ce2ea07
    • Alex Deucher's avatar
      drm/amdgpu: fix error checking in amdgpu_read_mm_registers for nv · b42fee5e
      Alex Deucher authored
      Properly skip non-existent registers as well.
      
      Bug: https://gitlab.freedesktop.org/drm/amd/-/issues/2442Reviewed-by: default avatarHawking Zhang <Hawking.Zhang@amd.com>
      Signed-off-by: default avatarAlex Deucher <alexander.deucher@amd.com>
      Cc: stable@vger.kernel.org
      b42fee5e
    • Alex Deucher's avatar
      drm/amdgpu: fix error checking in amdgpu_read_mm_registers for soc21 · 2915e43a
      Alex Deucher authored
      Properly skip non-existent registers as well.
      
      Bug: https://gitlab.freedesktop.org/drm/amd/-/issues/2442Reviewed-by: default avatarHawking Zhang <Hawking.Zhang@amd.com>
      Signed-off-by: default avatarAlex Deucher <alexander.deucher@amd.com>
      Cc: stable@vger.kernel.org
      2915e43a
    • Alex Deucher's avatar
      drm/amdgpu: fix error checking in amdgpu_read_mm_registers for soc15 · 0dcdf849
      Alex Deucher authored
      Properly skip non-existent registers as well.
      
      Bug: https://gitlab.freedesktop.org/drm/amd/-/issues/2442Reviewed-by: default avatarHawking Zhang <Hawking.Zhang@amd.com>
      Reviewed-by: default avatarEvan Quan <evan.quan@amd.com>
      Signed-off-by: default avatarAlex Deucher <alexander.deucher@amd.com>
      Cc: stable@vger.kernel.org
      0dcdf849
    • lyndonli's avatar
      drm/amdgpu: Fix the warning info when removing amdgpu device · 8879ec6d
      lyndonli authored
      Actually, the drm_dev_enter in psp_cmd_submit_buf does not
      protect anything. If DRM device is unplugged, it will always
      check the condition in WARN_ON. So drop drm_dev_enter and
      drm_dev_exit in psp_cmd_submit_buf.
      
      When removing amdgpu, the calling order is as follows:
      amdgpu_pci_remove
          drm_dev_unplug
          amdgpu_driver_unload_kms
              amdgpu_device_fini_hw
                  amdgpu_device_ip_fini_early
                      psp_hw_fini
                          psp_ras_terminate
                              psp_ta_unloadye
                                  psp_cmd_submit_buf
      
      [ 4507.740388] Call Trace:
      [ 4507.740389]  <TASK>
      [ 4507.740391]  psp_ta_unload+0x44/0x70 [amdgpu]
      [ 4507.740485]  psp_ras_terminate+0x4d/0x70 [amdgpu]
      [ 4507.740575]  psp_hw_fini+0x28/0xa0 [amdgpu]
      [ 4507.740662]  amdgpu_device_fini_hw+0x328/0x442 [amdgpu]
      [ 4507.740791]  amdgpu_driver_unload_kms+0x51/0x60 [amdgpu]
      [ 4507.740875]  amdgpu_pci_remove+0x5a/0x140 [amdgpu]
      [ 4507.740962]  ? _raw_spin_unlock_irqrestore+0x27/0x43
      [ 4507.740965]  ? __pm_runtime_resume+0x60/0x90
      [ 4507.740968]  pci_device_remove+0x39/0xb0
      [ 4507.740971]  device_remove+0x46/0x70
      [ 4507.740972]  device_release_driver_internal+0xd1/0x160
      [ 4507.740974]  driver_detach+0x4a/0x90
      [ 4507.740975]  bus_remove_driver+0x6c/0xf0
      [ 4507.740976]  driver_unregister+0x31/0x50
      [ 4507.740977]  pci_unregister_driver+0x40/0x90
      [ 4507.740978]  amdgpu_exit+0x15/0x120 [amdgpu]
      
      v2: fix commit message style issue
      Signed-off-by: default avatarlyndonli <Lyndon.Li@amd.com>
      Reviewed-by: default avatarGuchun Chen <guchun.chen@amd.com>
      Acked-by: default avatarChristian König <christian.koenig@amd.com>
      Signed-off-by: default avatarAlex Deucher <alexander.deucher@amd.com>
      8879ec6d
    • Shashank Sharma's avatar
      drm/amdgpu: fix return value check in kfd · 20534dbc
      Shashank Sharma authored
      This patch fixes a return value check in kfd doorbell handling.
      This function should return 0(error) only when the ida_simple_get
      returns < 0(error), return > 0 is a success case.
      
      Cc: Felix Kuehling <Felix.Kuehling@amd.com>
      Cc: Alex Deucher <alexander.deucher@amd.com>
      Fixes: 16f00131 ("drm/amdkfd: Allocate doorbells only when needed")
      Acked-by: default avatarChristian Koenig <chriatian.koenig@amd.com>
      Reviewed-by: default avatarFelix Kuehling <Felix.Kuehling@amd.com>
      Signed-off-by: default avatarShashank Sharma <shashank.sharma@amd.com>
      Signed-off-by: default avatarAlex Deucher <alexander.deucher@amd.com>
      20534dbc
    • Mario Limonciello's avatar
      drm/amd: Fix initialization mistake for NBIO 7.3.0 · 1717cc5f
      Mario Limonciello authored
      The same strapping initialization issue that happened on NBIO 7.5.1
      appears to be happening on NBIO 7.3.0.
      Apply the same fix to 7.3.0 as well.
      
      Note: This workaround relies upon the integrated GPU being enabled
      in BIOS. If the integrated GPU is disabled in BIOS a different
      workaround will be required.
      Reported-by: default avatarThomas Glanzmann <thomas@glanzmann.de>
      Cc: Basavaraj Natikar <Basavaraj.Natikar@amd.com>
      Link: https://lore.kernel.org/linux-usb/Y%2Fz9GdHjPyF2rNG3@glanzmann.de/T/#uSigned-off-by: default avatarMario Limonciello <mario.limonciello@amd.com>
      Reviewed-by: default avatarAlex Deucher <alexander.deucher@amd.com>
      Signed-off-by: default avatarAlex Deucher <alexander.deucher@amd.com>
      1717cc5f
    • lyndonli's avatar
      drm/amdgpu: Fix call trace warning and hang when removing amdgpu device · 93bb18d2
      lyndonli authored
      On GPUs with RAS enabled, below call trace and hang are observed when
      shutting down device.
      
      v2: use DRM device unplugged flag instead of shutdown flag as the check to
      prevent memory wipe in shutdown stage.
      
      [ +0.000000] RIP: 0010:amdgpu_vram_mgr_fini+0x18d/0x1c0 [amdgpu]
      [ +0.000001] PKRU: 55555554
      [ +0.000001] Call Trace:
      [ +0.000001] <TASK>
      [ +0.000002] amdgpu_ttm_fini+0x140/0x1c0 [amdgpu]
      [ +0.000183] amdgpu_bo_fini+0x27/0xa0 [amdgpu]
      [ +0.000184] gmc_v11_0_sw_fini+0x2b/0x40 [amdgpu]
      [ +0.000163] amdgpu_device_fini_sw+0xb6/0x510 [amdgpu]
      [ +0.000152] amdgpu_driver_release_kms+0x16/0x30 [amdgpu]
      [ +0.000090] drm_dev_release+0x28/0x50 [drm]
      [ +0.000016] devm_drm_dev_init_release+0x38/0x60 [drm]
      [ +0.000011] devm_action_release+0x15/0x20
      [ +0.000003] release_nodes+0x40/0xc0
      [ +0.000001] devres_release_all+0x9e/0xe0
      [ +0.000001] device_unbind_cleanup+0x12/0x80
      [ +0.000003] device_release_driver_internal+0xff/0x160
      [ +0.000001] driver_detach+0x4a/0x90
      [ +0.000001] bus_remove_driver+0x6c/0xf0
      [ +0.000001] driver_unregister+0x31/0x50
      [ +0.000001] pci_unregister_driver+0x40/0x90
      [ +0.000003] amdgpu_exit+0x15/0x120 [amdgpu]
      Signed-off-by: default avatarlyndonli <Lyndon.Li@amd.com>
      Reviewed-by: default avatarGuchun Chen <guchun.chen@amd.com>
      Reviewed-by: default avatarChristian König <christian.koenig@amd.com>
      Signed-off-by: default avatarAlex Deucher <alexander.deucher@amd.com>
      93bb18d2
  3. 09 Mar, 2023 6 commits
    • Conor Dooley's avatar
      RISC-V: Don't check text_mutex during stop_machine · 2a8db5ec
      Conor Dooley authored
      We're currently using stop_machine() to update ftrace & kprobes, which
      means that the thread that takes text_mutex during may not be the same
      as the thread that eventually patches the code.  This isn't actually a
      race because the lock is still held (preventing any other concurrent
      accesses) and there is only one thread running during stop_machine(),
      but it does trigger a lockdep failure.
      
      This patch just elides the lockdep check during stop_machine.
      
      Fixes: c15ac4fd ("riscv/ftrace: Add dynamic function tracer support")
      Suggested-by: default avatarSteven Rostedt <rostedt@goodmis.org>
      Reported-by: default avatarChangbin Du <changbin.du@gmail.com>
      Signed-off-by: default avatarPalmer Dabbelt <palmerdabbelt@google.com>
      Signed-off-by: default avatarConor Dooley <conor.dooley@microchip.com>
      Link: https://lore.kernel.org/r/20230303143754.4005217-1-conor.dooley@microchip.comSigned-off-by: default avatarPalmer Dabbelt <palmer@rivosinc.com>
      2a8db5ec
    • Alexandre Ghiti's avatar
      riscv: Use READ_ONCE_NOCHECK in imprecise unwinding stack mode · 76950340
      Alexandre Ghiti authored
      When CONFIG_FRAME_POINTER is unset, the stack unwinding function
      walk_stackframe randomly reads the stack and then, when KASAN is enabled,
      it can lead to the following backtrace:
      
      [    0.000000] ==================================================================
      [    0.000000] BUG: KASAN: stack-out-of-bounds in walk_stackframe+0xa6/0x11a
      [    0.000000] Read of size 8 at addr ffffffff81807c40 by task swapper/0
      [    0.000000]
      [    0.000000] CPU: 0 PID: 0 Comm: swapper Not tainted 6.2.0-12919-g24203e6db61f #43
      [    0.000000] Hardware name: riscv-virtio,qemu (DT)
      [    0.000000] Call Trace:
      [    0.000000] [<ffffffff80007ba8>] walk_stackframe+0x0/0x11a
      [    0.000000] [<ffffffff80099ecc>] init_param_lock+0x26/0x2a
      [    0.000000] [<ffffffff80007c4a>] walk_stackframe+0xa2/0x11a
      [    0.000000] [<ffffffff80c49c80>] dump_stack_lvl+0x22/0x36
      [    0.000000] [<ffffffff80c3783e>] print_report+0x198/0x4a8
      [    0.000000] [<ffffffff80099ecc>] init_param_lock+0x26/0x2a
      [    0.000000] [<ffffffff80007c4a>] walk_stackframe+0xa2/0x11a
      [    0.000000] [<ffffffff8015f68a>] kasan_report+0x9a/0xc8
      [    0.000000] [<ffffffff80007c4a>] walk_stackframe+0xa2/0x11a
      [    0.000000] [<ffffffff80007c4a>] walk_stackframe+0xa2/0x11a
      [    0.000000] [<ffffffff8006e99c>] desc_make_final+0x80/0x84
      [    0.000000] [<ffffffff8009a04e>] stack_trace_save+0x88/0xa6
      [    0.000000] [<ffffffff80099fc2>] filter_irq_stacks+0x72/0x76
      [    0.000000] [<ffffffff8006b95e>] devkmsg_read+0x32a/0x32e
      [    0.000000] [<ffffffff8015ec16>] kasan_save_stack+0x28/0x52
      [    0.000000] [<ffffffff8006e998>] desc_make_final+0x7c/0x84
      [    0.000000] [<ffffffff8009a04a>] stack_trace_save+0x84/0xa6
      [    0.000000] [<ffffffff8015ec52>] kasan_set_track+0x12/0x20
      [    0.000000] [<ffffffff8015f22e>] __kasan_slab_alloc+0x58/0x5e
      [    0.000000] [<ffffffff8015e7ea>] __kmem_cache_create+0x21e/0x39a
      [    0.000000] [<ffffffff80e133ac>] create_boot_cache+0x70/0x9c
      [    0.000000] [<ffffffff80e17ab2>] kmem_cache_init+0x6c/0x11e
      [    0.000000] [<ffffffff80e00fd6>] mm_init+0xd8/0xfe
      [    0.000000] [<ffffffff80e011d8>] start_kernel+0x190/0x3ca
      [    0.000000]
      [    0.000000] The buggy address belongs to stack of task swapper/0
      [    0.000000]  and is located at offset 0 in frame:
      [    0.000000]  stack_trace_save+0x0/0xa6
      [    0.000000]
      [    0.000000] This frame has 1 object:
      [    0.000000]  [32, 56) 'c'
      [    0.000000]
      [    0.000000] The buggy address belongs to the physical page:
      [    0.000000] page:(____ptrval____) refcount:1 mapcount:0 mapping:0000000000000000 index:0x0 pfn:0x81a07
      [    0.000000] flags: 0x1000(reserved|zone=0)
      [    0.000000] raw: 0000000000001000 ff600003f1e3d150 ff600003f1e3d150 0000000000000000
      [    0.000000] raw: 0000000000000000 0000000000000000 00000001ffffffff
      [    0.000000] page dumped because: kasan: bad access detected
      [    0.000000]
      [    0.000000] Memory state around the buggy address:
      [    0.000000]  ffffffff81807b00: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
      [    0.000000]  ffffffff81807b80: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
      [    0.000000] >ffffffff81807c00: 00 00 00 00 00 00 00 00 f1 f1 f1 f1 00 00 00 f3
      [    0.000000]                                            ^
      [    0.000000]  ffffffff81807c80: f3 f3 f3 f3 00 00 00 00 00 00 00 00 00 00 00 00
      [    0.000000]  ffffffff81807d00: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
      [    0.000000] ==================================================================
      
      Fix that by using READ_ONCE_NOCHECK when reading the stack in imprecise
      mode.
      
      Fixes: 5d8544e2 ("RISC-V: Generic library routines and assembly")
      Reported-by: default avatarChathura Rajapaksha <chathura.abeyrathne.lk@gmail.com>
      Link: https://lore.kernel.org/all/CAD7mqryDQCYyJ1gAmtMm8SASMWAQ4i103ptTb0f6Oda=tPY2=A@mail.gmail.com/Suggested-by: default avatarDmitry Vyukov <dvyukov@google.com>
      Signed-off-by: default avatarAlexandre Ghiti <alexghiti@rivosinc.com>
      Link: https://lore.kernel.org/r/20230308091639.602024-1-alexghiti@rivosinc.comSigned-off-by: default avatarPalmer Dabbelt <palmer@rivosinc.com>
      76950340
    • Dave Airlie's avatar
      Merge tag 'drm-msm-fixes-2023-03-09' of https://gitlab.freedesktop.org/drm/msm into drm-fixes · 3a43e30b
      Dave Airlie authored
      msm-fixes for v6.3-rc2
      
      - Fix for possible invalid ptr free in submit ioctl syncobj cleanup path.
      - Synchronize GMU removal in driver teardown path
      - a5xx preemption fixes
      - Fix runpm imbalance at unbind
      - DPU hw catalog fixes:
       - set DPU_MDP_PERIPH_0_REMOVED for sc8280xp as this is another chipset
         where the PERIPH_0 block of registers is not there
       - fix the DPU features supported in QCM2290 by comparing it with the
         downstream device tree
       - fix the length of registers in the sc7180_ctl from 0xe4 to 0x1dc
       - fix the max mixer line width for sm6115 and qcm2290 chipsets in the
         DPU catalog
       - fix the scaler version on sm8550, sc8280xp, sm8450, sm8250, sm8350
         and sm6115. This was incorrectly populated on the SW version of the
         scaler library and  not the scaler HW version
       - Drop dim layer support for msm8998 as its not indicated to be
         supported in the downstream DTSI
       - fix the DPU_CLK_CTRL bits for msm 8998 sspp blocks
       - Use DPU_CLK_CTRL_DMA* prefix instead of DPU_CLK_CTRL_CURSOR*
         for all chipsets for the DMA sspp blocks
       - fix the ping-pong block base address for sc7280 in the DPU HW catalog
      - Fix stack corruption issue in the dpu_hw_ctl_setup_blendstage() function
        as it was causing a negative left shift by protecting against an invalid
        index
      - Clear the DSPP reservations in dpu_rm_release(). This was missed out and
        as as result the DSPP was not released from the resource manager global
        state.
      Signed-off-by: default avatarDave Airlie <airlied@redhat.com>
      From: Rob Clark <robdclark@gmail.com>
      Link: https://patchwork.freedesktop.org/patch/msgid/CAF6AEGvH+VH_Wx3mFMG51CMnoiU06CM-+-WMhM73M42Qx7Bp4A@mail.gmail.com
      3a43e30b
    • Dave Airlie's avatar
      b2bda460
    • Linus Torvalds's avatar
      Merge tag 'net-6.3-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net · 44889ba5
      Linus Torvalds authored
      Pull networking fixes from Paolo Abeni:
       "Including fixes from netfilter and bpf.
      
        Current release - regressions:
      
         - core: avoid skb end_offset change in __skb_unclone_keeptruesize()
      
         - sched:
            - act_connmark: handle errno on tcf_idr_check_alloc
            - flower: fix fl_change() error recovery path
      
         - ieee802154: prevent user from crashing the host
      
        Current release - new code bugs:
      
         - eth: bnxt_en: fix the double free during device removal
      
         - tools: ynl:
            - fix enum-as-flags in the generic CLI
            - fully inherit attrs in subsets
            - re-license uniformly under GPL-2.0 or BSD-3-clause
      
        Previous releases - regressions:
      
         - core: use indirect calls helpers for sk_exit_memory_pressure()
      
         - tls:
            - fix return value for async crypto
            - avoid hanging tasks on the tx_lock
      
         - eth: ice: copy last block omitted in ice_get_module_eeprom()
      
        Previous releases - always broken:
      
         - core: avoid double iput when sock_alloc_file fails
      
         - af_unix: fix struct pid leaks in OOB support
      
         - tls:
            - fix possible race condition
            - fix device-offloaded sendpage straddling records
      
         - bpf:
            - sockmap: fix an infinite loop error
            - test_run: fix &xdp_frame misplacement for LIVE_FRAMES
            - fix resolving BTF_KIND_VAR after ARRAY, STRUCT, UNION, PTR
      
         - netfilter: tproxy: fix deadlock due to missing BH disable
      
         - phylib: get rid of unnecessary locking
      
         - eth: bgmac: fix *initial* chip reset to support BCM5358
      
         - eth: nfp: fix csum for ipsec offload
      
         - eth: mtk_eth_soc: fix RX data corruption issue
      
        Misc:
      
         - usb: qmi_wwan: add telit 0x1080 composition"
      
      * tag 'net-6.3-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (64 commits)
        tools: ynl: fix enum-as-flags in the generic CLI
        tools: ynl: move the enum classes to shared code
        net: avoid double iput when sock_alloc_file fails
        af_unix: fix struct pid leaks in OOB support
        eth: fealnx: bring back this old driver
        net: dsa: mt7530: permit port 5 to work without port 6 on MT7621 SoC
        net: microchip: sparx5: fix deletion of existing DSCP mappings
        octeontx2-af: Unlock contexts in the queue context cache in case of fault detection
        net/smc: fix fallback failed while sendmsg with fastopen
        ynl: re-license uniformly under GPL-2.0 OR BSD-3-Clause
        mailmap: update entries for Stephen Hemminger
        mailmap: add entry for Maxim Mikityanskiy
        nfc: change order inside nfc_se_io error path
        ethernet: ice: avoid gcc-9 integer overflow warning
        ice: don't ignore return codes in VSI related code
        ice: Fix DSCP PFC TLV creation
        net: usb: qmi_wwan: add Telit 0x1080 composition
        net: usb: cdc_mbim: avoid altsetting toggling for Telit FE990
        netfilter: conntrack: adopt safer max chain length
        net: tls: fix device-offloaded sendpage straddling records
        ...
      44889ba5
    • Linus Torvalds's avatar
      Merge tag 'for-linus-2023030901' of git://git.kernel.org/pub/scm/linux/kernel/git/hid/hid · 2653e3fe
      Linus Torvalds authored
      Pull HID fixes from Benjamin Tissoires:
      
       - fix potential out of bound write of zeroes in HID core with a
         specially crafted uhid device (Lee Jones)
      
       - fix potential use-after-free in work function in intel-ish-hid (Reka
         Norman)
      
       - selftests config fixes (Benjamin Tissoires)
      
       - few device small fixes and support
      
      * tag 'for-linus-2023030901' of git://git.kernel.org/pub/scm/linux/kernel/git/hid/hid:
        HID: intel-ish-hid: ipc: Fix potential use-after-free in work function
        HID: logitech-hidpp: Add support for Logitech MX Master 3S mouse
        HID: cp2112: Fix driver not registering GPIO IRQ chip as threaded
        selftest: hid: fix hid_bpf not set in config
        HID: uhid: Over-ride the default maximum data buffer value with our own
        HID: core: Provide new max_buffer_size attribute to over-ride the default
      2653e3fe