2
0
mirror of git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git synced 2025-09-04 20:19:47 +08:00
Commit Graph

17532 Commits

Author SHA1 Message Date
Sebastian Andrzej Siewior
4140e2b31b tools headers: Synchronize prctl.h ABI header
The prctl.h ABI header was slightly updated during the development of
the interface. In particular the "immutable" parameter became a bit in
the option argument.

Synchronize prctl.h ABI header again and make use of the definition in
the testsuite and "perf bench futex".

Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Reviewed-by: André Almeida <andrealmeid@igalia.com>
Link: https://lore.kernel.org/r/20250517151455.1065363-5-bigeasy@linutronix.de
2025-05-21 13:57:41 +02:00
Thomas Richter
ba5f102eec perf ftrace: Use process/session specific trace settings
Executing 'perf ftrace' commands 'ftrace', 'profile' and 'latency' leave
tracing disabled as can seen in this output:

 # echo 1 > /sys/kernel/debug/tracing/tracing_on
 # cat /sys/kernel/debug/tracing/tracing_on
 1
 # perf ftrace trace --graph-opts depth=5 sleep 0.1 > /dev/null
 # cat /sys/kernel/debug/tracing/tracing_on
 0
 #

The 'tracing_on' file is not restored to its value before the command.

To fix that this patch uses the .../tracing/instances/XXX subdirectory
feature.

Each 'perf ftrace' invocation creates its own session/process
specific subdirectory and does not change the global state
in the .../tracing directory itself.

Use rmdir(../tracing/instances/dir) to stop process/session specific
tracing and delete all process/session specific setings.

Reported-by: Alexander Egorenkov <egorenar@linux.ibm.com>
Suggested-by: Arnaldo Carvalho de Melo <acme@kernel.org>
Signed-off-by: Thomas Richter <tmricht@linux.ibm.com>
Cc: Alexander Gordeev <agordeev@linux.ibm.com>
Cc: Heiko Carstens <hca@linux.ibm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Sumanth Korikkar <sumanthk@linux.ibm.com>
Cc: Vasily Gorbik <gor@linux.ibm.com>
Link: https://lore.kernel.org/r/20250520093726.2009696-1-tmricht@linux.ibm.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2025-05-20 13:11:09 -03:00
Arnaldo Carvalho de Melo
b705ca3d24 tools include UAPI: Sync linux/vhost.h with the kernel sources
To get the changes in:

  a940e0a685 ("vhost: fix VHOST_*_OWNER documentation")

That just changed lines in comments

This addresses this perf build warning:

  Warning: Kernel ABI header differences:
    diff -u tools/perf/trace/beauty/include/uapi/linux/vhost.h include/uapi/linux/vhost.h

Please see tools/include/uapi/README for further details.

Acked-by: Stefano Garzarella <sgarzare@redhat.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: James Clark <james.clark@linaro.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Michael S. Tsirkin <mst@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: https://lore.kernel.org/r/20250519214126.1652491-2-acme@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2025-05-20 12:57:12 -03:00
Leo Yan
735a3ac370 perf test probe_vfs_getname: Add regex for searching probe line
Since commit 611851010c ("fs: dedup handling of struct filename
init and refcounts bumps"), the kernel has been refactored to use a new
inline function initname(), moving name initialization into it.

As a result, the perf probe test can no longer find the source line that
matches the defined regular expressions. This causes the script to fail
when attempting to add probes.

Add a regular expression to search for the call site of initname(). This
provides a valid source line number for adding the probe. Keeps the
older regular expressions for passing test on older kernels.

Fixes: 611851010c ("fs: dedup handling of struct filename init and refcounts bumps")
Suggested-by: Arnaldo Carvalho de Melo <acme@kernel.org>
Signed-off-by: Leo Yan <leo.yan@arm.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Christian Brauner <brauner@kernel.org>
Cc: Ian Rogers <irogers@google.com>
Cc: Jakub Brnak <jbrnak@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Mateusz Guzik <mjguzik@gmail.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: https://lore.kernel.org/r/20250519082755.1669187-1-leo.yan@arm.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2025-05-20 12:50:10 -03:00
Chun-Tse Shao
8cdf00b843 perf record: Fix a asan runtime error in util/maps.c
If I build perf with asan and run Zstd test:

  $ make -C tools/perf O=/tmp/perf DEBUG=1 EXTRA_CFLAGS="-O0 -g -fno-omit-frame-pointer -fsanitize=undefined"
  $ /tmp/perf/perf test "Zstd perf.data compression/decompression" -vv
   83: Zstd perf.data compression/decompression:
  ...
  util/maps.c:1046:5: runtime error: null pointer passed as argument 2, which is declared to never be null
  ...

The issue was caused by `bsearch`. The patch adds a check to ensure
argument 2 and 3 are not NULL and 0.

Testing with the commands above confirms that the runtime error is
resolved.

Reviewed-by: Ian Rogers <irogers@google.com>
Signed-off-by: Chun-Tse Shao <ctshao@google.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Ben Gainey <ben.gainey@arm.com>
Cc: Christophe Leroy <christophe.leroy@csgroup.eu>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@linaro.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Leo Yan <leo.yan@arm.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Nick Terrell <terrelln@fb.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: https://lore.kernel.org/r/20250303183646.327510-2-ctshao@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2025-05-16 17:31:40 -03:00
Chun-Tse Shao
208c0e1683 perf record: Add 8-byte aligned event type PERF_RECORD_COMPRESSED2
The original PERF_RECORD_COMPRESS is not 8-byte aligned, which can cause
asan runtime error:

  # Build with asan
  $ make -C tools/perf O=/tmp/perf DEBUG=1 EXTRA_CFLAGS="-O0 -g -fno-omit-frame-pointer -fsanitize=undefined"
  # Test success with many asan runtime errors:
  $ /tmp/perf/perf test "Zstd perf.data compression/decompression" -vv
   83: Zstd perf.data compression/decompression:
  ...
  util/session.c:1959:13: runtime error: member access within misaligned address 0x7f69e3f99653 for type 'union perf_event', which requires 13 byte alignment
  0x7f69e3f99653: note: pointer points here
   d0  3a 50 69 44 00 00 00 00  00 08 00 bb 07 00 00 00  00 00 00 44 00 00 00 00  00 00 00 ff 07 00 00
                ^
  util/session.c:2163:22: runtime error: member access within misaligned address 0x7f69e3f99653 for type 'union perf_event', which requires 8 byte alignment
  0x7f69e3f99653: note: pointer points here
   d0  3a 50 69 44 00 00 00 00  00 08 00 bb 07 00 00 00  00 00 00 44 00 00 00 00  00 00 00 ff 07 00 00
                ^
  ...

Since there is no way to align compressed data in zstd compression, this
patch add a new event type `PERF_RECORD_COMPRESSED2`, which adds a field
`data_size` to specify the actual compressed data size.

The `header.size` contains the total record size, including the padding
at the end to make it 8-byte aligned.

Tested with `Zstd perf.data compression/decompression`

Signed-off-by: Chun-Tse Shao <ctshao@google.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Ben Gainey <ben.gainey@arm.com>
Cc: Christophe Leroy <christophe.leroy@csgroup.eu>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@linaro.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Leo Yan <leo.yan@arm.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Nick Terrell <terrelln@fb.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: https://lore.kernel.org/r/20250303183646.327510-1-ctshao@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2025-05-16 17:31:40 -03:00
Ian Rogers
bcfab08db7 perf intel-tpebs: Filter non-workload samples
If perf is running with a benchmark then we want the retirement
latency samples associated with the benchmark rather than from the
system as a whole.

Use the workload's PID to filter out samples that aren't from the
workload or its children.

Signed-off-by: Ian Rogers <irogers@google.com>
Tested-by: Weilin Wang <weilin.wang@intel.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@linaro.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: https://lore.kernel.org/r/20250430200108.243234-1-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2025-05-16 11:31:10 -03:00
Chun-Tse Shao
1c5721ca89 perf test: Allow tolerance for leader sampling test
There is a known issue that the leader sampling is inconsistent, since
throttle only affect leader, not the slave. The detail is in [1].

To maintain test coverage, this patch sets a tolerance rate of 80% to
accommodate the throttled samples and prevent test failures due to
throttling.

[1] lore.kernel.org/20250328182752.769662-1-ctshao@google.com

Suggested-by: Ian Rogers <irogers@google.com>
Suggested-by: Thomas Richter <tmricht@linux.ibm.com>
Reviewed-by: Ian Rogers <irogers@google.com>
Signed-off-by: Chun-Tse Shao <ctshao@google.com>
Co-developed-by: Thomas Richter <tmricht@linux.ibm.com>
Signed-off-by: Thomas Richter <tmricht@linux.ibm.com>
Tested-by: Thomas Richter <tmricht@linux.ibm.com>
Cc: Alexander Gordeev <agordeev@linux.ibm.com>
Cc: Heiko Carstens <hca@linux.ibm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Sumanth Korikkar <sumanthk@linux.ibm.com>
Cc: Vasily Gorbik <gor@linux.ibm.com>
Link: https://lore.kernel.org/r/20250430140611.599078-1-tmricht@linux.ibm.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2025-05-15 12:55:51 -03:00
Chun-Tse Shao
cb422594d6 perf test: Add stat uniquifying test
The `stat+uniquify.sh` test retrieves all uniquified `clockticks` events
from `perf list -v clockticks` and check if `perf stat -e clockticks -A`
contains all of them.

Committer testing:

  root@x1:~# grep -m1 "model name" /proc/cpuinfo
  model name	: 13th Gen Intel(R) Core(TM) i7-1365U
  root@x1:~# perf list clockticks

  List of pre-defined events (to be used in -e or -M):

    uncore_clock/clockticks/                           [Kernel PMU event]

  uncore memory:
    unc_m_clockticks
         [Number of clocks. Unit: uncore_imc]
  root@x1:~#
  root@x1:~# perf test uniquifying
   92: perf stat events uniquifying                    : Ok
  root@x1:~# perf test -vv uniquifying
   92: perf stat events uniquifying:
  --- start ---
  test child forked, pid 1552628
  stat event uniquifying test
  ---- end(0) ----
   92: perf stat events uniquifying                    : Ok
  root@x1:~#

Signed-off-by: Chun-Tse Shao <ctshao@google.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Dr. David Alan Gilbert <linux@treblig.org>
Cc: Howard Chu <howardchu95@gmail.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@linaro.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Levi Yun <yeoreum.yun@arm.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Weilin Wang <weilin.wang@intel.com>
Link: https://lore.kernel.org/r/20250513215401.2315949-4-ctshao@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2025-05-14 09:51:04 -03:00
Ian Rogers
137359b789 perf parse-events: Use wildcard processing to set an event to merge into
The merge stat code fails for uncore events if they are repeated twice,
for example `perf stat -e clockticks,clockticks -I 1000` as the counts
of the second set of uncore events will be merged into the first
counter.

Reimplement the logic to have a first_wildcard_match so that merged
later events correctly merge into the first wildcard event that they
will be aggregated into.

Signed-off-by: Ian Rogers <irogers@google.com>
Tested-by: Chun-Tse Shao <ctshao@google.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Dr. David Alan Gilbert <linux@treblig.org>
Cc: Howard Chu <howardchu95@gmail.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@linaro.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Levi Yun <yeoreum.yun@arm.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Weilin Wang <weilin.wang@intel.com>
Link: https://lore.kernel.org/r/20250513215401.2315949-3-ctshao@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2025-05-14 09:36:24 -03:00
Ian Rogers
7d45f402d3 perf evlist: Make uniquifying counter names consistent
'perf stat' has different uniquification logic to 'perf record' and perf
top. In the case of perf record and 'perf top' all hybrid event names
are uniquified.

'perf stat' is more disciplined respecting name config terms, libpfm4
events, etc.

'perf stat' will uniquify hybrid events and the non-core PMU cases
shouldn't apply to perf record or 'perf top'.

For consistency, remove the uniquification for 'perf record' and 'perf
top' and reuse the 'perf stat' uniquification, making the code more
globally visible for this.

Fix the detection of cross-PMU for disabling uniquify by correctly
setting last_pmu.

When setting uniquify on an evsel, make sure the PMUs between the 2
considered events differ otherwise the uniquify isn't adding value.

Signed-off-by: Ian Rogers <irogers@google.com>
Tested-by: Chun-Tse Shao <ctshao@google.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Dr. David Alan Gilbert <linux@treblig.org>
Cc: Howard Chu <howardchu95@gmail.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@linaro.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Levi Yun <yeoreum.yun@arm.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Weilin Wang <weilin.wang@intel.com>
Link: https://lore.kernel.org/r/20250513215401.2315949-2-ctshao@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2025-05-14 09:36:21 -03:00
Namhyung Kim
ef60b8f572 perf trace: Support --summary-mode=cgroup
Add a new summary mode to collect stats for each cgroup.

  $ sudo ./perf trace -as --bpf-summary --summary-mode=cgroup -- sleep 1

   Summary of events:

   cgroup /user.slice/user-657345.slice/user@657345.service/session.slice/org.gnome.Shell@x11.service, 535 events

     syscall            calls  errors  total       min       avg       max       stddev
                                       (msec)    (msec)    (msec)    (msec)        (%)
     --------------- --------  ------ -------- --------- --------- ---------     ------
     ppoll                 15      0   373.600     0.004    24.907   197.491     55.26%
     poll                  15      0     1.325     0.001     0.088     0.369     38.76%
     close                 66      0     0.567     0.007     0.009     0.026      3.55%
     write                150      0     0.471     0.001     0.003     0.010      3.29%
     recvmsg               94     83     0.290     0.000     0.003     0.037     16.39%
     ioctl                 26      0     0.237     0.001     0.009     0.096     50.13%
     timerfd_create        66      0     0.236     0.003     0.004     0.024      8.92%
     timerfd_settime       70      0     0.160     0.001     0.002     0.012      7.66%
     writev                10      0     0.118     0.001     0.012     0.019     18.17%
     read                   9      0     0.021     0.001     0.002     0.004     14.07%
     getpid                14      0     0.019     0.000     0.001     0.004     20.28%

   cgroup /system.slice/polkit.service, 94 events

     syscall            calls  errors  total       min       avg       max       stddev
                                       (msec)    (msec)    (msec)    (msec)        (%)
     --------------- --------  ------ -------- --------- --------- ---------     ------
     ppoll                 22      0    19.811     0.000     0.900     9.273     63.88%
     write                 30      0     0.040     0.001     0.001     0.003     12.09%
     recvmsg               12      0     0.018     0.001     0.002     0.006     28.15%
     read                  18      0     0.013     0.000     0.001     0.003     21.99%
     poll                  12      0     0.006     0.000     0.001     0.001      4.48%

   cgroup /user.slice/user-657345.slice/user@657345.service/app.slice/app-org.gnome.Terminal.slice/gnome-terminal-server.service, 21 events

     syscall            calls  errors  total       min       avg       max       stddev
                                       (msec)    (msec)    (msec)    (msec)        (%)
     --------------- --------  ------ -------- --------- --------- ---------     ------
     ppoll                  4      0    17.476     0.003     4.369    13.298     69.65%
     recvmsg               15     12     0.068     0.002     0.005     0.014     26.53%
     writev                 1      0     0.033     0.033     0.033     0.033      0.00%
     poll                   1      0     0.005     0.005     0.005     0.005      0.00%

   ...

It works only for --bpf-summary for now.

Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Howard Chu <howardchu95@gmail.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Song Liu <song@kernel.org>
Link: https://lore.kernel.org/r/20250501225337.928470-1-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2025-05-13 18:20:46 -03:00
Namhyung Kim
39922dc53c perf report: Add 'tgid' sort key
Sometimes we need to analyze the data in process level but current sort
keys only work on thread level.  Let's add 'tgid' sort key for that as
'pid' is already taken for thread.

This will look mostly the same, but it only uses tgid instead of tid.
Here's an example of a process with two threads (thloop).

  $ perf record -- perf test -w thloop

  $ perf report --stdio -s tgid,pid -H
  ...
  #
  #    Overhead  Tgid:Command / Pid:Command
  # ...........  ..........................
  #
     100.00%     2018407:perf
         50.34%     2018407:perf
         49.66%     2018409:perf

Suggested-by: Stephane Eranian <eranian@google.com>
Reviewed-by: Ian Rogers <irogers@google.com>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: https://lore.kernel.org/r/20250509210421.197245-1-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2025-05-13 17:51:32 -03:00
Namhyung Kim
b922881712 perf test: Update sysfs path for core PMU caps
While CPU is a system device, it'd be better to use a path for
event_source devices when it checks PMU capability.

Reviewed-by: Ian Rogers <irogers@google.com>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: https://lore.kernel.org/r/20250509213017.204343-2-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2025-05-13 17:45:35 -03:00
Namhyung Kim
1c3741611f perf test: Fix LBR test by ignoring idle task
I found 'perf record LBR tests' failing due to empty branch stacks.

  $ perf test -v LBR
  ...
  LBR system wide any branch test
  Lowering default frequency rate from 4000 to 1000.
  Please consider tweaking /proc/sys/kernel/perf_event_max_sample_rate.
  [ perf record: Woken up 8 times to write data ]
  [ perf record: Captured and wrote 3.142 MB /tmp/__perf_test.perf.data.dgSBl (3572 samples) ]
  LBR system wide any branch test: 3572 samples
  LBR system wide any branch test [Failed empty br stack ratio exceed 2%: 3%]
  LBR system wide any call test
  Lowering default frequency rate from 4000 to 1000.
  Please consider tweaking /proc/sys/kernel/perf_event_max_sample_rate.
  [ perf record: Woken up 8 times to write data ]
  [ perf record: Captured and wrote 3.337 MB /tmp/__perf_test.perf.data.dgSBl (3967 samples) ]
  LBR system wide any call test: 3967 samples
  LBR system wide any call test [Failed empty br stack ratio exceed 2%: 9%]
  ...

The failing cases were in system-wide mode and I realized that the
samples were from the idle tasks (swapper).  I suspect going to/from
idle state may affect the LBR contents.

If we can skip empty branch stacks from the idle tasks, the failure
should go away.  I can see the following output in perf report -D.

  $ perf report -D | grep -m5 -A3 'branch stack: nr:0'
  ...
  --
  ... branch stack: nr:0
   ... thread: swapper:0
   ...... dso: /proc/kcore

  --
  ... branch stack: nr:0
   ... thread: swapper:0
   ...... dso: /proc/kcore

  --
  ... branch stack: nr:0
   ... thread: DefaultEventMan:10282
   ...... dso: /proc/kcore

  --
  ... branch stack: nr:0
   ... thread: swapper:0
   ...... dso: /proc/kcore

  --
  ... branch stack: nr:0
   ... thread: swapper:0
   ...... dso: /proc/kcore

  $ perf report -D | grep -c 'branch stack: nr:0'
  145

  $ perf report -D | grep -A3 'branch stack: nr:0' | grep thread | grep -c swapper
  i36

  $ perf report -D | grep -A3 'branch stack: nr:0' | grep thread | grep -cv swapper
  9

Reviewed-by: Ian Rogers <irogers@google.com>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: https://lore.kernel.org/r/20250509213017.204343-1-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2025-05-13 17:45:11 -03:00
James Clark
aea3496bbc perf tools: Fix arm64 source package build
Syscall tables are generated from rules in the kernel tree. Add the
related files to the MANIFEST to fix the Perf source package build.

Reported-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Fixes: bfb713ea53 ("perf tools: Fix arm64 build by generating unistd_64.h")
Signed-off-by: James Clark <james.clark@linaro.org>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: https://lore.kernel.org/r/20250513-james-perf-src-pkg-fix-v1-1-bcfd0486dbd6@linaro.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2025-05-13 17:26:35 -03:00
Ian Rogers
3eb5c49f71 perf test: Hybrid improvements for metric value validation test
On my alderlake I currently see for the "perf metrics value validation" test:

```
Total Test Count:  142
Passed Test Count:  139
[
Metric Relationship Error:      The collected value of metric ['tma_fetch_latency', 'tma_fetch_bandwidth', 'tma_frontend_bound']
                        is [31.137028] in workload(s): ['perf bench futex hash -r 2 -s']
                        but expected value range is [tma_frontend_bound, tma_frontend_bound]
                        Relationship rule description: 'Sum of the level 2 children should equal level 1 parent',
Metric Relationship Error:      The collected value of metric ['tma_memory_bound', 'tma_core_bound', 'tma_backend_bound']
                        is [6.564442] in workload(s): ['perf bench futex hash -r 2 -s']
                        but expected value range is [tma_backend_bound, tma_backend_bound]
                        Relationship rule description: 'Sum of the level 2 children should equal level 1 parent',
Metric Relationship Error:      The collected value of metric ['tma_light_operations', 'tma_heavy_operations', 'tma_retiring']
                        is [57.806179] in workload(s): ['perf bench futex hash -r 2 -s']
                        but expected value range is [tma_retiring, tma_retiring]
                        Relationship rule description: 'Sum of the level 2 children should equal level 1 parent']
Metric validation return with erros. Please check metrics reported with errors.
```

I suspect it is due to two metrics for different CPU types being
enabled. Add a -cputype option to avoid this. The test still fails with:

```
Total Test Count:  115
Passed Test Count:  114
[
Wrong Metric Value Error:       The collected value of metric ['tma_l2_hit_latency']
                        is [117.947088] in workload(s): ['perf bench futex hash -r 2 -s']
                        but expected value range is [0, 100]]
Metric validation return with errors. Please check metrics reported with errors.
```

which is a reproducible genuine error and likely requires a metric fix.

Signed-off-by: Ian Rogers <irogers@google.com>
Tested-by: Thomas Falcon <thomas.falcon@intel.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@linaro.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Weilin Wang <weilin.wang@intel.com>
Link: https://lore.kernel.org/r/20250512184700.11691-2-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2025-05-13 17:24:42 -03:00
Ian Rogers
7f84f67418 perf list: Display the PMU name associated with a perf metric in JSON
The 'perf stat --cputype' option can be used to filter which metrics
will be applied, for this reason the JSON metrics have an associated
PMU.

List this PMU name in the 'perf list' output in JSON mode so that
tooling may access it.

An example of the new field is:
```
{
        "MetricGroup": "Backend",
        "MetricName": "tma_core_bound",
        "MetricExpr": "max(0, tma_backend_bound - tma_memory_bound)",
        "MetricThreshold": "tma_core_bound > 0.1 & tma_backend_bound > 0.2",
        "ScaleUnit": "100%",
        "BriefDescription": "This metric represents fraction of slots where ...
        "PublicDescription": "This metric represents fraction of slots where ...
        "Unit": "cpu_core"
},
```

Signed-off-by: Ian Rogers <irogers@google.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@linaro.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Weilin Wang <weilin.wang@intel.com>
Link: https://lore.kernel.org/r/20250512184700.11691-1-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2025-05-13 17:06:14 -03:00
Ian Rogers
4102ff8b1f perf metricgroup: Binary search when resolving referred to metrics
Unlike with events, metrics can be matched by name or a list of metric
groups.

However, when a metric refers to another metric it isn't referring to a
group but the singular metric in question.

Prior to this change every "id" in a metric expression is checked to see
if it is a metric by scanning all the metrics in the metrics table.

As the table is sorted my metric name we can speed the search in the
resolution case by binary searching for the metric.

Rename some of the metricgroup functions to make it clearer whether
they match a metric by name or by both name and group.

Before:
```
$ time perf test -v 10
 10: PMU JSON event tests                                            :
 10.1: PMU event table sanity                                        : Ok
 10.2: PMU event map aliases                                         : Ok
 10.3: Parsing of PMU event table metrics                            : Ok
 10.4: Parsing of PMU event table metrics with fake PMUs             : Ok
 10.5: Parsing of metric thresholds with fake PMUs                   : Ok

real    0m15.972s
user    0m13.176s
sys     0m3.001s
```

After:
```
$ time perf test -v 10
 10: PMU JSON event tests                                            :
 10.1: PMU event table sanity                                        : Ok
 10.2: PMU event map aliases                                         : Ok
 10.3: Parsing of PMU event table metrics                            : Ok
 10.4: Parsing of PMU event table metrics with fake PMUs             : Ok
 10.5: Parsing of metric thresholds with fake PMUs                   : Ok

real    0m5.343s
user    0m1.871s
sys     0m2.128s
```

Committer testing:

  root@number:~# grep -m1 'model name' /proc/cpuinfo
  model name	: AMD Ryzen 9 9950X3D 16-Core Processor
  root@number:~#

Before:

  root@number:~# time perf test "Parsing of PMU event table metrics"
   10.3: Parsing of PMU event table metrics                            : Ok
   10.4: Parsing of PMU event table metrics with fake PMUs             : Ok

  real	0m9.286s
  user	0m9.354s
  sys	0m0.062s
  root@number:~#

After:

  root@number:~# time perf test "Parsing of PMU event table metrics"
   10.3: Parsing of PMU event table metrics                            : Ok
   10.4: Parsing of PMU event table metrics with fake PMUs             : Ok

  real	0m0.689s
  user	0m0.766s
  sys	0m0.042s
  root@number:~# time perf test 10
   10: PMU JSON event tests                                            :
   10.1: PMU event table sanity                                        : Ok
   10.2: PMU event map aliases                                         : Ok
   10.3: Parsing of PMU event table metrics                            : Ok
   10.4: Parsing of PMU event table metrics with fake PMUs             : Ok
   10.5: Parsing of metric thresholds with fake PMUs                   : Ok

  real	0m0.696s
  user	0m0.807s
  sys	0m0.064s
  root@number:~#

Signed-off-by: Ian Rogers <irogers@google.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Tested-by: Namhyung Kim <namhyung@kernel.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@linaro.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Ravi Bangoria <ravi.bangoria@amd.com>
Cc: Thomas Richter <tmricht@linux.ibm.com>
Cc: Xu Yang <xu.yang_2@nxp.com>
Link: https://lore.kernel.org/r/20250512194622.33258-4-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2025-05-13 16:36:51 -03:00
Ian Rogers
754baf426e perf pmu: Change aliases from list to hashmap
Finding an alias for things like perf_pmu__have_event() would need to
search the aliases list, whilst this happens relatively infrequently it
can be a significant overhead in testing.

Switch to using a hashmap. Move common initialization code to
perf_pmu__init(). Refactor the test 'struct perf_pmu_test_pmu' to not
have perf pmu within it to better support the perf_pmu__init() function.

Before:
```
$ time perf test "Parsing of PMU event table metrics"
 10.3: Parsing of PMU event table metrics                            : Ok
 10.4: Parsing of PMU event table metrics with fake PMUs             : Ok

real    0m13.287s
user    0m13.026s
sys     0m0.532s
```

After:
```
$ time perf test "Parsing of PMU event table metrics"
 10.3: Parsing of PMU event table metrics                            : Ok
 10.4: Parsing of PMU event table metrics with fake PMUs             : Ok

real    0m13.011s
user    0m12.885s
sys     0m0.485s
```

Committer testing:

  root@number:~# grep -m1 'model name' /proc/cpuinfo
  model name	: AMD Ryzen 9 9950X3D 16-Core Processor
  root@number:~#

Before:

  root@number:~# time perf test "Parsing of PMU event table metrics"
   10.3: Parsing of PMU event table metrics                            : Ok
   10.4: Parsing of PMU event table metrics with fake PMUs             : Ok

  real	0m9.296s
  user	0m9.361s
  sys	0m0.063s
  root@number:~#

After:

  root@number:~# time perf test "Parsing of PMU event table metrics"
   10.3: Parsing of PMU event table metrics                            : Ok
   10.4: Parsing of PMU event table metrics with fake PMUs             : Ok

  real	0m9.286s
  user	0m9.354s
  sys	0m0.062s
  root@number:~#

Signed-off-by: Ian Rogers <irogers@google.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Tested-by: Namhyung Kim <namhyung@kernel.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@linaro.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Ravi Bangoria <ravi.bangoria@amd.com>
Cc: Thomas Richter <tmricht@linux.ibm.com>
Cc: Xu Yang <xu.yang_2@nxp.com>
Link: https://lore.kernel.org/r/20250512194622.33258-3-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2025-05-13 16:36:43 -03:00
Ian Rogers
375368a961 perf fncache: Switch to using hashmap
The existing fncache can get large in testing situations. As the
bucket array is a fixed size this leads to it degrading to O(n)
performance. Use a regular hashmap that can dynamically reallocate its
array.

Before:
```
$ time perf test "Parsing of PMU event table metrics"
 10.3: Parsing of PMU event table metrics                            : Ok
 10.4: Parsing of PMU event table metrics with fake PMUs             : Ok

real    0m14.132s
user    0m17.806s
sys     0m0.557s
```

After:
```
$ time perf test "Parsing of PMU event table metrics"
 10.3: Parsing of PMU event table metrics                            : Ok
 10.4: Parsing of PMU event table metrics with fake PMUs             : Ok

real    0m13.287s
user    0m13.026s
sys     0m0.532s
```

Committer notes:

  root@number:~# grep -m1 'model name' /proc/cpuinfo
  model name	: AMD Ryzen 9 9950X3D 16-Core Processor
  root@number:~#

Before:

  root@number:~# time perf test "Parsing of PMU event table metrics"
   10.3: Parsing of PMU event table metrics                            : Ok
   10.4: Parsing of PMU event table metrics with fake PMUs             : Ok

  real	0m9.277s
  user	0m9.979s
  sys	0m0.055s
  root@number:~#

After:

  root@number:~# time perf test "Parsing of PMU event table metrics"
   10.3: Parsing of PMU event table metrics                            : Ok
   10.4: Parsing of PMU event table metrics with fake PMUs             : Ok

  real	0m9.296s
  user	0m9.361s
  sys	0m0.063s
  root@number:~#

Signed-off-by: Ian Rogers <irogers@google.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Tested-by: Namhyung Kim <namhyung@kernel.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@linaro.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Ravi Bangoria <ravi.bangoria@amd.com>
Cc: Thomas Richter <tmricht@linux.ibm.com>
Cc: Xu Yang <xu.yang_2@nxp.com>
Link: https://lore.kernel.org/r/20250512194622.33258-2-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2025-05-13 16:36:22 -03:00
Ian Rogers
f3061d5267 perf tests: Harden branch stack sampling test
On continuous testing the perf script output can be empty, or nearly
empty, causing tr/grep to exit and due to "set -e" the test traps and
fails.

Add some empty file handling that sets the test to skip and make grep
and other text rewriting failures non-fatal by adding "|| true".

Committer testing:

  root@number:~# grep -m1 "model name" /proc/cpuinfo
  model name	: AMD Ryzen 9 9950X3D 16-Core Processor
  root@number:~# perf test "Check branch stack sampling"
  104: Check branch stack sampling                                     : Ok
  root@number:~#
  root@number:~# perf test -vvvvvvv "Check branch stack sampling"
  104: Check branch stack sampling:
  --- start ---
  test child forked, pid 396047
   142d22-142da0 l brstack_bench
  perf does have symbol 'brstack_bench'
  Testing user branch stack sampling
  Testing branch stack filtering permutation (any_call,CALL|IND_CALL|COND_CALL|SYSCALL|IRQ)
  Testing branch stack filtering permutation (call,CALL|SYSCALL)
  Testing branch stack filtering permutation (cond,COND)
  Testing branch stack filtering permutation (any_ret,RET|COND_RET|SYSRET|ERET)
  Testing branch stack filtering permutation (call,cond,CALL|SYSCALL|COND)
  Testing branch stack filtering permutation (any_call,cond,CALL|IND_CALL|COND_CALL|IRQ|SYSCALL|COND)
  Testing branch stack filtering permutation (cond,any_call,any_ret,COND|CALL|IND_CALL|COND_CALL|SYSCALL|IRQ|RET|COND_RET|SYSRET|ERET)
  ---- end(0) ----
  104: Check branch stack sampling                                     : Ok
  root@number:~#

Signed-off-by: Ian Rogers <irogers@google.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Anshuman Khandual <anshuman.khandual@arm.com>
Cc: German Gomez <german.gomez@arm.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@linaro.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: https://lore.kernel.org/r/20250318161639.34446-1-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2025-05-12 14:55:15 -03:00
Ian Rogers
255f5b6d06 perf parse-events: Add "cpu" term to set the CPU an event is recorded on
The -C option allows the CPUs for a list of events to be specified but
its not possible to set the CPU for a single event. Add a term to
allow this. The term isn't a general CPU list due to ',' already being
a special character in event parsing instead multiple cpu= terms may
be provided and they will be merged/unioned together.

An example of mixing different types of events counted on different CPUs:
```
$ perf stat -A -C 0,4-5,8 -e "instructions/cpu=0/,l1d-misses/cpu=4,cpu=5/,inst_retired.any/cpu=8/,cycles" -a sleep 0.1

 Performance counter stats for 'system wide':

CPU0            6,979,225      instructions/cpu=0/              #    0.89  insn per cycle
CPU4               75,138      cpu/l1d-misses/
CPU5            1,418,939      cpu/l1d-misses/
CPU8              797,553      cpu/inst_retired.any,cpu=8/
CPU0            7,845,302      cycles
CPU4            6,546,859      cycles
CPU5          185,915,438      cycles
CPU8            2,065,668      cycles

       0.112449242 seconds time elapsed
```

Committer testing:

  root@number:~# grep -m1 "model name" /proc/cpuinfo
  model name	: AMD Ryzen 9 9950X3D 16-Core Processor
  root@number:~# perf stat -A -e "instructions/cpu=0/,instructions,l1d-misses/cpu=4,cpu=5/,cycles" -a sleep 0.1

   Performance counter stats for 'system wide':

  CPU0    2,398,351   instructions/cpu=0/    #  0.44  insn per cycle
  CPU0    2,398,152   instructions           #  0.44  insn per cycle
  CPU1    1,265,634   instructions           #  0.49  insn per cycle
  CPU2      606,087   instructions           #  0.50  insn per cycle
  CPU3    4,025,752   instructions           #  0.52  insn per cycle
  CPU4    4,236,810   instructions           #  0.53  insn per cycle
  CPU5    3,984,832   instructions           #  0.66  insn per cycle
  CPU6      434,132   instructions           #  0.44  insn per cycle
  CPU7       65,752   instructions           #  0.41  insn per cycle
  CPU8      459,083   instructions           #  0.48  insn per cycle
  CPU9    6,464,161   instructions           #  1.31  insn per cycle
  <SNIP>
  root@number:~# perf stat -e "instructions/cpu=0/,instructions,l1d-misses/cpu=4,cpu=5/,cycles" -a sleep 0.

   Performance counter stats for 'system wide':

             144,822      instructions/cpu=0/              #    0.03  insn per cycle
           4,666,114      instructions                     #    0.93  insn per cycle
               2,583      l1d-misses
           4,993,633      cycles

         0.000868512 seconds time elapsed

  root@number:~#

Signed-off-by: Ian Rogers <irogers@google.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Tested-by: Kan Liang <kan.liang@linux.intel.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Dominique Martinet <asmadeus@codewreck.org>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@linaro.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Leo Yan <leo.yan@arm.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Weilin Wang <weilin.wang@intel.com>
Cc: Yicong Yang <yangyicong@hisilicon.com>
Link: https://lore.kernel.org/r/20250403194337.40202-5-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2025-05-12 14:23:19 -03:00
Ian Rogers
168c7b5091 perf parse-events: Set is_pmu_core for legacy hardware events
Also set the CPU map to all online CPU maps.

This is done so the behavior of legacy hardware and hardware cache
events better matches that of sysfs and JSON events during
__perf_evlist__propagate_maps().

Fix missing cpumap put in "Synthesize attr update" test.

Signed-off-by: Ian Rogers <irogers@google.com>
Tested-by: Kan Liang <kan.liang@linux.intel.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Dominique Martinet <asmadeus@codewreck.org>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@linaro.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Leo Yan <leo.yan@arm.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Weilin Wang <weilin.wang@intel.com>
Cc: Yicong Yang <yangyicong@hisilicon.com>
Link: https://lore.kernel.org/r/20250403194337.40202-4-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2025-05-12 14:18:16 -03:00
Ian Rogers
f60c3f4468 perf stat: Use counter cpumask to skip zero values
When a counter is 0 it may or may not be skipped.

For uncore counters it is common they are only valid on 1 logical CPU
and all other CPUs should be skipped.

The PMU's cpumask was used for the skip calculation, but that cpumask
may not reflect user overrides.

Similarly a counter on a core PMU may explicitly not request a CPU be
gathered.

If the counter on this CPU's value is 0 then the counter should be
skipped as it wasn't requested.

Switch from using the PMU cpumask to that associated with the evsel to
support these cases.

Avoid potential crash with --per-thread mode where config->aggr_get_id
is NULL. Add some examples for the tool event 0 counter skipping.

Signed-off-by: Ian Rogers <irogers@google.com>
Tested-by: Kan Liang <kan.liang@linux.intel.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Dominique Martinet <asmadeus@codewreck.org>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@linaro.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Leo Yan <leo.yan@arm.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Weilin Wang <weilin.wang@intel.com>
Cc: Yicong Yang <yangyicong@hisilicon.com>
Link: https://lore.kernel.org/r/20250403194337.40202-3-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2025-05-12 14:18:16 -03:00
Ian Rogers
365e02ddb6 perf tests metrics: Permission related fixes
When permissions are limited running sleep without system wide isn't a
good benchmark to run to achieve samples, switch to running noploop.

Remove indent for non-success cases.

Allow skip for the not counted case.

Minor debug changes.

Signed-off-by: Ian Rogers <irogers@google.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@linaro.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Weilin Wang <weilin.wang@intel.com>
Link: https://lore.kernel.org/r/20250412004704.2297939-2-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2025-05-12 14:18:16 -03:00
Ian Rogers
f0869f3156 perf evsel: Add per-thread warning for EOPNOTSUPP open failues
The mrvl_ddr_pmu will return EOPNOTSUPP if opened in per-thread
mode. Give a warning for this similar to EINVAL.

Doing this better supports metric testing with limited permissions when
the mrvl_ddr_pmu is present, as the failure to open causes the test to
skip and not fail.

Signed-off-by: Ian Rogers <irogers@google.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@linaro.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Weilin Wang <weilin.wang@intel.com>
Link: https://lore.kernel.org/r/20250412004704.2297939-1-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2025-05-12 14:18:16 -03:00
Adrian Hunter
17e548405a perf scripts python: exported-sql-viewer.py: Fix pattern matching with Python 3
The script allows the user to enter patterns to find symbols.

The pattern matching characters are converted for use in SQL.

For PostgreSQL the conversion involves using the Python maketrans()
method which is slightly different in Python 3 compared with Python 2.

Fix to work in Python 3.

Fixes: beda0e725e ("perf script python: Add Python3 support to exported-sql-viewer.py")
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Tony Jones <tonyj@suse.de>
Link: https://lore.kernel.org/r/20250512093932.79854-4-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2025-05-12 14:18:16 -03:00
Adrian Hunter
352b088164 perf intel-pt: Do not default to recording all switch events
On systems with many CPUs, recording extra context switch events can be
excessive and unnecessary. Add perf config intel-pt.all-switch-events=false
to control the behaviour.

Example:

 # perf config intel-pt.all-switch-events=false
 # perf record -eintel_pt//u uname
 Linux
 [ perf record: Woken up 1 times to write data ]
 [ perf record: Captured and wrote 0.082 MB perf.data ]
 # perf script -D | grep PERF_RECORD_SWITCH | awk '{print $5}' | uniq -c
       5 PERF_RECORD_SWITCH
 # perf config intel-pt.all-switch-events=true
 # perf record -eintel_pt//u uname
 Linux
 [ perf record: Woken up 1 times to write data ]
 [ perf record: Captured and wrote 0.102 MB perf.data ]
 # perf script -D | grep PERF_RECORD_SWITCH | awk '{print $5}' | uniq -c
     180 PERF_RECORD_SWITCH_CPU_WIDE

Committer testing:

While doing a make -j28 allmodconfig:

  root@five:~# grep "model name" -m1 /proc/cpuinfo
  model name	: Intel(R) Core(TM) i7-14700K
  root@five:~#
  root@five:~# perf config intel-pt.all-switch-events=false
  root@five:~# perf record -e intel_pt//u uname
  Linux
  [ perf record: Woken up 2 times to write data ]
  [ perf record: Captured and wrote 0.019 MB perf.data ]
  root@five:~# perf report --stats | grep SWITCH_CPU_WIDE
  root@five:~#
  root@five:~# perf config intel-pt.all-switch-events=true
  root@five:~# perf record -e intel_pt//u uname
  Linux
  [ perf record: Woken up 1 times to write data ]
  [ perf record: Captured and wrote 0.047 MB perf.data ]
  root@five:~# perf report --stats | grep SWITCH_CPU_WIDE
       SWITCH_CPU_WIDE events:        542  (96.4%)
  root@five:~#

Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: https://lore.kernel.org/r/20250512093932.79854-3-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2025-05-12 14:18:16 -03:00
Adrian Hunter
e00eac6b5b perf intel-pt: Fix PEBS-via-PT data_src
The Fixes commit did not add support for decoding PEBS-via-PT data_src.
Fix by adding support.

PEBS-via-PT is a feature of some E-core processors, starting with
processors based on Tremont microarchitecture. Because the kernel only
supports Intel PT features that are on all processors, there is no support
for PEBS-via-PT on hybrids.

Currently that leaves processors based on Tremont, Gracemont and Crestmont,
however there are no events on Tremont that produce data_src information,
and for Gracemont and Crestmont there are only:

	mem-loads	event=0xd0,umask=0x5,ldlat=3
	mem-stores	event=0xd0,umask=0x6

Affected processors include Alder Lake N (Gracemont), Sierra Forest
(Crestmont) and Grand Ridge (Crestmont).

Example:

 # perf record -d -e intel_pt/branch=0/ -e mem-loads/aux-output/pp uname

 Before:

  # perf.before script --itrace=o -Fdata_src
            0 |OP No|LVL N/A|SNP N/A|TLB N/A|LCK No|BLK  N/A
            0 |OP No|LVL N/A|SNP N/A|TLB N/A|LCK No|BLK  N/A

 After:

  # perf script --itrace=o -Fdata_src
  10268100142 |OP LOAD|LVL L1 hit|SNP None|TLB L1 or L2 hit|LCK No|BLK  N/A
  10450100442 |OP LOAD|LVL L2 hit|SNP None|TLB L2 miss|LCK No|BLK  N/A

Fixes: 975846eddf ("perf intel-pt: Add memory information to synthesized PEBS sample")
Reviewed-by: Kan Liang <kan.liang@linux.intel.com>
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: https://lore.kernel.org/r/20250512093932.79854-2-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2025-05-12 14:18:09 -03:00
Ian Rogers
cd17a9b1a7 perf test demangle-ocaml: Switch to using dso__demangle_sym()
The use of the demangle-ocaml APIs means we don't detect if a different
demangler is used before the OCaml one for the case that matters to
perf.

Signed-off-by: Ian Rogers <irogers@google.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alex Gaynor <alex.gaynor@gmail.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Alice Ryhl <aliceryhl@google.com>
Cc: Andreas Hindborg <a.hindborg@kernel.org>
Cc: Ariel Ben-Yehuda <ariel.byd@gmail.com>
Cc: Benno Lossin <benno.lossin@proton.me>
Cc: Bill Wendling <morbo@google.com>
Cc: Björn Roy Baron <bjorn3_gh@protonmail.com>
Cc: Boqun Feng <boqun.feng@gmail.com>
Cc: Daniel Xu <dxu@dxuuu.xyz>
Cc: Danilo Krummrich <dakr@kernel.org>
Cc: Gary Guo <gary@garyguo.net>
Cc: Howard Chu <howardchu95@gmail.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@linaro.org>
Cc: Jiapeng Chong <jiapeng.chong@linux.alibaba.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Justin Stitt <justinstitt@google.com>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Miguel Ojeda <ojeda@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Nathan Chancellor <nathan@kernel.org>
Cc: Nick Desaulniers <nick.desaulniers+lkml@gmail.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Ravi Bangoria <ravi.bangoria@amd.com>
Cc: Stephen Brennan <stephen.s.brennan@oracle.com>
Cc: Trevor Gross <tmgross@umich.edu>
Link: https://lore.kernel.org/r/20250430004128.474388-7-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2025-05-09 17:03:09 -03:00
Ian Rogers
41fcc9a343 perf test demangle-java: Switch to using dso__demangle_sym()
The use of the demangle-java APIs means we don't detect if a different
demangler is used before the Java one for the case that matters to
perf.

Remove the return types from the demangled names as dso__demangle_sym()
removes those.

Signed-off-by: Ian Rogers <irogers@google.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alex Gaynor <alex.gaynor@gmail.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Alice Ryhl <aliceryhl@google.com>
Cc: Andreas Hindborg <a.hindborg@kernel.org>
Cc: Ariel Ben-Yehuda <ariel.byd@gmail.com>
Cc: Benno Lossin <benno.lossin@proton.me>
Cc: Bill Wendling <morbo@google.com>
Cc: Björn Roy Baron <bjorn3_gh@protonmail.com>
Cc: Boqun Feng <boqun.feng@gmail.com>
Cc: Daniel Xu <dxu@dxuuu.xyz>
Cc: Danilo Krummrich <dakr@kernel.org>
Cc: Gary Guo <gary@garyguo.net>
Cc: Howard Chu <howardchu95@gmail.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@linaro.org>
Cc: Jiapeng Chong <jiapeng.chong@linux.alibaba.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Justin Stitt <justinstitt@google.com>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Miguel Ojeda <ojeda@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Nathan Chancellor <nathan@kernel.org>
Cc: Nick Desaulniers <nick.desaulniers+lkml@gmail.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Ravi Bangoria <ravi.bangoria@amd.com>
Cc: Stephen Brennan <stephen.s.brennan@oracle.com>
Cc: Trevor Gross <tmgross@umich.edu>
Link: https://lore.kernel.org/r/20250430004128.474388-6-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2025-05-09 17:02:23 -03:00
Ian Rogers
bdf05ccd18 perf test demangle-rust: Add Rust demangling test
The test cases are listed examples in:

https://doc.rust-lang.org/rustc/symbol-mangling/v0.html

This test was previously part of a different Rust v0 demangler:

https://lore.kernel.org/lkml/20250129193037.573431-1-irogers@google.com/

Signed-off-by: Ian Rogers <irogers@google.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alex Gaynor <alex.gaynor@gmail.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Alice Ryhl <aliceryhl@google.com>
Cc: Andreas Hindborg <a.hindborg@kernel.org>
Cc: Ariel Ben-Yehuda <ariel.byd@gmail.com>
Cc: Benno Lossin <benno.lossin@proton.me>
Cc: Bill Wendling <morbo@google.com>
Cc: Björn Roy Baron <bjorn3_gh@protonmail.com>
Cc: Boqun Feng <boqun.feng@gmail.com>
Cc: Daniel Xu <dxu@dxuuu.xyz>
Cc: Danilo Krummrich <dakr@kernel.org>
Cc: Gary Guo <gary@garyguo.net>
Cc: Howard Chu <howardchu95@gmail.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@linaro.org>
Cc: Jiapeng Chong <jiapeng.chong@linux.alibaba.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Justin Stitt <justinstitt@google.com>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Miguel Ojeda <ojeda@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Nathan Chancellor <nathan@kernel.org>
Cc: Nick Desaulniers <nick.desaulniers+lkml@gmail.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Ravi Bangoria <ravi.bangoria@amd.com>
Cc: Stephen Brennan <stephen.s.brennan@oracle.com>
Cc: Trevor Gross <tmgross@umich.edu>
Link: https://lore.kernel.org/r/20250430004128.474388-5-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2025-05-09 17:01:57 -03:00
Ian Rogers
ac292ea7c3 perf demangle-rust: Remove previous legacy rust decoder
Code is unused since the introduction of rustc-demangle demangler.

Reviewed-by: Miguel Ojeda <ojeda@kernel.org>
Signed-off-by: Ian Rogers <irogers@google.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alex Gaynor <alex.gaynor@gmail.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Alice Ryhl <aliceryhl@google.com>
Cc: Andreas Hindborg <a.hindborg@kernel.org>
Cc: Ariel Ben-Yehuda <ariel.byd@gmail.com>
Cc: Benno Lossin <benno.lossin@proton.me>
Cc: Bill Wendling <morbo@google.com>
Cc: Björn Roy Baron <bjorn3_gh@protonmail.com>
Cc: Boqun Feng <boqun.feng@gmail.com>
Cc: Daniel Xu <dxu@dxuuu.xyz>
Cc: Danilo Krummrich <dakr@kernel.org>
Cc: Gary Guo <gary@garyguo.net>
Cc: Howard Chu <howardchu95@gmail.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@linaro.org>
Cc: Jiapeng Chong <jiapeng.chong@linux.alibaba.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Justin Stitt <justinstitt@google.com>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Nathan Chancellor <nathan@kernel.org>
Cc: Nick Desaulniers <nick.desaulniers+lkml@gmail.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Ravi Bangoria <ravi.bangoria@amd.com>
Cc: Stephen Brennan <stephen.s.brennan@oracle.com>
Cc: Trevor Gross <tmgross@umich.edu>
Link: https://lore.kernel.org/r/20250430004128.474388-4-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2025-05-09 17:01:39 -03:00
Ian Rogers
e20848c317 perf symbol-elf: Integrate rust-v0 demangling
Use the demangle-rust-v0 APIs to see if symbol is Rust mangled and
demangle if so.

The API requires a pre-allocated output buffer, some estimation and
retrying are added for this.

Signed-off-by: Ian Rogers <irogers@google.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alex Gaynor <alex.gaynor@gmail.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Alice Ryhl <aliceryhl@google.com>
Cc: Andreas Hindborg <a.hindborg@kernel.org>
Cc: Ariel Ben-Yehuda <ariel.byd@gmail.com>
Cc: Benno Lossin <benno.lossin@proton.me>
Cc: Bill Wendling <morbo@google.com>
Cc: Björn Roy Baron <bjorn3_gh@protonmail.com>
Cc: Boqun Feng <boqun.feng@gmail.com>
Cc: Daniel Xu <dxu@dxuuu.xyz>
Cc: Danilo Krummrich <dakr@kernel.org>
Cc: Gary Guo <gary@garyguo.net>
Cc: Howard Chu <howardchu95@gmail.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@linaro.org>
Cc: Jiapeng Chong <jiapeng.chong@linux.alibaba.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Justin Stitt <justinstitt@google.com>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Miguel Ojeda <ojeda@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Nathan Chancellor <nathan@kernel.org>
Cc: Nick Desaulniers <nick.desaulniers+lkml@gmail.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Ravi Bangoria <ravi.bangoria@amd.com>
Cc: Stephen Brennan <stephen.s.brennan@oracle.com>
Cc: Trevor Gross <tmgross@umich.edu>
Link: https://lore.kernel.org/r/20250430004128.474388-3-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2025-05-09 17:00:59 -03:00
Ian Rogers
60869b22af perf demangle-rust: Add rustc-demangle C demangler
Imported at commit 80e40f57d99f ("add comment about finding latest
version of code") from:

https://github.com/rust-lang/rustc-demangle/blob/main/crates/native-c/src/demangle.c
https://github.com/rust-lang/rustc-demangle/blob/main/crates/native-c/include/demangle.h

There is discussion of this issue motivating the import in:

https://github.com/rust-lang/rust/issues/60705
https://lore.kernel.org/lkml/20250129193037.573431-1-irogers@google.com/

The SPDX lines reflect the dual license Apache-2 or MIT in:

https://github.com/rust-lang/rustc-demangle/blob/main/README.md

Following Migual Ojeda's suggestion comments were added on copyright and
keeping the code in sync with upstream.

The files are renamed as perf supports multiple demanglers and so
demangle as a name would be overloaded.

The work here was done by Ariel Ben-Yehuda <ariel.byd@gmail.com> and I
am merely importing it as discussed in the rust-lang issue.

Reviewed-by: Miguel Ojeda <ojeda@kernel.org>
Signed-off-by: Ian Rogers <irogers@google.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alex Gaynor <alex.gaynor@gmail.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Alice Ryhl <aliceryhl@google.com>
Cc: Andreas Hindborg <a.hindborg@kernel.org>
Cc: Ariel Ben-Yehuda <ariel.byd@gmail.com>
Cc: Benno Lossin <benno.lossin@proton.me>
Cc: Bill Wendling <morbo@google.com>
Cc: Björn Roy Baron <bjorn3_gh@protonmail.com>
Cc: Boqun Feng <boqun.feng@gmail.com>
Cc: Daniel Xu <dxu@dxuuu.xyz>
Cc: Danilo Krummrich <dakr@kernel.org>
Cc: Gary Guo <gary@garyguo.net>
Cc: Howard Chu <howardchu95@gmail.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@linaro.org>
Cc: Jiapeng Chong <jiapeng.chong@linux.alibaba.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Justin Stitt <justinstitt@google.com>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Nathan Chancellor <nathan@kernel.org>
Cc: Nick Desaulniers <nick.desaulniers+lkml@gmail.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Ravi Bangoria <ravi.bangoria@amd.com>
Cc: Stephen Brennan <stephen.s.brennan@oracle.com>
Cc: Trevor Gross <tmgross@umich.edu>
Link: https://lore.kernel.org/r/20250430004128.474388-2-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2025-05-09 17:00:05 -03:00
Colin Ian King
4f1a19b8bc perf test amd ibs: Fix spelling mistake "Asssuming" -> "Assuming"
There is a spelling mistake ina pr_debug message. Fix it.

Signed-off-by: Colin Ian King <colin.i.king@gmail.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: https://lore.kernel.org/r/20250507082421.188848-1-colin.i.king@gmail.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2025-05-09 14:47:19 -03:00
Namhyung Kim
c60b7d6f50 perf pmu: Use available core PMU for raw events
When it finds a matching PMU for a legacy event, it should look for
core PMUs.  The raw events also refers to core events so it should be
handled similarly.

On x86, PERF_TYPE_RAW should match with the existing cpu PMU.  But on
ARM, there's no PMU with the matching type so it'll pick the first core
PMU for it.

Suggested-by: Ian Rogers <irogers@google.com>
Reviewed-by: Ian Rogers <irogers@google.com>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: https://lore.kernel.org/r/20250507215939.54399-1-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2025-05-09 14:44:44 -03:00
Namhyung Kim
c42e219942 perf lock contention: Add -J/--inject-delay option
This is to slow down lock acquistion (on contention locks) deliberately.

A possible use case is to estimate impact on application performance by
optimization of kernel locking behavior.  By delaying the lock it can
simulate the worse condition as a control group, and then compare with
the current behavior as a optimized condition.

The syntax is 'time@function' and the time can have unit suffix like
"us" and "ms".  For example, I ran a simple test like below.

  $ sudo perf lock con -abl -L tasklist_lock -- \
    sh -c 'for i in $(seq 1000); do sleep 1 & done; wait'
   contended   total wait     max wait     avg wait            address   symbol

          92      1.18 ms    199.54 us     12.79 us   ffffffff8a806080   tasklist_lock (rwlock)

The contention count was 92 and the average wait time was around 10 us.
But if I add 100 usec of delay to the tasklist_lock,

  $ sudo perf lock con -abl -L tasklist_lock -J 100us@tasklist_lock -- \
    sh -c 'for i in $(seq 1000); do sleep 1 & done; wait'
   contended   total wait     max wait     avg wait            address   symbol

         190     15.67 ms    230.10 us     82.46 us   ffffffff8a806080   tasklist_lock (rwlock)

The contention count increased and the average wait time was up closed
to 100 usec.  If I increase the delay even more,

  $ sudo perf lock con -abl -L tasklist_lock -J 1ms@tasklist_lock -- \
    sh -c 'for i in $(seq 1000); do sleep 1 & done; wait'
   contended   total wait     max wait     avg wait            address   symbol

        1002      2.80 s       3.01 ms      2.80 ms   ffffffff8a806080   tasklist_lock (rwlock)

Now every sleep process had contention and the wait time was more than 1
msec.  This is on my 4 CPU laptop so I guess one CPU has the lock while
other 3 are waiting for it mostly.

For simplicity, it only supports global locks for now.

Committer testing:

  root@number:~# grep -m1 'model name' /proc/cpuinfo
  model name : AMD Ryzen 9 9950X3D 16-Core Processor
  root@number:~# perf lock con -abl -L tasklist_lock -- sh -c 'for i in $(seq 1000); do sleep 1 & done; wait'
   contended  total wait   max wait  avg wait           address  symbol

         142   453.85 us   25.39 us   3.20 us  ffffffffae808080  tasklist_lock (rwlock)
  root@number:~# perf lock con -abl -L tasklist_lock -J 100us@tasklist_lock -- sh -c 'for i in $(seq 1000); do sleep 1 & done; wait'
   contended  total wait   max wait  avg wait           address  symbol

        1040     2.39 s     3.11 ms   2.30 ms  ffffffffae808080  tasklist_lock (rwlock)
  root@number:~# perf lock con -abl -L tasklist_lock -J 1ms@tasklist_lock -- sh -c 'for i in $(seq 1000); do sleep 1 & done; wait'
   contended  total wait   max wait  avg wait           address  symbol

        1025    24.72 s    31.01 ms  24.12 ms  ffffffffae808080  tasklist_lock (rwlock)
  root@number:~#

Suggested-by: Stephane Eranian <eranian@google.com>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Song Liu <song@kernel.org>
Link: https://lore.kernel.org/r/20250509171950.183591-1-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2025-05-09 14:32:15 -03:00
Michael Petlan
4bfe27140e perf tests: Fix 'perf report' tests installation
There was a copy-paste mistake in the installation commands.

Also, we need to install stderr-whitelist.txt file, which contains
allowed messages that are printed on stderr and should not cause test
fail.

Fixes: 097fe67df1 ("perf testsuite: Install perf-report tests in the 'make install-tests -C tools/perf' target")
Signed-off-by: Michael Petlan <mpetlan@redhat.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: https://lore.kernel.org/r/20250113182605.130719-6-vmolnaro@redhat.com
Signed-off-by: Veronika Molnarova <vmolnaro@redhat.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2025-05-09 14:10:08 -03:00
Namhyung Kim
30d20fb1f8 perf trace: Fix leaks of 'struct thread' in set_filter_loop_pids()
I've found some leaks from 'perf trace -a'.

It seems there are more leaks but this is what I can find for now.

Fixes: 082ab9a18e ("perf trace: Filter out 'sshd' in the tracer ancestry in syswide tracing")
Reviewed-by: Howard Chu <howardchu95@gmail.com>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: https://lore.kernel.org/r/20250403054213.7021-1-namhyung@kernel.org
[ split from a larget patch ]
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2025-05-08 17:11:50 -03:00
Namhyung Kim
bb3de7fa98 perf trace: Fix leaks of 'struct thread' in fprintf_sys_enter()
I've found some leaks from 'perf trace -a'.

It seems there are more leaks but this is what I can find for now.

Fixes: 70351029b5 ("perf thread: Add support for reading the e_machine type for a thread")
Reviewed-by: Howard Chu <howardchu95@gmail.com>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: https://lore.kernel.org/r/20250403054213.7021-1-namhyung@kernel.org
[ split from a larget patch ]
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2025-05-08 17:10:09 -03:00
Ian Rogers
70e21ac8b0 perf parse-events: Add debug dump of evlist if reordered
Add debug verbose output to show how evsels were reordered by
parse_events__sort_events_and_fix_groups(). For example:

```
$ perf record -v -e '{instructions,cycles}' true
Using CPUID GenuineIntel-6-B7-1
WARNING: events were regrouped to match PMUs
evlist after sorting/fixing: '{cpu_atom/instructions/,cpu_atom/cycles/},{cpu_core/instructions/,cpu_core/cycles/}'
```

Reviewed-by: Kan Liang <kan.liang@linux.intel.com>
Signed-off-by: Ian Rogers <irogers@google.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Dominique Martinet <asmadeus@codewreck.org>
Cc: Dr. David Alan Gilbert <linux@treblig.org>
Cc: Howard Chu <howardchu95@gmail.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@linaro.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Levi Yun <yeoreum.yun@arm.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Weilin Wang <weilin.wang@intel.com>
Link: https://lore.kernel.org/r/20250402201549.4090305-6-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2025-05-08 12:53:51 -03:00
Ian Rogers
583dc500d1 perf evlist: Make groups visible in evlist__format_evsels() output
Make groups visible in output:

Before:

{cycles,instructions} ->
cpu_atom/cycles/,cpu_atom/instructions/,cpu_core/cycles/,cpu_core/instructions/

After:

{cycles,instructions} ->
{cpu_atom/cycles/,cpu_atom/instructions/},{cpu_core/cycles/,cpu_core/instructions/}

Committer testing:

Before:

  root@number:~# perf record -e '{cycles,instructions,cache-misses}' /tmp/bla
  Failed to collect 'cycles,instructions,cache-misses' for the '/tmp/bla' workload: Permission denied
  root@number:~#

After:

  root@number:~# perf record -e '{cycles,instructions,cache-misses}' /tmp/bla
  Failed to collect '{cycles,instructions,cache-misses}' for the '/tmp/bla' workload: Permission denied
  root@number:~#

Reviewed-by: Kan Liang <kan.liang@linux.intel.com>
Signed-off-by: Ian Rogers <irogers@google.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Dominique Martinet <asmadeus@codewreck.org>
Cc: Dr. David Alan Gilbert <linux@treblig.org>
Cc: Howard Chu <howardchu95@gmail.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@linaro.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Levi Yun <yeoreum.yun@arm.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Weilin Wang <weilin.wang@intel.com>
Link: https://lore.kernel.org/r/20250402201549.4090305-5-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2025-05-08 12:51:40 -03:00
Ian Rogers
f0f245eaa2 perf evlist: Refactor evlist__scnprintf_evsels()
Switch output to using a strbuf so the storage can be resized.

Add a maximum size argument to avoid too much output that may happen for
uncore events.

Rename as scnprintf is no longer used.

Committer testing:

  With the patch applied:

  root@number:~# perf probe -x ~/bin/perf evlist__format_evsels
  Added new event:
    probe_perf:evlist_format_evsels (on evlist__format_evsels in /home/acme/bin/perf)

  You can now use it in all perf tools, such as:

  	perf record -e probe_perf:evlist_format_evsels -aR sleep 1

  root@number:~# perf probe -l
    probe_perf:evlist_format_evsels (on evlist__format_evsels@util/evlist.c in /home/acme/bin/perf)
  root@number:~# perf trace -e probe_perf:*/max-stack=10/ perf record -e cycles,instructions,cache-misses /tmp/bla
  Failed to collect 'cycles,instructions,cache-misses' for the '/tmp/bla' workload: Permission denied
       0.000 perf/3893011 probe_perf:evlist_format_evsels(__probe_ip: 6183397)
                                         evlist__format_evsels (/home/acme/bin/perf)
                                         __cmd_record (/home/acme/bin/perf)
                                         cmd_record (/home/acme/bin/perf)
                                         run_builtin (/home/acme/bin/perf)
                                         handle_internal_command (/home/acme/bin/perf)
                                         run_argv (/home/acme/bin/perf)
                                         main (/home/acme/bin/perf)
                                         __libc_start_call_main (/usr/lib64/libc.so.6)
                                         __libc_start_main@@GLIBC_2.34 (/usr/lib64/libc.so.6)
                                         _start (/home/acme/bin/perf)
  root@number:~#

Reviewed-by: Kan Liang <kan.liang@linux.intel.com>
Signed-off-by: Ian Rogers <irogers@google.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Dominique Martinet <asmadeus@codewreck.org>
Cc: Dr. David Alan Gilbert <linux@treblig.org>
Cc: Howard Chu <howardchu95@gmail.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@linaro.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Levi Yun <yeoreum.yun@arm.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Weilin Wang <weilin.wang@intel.com>
Link: https://lore.kernel.org/r/20250402201549.4090305-4-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2025-05-08 12:47:46 -03:00
Ian Rogers
a5efaf9008 perf stat: Remove print_mixed_hw_group_error
print_mixed_hw_group_error will print a warning when a group of events
uses different PMUs.

This isn't possible to happen as parse_events__sort_events_and_fix_groups()
will break groups when this happens, adding the warning at the start
of perf of:

  WARNING: events were regrouped to match PMUs

As the previous mixed group warning can never happen, remove the
associated code.

Committer testing:

Before/after:

  acme@five:~$ perf stat -e '{cpu_core/cycles/,cpu_atom/cycles/}' sleep 1
  WARNING: events were regrouped to match PMUs

   Performance counter stats for 'sleep 1':

             424,895      cpu_atom/cycles/u
       <not counted>      cpu_core/cycles/u         (0.00%)

         1.011862314 seconds time elapsed

         0.000000000 seconds user
         0.003166000 seconds sys

  acme@five:~$

Reviewed-by: Kan Liang <kan.liang@linux.intel.com>
Signed-off-by: Ian Rogers <irogers@google.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Dominique Martinet <asmadeus@codewreck.org>
Cc: Dr. David Alan Gilbert <linux@treblig.org>
Cc: Howard Chu <howardchu95@gmail.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@linaro.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Levi Yun <yeoreum.yun@arm.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Weilin Wang <weilin.wang@intel.com>
Link: https://lore.kernel.org/r/20250402201549.4090305-3-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2025-05-08 12:46:23 -03:00
Ian Rogers
4b53137721 perf stat: Better hybrid support for the NMI watchdog warning
Prior to this patch evlist__has_hybrid would return false if the
processor wasn't hybrid or the evlist didn't contain any core
events. If the only PMU used by events was cpu_core then it would
true even though there are no cpu_atom events. For example:

```
$ perf stat --cputype=cpu_core -e '{cycles,cycles,cycles,cycles,cycles,cycles,cycles,cycles,cycles}' true

 Performance counter stats for 'true':

     <not counted>      cpu_core/cycles/                                                        (0.00%)
     <not counted>      cpu_core/cycles/                                                        (0.00%)
     <not counted>      cpu_core/cycles/                                                        (0.00%)
     <not counted>      cpu_core/cycles/                                                        (0.00%)
     <not counted>      cpu_core/cycles/                                                        (0.00%)
     <not counted>      cpu_core/cycles/                                                        (0.00%)
     <not counted>      cpu_core/cycles/                                                        (0.00%)
     <not counted>      cpu_core/cycles/                                                        (0.00%)
     <not counted>      cpu_core/cycles/                                                        (0.00%)

       0.001981900 seconds time elapsed

       0.002311000 seconds user
       0.000000000 seconds sys
```

This patch changes evlist__has_hybrid to return true only if the
evlist contains events from >1 core PMU. This means the NMI watchdog
warning is shown for the case above.

Reviewed-by: Kan Liang <kan.liang@linux.intel.com>
Signed-off-by: Ian Rogers <irogers@google.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Dominique Martinet <asmadeus@codewreck.org>
Cc: Dr. David Alan Gilbert <linux@treblig.org>
Cc: Howard Chu <howardchu95@gmail.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@linaro.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Levi Yun <yeoreum.yun@arm.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Weilin Wang <weilin.wang@intel.com>
Link: https://lore.kernel.org/r/20250402201549.4090305-2-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2025-05-08 12:21:40 -03:00
Ian Rogers
7900938850 perf trace: Add missing thread__put() in thread__e_machine()
Add missing thread__put() of the found parent thread in
thread__e_machine().

Reviewed-by: Howard Chu <howardchu95@gmail.com>
Signed-off-by: Ian Rogers <irogers@google.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: https://lore.kernel.org/r/20250401202715.3493567-1-irogers@google.com
[ split from a larger patch ]
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2025-05-08 11:50:44 -03:00
Ian Rogers
8830091383 perf trace: Free the files.max entry in files->table
The files.max is the maximum valid fd in the files array and so
freeing the values needs to be inclusive of the max value.

Reviewed-by: Howard Chu <howardchu95@gmail.com>
Signed-off-by: Ian Rogers <irogers@google.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: https://lore.kernel.org/r/20250401202715.3493567-1-irogers@google.com
[ split from a larger patch ]
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2025-05-08 11:49:40 -03:00
Arnaldo Carvalho de Melo
8330d092f7 Merge remote-tracking branch 'torvalds/master' into perf-tools-next
To pick up fixes from the latest perf-tools pull request from Namhyung
and get perf-tools-next in line with thinngs in other areas it uses,
like tools/lib/bpf, etc.

Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2025-05-07 12:51:38 -03:00
Ingo Molnar
24035886d7 Linux 6.15-rc5
-----BEGIN PGP SIGNATURE-----
 
 iQFSBAABCgA8FiEEq68RxlopcLEwq+PEeb4+QwBBGIYFAmgX1CgeHHRvcnZhbGRz
 QGxpbnV4LWZvdW5kYXRpb24ub3JnAAoJEHm+PkMAQRiGxiIH/A7LHlVatGEQgRFi
 0JALDgcuGTMtMU1qD43rv8Z1GXqTpCAlaBt9D1C9cUH/86MGyBTVRWgVy0wkaU2U
 8QSfFWQIbrdaIzelHtzmAv5IDtb+KrcX1iYGLcMb6ZYaWkv8/CMzMX1nkgxEr1QT
 37Xo3/F17yJumAdNQxdRhVLGy2d3X5rScecpufwh97sMwoddllMCDs2LIoeSAYpG
 376/wzni09G2fADa8MEKqcaMue4qcf0FOo/gOkT8YwFGSZLKa6uumlBLg04QoCt0
 foK2vfcci1q4H4ZbCu3uQESYGLQHY0f2ICDCwC3m25VF9a81TmlbC3MLum3vhmKe
 RtLDcXg=
 =xyaI
 -----END PGP SIGNATURE-----

Merge tag 'v6.15-rc5' into x86/cpu, to resolve conflicts

 Conflicts:
	tools/arch/x86/include/asm/cpufeatures.h

Signed-off-by: Ingo Molnar <mingo@kernel.org>
2025-05-06 10:00:58 +02:00
Howard Chu
8feafba59c perf test: Add direct off-cpu tests
Since we added --off-cpu-thresh, add tests for when a sample's off-cpu
time is above the threshold, and when it's below the threshold.

Note that the basic test performed in test_offcpu_basic() collects a
direct sample now, since sleep 1 has duration of 1000ms, higher than the
default value of --off-cpu-thresh of 500ms, resulting in a direct
sample.

An example:

  $ sudo perf test offcpu
  124: perf record offcpu profiling tests                              : Ok
  $

Committer testing:

  root@number:~# perf test offcpu
  126: perf record offcpu profiling tests                              : Ok
  root@number:~# perf test -v offcpu
  126: perf record offcpu profiling tests                              : Ok
  root@number:~# perf test -vv offcpu
  126: perf record offcpu profiling tests:
  --- start ---
  test child forked, pid 1410791
  Checking off-cpu privilege
  Basic off-cpu test
  Basic off-cpu test [Success]
  Child task off-cpu test
  Child task off-cpu test [Success]
  Threshold test (above threshold)
  Threshold test (above threshold) [Success]
  Threshold test (below threshold)
  Threshold test (below threshold) [Success]
  ---- end(0) ----
  126: perf record offcpu profiling tests                              : Ok
  root@number:~#

Suggested-by: Namhyung Kim <namhyung@kernel.org>
Reviewed-by: Ian Rogers <irogers@google.com>
Signed-off-by: Howard Chu <howardchu95@gmail.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Tested-by: Gautam Menghani <gautam@linux.ibm.com>
Tested-by: Ian Rogers <irogers@google.com>
Acked-by: Namhyung Kim <namhyung@kernel.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: https://lore.kernel.org/r/20250501022809.449767-11-howardchu95@gmail.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2025-05-05 21:52:08 -03:00
Howard Chu
9557c00076 perf record --off-cpu: Add --off-cpu-thresh option
Specify the threshold for dumping offcpu samples with --off-cpu-thresh,
the unit is milliseconds. Default value is 500ms.

Example:

  perf record --off-cpu --off-cpu-thresh 824

The example above collects direct off-cpu samples where the off-cpu time
is longer than 824ms.

Committer testing:

After commenting out the end off-cpu dump to have just the ones that are
added right after the task is scheduled back, and using a threshould of
1000ms, we see some periods (the 5th column, just before "offcpu-time"
in the 'perf script' output) that are over 1000.000.000 nanoseconds:

  root@number:~# perf record --off-cpu --off-cpu-thresh 10000
  ^C[ perf record: Woken up 1 times to write data ]
  [ perf record: Captured and wrote 3.902 MB perf.data (34335 samples) ]
  root@number:~# perf script
<SNIP>
  Isolated Web Co   59932 [028] 63839.594437: 1000049427 offcpu-time:
             7fe63c7976c2 __syscall_cancel_arch_end+0x0 (/usr/lib64/libc.so.6)
             7fe63c78c04c __futex_abstimed_wait_common+0x7c (/usr/lib64/libc.so.6)
             7fe63c78e928 pthread_cond_timedwait@@GLIBC_2.3.2+0x178 (/usr/lib64/libc.so.6)
             5599974a9fe7 mozilla::detail::ConditionVariableImpl::wait_for(mozilla::detail::MutexImpl&, mozilla::BaseTimeDuration<mozilla::TimeDurationValueCalculator> const&)+0xe7 (/usr/lib64/fir>
                100000000 [unknown] ([unknown])

          swapper       0 [025] 63839.594459:     195724    cycles:P:  ffffffffac328270 read_tsc+0x0 ([kernel.kallsyms])
  Isolated Web Co   59932 [010] 63839.594466: 1000055278 offcpu-time:
             7fe63c7976c2 __syscall_cancel_arch_end+0x0 (/usr/lib64/libc.so.6)
             7fe63c78ba24 __syscall_cancel+0x14 (/usr/lib64/libc.so.6)
             7fe63c804c4e __poll+0x1e (/usr/lib64/libc.so.6)
             7fe633b0d1b8 PollWrapper(_GPollFD*, unsigned int, int) [clone .lto_priv.0]+0xf8 (/usr/lib64/firefox/libxul.so)
                10000002c [unknown] ([unknown])

          swapper       0 [027] 63839.594475:     134433    cycles:P:  ffffffffad4c45d9 irqentry_enter+0x19 ([kernel.kallsyms])
          swapper       0 [028] 63839.594499:     215838    cycles:P:  ffffffffac39199a switch_mm_irqs_off+0x10a ([kernel.kallsyms])
  MediaPD~oder #1 1407676 [027] 63839.594514:     134433    cycles:P:      7f982ef5e69f dct_IV(int*, int, int*)+0x24f (/usr/lib64/libfdk-aac.so.2.0.0)
          swapper       0 [024] 63839.594524:     267411    cycles:P:  ffffffffad4c6ee6 poll_idle+0x56 ([kernel.kallsyms])
  MediaSu~sor #75 1093827 [026] 63839.594555:     332652    cycles:P:      55be753ad030 moz_xmalloc+0x200 (/usr/lib64/firefox/firefox)
          swapper       0 [027] 63839.594616:     160548    cycles:P:  ffffffffad144840 menu_select+0x570 ([kernel.kallsyms])
  Isolated Web Co   14019 [027] 63839.595120: 1000050178 offcpu-time:
             7fc9537cc6c2 __syscall_cancel_arch_end+0x0 (/usr/lib64/libc.so.6)
             7fc9537c104c __futex_abstimed_wait_common+0x7c (/usr/lib64/libc.so.6)
             7fc9537c3928 pthread_cond_timedwait@@GLIBC_2.3.2+0x178 (/usr/lib64/libc.so.6)
             7fc95372a3c8 pt_TimedWait+0xb8 (/usr/lib64/libnspr4.so)
             7fc95372a8d8 PR_WaitCondVar+0x68 (/usr/lib64/libnspr4.so)
             7fc94afb1f7c WatchdogMain(void*)+0xac (/usr/lib64/firefox/libxul.so)
             7fc947498660 [unknown] ([unknown])
             7fc9535fce88 [unknown] ([unknown])
             7fc94b620e60 WatchdogManager::~WatchdogManager()+0x0 (/usr/lib64/firefox/libxul.so)
          fff8548387f8b48 [unknown] ([unknown])

          swapper       0 [003] 63839.595712:     212948    cycles:P:  ffffffffacd5b865 acpi_os_read_port+0x55 ([kernel.kallsyms])
<SNIP>

Suggested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Suggested-by: Ian Rogers <irogers@google.com>
Suggested-by: Namhyung Kim <namhyung@kernel.org>
Reviewed-by: Ian Rogers <irogers@google.com>
Signed-off-by: Howard Chu <howardchu95@gmail.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Tested-by: Gautam Menghani <gautam@linux.ibm.com>
Tested-by: Ian Rogers <irogers@google.com>
Acked-by: Namhyung Kim <namhyung@kernel.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@linaro.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: https://lore.kernel.org/r/20241108204137.2444151-2-howardchu95@gmail.com
Link: https://lore.kernel.org/r/20250501022809.449767-10-howardchu95@gmail.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2025-05-05 21:51:54 -03:00
Howard Chu
74069a0160 perf record --off-cpu: Dump the remaining PERF_SAMPLE_ in sample_type from BPF's stack trace map
Dump the remaining PERF_SAMPLE_ data, as if it is dumping a direct
sample.

Put the stack trace, tid, off-cpu time and cgroup id into the raw_data
section, just like a direct off-cpu sample coming from BPF's
bpf_perf_event_output().

This ensures that evsel__parse_sample() correctly parses both direct
samples and accumulated samples.

Suggested-by: Namhyung Kim <namhyung@kernel.org>
Reviewed-by: Ian Rogers <irogers@google.com>
Signed-off-by: Howard Chu <howardchu95@gmail.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Tested-by: Gautam Menghani <gautam@linux.ibm.com>
Tested-by: Ian Rogers <irogers@google.com>
Acked-by: Namhyung Kim <namhyung@kernel.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@linaro.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: https://lore.kernel.org/r/20241108204137.2444151-10-howardchu95@gmail.com
Link: https://lore.kernel.org/r/20250501022809.449767-9-howardchu95@gmail.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2025-05-05 21:51:43 -03:00
Howard Chu
8ae7a5769b perf script: Display off-cpu samples correctly
No PERF_SAMPLE_CALLCHAIN in sample_type, but 'perf script' needs to
display a callchain, have to specify manually.

Also, prefer displaying a callchain:

 gvfs-afc-volume    2267 [001] 3829232.955656: 1001115340 offcpu-time:
            77f05292603f __pselect+0xbf (/usr/lib/x86_64-linux-gnu/libc.so.6)
            77f052a1801c [unknown] (/usr/lib/x86_64-linux-gnu/libusbmuxd-2.0.so.6.0.0)
            77f052a18d45 [unknown] (/usr/lib/x86_64-linux-gnu/libusbmuxd-2.0.so.6.0.0)
            77f05289ca94 start_thread+0x384 (/usr/lib/x86_64-linux-gnu/libc.so.6)
            77f052929c3c clone3+0x2c (/usr/lib/x86_64-linux-gnu/libc.so.6)

to a raw binary BPF output:

  BPF output: 0000: dd 08 00 00 db 08 00 00  <DD>...<DB>...
	  0008: cc ce ab 3b 00 00 00 00  <CC>Ϋ;....
	  0010: 06 00 00 00 00 00 00 00  ........
	  0018: 00 fe ff ff ff ff ff ff  .<FE><FF><FF><FF><FF><FF><FF>
	  0020: 3f 60 92 52 f0 77 00 00  ?`.R<F0>w..
	  0028: 1c 80 a1 52 f0 77 00 00  ..<A1>R<F0>w..
	  0030: 45 8d a1 52 f0 77 00 00  E.<A1>R<F0>w..
	  0038: 94 ca 89 52 f0 77 00 00  .<CA>.R<F0>w..
	  0040: 3c 9c 92 52 f0 77 00 00  <..R<F0>w..
	  0048: 00 00 00 00 00 00 00 00  ........
	  0050: 00 00 00 00              ....

Reviewed-by: Ian Rogers <irogers@google.com>
Signed-off-by: Howard Chu <howardchu95@gmail.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Tested-by: Gautam Menghani <gautam@linux.ibm.com>
Tested-by: Ian Rogers <irogers@google.com>
Acked-by: Namhyung Kim <namhyung@kernel.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@linaro.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: https://lore.kernel.org/r/20241108204137.2444151-9-howardchu95@gmail.com
Link: https://lore.kernel.org/r/20250501022809.449767-8-howardchu95@gmail.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2025-05-05 21:51:31 -03:00
Howard Chu
7de1a87f1e perf record --off-cpu: Disable perf_event's callchain collection
There is a check in evsel.c that does this:

if (evsel__is_offcpu_event(evsel))
	evsel->core.attr.sample_type &= OFFCPU_SAMPLE_TYPES;

This along with:

 #define OFFCPU_SAMPLE_TYPES  (PERF_SAMPLE_IDENTIFIER | PERF_SAMPLE_IP | \
			      PERF_SAMPLE_TID | PERF_SAMPLE_TIME | \
			      PERF_SAMPLE_ID | PERF_SAMPLE_CPU | \
			      PERF_SAMPLE_PERIOD | PERF_SAMPLE_CALLCHAIN | \
			      PERF_SAMPLE_CGROUP)

will tell perf_event to collect callchain.

We don't need the callchain from perf_event when collecting off-cpu
samples, because it's prev's callchain, not next's callchain.

   (perf_event)     (task_storage) (needed)
   prev             next
   |                  |
   ---sched_switch---->

Reviewed-by: Ian Rogers <irogers@google.com>
Signed-off-by: Howard Chu <howardchu95@gmail.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Tested-by: Gautam Menghani <gautam@linux.ibm.com>
Tested-by: Ian Rogers <irogers@google.com>
Acked-by: Namhyung Kim <namhyung@kernel.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@linaro.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: https://lore.kernel.org/r/20241108204137.2444151-8-howardchu95@gmail.com
Link: https://lore.kernel.org/r/20250501022809.449767-7-howardchu95@gmail.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2025-05-05 21:51:12 -03:00
Howard Chu
7f8f56475d perf evsel: Assemble off-cpu samples
Use the data in bpf-output samples, to assemble off-cpu samples.

In evsel__is_offcpu_event(), check if sample_type is PERF_SAMPLE_RAW to
support off-cpu sample data created by an older version of perf.

Testing compatibility on off-cpu samples collected by perf before this patch series:

See below, the sample_type still uses PERF_SAMPLE_CALLCHAIN

$ perf script --header -i ./perf.data.ptn | grep "event : name = offcpu-time"
 # event : name = offcpu-time, , id = { 237917, 237918, 237919, 237920 }, type = 1 (software), size = 136, config = 0xa (PERF_COUNT_SW_BPF_OUTPUT), { sample_period, sample_freq } = 1, sample_type = IP|TID|TIME|CALLCHAIN|CPU|PERIOD|IDENTIFIER, read_format = ID|LOST, disabled = 1, freq = 1, sample_id_all = 1

The output is correct.

  $ perf script -i ./perf.data.ptn | grep offcpu-time
  gmain    2173 [000] 18446744069.414584:  100102015 offcpu-time:
  NetworkManager     901 [000] 18446744069.414584:    5603579 offcpu-time:
  Web Content 1183550 [000] 18446744069.414584:      46278 offcpu-time:
  gnome-control-c 2200559 [000] 18446744069.414584: 11998247014 offcpu-time:
  <SNIP>
  $

And after this patch series:

  $ perf script --header -i ./perf.data.off-cpu-v9 | grep "event : name = offcpu-time"
   # event : name = offcpu-time, , id = { 237959, 237960, 237961, 237962 }, type = 1 (software), size = 136, config = 0xa (PERF_COUNT_SW_BPF_OUTPUT), { sample_period, sample_freq } = 1, sample_type = IP|TID|TIME|CPU|PERIOD|RAW|IDENTIFIER, read_format = ID|LOST, disabled = 1, freq = 1, sample_id_all = 1

  $ ./perf script -i ./perf.data.off-cpu-v9 | grep offcpu-time
  gnome-shell    1875 [001] 4789616.361225:  100097057 offcpu-time:
  gnome-shell    1875 [001] 4789616.461419:  100107463 offcpu-time:
      firefox 2206821 [002] 4789616.475690:  255257245 offcpu-time:
  $

Committer testing:

The command to record those samples:

  root@number:~# perf record --off-cpu -a sleep 1
  [ perf record: Woken up 1 times to write data ]
  [ perf record: Captured and wrote 2.092 MB perf.data (1552 samples) ]
  root@number:~#

Then, before this patch series, the sample_type for the "offcpu-time" event is:

  root@number:~# perf evlist -v | grep offcpu-time
  offcpu-time: type: 1 (PERF_TYPE_SOFTWARE), size: 136, config: 0xa (PERF_COUNT_SW_BPF_OUTPUT), { sample_period, sample_freq }: 1, sample_type: IP|TID|TIME|CALLCHAIN|CPU|PERIOD|IDENTIFIER, read_format: ID|LOST, disabled: 1, freq: 1, sample_id_all: 1
  root@number:~#

And after it, after recording it again:

  root@number:~# perf record --off-cpu -a sleep 1 ; perf evlist -v | grep offcpu-time
  [ perf record: Woken up 1 times to write data ]
  [ perf record: Captured and wrote 2.151 MB perf.data (2843 samples) ]
  offcpu-time: type: 1 (PERF_TYPE_SOFTWARE), size: 136, config: 0xa (PERF_COUNT_SW_BPF_OUTPUT), { sample_period, sample_freq }: 1, sample_type: IP|TID|TIME|CPU|PERIOD|IDENTIFIER, read_format: ID|LOST, disabled: 1, sample_id_all: 1
  root@number:~#

Suggested-by: Namhyung Kim <namhyung@kernel.org>
Reviewed-by: Ian Rogers <irogers@google.com>
Signed-off-by: Howard Chu <howardchu95@gmail.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Tested-by: Gautam Menghani <gautam@linux.ibm.com>
Tested-by: Ian Rogers <irogers@google.com>
Acked-by: Namhyung Kim <namhyung@kernel.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@linaro.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: https://lore.kernel.org/r/20241108204137.2444151-7-howardchu95@gmail.com
Link: https://lore.kernel.org/r/20250501022809.449767-6-howardchu95@gmail.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2025-05-05 21:49:08 -03:00
Howard Chu
d6948f2af2 perf record --off-cpu: Dump off-cpu samples in BPF
Collect tid, period, callchain, and cgroup id and dump them when off-cpu
time threshold is reached.

We don't collect the off-cpu time twice (the delta), it's either in
direct samples, or accumulated samples that are dumped at the end of
perf.data.

Suggested-by: Namhyung Kim <namhyung@kernel.org>
Reviewed-by: Ian Rogers <irogers@google.com>
Signed-off-by: Howard Chu <howardchu95@gmail.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Tested-by: Gautam Menghani <gautam@linux.ibm.com>
Tested-by: Ian Rogers <irogers@google.com>
Acked-by: Namhyung Kim <namhyung@kernel.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@linaro.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: https://lore.kernel.org/r/20241108204137.2444151-6-howardchu95@gmail.com
Link: https://lore.kernel.org/r/20250501022809.449767-5-howardchu95@gmail.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2025-05-05 21:48:44 -03:00
Howard Chu
282c195906 perf record --off-cpu: Preparation of off-cpu BPF program
Set the perf_event map in BPF for dumping off-cpu samples, and set the
offcpu_thresh to specify the threshold.

Reviewed-by: Ian Rogers <irogers@google.com>
Signed-off-by: Howard Chu <howardchu95@gmail.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Tested-by: Gautam Menghani <gautam@linux.ibm.com>
Tested-by: Ian Rogers <irogers@google.com>
Acked-by: Namhyung Kim <namhyung@kernel.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@linaro.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: https://lore.kernel.org/r/20241108204137.2444151-5-howardchu95@gmail.com
Link: https://lore.kernel.org/r/20250501022809.449767-4-howardchu95@gmail.com
[ Added some missing iteration variables to off_cpu_config() and fixed up
  a manually edited patch hunk line boundary line ]
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2025-05-05 21:48:20 -03:00
Howard Chu
0f72027bb9 perf record --off-cpu: Parse off-cpu event
Parse the off-cpu event using parse_event(), as bpf-output.

Call evlist__enable_evsel() on off-cpu event. This fixes the inability
to collect direct off-cpu samples on a workload, as reported by Arnaldo
Carvalho de Melo <acme@redhat.com>.

The reason being, workload sets enable_on_exec instead of calling
evlist__enable(), but off-cpu event does not attach to an executable and
execve won't be called, so the fds from perf_event_open() are not
enabled.

no-inherit should be set to 1, here's the reason:

We update the BPF perf_event map for direct off-cpu sample dumping (in
following patches), it executes as follows:

bpf_map_update_value()
 bpf_fd_array_map_update_elem()
  perf_event_fd_array_get_ptr()
   perf_event_read_local()

In perf_event_read_local(), there is:

int perf_event_read_local(struct perf_event *event, u64 *value,
			  u64 *enabled, u64 *running)
{
...
	/*
	 * It must not be an event with inherit set, we cannot read
	 * all child counters from atomic context.
	 */
	if (event->attr.inherit) {
		ret = -EOPNOTSUPP;
		goto out;
	}

Which means no-inherit has to be true for updating the BPF perf_event
map.

Moreover, for bpf-output events, we primarily want a system-wide event
instead of a per-task event.

The reason is that in BPF's bpf_perf_event_output(), BPF uses the CPU
index to retrieve the perf_event file descriptor it outputs to.

Making a bpf-output event system-wide naturally satisfies this
requirement by mapping CPU appropriately.

Suggested-by: Namhyung Kim <namhyung@kernel.org>
Reviewed-by: Ian Rogers <irogers@google.com>
Signed-off-by: Howard Chu <howardchu95@gmail.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Tested-by: Gautam Menghani <gautam@linux.ibm.com>
Tested-by: Ian Rogers <irogers@google.com>
Acked-by: Namhyung Kim <namhyung@kernel.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@linaro.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: https://lore.kernel.org/r/20241108204137.2444151-4-howardchu95@gmail.com
Link: https://lore.kernel.org/r/20250501022809.449767-3-howardchu95@gmail.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2025-05-05 21:48:02 -03:00
Howard Chu
671e943452 perf evsel: Expose evsel__is_offcpu_event() for future use
Expose evsel__is_offcpu_event() so it can be used in off_cpu_config(),
evsel__parse_sample() and 'perf script'.

Reviewed-by: Ian Rogers <irogers@google.com>
Signed-off-by: Howard Chu <howardchu95@gmail.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Tested-by: Gautam Menghani <gautam@linux.ibm.com>
Tested-by: Ian Rogers <irogers@google.com>
Acked-by: Namhyung Kim <namhyung@kernel.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@linaro.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: https://lore.kernel.org/r/20241108204137.2444151-3-howardchu95@gmail.com
Link: https://lore.kernel.org/r/20250501022809.449767-2-howardchu95@gmail.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2025-05-05 21:47:39 -03:00
Sebastian Andrzej Siewior
60035a3981 tools/perf: Allow to select the number of hash buckets
Add the -b/ --buckets argument to specify the number of hash buckets for
the private futex hash. This is directly passed to
    prctl(PR_FUTEX_HASH, PR_FUTEX_HASH_SET_SLOTS, buckets, immutable)

and must return without an error if specified. The `immutable' is 0 by
default and can be set to 1 via the -I/ --immutable argument.
The size of the private hash is verified with PR_FUTEX_HASH_GET_SLOTS.
If PR_FUTEX_HASH_GET_SLOTS failed then it is assumed that an older
kernel was used without the support and that the global hash is used.

Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Link: https://lore.kernel.org/r/20250416162921.513656-20-bigeasy@linutronix.de
2025-05-03 12:02:10 +02:00
Ian Rogers
fa9c4977fb perf symbol-minimal: Fix double free in filename__read_build_id
Running the "perf script task-analyzer tests" with address sanitizer
showed a double free:
```
FAIL: "test_csv_extended_times" Error message: "Failed to find required string:'Out-Out;'."
=================================================================
==19190==ERROR: AddressSanitizer: attempting double-free on 0x50b000017b10 in thread T0:
    #0 0x55da9601c78a in free (perf+0x26078a) (BuildId: e7ef50e08970f017a96fde6101c5e2491acc674a)
    #1 0x55da96640c63 in filename__read_build_id tools/perf/util/symbol-minimal.c:221:2

0x50b000017b10 is located 0 bytes inside of 112-byte region [0x50b000017b10,0x50b000017b80)
freed by thread T0 here:
    #0 0x55da9601ce40 in realloc (perf+0x260e40) (BuildId: e7ef50e08970f017a96fde6101c5e2491acc674a)
    #1 0x55da96640ad6 in filename__read_build_id tools/perf/util/symbol-minimal.c:204:10

previously allocated by thread T0 here:
    #0 0x55da9601ca23 in malloc (perf+0x260a23) (BuildId: e7ef50e08970f017a96fde6101c5e2491acc674a)
    #1 0x55da966407e7 in filename__read_build_id tools/perf/util/symbol-minimal.c:181:9

SUMMARY: AddressSanitizer: double-free (perf+0x26078a) (BuildId: e7ef50e08970f017a96fde6101c5e2491acc674a) in free
==19190==ABORTING
FAIL: "invocation of perf script report task-analyzer --csv-summary csvsummary --summary-extended command failed" Error message: ""
FAIL: "test_csvsummary_extended" Error message: "Failed to find required string:'Out-Out;'."
---- end(-1) ----
132: perf script task-analyzer tests                                 : FAILED!
```

The buf_size if always set to phdr->p_filesz, but that may be 0
causing a free and realloc to return NULL. This is treated in
filename__read_build_id like a failure and the buffer is freed again.

To avoid this problem only grow buf, meaning the buf_size will never
be 0. This also reduces the number of memory (re)allocations.

Fixes: b691f64360 ("perf symbols: Implement poor man's ELF parser")
Signed-off-by: Ian Rogers <irogers@google.com>
Acked-by: Namhyung Kim <namhyung@kernel.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung.kim@lge.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: https://lore.kernel.org/r/20250501070003.22251-1-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2025-05-02 15:36:14 -03:00
Namhyung Kim
f7458176a7 perf mem: Add 'dtlb' output field
This is a breakdown of perf_mem_data_src.mem_dtlb values.  It assumes
PMU drivers would set PERF_MEM_TLB_HIT bit with an appropriate level.

And having PERF_MEM_TLB_MISS means that it failed to find one in any
levels of TLB.  For now, it doesn't use PERF_MEM_TLB_{WK,OS} bits.

Also it seems Intel machines don't distinguish L1 or L2 precisely.  So I
added ANY_HIT (printed as "L?-Hit") to handle the case.

  $ perf mem report -F overhead,dtlb,dso --stdio
  ...
  #           --- D-TLB ----
  # Overhead   L?-Hit   Miss  Shared Object
  # ........  ..............  .................
  #
      67.03%    99.5%   0.5%  [unknown]
      31.23%    99.2%   0.8%  [kernel.kallsyms]
       1.08%    97.8%   2.2%  [i915]
       0.36%   100.0%   0.0%  [JIT] tid 6853
       0.12%   100.0%   0.0%  [drm]
       0.05%   100.0%   0.0%  [drm_kms_helper]
       0.05%   100.0%   0.0%  [ext4]
       0.02%   100.0%   0.0%  [aesni_intel]
       0.02%   100.0%   0.0%  [crc32c_intel]
       0.02%   100.0%   0.0%  [dm_crypt]
       ...

Committer testing:

  # perf report --header | grep cpudesc
  # cpudesc : AMD Ryzen 9 9950X3D 16-Core Processor
  # perf mem report -F overhead,dtlb,dso --stdio | head -20
  # To display the perf.data header info, please use --header/--header-only options.
  #
  #
  # Total Lost Samples: 0
  #
  # Samples: 2K of event 'cycles:P'
  # Total weight : 2637
  # Sort order   : local_weight,mem,sym,dso,symbol_daddr,dso_daddr,snoop,tlb,locked,blocked,local_ins_lat,local_p_stage_cyc
  #
  #           ---------- D-TLB -----------
  # Overhead   L1-Hit L2-Hit   Miss  Other  Shared Object
  # ........  ............................  .................................
  #
      77.47%    18.4%   0.1%   0.6%  80.9%  [kernel.kallsyms]
       5.61%    36.5%   0.7%   1.4%  61.5%  libxul.so
       2.77%    39.7%   0.0%  12.3%  47.9%  libc.so.6
       2.01%    34.0%   1.9%   1.9%  62.3%  libglib-2.0.so.0.8400.1
       1.93%    31.4%   2.0%   2.0%  64.7%  [amdgpu]
       1.63%    48.8%   0.0%   0.0%  51.2%  [JIT] tid 60168
       1.14%     3.3%   0.0%   0.0%  96.7%  [vdso]
  #

Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Leo Yan <leo.yan@arm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Ravi Bangoria <ravi.bangoria@amd.com>
Link: https://lore.kernel.org/r/20250430205548.789750-12-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2025-05-02 15:36:14 -03:00
Namhyung Kim
5e424a0178 perf mem: Add 'snoop' output field
This is a breakdown of perf_mem_data_src.mem_snoop values.  For now, it
doesn't use mem_snoopx values like FWD and PEER.

  $ perf mem report -F overhead,snoop,comm --stdio
  ...
  #           ---------- Snoop -----------
  # Overhead      Hit   HitM   Miss  Other  Command
  # ........  ............................  ...............
  #
      34.24%     0.6%   0.0%   0.0%  99.4%  gnome-shell
      12.02%     1.0%   0.0%   0.0%  99.0%  chrome
       9.32%     1.0%   0.0%   0.3%  98.7%  Isolated Web Co
       6.85%     1.0%   0.3%   0.0%  98.6%  swapper
       6.30%     0.8%   0.8%   0.0%  98.5%  Xorg
       3.02%     2.4%   0.0%   0.0%  97.6%  VizCompositorTh
       2.35%     0.0%   0.0%   0.0% 100.0%  firefox-esr
       2.04%     0.0%   0.0%   0.0% 100.0%  JS Helper
       1.51%     3.2%   0.0%   0.0%  96.8%  threaded-ml
       1.44%     0.0%   0.0%   0.0% 100.0%  AudioIP~allback
       ...

Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Leo Yan <leo.yan@arm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Ravi Bangoria <ravi.bangoria@amd.com>
Link: https://lore.kernel.org/r/20250430205548.789750-11-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2025-05-02 15:36:14 -03:00
Namhyung Kim
abe4dc24a8 perf mem: Add 'cache' and 'memory' output fields
This is a breakdown of perf_mem_data_src.mem_lvl_num.  But it's also
divided into two parts because the combination is bigger than 8.

Since there are many entries for different cache levels, 'cache' field
focuses on them.  I generalized buffers like LFB, MAB and MHB to L1-buf
and L2-buf.

The rest goes to 'memory' field which can be RAM, CXL, PMEM, IO, etc.

  $ perf mem report -F cache,mem,dso --stdio
  ...
  #
  # -------------- Cache --------------  --- Memory ---
  #      L1     L2     L3 L1-buf  Other      RAM  Other  Shared Object
  # ...................................  ..............  ....................................
  #
      53.9%   3.6%  16.2%  21.6%   4.8%     4.8%  95.2%  [kernel.kallsyms]
      64.7%   1.7%   3.5%  17.4%  12.8%    12.8%  87.2%  chrome (deleted)
      78.3%   2.8%   0.0%   1.0%  17.9%    17.9%  82.1%  libc.so.6
      39.6%   1.5%   0.0%   5.7%  53.2%    53.2%  46.8%  libxul.so
      26.2%   0.0%   0.0%   0.0%  73.8%    73.8%  26.2%  [unknown]
      85.5%   0.0%   0.0%  14.5%   0.0%     0.0% 100.0%  libspa-audioconvert.so
      66.3%   4.4%   0.0%  29.4%   0.0%     0.0% 100.0%  libglib-2.0.so.0.8200.1 (deleted)
       1.9%   0.0%   0.0%   0.0%  98.1%    98.1%   1.9%  libmutter-cogl-15.so.0.0.0 (deleted)
      10.6%   0.0%   0.0%  89.4%   0.0%     0.0% 100.0%  libpulsecommon-16.1.so
       0.0%   0.0%   0.0% 100.0%   0.0%     0.0% 100.0%  libfreeblpriv3.so (deleted)
       ...

Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Leo Yan <leo.yan@arm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Ravi Bangoria <ravi.bangoria@amd.com>
Link: https://lore.kernel.org/r/20250430205548.789750-10-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2025-05-02 15:36:14 -03:00
Namhyung Kim
225772c17c perf hist: Hide unused mem stat columns
Some mem_stat types don't use all 8 columns.  And there are cases only
samples in certain kinds of mem_stat types are available only.  For that
case hide columns which has no samples.

The new output for the previous data would be:

  $ perf mem report -F overhead,op,comm --stdio
  ...
  #           ------ Mem Op -------
  # Overhead     Load  Store  Other  Command
  # ........  .....................  ...............
  #
      44.85%    21.1%  30.7%  48.3%  swapper
      26.82%    98.8%   0.3%   0.9%  netsli-prober
       7.19%    51.7%  13.7%  34.6%  perf
       5.81%    89.7%   2.2%   8.1%  qemu-system-ppc
       4.77%   100.0%   0.0%   0.0%  notifications_c
       1.77%    95.9%   1.2%   3.0%  MemoryReleaser
       0.77%    71.6%   4.1%  24.3%  DefaultEventMan
       0.19%    66.7%  22.2%  11.1%  gnome-shell
       ...

On Intel machines, the event is only for loads or stores so it'll have
only one column:

  #            Mem Op
  # Overhead     Load  Command
  # ........  .......  ...............
  #
      20.55%   100.0%  swapper
      17.13%   100.0%  chrome
       9.02%   100.0%  data-loop.0
       6.26%   100.0%  pipewire-pulse
       5.63%   100.0%  threaded-ml
       5.47%   100.0%  GraphRunner
       5.37%   100.0%  AudioIP~allback
       5.30%   100.0%  Chrome_ChildIOT
       3.17%   100.0%  Isolated Web Co
       ...

Committer testing:

  # grep "model name" -m1 /proc/cpuinfo
  model name	: AMD Ryzen 9 9950X3D 16-Core Processo
  # perf mem report -F overhead,op,comm --stdio
  # Total Lost Samples: 0
  #
  # Samples: 2K of event 'cycles:P'
  # Total weight : 2637
  # Sort order   : local_weight,mem,sym,dso,symbol_daddr,dso_daddr,snoop,tlb,locked,blocked,local_ins_lat,local_p_stage_cyc
  #
  #           ------ Mem Op -------
  # Overhead     Load  Store  Other  Command
  # ........  .....................  ...............
  #
      61.02%    14.4%  25.5%  60.1%  swapper
       5.61%    26.4%  13.5%  60.1%  Isolated Web Co
       5.50%    21.4%  29.7%  49.0%  perf
       4.74%    27.2%  15.2%  57.6%  gnome-shell
       4.63%    33.6%  11.5%  54.9%  mdns_service
       4.29%    28.3%  12.4%  59.3%  ptyxis
       2.16%    24.6%  19.3%  56.1%  DOM Worker
       0.99%    23.1%  34.6%  42.3%  firefox
       0.72%    26.3%  15.8%  57.9%  IPC I/O Parent
       0.61%    12.5%  12.5%  75.0%  kworker/u130:20
       0.61%    37.5%  18.8%  43.8%  podman
       0.57%    33.3%   6.7%  60.0%  Timer
       0.53%    14.3%   7.1%  78.6%  KMS thread
       0.49%    30.8%   7.7%  61.5%  kworker/u130:3-
       0.46%    41.7%  33.3%  25.0%  IPDL Background

Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Leo Yan <leo.yan@arm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Ravi Bangoria <ravi.bangoria@amd.com>
Link: https://lore.kernel.org/r/20250430205548.789750-9-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2025-05-02 15:36:14 -03:00
Namhyung Kim
1e6569dca5 perf mem: Add 'op' output field
This is an actual example of the he_mem_stat based sample breakdown.  It
uses 'mem_op' field of union perf_mem_data_src which means memory
operations.

It'd have basically 'load' or 'store' which can be useful if PMU doesn't
have separate events for them like IBS or SPE.  In addition, there's an
entry in case load and store happen at the same time.  Also adds entries
for prefetching and execution.

  $ perf mem report -F +op -s comm --stdio
  # To display the perf.data header info, please use --header/--header-only options.
  #
  #
  # Total Lost Samples: 0
  #
  # Samples: 4K of event 'ibs_op//'
  # Total weight : 9559
  # Sort order   : comm
  #
  #                   --------------------- Mem Op ----------------------
  # Overhead  Samples   Load  Store Ld+St Pfetch  Exec  Other   N/A   N/A  Command
  # ........  ....... ...................................................  ...............
  #
      44.85%     4077  21.1%  30.7%  0.0%   0.0%  0.0%  48.3%  0.0%  0.0%  swapper
      26.82%       45  98.8%   0.3%  0.0%   0.0%  0.0%   0.9%  0.0%  0.0%  netsli-prober
       7.19%      442  51.7%  13.7%  0.0%   0.0%  0.0%  34.6%  0.0%  0.0%  perf
       5.81%       75  89.7%   2.2%  0.0%   0.0%  0.0%   8.1%  0.0%  0.0%  qemu-system-ppc
       4.77%        1 100.0%   0.0%  0.0%   0.0%  0.0%   0.0%  0.0%  0.0%  notifications_c
       1.77%       10  95.9%   1.2%  0.0%   0.0%  0.0%   3.0%  0.0%  0.0%  MemoryReleaser
       0.77%       32  71.6%   4.1%  0.0%   0.0%  0.0%  24.3%  0.0%  0.0%  DefaultEventMan
       0.19%       10  66.7%  22.2%  0.0%   0.0%  0.0%  11.1%  0.0%  0.0%  gnome-shell

Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Leo Yan <leo.yan@arm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Ravi Bangoria <ravi.bangoria@amd.com>
Link: https://lore.kernel.org/r/20250430205548.789750-8-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2025-05-02 15:36:14 -03:00
Namhyung Kim
b1fc83ca43 perf hist: Implement output fields for mem stats
This is a preparation for later changes to support mem_stat output.  The
new fields will need two lines for the header - the first line will show
type of mem stat and the second line will show the name of each item
which is returned by mem_stat_name().

Each element in the mem_stat array will be printed in percentage for the
hist_entry and their sum would be 100%.

Add new output field dimension only for SORT_MODE__MEM using mem_stat.

To handle possible name conflict with existing sort keys, move the order
of checking output field dimensions after the sort dimensions when it
looks for sort keys.

Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Leo Yan <leo.yan@arm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Ravi Bangoria <ravi.bangoria@amd.com>
Link: https://lore.kernel.org/r/20250430205548.789750-7-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2025-05-02 15:36:14 -03:00
Namhyung Kim
9fcb43e27c perf hist: Basic support for mem_stat accounting
Add a logic to account he->mem_stat based on mem_stat_type in hists.

Each mem_stat entry will have different meaning based on the type so the
index in the array is calculated at runtime using the corresponding
value in the sample.data_src.

Still hists has no mem_stat_types yet so this code won't work for now.

Later hists->mem_stat_types will be allocated based on what users want
in the output actually.

Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Leo Yan <leo.yan@arm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Ravi Bangoria <ravi.bangoria@amd.com>
Link: https://lore.kernel.org/r/20250430205548.789750-6-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2025-05-02 15:36:14 -03:00
Namhyung Kim
930d4c45c6 perf hist: Add struct he_mem_stat
The 'struct he_mem_stat' is to save detailed information about memory
instruction.  It'll be used to show breakdown of various data from
PERF_SAMPLE_DATA_SRC.  Note that this structure is generic and the
contents will be different depending on actual data it'll use later.

The information about the actual data will be saved in 'struct hists'
and its length is in nr_mem_stats.  This commit just adds ground works
and does nothing since hists->nr_mem_stats is 0 for now.

Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Leo Yan <leo.yan@arm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Ravi Bangoria <ravi.bangoria@amd.com>
Link: https://lore.kernel.org/r/20250430205548.789750-5-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2025-05-02 15:36:14 -03:00
Namhyung Kim
29e6392ec3 perf hist: Support multi-line header
This is a preparation to support multi-line headers in 'perf mem report'.

Normal sort keys and output fields that don't have contents for multi-
line will print the header string at the last line only.

As we don't use multi-line headers normally, it should not have any
changes in the output.

Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Leo Yan <leo.yan@arm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Ravi Bangoria <ravi.bangoria@amd.com>
Link: https://lore.kernel.org/r/20250430205548.789750-4-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2025-05-02 15:36:14 -03:00
Namhyung Kim
43a6446998 perf record: Add --sample-mem-info option
There's no way to enable PERF_SAMPLE_DATA_SRC without PERF_SAMPLE_ADDR
which brings a lot of overhead due to the number of MMAP[2] records.

Let's add a new option to enable this information separately.

Committer testing:

  # perf record -a --sample-mem-info
  ^C[ perf record: Woken up 1 times to write data ]
  [ perf record: Captured and wrote 1.815 MB perf.data (2637 samples) ]
  #
  # perf evlist -v
  cycles:P: type: 0 (PERF_TYPE_HARDWARE), size: 136, config: 0 (PERF_COUNT_HW_CPU_CYCLES), { sample_period, sample_freq }: 4000, sample_type: IP|TID|TIME|CPU|PERIOD|IDENTIFIER|DATA_SRC, read_format: ID|LOST, disabled: 1, freq: 1, precise_ip: 2, sample_id_all: 1
  dummy:u: type: 1 (PERF_TYPE_SOFTWARE), size: 136, config: 0x9 (PERF_COUNT_SW_DUMMY), { sample_period, sample_freq }: 1, sample_type: IP|TID|TIME|CPU|IDENTIFIER|DATA_SRC, read_format: ID|LOST, exclude_kernel: 1, exclude_hv: 1, mmap: 1, comm: 1, task: 1, sample_id_all: 1, exclude_guest: 1, mmap2: 1, comm_exec: 1, ksymbol: 1, bpf_event: 1
  #
  # perf report -D |& grep -w PERF_RECORD_SAMPLE -A3 -m1
  0 44675164447282 0x1a7590 [0x40]: PERF_RECORD_SAMPLE(IP, 0x4001): 107299/107299: 0xffffffffac4a5e11 period: 144 addr: 0
   . data_src: 0x229080142
   ... thread: perf:107299
   ...... dso: /lib/modules/6.15.0-rc4+/build/vmlinux
  #

Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Leo Yan <leo.yan@arm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Ravi Bangoria <ravi.bangoria@amd.com>
Link: https://lore.kernel.org/r/20250430205548.789750-3-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2025-05-02 15:36:14 -03:00
Namhyung Kim
3761e7fe98 perf hist: Remove output field from sort-list properly
When it removes an output format for cancelled children or latency, it
should delete itself from the sort list as well.  Otherwise assertion
in fmt_free() will fire.

  $ perf report -H --stdio
  perf: ui/hist.c:603: fmt_free: Assertion `!(!list_empty(&fmt->sort_list))' failed.
  Aborted (core dumped)

Also convert to perf_hpp__column_unregister() for the same open codes.

Committer notes:

Before this patch:

  # perf test hierarchy
   83: perf report --hierarchy                                         : FAILED!
  # perf test -v hierarchy
  --- start ---
  test child forked, pid 102242
  perf report --hierarchy
  Linux
  [ perf record: Woken up 1 times to write data ]
  [ perf record: Captured and wrote 0.025 MB /tmp/perf-test-report.HX0N85TlPq/perf-report-hierarchy-perf.data (6 samples) ]
  perf: ui/hist.c:603: fmt_free: Assertion `!(!list_empty(&fmt->sort_list))' failed.
  /home/acme/libexec/perf-core/tests/shell/perf-report-hierarchy.sh: line 34: 102250 Aborted                 (core dumped) perf report --hierarchy > /dev/null
  --- Cleaning up ---
  ---- end(-1) ----
   83: perf report --hierarchy                                         : FAILED!
  #

After:

  # perf test hierarchy
   83: perf report --hierarchy                                         : Ok
  #

Fixes: dbd11b6bda ("perf hist: Remove formats in hierarchy when cancel children")
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: https://lore.kernel.org/r/20250430180321.736939-1-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2025-05-02 15:36:14 -03:00
Arnaldo Carvalho de Melo
bb5ae52e53 perf test perf-report-hierarchy: Add new test
Super simple test to check that at least we're not segfaulting when
trying to use 'perf report --hierarchy', more subtests should be added
to make sure the output is the expected one.

This is being merged right before a fix for that that this test detects:

  # perf test hierarchy
   83: perf report --hierarchy                                         : FAILED!
  # perf test -v hierarchy
  --- start ---
  test child forked, pid 102242
  perf report --hierarchy
  Linux
  [ perf record: Woken up 1 times to write data ]
  [ perf record: Captured and wrote 0.025 MB /tmp/perf-test-report.HX0N85TlPq/perf-report-hierarchy-perf.data (6 samples) ]
  perf: ui/hist.c:603: fmt_free: Assertion `!(!list_empty(&fmt->sort_list))' failed.
  /home/acme/libexec/perf-core/tests/shell/perf-report-hierarchy.sh: line 34: 102250 Aborted                 (core dumped) perf report --hierarchy > /dev/null
  --- Cleaning up ---
  ---- end(-1) ----
   83: perf report --hierarchy                                         : FAILED!
  #

Acked-by: Namhyung Kim <namhyung@kernel.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: James Clark <james.clark@linaro.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kan Liang <kan.liang@linux.intel.com>
Link: https://lore.kernel.org/lkml/20250430180321.736939-1-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2025-05-02 15:36:01 -03:00
Ravi Bangoria
35db59fa8e perf test amd ibs: Add sample period unit test
IBS Fetch and IBS Op PMUs has various constraints on supported sample
periods. Add perf unit tests to test those.

Running it in parallel with other tests causes intermittent failures.
Mark it exclusive to force it to run sequentially. Sample output on a
Zen5 machine:

Without kernel fixes:

  $ sudo ./perf test -vv 112
  112: AMD IBS sample period:
  --- start ---
  test child forked, pid 8774
  Using CPUID AuthenticAMD-26-2-1

  IBS config tests:
  -----------------
  Fetch PMU tests:
  0xffff            : Ok   (nr samples: 1078)
  0x1000            : Ok   (nr samples: 17030)
  0xff              : Ok   (nr samples: 41068)
  0x1               : Ok   (nr samples: 40543)
  0x0               : Ok
  0x10000           : Ok
  Op PMU tests:
  0x0               : Ok
  0x1               : Fail
  0x8               : Fail
  0x9               : Ok   (nr samples: 40543)
  0xf               : Ok   (nr samples: 40543)
  0x1000            : Ok   (nr samples: 18736)
  0xffff            : Ok   (nr samples: 1168)
  0x10000           : Ok
  0x100000          : Fail (nr samples: 14)
  0xf00000          : Fail (nr samples: 1)
  0xf0ffff          : Fail (nr samples: 1)
  0x1f0ffff         : Fail (nr samples: 1)
  0x7f0ffff         : Fail (nr samples: 0)
  0x8f0ffff         : Ok
  0x17f0ffff        : Ok

  IBS sample period constraint tests:
  -----------------------------------
  Fetch PMU test:
  freq 0, sample_freq         0: Ok
  freq 0, sample_freq         1: Fail
  freq 0, sample_freq        15: Fail
  freq 0, sample_freq        16: Ok   (nr samples: 1604)
  freq 0, sample_freq        17: Ok   (nr samples: 1604)
  freq 0, sample_freq       143: Ok   (nr samples: 1604)
  freq 0, sample_freq       144: Ok   (nr samples: 1604)
  freq 0, sample_freq       145: Ok   (nr samples: 1604)
  freq 0, sample_freq      1234: Ok   (nr samples: 1566)
  freq 0, sample_freq      4103: Ok   (nr samples: 1119)
  freq 0, sample_freq     65520: Ok   (nr samples: 2264)
  freq 0, sample_freq     65535: Ok   (nr samples: 2263)
  freq 0, sample_freq     65552: Ok   (nr samples: 1166)
  freq 0, sample_freq   8388607: Ok   (nr samples: 268)
  freq 0, sample_freq 268435455: Ok   (nr samples: 8)
  freq 1, sample_freq         0: Ok
  freq 1, sample_freq         1: Ok   (nr samples: 4)
  freq 1, sample_freq        15: Ok   (nr samples: 4)
  freq 1, sample_freq        16: Ok   (nr samples: 4)
  freq 1, sample_freq        17: Ok   (nr samples: 4)
  freq 1, sample_freq       143: Ok   (nr samples: 5)
  freq 1, sample_freq       144: Ok   (nr samples: 5)
  freq 1, sample_freq       145: Ok   (nr samples: 5)
  freq 1, sample_freq      1234: Ok   (nr samples: 7)
  freq 1, sample_freq      4103: Ok   (nr samples: 35)
  freq 1, sample_freq     65520: Ok   (nr samples: 642)
  freq 1, sample_freq     65535: Ok   (nr samples: 636)
  freq 1, sample_freq     65552: Ok   (nr samples: 651)
  freq 1, sample_freq   8388607: Ok
  Op PMU test:
  freq 0, sample_freq         0: Ok
  freq 0, sample_freq         1: Fail
  freq 0, sample_freq        15: Fail
  freq 0, sample_freq        16: Fail
  freq 0, sample_freq        17: Fail
  freq 0, sample_freq       143: Fail
  freq 0, sample_freq       144: Ok   (nr samples: 1604)
  freq 0, sample_freq       145: Ok   (nr samples: 1604)
  freq 0, sample_freq      1234: Ok   (nr samples: 1604)
  freq 0, sample_freq      4103: Ok   (nr samples: 1604)
  freq 0, sample_freq     65520: Ok   (nr samples: 2227)
  freq 0, sample_freq     65535: Ok   (nr samples: 2296)
  freq 0, sample_freq     65552: Ok   (nr samples: 2213)
  freq 0, sample_freq   8388607: Ok   (nr samples: 250)
  freq 0, sample_freq 268435455: Ok   (nr samples: 8)
  freq 1, sample_freq         0: Ok
  freq 1, sample_freq         1: Fail (nr samples: 4)
  freq 1, sample_freq        15: Fail (nr samples: 4)
  freq 1, sample_freq        16: Fail (nr samples: 4)
  freq 1, sample_freq        17: Fail (nr samples: 4)
  freq 1, sample_freq       143: Fail (nr samples: 5)
  freq 1, sample_freq       144: Fail (nr samples: 5)
  freq 1, sample_freq       145: Fail (nr samples: 5)
  freq 1, sample_freq      1234: Fail (nr samples: 8)
  freq 1, sample_freq      4103: Fail (nr samples: 33)
  freq 1, sample_freq     65520: Fail (nr samples: 546)
  freq 1, sample_freq     65535: Fail (nr samples: 544)
  freq 1, sample_freq     65552: Fail (nr samples: 555)
  freq 1, sample_freq   8388607: Ok

  IBS ioctl() tests:
  ------------------
  Fetch PMU tests
  ioctl(period = 0x0      ): Ok
  ioctl(period = 0x1      ): Fail
  ioctl(period = 0xf      ): Fail
  ioctl(period = 0x10     ): Ok
  ioctl(period = 0x11     ): Fail
  ioctl(period = 0x1f     ): Fail
  ioctl(period = 0x20     ): Ok
  ioctl(period = 0x80     ): Ok
  ioctl(period = 0x8f     ): Fail
  ioctl(period = 0x90     ): Ok
  ioctl(period = 0x91     ): Fail
  ioctl(period = 0x100    ): Ok
  ioctl(period = 0xfff0   ): Ok
  ioctl(period = 0xffff   ): Fail
  ioctl(period = 0x10000  ): Ok
  ioctl(period = 0x1fff0  ): Ok
  ioctl(period = 0x1fff5  ): Fail
  ioctl(freq   = 0x0      ): Ok
  ioctl(freq   = 0x1      ): Ok
  ioctl(freq   = 0xf      ): Ok
  ioctl(freq   = 0x10     ): Ok
  ioctl(freq   = 0x11     ): Ok
  ioctl(freq   = 0x1f     ): Ok
  ioctl(freq   = 0x20     ): Ok
  ioctl(freq   = 0x80     ): Ok
  ioctl(freq   = 0x8f     ): Ok
  ioctl(freq   = 0x90     ): Ok
  ioctl(freq   = 0x91     ): Ok
  ioctl(freq   = 0x100    ): Ok
  Op PMU tests
  ioctl(period = 0x0      ): Ok
  ioctl(period = 0x1      ): Fail
  ioctl(period = 0xf      ): Fail
  ioctl(period = 0x10     ): Fail
  ioctl(period = 0x11     ): Fail
  ioctl(period = 0x1f     ): Fail
  ioctl(period = 0x20     ): Fail
  ioctl(period = 0x80     ): Fail
  ioctl(period = 0x8f     ): Fail
  ioctl(period = 0x90     ): Ok
  ioctl(period = 0x91     ): Fail
  ioctl(period = 0x100    ): Ok
  ioctl(period = 0xfff0   ): Ok
  ioctl(period = 0xffff   ): Fail
  ioctl(period = 0x10000  ): Ok
  ioctl(period = 0x1fff0  ): Ok
  ioctl(period = 0x1fff5  ): Fail
  ioctl(freq   = 0x0      ): Ok
  ioctl(freq   = 0x1      ): Ok
  ioctl(freq   = 0xf      ): Ok
  ioctl(freq   = 0x10     ): Ok
  ioctl(freq   = 0x11     ): Ok
  ioctl(freq   = 0x1f     ): Ok
  ioctl(freq   = 0x20     ): Ok
  ioctl(freq   = 0x80     ): Ok
  ioctl(freq   = 0x8f     ): Ok
  ioctl(freq   = 0x90     ): Ok
  ioctl(freq   = 0x91     ): Ok
  ioctl(freq   = 0x100    ): Ok

  IBS freq (negative) tests:
  --------------------------
  freq 1, sample_freq 200000: Fail

  IBS L3MissOnly test: (takes a while)
  --------------------
  Fetch L3MissOnly: Fail (nr_samples: 1213)
  Op L3MissOnly:    Ok   (nr_samples: 1193)
  ---- end(-1) ----
  112: AMD IBS sample period                                           : FAILED!

With kernel fixes:

  $ sudo ./perf test -vv 112
  112: AMD IBS sample period:
  --- start ---
  test child forked, pid 6939
  Using CPUID AuthenticAMD-26-2-1

  IBS config tests:
  -----------------
  Fetch PMU tests:
  0xffff            : Ok   (nr samples: 969)
  0x1000            : Ok   (nr samples: 15540)
  0xff              : Ok   (nr samples: 40555)
  0x1               : Ok   (nr samples: 40543)
  0x0               : Ok
  0x10000           : Ok
  Op PMU tests:
  0x0               : Ok
  0x1               : Ok
  0x8               : Ok
  0x9               : Ok   (nr samples: 40543)
  0xf               : Ok   (nr samples: 40543)
  0x1000            : Ok   (nr samples: 19156)
  0xffff            : Ok   (nr samples: 1169)
  0x10000           : Ok
  0x100000          : Ok   (nr samples: 1151)
  0xf00000          : Ok   (nr samples: 76)
  0xf0ffff          : Ok   (nr samples: 73)
  0x1f0ffff         : Ok   (nr samples: 33)
  0x7f0ffff         : Ok   (nr samples: 10)
  0x8f0ffff         : Ok
  0x17f0ffff        : Ok

  IBS sample period constraint tests:
  -----------------------------------
  Fetch PMU test:
  freq 0, sample_freq         0: Ok
  freq 0, sample_freq         1: Ok
  freq 0, sample_freq        15: Ok
  freq 0, sample_freq        16: Ok   (nr samples: 1203)
  freq 0, sample_freq        17: Ok   (nr samples: 1604)
  freq 0, sample_freq       143: Ok   (nr samples: 1604)
  freq 0, sample_freq       144: Ok   (nr samples: 1604)
  freq 0, sample_freq       145: Ok   (nr samples: 1604)
  freq 0, sample_freq      1234: Ok   (nr samples: 1604)
  freq 0, sample_freq      4103: Ok   (nr samples: 1343)
  freq 0, sample_freq     65520: Ok   (nr samples: 2254)
  freq 0, sample_freq     65535: Ok   (nr samples: 2136)
  freq 0, sample_freq     65552: Ok   (nr samples: 1158)
  freq 0, sample_freq   8388607: Ok   (nr samples: 257)
  freq 0, sample_freq 268435455: Ok   (nr samples: 8)
  freq 1, sample_freq         0: Ok
  freq 1, sample_freq         1: Ok   (nr samples: 4)
  freq 1, sample_freq        15: Ok   (nr samples: 4)
  freq 1, sample_freq        16: Ok   (nr samples: 4)
  freq 1, sample_freq        17: Ok   (nr samples: 4)
  freq 1, sample_freq       143: Ok   (nr samples: 5)
  freq 1, sample_freq       144: Ok   (nr samples: 5)
  freq 1, sample_freq       145: Ok   (nr samples: 5)
  freq 1, sample_freq      1234: Ok   (nr samples: 8)
  freq 1, sample_freq      4103: Ok   (nr samples: 34)
  freq 1, sample_freq     65520: Ok   (nr samples: 458)
  freq 1, sample_freq     65535: Ok   (nr samples: 628)
  freq 1, sample_freq     65552: Ok   (nr samples: 396)
  freq 1, sample_freq   8388607: Ok
  Op PMU test:
  freq 0, sample_freq         0: Ok
  freq 0, sample_freq         1: Ok
  freq 0, sample_freq        15: Ok
  freq 0, sample_freq        16: Ok
  freq 0, sample_freq        17: Ok
  freq 0, sample_freq       143: Ok
  freq 0, sample_freq       144: Ok   (nr samples: 1604)
  freq 0, sample_freq       145: Ok   (nr samples: 1604)
  freq 0, sample_freq      1234: Ok   (nr samples: 1604)
  freq 0, sample_freq      4103: Ok   (nr samples: 1604)
  freq 0, sample_freq     65520: Ok   (nr samples: 2250)
  freq 0, sample_freq     65535: Ok   (nr samples: 2158)
  freq 0, sample_freq     65552: Ok   (nr samples: 2296)
  freq 0, sample_freq   8388607: Ok   (nr samples: 243)
  freq 0, sample_freq 268435455: Ok   (nr samples: 6)
  freq 1, sample_freq         0: Ok
  freq 1, sample_freq         1: Ok   (nr samples: 4)
  freq 1, sample_freq        15: Ok   (nr samples: 4)
  freq 1, sample_freq        16: Ok   (nr samples: 4)
  freq 1, sample_freq        17: Ok   (nr samples: 4)
  freq 1, sample_freq       143: Ok   (nr samples: 4)
  freq 1, sample_freq       144: Ok   (nr samples: 5)
  freq 1, sample_freq       145: Ok   (nr samples: 4)
  freq 1, sample_freq      1234: Ok   (nr samples: 6)
  freq 1, sample_freq      4103: Ok   (nr samples: 27)
  freq 1, sample_freq     65520: Ok   (nr samples: 542)
  freq 1, sample_freq     65535: Ok   (nr samples: 550)
  freq 1, sample_freq     65552: Ok   (nr samples: 552)
  freq 1, sample_freq   8388607: Ok

  IBS ioctl() tests:
  ------------------
  Fetch PMU tests
  ioctl(period = 0x0      ): Ok
  ioctl(period = 0x1      ): Ok
  ioctl(period = 0xf      ): Ok
  ioctl(period = 0x10     ): Ok
  ioctl(period = 0x11     ): Ok
  ioctl(period = 0x1f     ): Ok
  ioctl(period = 0x20     ): Ok
  ioctl(period = 0x80     ): Ok
  ioctl(period = 0x8f     ): Ok
  ioctl(period = 0x90     ): Ok
  ioctl(period = 0x91     ): Ok
  ioctl(period = 0x100    ): Ok
  ioctl(period = 0xfff0   ): Ok
  ioctl(period = 0xffff   ): Ok
  ioctl(period = 0x10000  ): Ok
  ioctl(period = 0x1fff0  ): Ok
  ioctl(period = 0x1fff5  ): Ok
  ioctl(freq   = 0x0      ): Ok
  ioctl(freq   = 0x1      ): Ok
  ioctl(freq   = 0xf      ): Ok
  ioctl(freq   = 0x10     ): Ok
  ioctl(freq   = 0x11     ): Ok
  ioctl(freq   = 0x1f     ): Ok
  ioctl(freq   = 0x20     ): Ok
  ioctl(freq   = 0x80     ): Ok
  ioctl(freq   = 0x8f     ): Ok
  ioctl(freq   = 0x90     ): Ok
  ioctl(freq   = 0x91     ): Ok
  ioctl(freq   = 0x100    ): Ok
  Op PMU tests
  ioctl(period = 0x0      ): Ok
  ioctl(period = 0x1      ): Ok
  ioctl(period = 0xf      ): Ok
  ioctl(period = 0x10     ): Ok
  ioctl(period = 0x11     ): Ok
  ioctl(period = 0x1f     ): Ok
  ioctl(period = 0x20     ): Ok
  ioctl(period = 0x80     ): Ok
  ioctl(period = 0x8f     ): Ok
  ioctl(period = 0x90     ): Ok
  ioctl(period = 0x91     ): Ok
  ioctl(period = 0x100    ): Ok
  ioctl(period = 0xfff0   ): Ok
  ioctl(period = 0xffff   ): Ok
  ioctl(period = 0x10000  ): Ok
  ioctl(period = 0x1fff0  ): Ok
  ioctl(period = 0x1fff5  ): Ok
  ioctl(freq   = 0x0      ): Ok
  ioctl(freq   = 0x1      ): Ok
  ioctl(freq   = 0xf      ): Ok
  ioctl(freq   = 0x10     ): Ok
  ioctl(freq   = 0x11     ): Ok
  ioctl(freq   = 0x1f     ): Ok
  ioctl(freq   = 0x20     ): Ok
  ioctl(freq   = 0x80     ): Ok
  ioctl(freq   = 0x8f     ): Ok
  ioctl(freq   = 0x90     ): Ok
  ioctl(freq   = 0x91     ): Ok
  ioctl(freq   = 0x100    ): Ok

  IBS freq (negative) tests:
  --------------------------
  freq 1, sample_freq 200000: Ok

  IBS L3MissOnly test: (takes a while)
  --------------------
  Fetch L3MissOnly: Ok   (nr_samples: 1301)
  Op L3MissOnly:    Ok   (nr_samples: 1590)
  ---- end(0) ----
  112: AMD IBS sample period                                           : Ok

Committer notes:

Avoid using PAGE_SIZE as that define is also in sys/user.h

Make it a variable not to call sysconf() multiple times.

Also cast func to void * when passing it as the first arg to memcpy to
avoid this with some versions of clang:

  arch/x86/tests/amd-ibs-period.c:81:3: error: no matching function for call to 'memcpy'
                  memcpy(func, insn1, sizeof(insn1));
                  ^~~~~~
  /usr/include/string.h:27:7: note: candidate function not viable: no known conversion from 'int (*)(void)' to 'void *' for 1st argument
  void *memcpy (void *__restrict, const void *__restrict, size_t);
        ^
  /usr/include/fortify/string.h:40:27: note: candidate function not viable: no known conversion from 'int (*)(void)' to 'void *const' for 1st argument
  _FORTIFY_FN(memcpy) void *memcpy(void * _FORTIFY_POS0 __od,
                            ^
  arch/x86/tests/amd-ibs-period.c:87:3: error: no matching function for call to 'memcpy'

This one, for instance:

  Alpine clang version 19.1.4
  Target: x86_64-alpine-linux-musl
  Thread model: posix
  InstalledDir: /usr/lib/llvm19/bin
  Configuration file: /etc/clang19/x86_64-alpine-linux-musl.cfg
  System configuration file directory: /etc/clang19

Signed-off-by: Ravi Bangoria <ravi.bangoria@amd.com>
Cc: Ananth Narayan <ananth.narayan@amd.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Joe Mario <jmario@redhat.com>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Sandipan Das <sandipan.das@amd.com>
Cc: Santosh Shukla <santosh.shukla@amd.com>
Cc: Stephane Eranian <eranian@google.com>
Link: https://lore.kernel.org/r/20250429035938.1301-5-ravi.bangoria@amd.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2025-04-29 22:30:46 -03:00
Ravi Bangoria
fa1332a801 perf mem/c2c amd: Add ldlat support
'perf mem/c2c' uses IBS Op PMU on AMD platforms.

IBS Op PMU on Zen5 uarch has added support for Load Latency filtering.

Implement 'perf mem/c2c' --ldlat using IBS Op Load Latency filtering
capability.

Some subtle differences between AMD and other arch:

o --ldlat is disabled by default on AMD

o Supported values are 128 to 2048.

Signed-off-by: Ravi Bangoria <ravi.bangoria@amd.com>
Cc: Ananth Narayan <ananth.narayan@amd.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Joe Mario <jmario@redhat.com>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Sandipan Das <sandipan.das@amd.com>
Cc: Santosh Shukla <santosh.shukla@amd.com>
Cc: Stephane Eranian <eranian@google.com>
Link: https://lore.kernel.org/r/20250429035938.1301-4-ravi.bangoria@amd.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2025-04-29 22:30:46 -03:00
Ravi Bangoria
fc481adc97 perf amd ibs: Incorporate Zen5 DTLB and PageSize information
IBS Op PMU on Zen5 reports DTLB and page size information differently
compared to prior generation.

  IBS_OP_DATA3     Zen3/4                 Zen5
  ----------------------------------------------------------------
  19               IbsDcL2TlbHit1G        Reserved
  ----------------------------------------------------------------
   6               IbsDcL2tlbHit2M        Reserved
  ----------------------------------------------------------------
   5               IbsDcL1TlbHit1G        PageSize:
   4               IbsDcL1TlbHit2M          0 - 4K
                                            1 - 2M
                                            2 - 1G
                                            3 - Reserved
                                          Valid only if
                                            IbsDcPhyAddrValid = 1
  ----------------------------------------------------------------
   3               IbsDcL2TlbMiss         IbsDcL2TlbMiss
                                          Valid only if
                                            IbsDcPhyAddrValid = 1
  ----------------------------------------------------------------
   2               IbsDcL1tlbMiss         IbsDcL1tlbMiss
                                          Valid only if
                                            IbsDcPhyAddrValid = 1
  ----------------------------------------------------------------

Kernel expose this change as "dtlb_pgsize" capability in PMU sysfs.

Change IBS register raw-dump logic according to new bit definitions.

Signed-off-by: Ravi Bangoria <ravi.bangoria@amd.com>
Cc: Ananth Narayan <ananth.narayan@amd.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Joe Mario <jmario@redhat.com>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Sandipan Das <sandipan.das@amd.com>
Cc: Santosh Shukla <santosh.shukla@amd.com>
Cc: Stephane Eranian <eranian@google.com>
Link: https://lore.kernel.org/r/20250429035938.1301-3-ravi.bangoria@amd.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2025-04-29 22:30:46 -03:00
Ravi Bangoria
eeefc13c71 perf amd ibs: Add Load Latency bits in raw dump
IBS OP PMU on Zen5 supports Load Latency filtering. Decode and dump Load
Latency filtering related bits into perf script raw dump.

Also add oneliner example in the perf-amd-ibs man page.

Signed-off-by: Ravi Bangoria <ravi.bangoria@amd.com>
Cc: Ananth Narayan <ananth.narayan@amd.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Joe Mario <jmario@redhat.com>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Sandipan Das <sandipan.das@amd.com>
Cc: Santosh Shukla <santosh.shukla@amd.com>
Cc: Stephane Eranian <eranian@google.com>
Link: https://lore.kernel.org/r/20250429035938.1301-2-ravi.bangoria@amd.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2025-04-29 22:30:46 -03:00
Arnaldo Carvalho de Melo
4d728bb93b perf symbols: Handle 'u' and 'l' symbols in /proc/kallsyms
I started seeing this in recent Fedora 42 kernels:

  # uname -a
  Linux number 6.14.3-300.fc42.x86_64 #1 SMP PREEMPT_DYNAMIC Sun Apr 20 16:08:39 UTC 2025 x86_64 GNU/Linux
  #

  # perf test vmlinux
    1: vmlinux symtab matches kallsyms                                 : FAILED!
  #

Where we have Rust enabled:

  # grep CONFIG_RUST /boot/config-6.14.3-300.fc42.x86_64
  CONFIG_RUSTC_VERSION=108600
  CONFIG_RUST_IS_AVAILABLE=y
  CONFIG_RUSTC_LLVM_VERSION=200101
  CONFIG_RUSTC_HAS_COERCE_POINTEE=y
  CONFIG_RUST=y
  CONFIG_RUSTC_VERSION_TEXT="rustc 1.86.0 (05f9846f8 2025-03-31) (Fedora 1.86.0-1.fc42)"
  CONFIG_RUST_FW_LOADER_ABSTRACTIONS=y
  CONFIG_RUST_PHYLIB_ABSTRACTIONS=y
  # CONFIG_RUST_DEBUG_ASSERTIONS is not set
  CONFIG_RUST_OVERFLOW_CHECKS=y
  # CONFIG_RUST_BUILD_ASSERT_ALLOW is not set
  #

Looking at the reason for the failure:

  # perf test -v vmlinux |& grep ^ERR
  ERR : 0xffffffff99efc7d0: __pfx__RNCINvNtNtNtCsf5tcb0XGUW4_4core4iter8adapters3map12map_try_foldjNtCsagR6JbSOIa9_12drm_panic_qr7VersionuINtNtNtBa_3ops12control_flow11ControlFlowB10_ENcB10_0NCINvNvNtNtNtB8_6traits8iterator8Iterator4find5checkB10_NCNvMB12_B10_13from_segments0E0E0B12_ not on kallsyms
  ERR : 0xffffffff99efc7e0: _RNCINvNtNtNtCsf5tcb0XGUW4_4core4iter8adapters3map12map_try_foldjNtCsagR6JbSOIa9_12drm_panic_qr7VersionuINtNtNtBa_3ops12control_flow11ControlFlowB10_ENcB10_0NCINvNvNtNtNtB8_6traits8iterator8Iterator4find5checkB10_NCNvMB12_B10_13from_segments0E0E0B12_ not on kallsyms
  #

But:

  # grep -w u /proc/kallsyms
  ffffffff99efc7d0 u __pfx__RNCINvNtNtNtCsf5tcb0XGUW4_4core4iter8adapters3map12map_try_foldjNtCsagR6JbSOIa9_12drm_panic_qr7VersionuINtNtNtBa_3ops12control_flow11ControlFlowB10_ENcB10_0NCINvNvNtNtNtB8_6traits8iterator8Iterator4find5checkB10_NCNvMB12_B10_13from_segments0E0E0B12_
  ffffffff99efc7e0 u _RNCINvNtNtNtCsf5tcb0XGUW4_4core4iter8adapters3map12map_try_foldjNtCsagR6JbSOIa9_12drm_panic_qr7VersionuINtNtNtBa_3ops12control_flow11ControlFlowB10_ENcB10_0NCINvNvNtNtNtB8_6traits8iterator8Iterator4find5checkB10_NCNvMB12_B10_13from_segments0E0E0B12_
  #

The test checks that "vmlinux symtab matches kallsyms", so it finds those two
symbols in vmlinux:

  # pahole --running_kernel_vmlinux
  /usr/lib/debug/lib/modules/6.14.3-300.fc42.x86_64/vmlinux
  #

  # readelf -sW /usr/lib/debug/lib/modules/6.14.3-300.fc42.x86_64/vmlinux | grep -Ew '(__pfx__RNCINvNtNtNtCsf5tcb0XGUW4_4core4iter8adapters3map12map_try_foldjNtCsagR6JbSOIa9_12drm_panic_qr7VersionuINtNtNtBa_3ops12control_flow11ControlFlowB10_ENcB10_0NCINvNvNtNtNtB8_6traits8iterator8Iterator4find5checkB10_NCNvMB12_B10_13from_segments0E0E0B12_|_RNCINvNtNtNtCsf5tcb0XGUW4_4core4iter8adapters3map12map_try_foldjNtCsagR6JbSOIa9_12drm_panic_qr7VersionuINtNtNtBa_3ops12control_flow11ControlFlowB10_ENcB10_0NCINvNvNtNtNtB8_6traits8iterator8Iterator4find5checkB10_NCNvMB12_B10_13from_segments0E0E0B12_)'
 81844: ffffffff81efc7e0   524 FUNC    LOCAL  DEFAULT    1 _RNCINvNtNtNtCsf5tcb0XGUW4_4core4iter8adapters3map12map_try_foldjNtCsagR6JbSOIa9_12drm_panic_qr7VersionuINtNtNtBa_3ops12control_flow11ControlFlowB10_ENcB10_0NCINvNvNtNtNtB8_6traits8iterator8Iterator4find5checkB10_NCNvMB12_B10_13from_segments0E0E0B12_
144259: ffffffff81efc7d0    16 FUNC    LOCAL  DEFAULT    1 __pfx__RNCINvNtNtNtCsf5tcb0XGUW4_4core4iter8adapters3map12map_try_foldjNtCsagR6JbSOIa9_12drm_panic_qr7VersionuINtNtNtBa_3ops12control_flow11ControlFlowB10_ENcB10_0NCINvNvNtNtNtB8_6traits8iterator8Iterator4find5checkB10_NCNvMB12_B10_13from_segments0E0E0B12_
  #

It is there.

From the nm documentation we can see that:

           "U" The symbol is undefined.

           "u" The symbol is a unique global symbol.  This is a GNU extension to the
	       standard set of ELF symbol bindings.  For such a symbol the dynamic
	       linker will make sure that in the entire process there is just one
	       symbol with this name and type in use.

So lets consider 'u' symbols in /proc/kallsyms when loading it to cover this case.

Fedora:40 shows this as a 'l' symbol, so consider that as well.

With this patch 'perf test 1' is happy again:

  # perf test vmlinux
    1: vmlinux symtab matches kallsyms                                 : Ok
  #

Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Masahiro Yamada <masahiroy@kernel.org>
Cc: Miguel Ojeda <ojeda@kernel.org>
Cc: James Clark <james.clark@linaro.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: https://lore.kernel.org/r/aBE_n0PGl3g6h-cS@x1
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2025-04-29 22:30:44 -03:00
James Clark
8988c4b919 perf tools: Fix in-source libperf build
When libperf is built alone in-source, $(OUTPUT) isn't set. This causes
the generated uapi path to resolve to '/../arch' which results in a
permissions error:

  mkdir: cannot create directory '/../arch': Permission denied

Fix it by removing the preceding '/..' which means that it gets
generated either in the tools/lib/perf part of the tree or the OUTPUT
folder. Some other rules that rely on OUTPUT further refine this
conditionally depending on whether it's an in-source or out-of-source
build, but I don't think we need the extra complexity here. And this
rule is slightly different to others because the header is needed by
both libperf and Perf. This is further complicated by the fact that Perf
always passes O=... to libperf even for in source builds, meaning that
OUTPUT isn't set consistently between projects.

Because we're no longer going one level up to try to generate the file
in the tools/ folder, Perf's include rule needs to descend into libperf.
Also fix the clean rule while we're here.

Reported-by: Thorsten Leemhuis <linux@leemhuis.info>
Closes: https://lore.kernel.org/linux-perf-users/7703f88e-ccb7-4c98-9da4-8aad224e780f@leemhuis.info/
Fixes: bfb713ea53 ("perf tools: Fix arm64 build by generating unistd_64.h")
Signed-off-by: James Clark <james.clark@linaro.org>
Tested-by: Thorsten Leemhuis <linux@leemhuis.info>
Link: https://lore.kernel.org/r/20250429-james-perf-fix-libperf-in-source-build-v1-1-a1a827ac15e5@linaro.org
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2025-04-29 12:32:31 -07:00
Jakub Brnak
92b664dcef perf test probe_vfs_getname: Skip if no suitable line detected
In some cases when calling function add_probe_vfs_getname, line number
can't be detected by 'perf probe -L getname_flags':

  78         atomic_set(&result->refcnt, 1);

	     // one of the following lines should have line number
	     // but sometimes it does not because of optimization
	     result->uptr = filename;
             result->aname = NULL;

  81         audit_getname(result);

To prevent false failures, skip the affected tests if no suitable line
numbers can be detected.

Signed-off-by: Jakub Brnak <jbrnak@redhat.com>
Acked-by: Namhyung Kim <namhyung@kernel.org>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Michael Petlan <mpetlan@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Tomas Glozar <tglozar@redhat.com>
Link: https://lore.kernel.org/r/20250324144523.597557-1-jbrnak@redhat.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2025-04-29 12:28:12 -03:00
Namhyung Kim
13f35928a4 perf lock contention: Symbolize zone->lock using BTF
The struct zone is embedded in struct pglist_data which can be allocated
for each NUMA node early in the boot process.  As it's not a slab object
nor a global lock, this was not symbolized.

Since the zone->lock is often contended, it'd be nice if we can
symbolize it.  On NUMA systems, node_data array will have pointers for
struct pglist_data.  By following the pointer, it can calculate the
address of each zone and its lock using BTF.  On UMA, it can just use
contig_page_data and its zones.

The following example shows the zone lock contention at the end.

  $ sudo ./perf lock con -abl -E 5 -- ./perf bench sched messaging
  # Running 'sched/messaging' benchmark:
  # 20 sender and receiver processes per group
  # 10 groups == 400 processes run

       Total time: 0.038 [sec]
   contended   total wait     max wait     avg wait            address   symbol

        5167     18.17 ms     10.27 us      3.52 us   ffff953340052d00   &kmem_cache_node (spinlock)
          38     11.75 ms    465.49 us    309.13 us   ffff95334060c480   &sock_inode_cache (spinlock)
        3916     10.13 ms     10.43 us      2.59 us   ffff953342aecb40   &kmem_cache_node (spinlock)
        2963     10.02 ms     13.75 us      3.38 us   ffff9533d2344098   &kmalloc-rnd-08-2k (spinlock)
         216      5.05 ms     99.49 us     23.39 us   ffff9542bf7d65d0   zone_lock (spinlock)

Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Song Liu <song@kernel.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: bpf@vger.kernel.org
Cc: linux-mm@kvack.org
Link: https://lore.kernel.org/r/20250401063055.7431-1-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2025-04-29 12:23:53 -03:00
Namhyung Kim
2d099ccaad perf test: Add perf trace summary test
$ sudo ./perf test -vv 'trace summary'
  109: perf trace summary:
  --- start ---
  test child forked, pid 3501572
  testing: perf trace -s -- true
  testing: perf trace -S -- true
  testing: perf trace -s --summary-mode=thread -- true
  testing: perf trace -S --summary-mode=total -- true
  testing: perf trace -as --summary-mode=thread --no-bpf-summary -- true
  testing: perf trace -as --summary-mode=total --no-bpf-summary -- true
  testing: perf trace -as --summary-mode=thread --bpf-summary -- true
  testing: perf trace -as --summary-mode=total --bpf-summary -- true
  testing: perf trace -aS --summary-mode=total --bpf-summary -- true
  ---- end(0) ----
  109: perf trace summary                                              : Ok

Reviewed-by: Howard Chu <howardchu95@gmail.com>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Howard Chu <howardchu95@gmail.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Song Liu <song@kernel.org>
Cc: bpf@vger.kernel.org
Link: https://lore.kernel.org/r/20250326044001.3503432-2-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2025-04-28 16:53:13 -03:00
Namhyung Kim
1bec43f523 perf trace: Implement syscall summary in BPF
When -s/--summary option is used, it doesn't need (augmented) arguments
of syscalls.  Let's skip the augmentation and load another small BPF
program to collect the statistics in the kernel instead of copying the
data to the ring-buffer to calculate the stats in userspace.  This will
be much more light-weight than the existing approach and remove any lost
events.

Let's add a new option --bpf-summary to control this behavior.  I cannot
make it default because there's no way to get e_machine in the BPF which
is needed for detecting different ABIs like 32-bit compat mode.

No functional changes intended except for no more LOST events. :)

  $ sudo ./perf trace -as --summary-mode=total --bpf-summary sleep 1

   Summary of events:

   total, 6194 events

     syscall            calls  errors  total       min       avg       max       stddev
                                       (msec)    (msec)    (msec)    (msec)        (%)
     --------------- --------  ------ -------- --------- --------- ---------     ------
     epoll_wait           561      0  4530.843     0.000     8.076   520.941     18.75%
     futex                693     45  4317.231     0.000     6.230   500.077     21.98%
     poll                 300      0  1040.109     0.000     3.467   120.928     17.02%
     clock_nanosleep        1      0  1000.172  1000.172  1000.172  1000.172      0.00%
     ppoll                360      0   872.386     0.001     2.423   253.275     41.91%
     epoll_pwait           14      0   384.349     0.001    27.453   380.002     98.79%
     pselect6              14      0   108.130     7.198     7.724     8.206      0.85%
     nanosleep             39      0    43.378     0.069     1.112    10.084     44.23%
     ...

Reviewed-by: Howard Chu <howardchu95@gmail.com>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Howard Chu <howardchu95@gmail.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Song Liu <song@kernel.org>
Link: https://lore.kernel.org/r/20250326044001.3503432-1-namhyung@kernel.org
[ Added fixup sent from Namhyung in response to my report to make it also dependent on CONFIG_TRACE ]
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2025-04-28 16:53:11 -03:00
Junhao He
c756441c35 perf vendor events arm64: Drop hip08 PublicDescription if same as BriefDescription
If BriefDescription and PublicDescription are the same, only
BriefDescription is needed. It will be used for both long and short
format outputs.

Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Reviewed-by: Yicong Yang <yangyicong@hisilicon.com>
Signed-off-by: Junhao He <hejunhao3@huawei.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@linaro.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: John Garry <john.g.garry@oracle.com>
Cc: Leo Yan <leo.yan@arm.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Mike Leach <mike.leach@linaro.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Will Deacon <will@kernel.org>
Link: https://lore.kernel.org/r/20250418070812.3771441-3-hejunhao3@huawei.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2025-04-25 12:32:19 -03:00
Junhao He
43fff3e948 perf vendor events arm64: Fill up Desc field for Hisi hip08 hha pmu
In the same PMU, when some JSON events have the "BriefDescription" field
populated while others do not, the cmp_sevent() function will split these
two types of events into separate groups. As a result, when using perf
list to display events, the two types of events cannot be grouped together
in the output.

before patch:
 $ perf list pmu
 ...
 uncore hha:
   hisi_sccl1_hha2/sdir-hit/
   hisi_sccl1_hha2/sdir-lookup/
 ...
 uncore hha:
   edir-hit
      [Count of The number of HHA E-Dir hit operations. Unit: hisi_sccl1_hha2]
 ...

after patch:
 $ perf list pmu
 ...
 uncore hha:
   edir-hit
      [Count of The number of HHA E-Dir hit operations. Unit: hisi_sccl1_hha2]
   sdir-hit
      [Count of The number of HHA S-Dir hit operations. Unit: hisi_sccl1_hha2]
   sdir-lookup
      [Count of the number of HHA S-Dir lookup operations. Unit: hisi_sccl1_hha2]
 ...

Reviewed-by: James Clark <james.clark@linaro.org>
Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Reviewed-by: Yicong Yang <yangyicong@hisilicon.com>
Signed-off-by: Junhao He <hejunhao3@huawei.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: John Garry <john.g.garry@oracle.com>
Cc: Leo Yan <leo.yan@arm.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Mike Leach <mike.leach@linaro.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Will Deacon <will@kernel.org>
Link: https://lore.kernel.org/r/20250418070812.3771441-2-hejunhao3@huawei.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2025-04-25 12:32:16 -03:00
Ian Rogers
022d270bb6 perf bench evlist-open-close: Reduce scope of 2 variables
Make 2 global variables local. Reduces ELF binary size by removing
relocations. For a no flags build, the perf binary size is reduced by
4,144 bytes on x86-64.

Reviewed-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: Ian Rogers <irogers@google.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Dapeng Mi <dapeng1.mi@linux.intel.com>
Cc: Dominique Martinet <asmadeus@codewreck.org>
Cc: Dr. David Alan Gilbert <linux@treblig.org>
Cc: Hao Ge <gehao@kylinos.cn>
Cc: Howard Chu <howardchu95@gmail.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@linaro.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Levi Yun <yeoreum.yun@arm.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Tengda Wu <wutengda@huaweicloud.com>
Cc: Thomas Richter <tmricht@linux.ibm.com>
Cc: Veronika Molnarova <vmolnaro@redhat.com>
Cc: Weilin Wang <weilin.wang@intel.com>
Cc: Xu Yang <xu.yang_2@nxp.com>
Link: https://lore.kernel.org/r/20250410173631.1713627-3-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2025-04-25 12:32:13 -03:00
Ian Rogers
be8aefad33 perf tests record: Cleanup improvements
Remove the script output file. Add a trap debug message. Minor style
consistency changes.

Signed-off-by: Ian Rogers <irogers@google.com>
Link: https://lore.kernel.org/r/20250410173631.1713627-2-irogers@google.com
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Levi Yun <yeoreum.yun@arm.com>
Cc: Dominique Martinet <asmadeus@codewreck.org>
Cc: Howard Chu <howardchu95@gmail.com>
Cc: Tengda Wu <wutengda@huaweicloud.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Weilin Wang <weilin.wang@intel.com>
Cc: Arnaldo Carvalho de Melo <acme@kernel.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Hao Ge <gehao@kylinos.cn>
Cc: James Clark <james.clark@linaro.org>
Cc: Thomas Richter <tmricht@linux.ibm.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Dapeng Mi <dapeng1.mi@linux.intel.com>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Xu Yang <xu.yang_2@nxp.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Veronika Molnarova <vmolnaro@redhat.com>
Cc: Dr. David Alan Gilbert <linux@treblig.org>
Cc: bpf@vger.kernel.org
Cc: linux-kernel@vger.kernel.org
Cc: linux-perf-users@vger.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2025-04-25 12:32:10 -03:00
Thomas Richter
ccd4b5cdf0 perf tests metric-only perf stat: Fix tests 84 and 86 s390
On s390x KVM and z/VM machines the CPU Measurement Facility is not
available. Events cycles and instructions do not exist.  Running above
tests on s390 KVM and z/VM guests always fail with this error:

  # ./perf test 84 86
  84: perf stat JSON output linter          : FAILED!
  86: perf stat STD output linter           : FAILED!
  #

Root cause is command:

  # perf stat -j --metric-only -e instructions,cycles -- true
  {"metric-value" : "none"}
  #

Which fails due to unsupported events and returns "none".
Do not execute this test case on s390 KVM and z/VM machines.

Output after:
  # ./perf test 84 86
  84: perf stat JSON output linter          : Ok
  86: perf stat STD output linter           : Ok
  #

Fixes: 45a86d017a ("perf test: Add --metric-only to perf stat output tests")
Suggested-by: Heiko Carstens <hca@linux.ibm.com>
Suggested-by: Sumanth Korikkar <sumanthk@linux.ibm.com>
Reviewed-by: Sumanth Korikkar <sumanthk@linux.ibm.com>
Signed-off-by: Thomas Richter <tmricht@linux.ibm.com>
Cc: Alexander Gordeev <agordeev@linux.ibm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Vasily Gorbik <gor@linux.ibm.com>
Link: https://lore.kernel.org/r/20250424133310.37452-1-tmricht@linux.ibm.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2025-04-25 12:32:07 -03:00
Ian Rogers
68cb156743 perf tool_pmu: Fix aggregation on duration_time
evsel__count_has_error() fails counters when the enabled or running time
are 0. The duration_time event reads 0 when the cpu_map_idx != 0 to
avoid aggregating time over CPUs. Change the enable and running time
to always have a ratio of 100% so that evsel__count_has_error won't
fail.

Before:
```
$ sudo /tmp/perf/perf stat --per-core -a -M UNCORE_FREQ sleep 1

 Performance counter stats for 'system wide':

S0-D0-C0              1      2,615,819,485      UNC_CLOCK.SOCKET                 #     2.61 UNCORE_FREQ
S0-D0-C0              2      <not counted>      duration_time

       1.002111784 seconds time elapsed
```

After:
```
$ perf stat --per-core -a -M UNCORE_FREQ sleep 1

 Performance counter stats for 'system wide':

S0-D0-C0              1        758,160,296      UNC_CLOCK.SOCKET                 #     0.76 UNCORE_FREQ
S0-D0-C0              2      1,003,438,246      duration_time

       1.002486017 seconds time elapsed
```

Note: the metric reads the value a different way and isn't impacted.

Fixes: 240505b2d0 ("perf tool_pmu: Factor tool events into their own PMU")
Reported-by: Stephane Eranian <eranian@google.com>
Reviewed-by: James Clark <james.clark@linaro.org>
Signed-off-by: Ian Rogers <irogers@google.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Richter <tmricht@linux.ibm.com>
Link: https://lore.kernel.org/r/20250423050358.94310-1-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2025-04-25 12:32:04 -03:00
Chun-Tse Shao
b1b26ce8bb perf session: Skip unsupported new event types
`perf report` currently halts with an error when encountering
unsupported new event types (`event.type >= PERF_RECORD_HEADER_MAX`).

This patch modifies the behavior to skip these samples and continue
processing the remaining events.

Additionally, stops reporting if the new event size is not 8-byte
aligned.

Suggested-by: Arnaldo Carvalho de Melo <acme@kernel.org>
Suggested-by: Namhyung Kim <namhyung@kernel.org>
Reviewed-by: Ian Rogers <irogers@google.com>
Signed-off-by: Chun-Tse Shao <ctshao@google.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Ben Gainey <ben.gainey@arm.com>
Cc: Dmitriy Vyukov <dvyukov@google.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: https://lore.kernel.org/r/20250414173921.2905822-1-ctshao@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2025-04-25 12:32:01 -03:00
Namhyung Kim
0ef8091f17 perf hist: Allow custom output fields in hierarchy mode
Now it can handle multiple output fields and sort keys in separate
levels, so it should be ok to use it in the hierarchy mode.  This
allows fully customized output format.

  $ perf report -F latency,comm,parallelism -H --stdio
  ...
  #     Latency  Command / Parallelism
  # ...........  .....................
  #
      31.84%     cc1
         29.96%     5
          1.24%     4
          0.37%     6
          0.26%     3
          0.02%     2
      24.68%     as
         22.39%     5
          1.12%     2
          0.98%     4
          0.12%     3
          0.07%     6
          ...

Committer testing:

Before:

  $ perf report -F latency,comm,parallelism -H --stdio
  Error: --hierarchy and --fields options cannot be used together

   Usage: perf report [<options>]

      -F, --fields <key[,keys...]>
                            output field(s): overhead latency overhead_sys overhead_us
  			  overhead_guest_sys overhead_guest_us overhead_children
  			  latency_children sample period weight1 weight2 weight3
  <SNIP>
      -H, --hierarchy       Show entries in a hierarchy
  $

After:

  $ perf report -F latency,comm,parallelism -H --stdio
  # Total Lost Samples: 0
  #
  # Samples: 1K of event 'cycles:Pu'
  # Event count (approx.): 1581450138
  #
  #     Latency  Command / Parallelism
  # ...........  .....................
  #
      97.66%     git
         96.95%     1
          0.55%     2
          0.04%     5
          0.03%     8
          0.03%     4
          0.02%     3
          0.01%     9
          0.01%     7
          0.01%     6
          0.01%     10
          0.00%     12
       2.34%     git-remote-http
          2.24%     1
          0.07%     5
          0.02%     2
          0.00%     4

  #
  # (Tip: To analyze particular parallelism levels, try: perf report --latency --parallelism=32-64)

Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: https://lore.kernel.org/r/20250331073722.4695-5-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2025-04-25 12:31:58 -03:00
Namhyung Kim
390627dda7 perf hist: Set levels in output_field_add()
It turns out that the output fields didn't consider the hierarchy mode
and put all the fields in the same level.  To support hierarchy, each
non-output field should be in a separate level.

Pass a pointer to level to output_field_add() and make it increase the
level when it sees non-output fields.

Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: https://lore.kernel.org/r/20250331073722.4695-4-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2025-04-25 12:31:54 -03:00
Namhyung Kim
b09124e2e1 perf hist: Remove formats in hierarchy when cancel latency
Likewise, it should remove latency output fields in hierarchy list.
Pass evlist to perf_hpp__cancel_latency() to handle them properly.

Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: https://lore.kernel.org/r/20250331073722.4695-3-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2025-04-25 12:31:51 -03:00
Namhyung Kim
dbd11b6bda perf hist: Remove formats in hierarchy when cancel children
This is to support hierarchy options with custom output fields.
Currently perf_hpp__cancel_cumulate() only removes accumulated
overhead and latency fields from the global perf_hpp_list.

This is not used in the hierarchy mode because each evsel's hist
has its own separate hpp_list.  So it needs to remove the fields
from the lists too.  Pass evlist to the function so that it can
iterate the evsels.

Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: https://lore.kernel.org/r/20250331073722.4695-2-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2025-04-25 12:31:47 -03:00
Ian Rogers
92504d927d perf record: Retirement latency cleanup in evsel__config
'perf record' will fail with retirement latency events as the open
doesn't do a perf_event_open system call.

Use evsel__config() to set up such events for recording by removing the
flag and enabling sample weights - the sample weights containing the
retirement latency.

Reviewed-by: Kan Liang <kan.liang@linux.intel.com>
Signed-off-by: Ian Rogers <irogers@google.com>
Tested-by: Weilin Wang <weilin.wang@intel.com>
Acked-by: Namhyung Kim <namhyung@kernel.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Alexandre Torgue <alexandre.torgue@foss.st.com>
Cc: Andreas Färber <afaerber@suse.de>
Cc: Caleb Biggers <caleb.biggers@intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Maxime Coquelin <mcoquelin.stm32@gmail.com>
Cc: Perry Taylor <perry.taylor@intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Falcon <thomas.falcon@intel.com>
Link: https://lore.kernel.org/r/20250414174134.3095492-17-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2025-04-25 12:31:44 -03:00
Ian Rogers
fc807b6bde perf pmu-events: Add retirement latency to JSON events inside of perf
The updated Intel vendor events add retirement latency for
graniterapids:

  https://lore.kernel.org/lkml/20250322063403.364981-14-irogers@google.com/

This change makes those values available within an alias/event within a
PMU and saves them into the evsel at event parse time.

When no TPEBS data is available the default values are substituted in
for TMA metrics that are using retirement latency events - currently
just those on graniterapids.

Reviewed-by: Kan Liang <kan.liang@linux.intel.com>
Signed-off-by: Ian Rogers <irogers@google.com>
Tested-by: Weilin Wang <weilin.wang@intel.com>
Acked-by: Namhyung Kim <namhyung@kernel.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Alexandre Torgue <alexandre.torgue@foss.st.com>
Cc: Andreas Färber <afaerber@suse.de>
Cc: Caleb Biggers <caleb.biggers@intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Maxime Coquelin <mcoquelin.stm32@gmail.com>
Cc: Perry Taylor <perry.taylor@intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Falcon <thomas.falcon@intel.com>
Link: https://lore.kernel.org/r/20250414174134.3095492-16-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2025-04-25 12:31:39 -03:00
Ian Rogers
f19306f065 perf stat: Add mean, min, max and last --tpebs-mode options
Add command line configuration option for how retirement latency
events are combined.

The default "mean" gives the average of retirement latency.

"min" or "max" give the smallest or largest retirment latency times
respectively.

"last" uses the last retirment latency sample's time.

Committer notes:

Enclose parse_tpebs_mode() under HAVE_ARCH_X86_64_SUPPORT to match the
ifdef block where it is used, fixing the build in systems like:

  20     5.60 debian:experimental-x-mips    : FAIL gcc version 14.2.0 (Debian 14.2.0-1)
    builtin-stat.c:2330:12: error: 'parse_tpebs_mode' defined but not used [-Werror=unused-function]
     2330 | static int parse_tpebs_mode(const struct option *opt, const char *str,
          |            ^~~~~~~~~~~~~~~~

Reviewed-by: Kan Liang <kan.liang@linux.intel.com>
Signed-off-by: Ian Rogers <irogers@google.com>
Tested-by: Weilin Wang <weilin.wang@intel.com>
Acked-by: Namhyung Kim <namhyung@kernel.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Alexandre Torgue <alexandre.torgue@foss.st.com>
Cc: Andreas Färber <afaerber@suse.de>
Cc: Caleb Biggers <caleb.biggers@intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Maxime Coquelin <mcoquelin.stm32@gmail.com>
Cc: Perry Taylor <perry.taylor@intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Falcon <thomas.falcon@intel.com>
Link: https://lore.kernel.org/r/20250414174134.3095492-15-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2025-04-25 12:31:36 -03:00
Ian Rogers
3533b56d22 perf intel-tpebs: Use stats for retirement latency statistics
struct stats provides access to mean, min and max.

It also provides uniformity with statistics code used elsewhere in perf.

Reviewed-by: Kan Liang <kan.liang@linux.intel.com>
Signed-off-by: Ian Rogers <irogers@google.com>
Tested-by: Weilin Wang <weilin.wang@intel.com>
Acked-by: Namhyung Kim <namhyung@kernel.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Alexandre Torgue <alexandre.torgue@foss.st.com>
Cc: Andreas Färber <afaerber@suse.de>
Cc: Caleb Biggers <caleb.biggers@intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Maxime Coquelin <mcoquelin.stm32@gmail.com>
Cc: Perry Taylor <perry.taylor@intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Falcon <thomas.falcon@intel.com>
Link: https://lore.kernel.org/r/20250414174134.3095492-14-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2025-04-25 12:31:32 -03:00