18433 Commits

Author SHA1 Message Date
Chuck Lever
35b16a7a2c perf synthetic-events: Fix stale build ID in module MMAP2 records
perf_event__synthesize_modules() allocates a single union perf_event and
reuses it across every kernel module callback.

After the first module is processed, perf_record_mmap2__read_build_id()
sets PERF_RECORD_MISC_MMAP_BUILD_ID in header.misc and writes that
module's build ID into the event.

On subsequent iterations the callback overwrites start, len, pid, and
filename for the next module but never clears the stale build ID fields
or the MMAP_BUILD_ID flag.

When perf_record_mmap2__read_build_id() runs for the second module it
sees the flag, reads the stale build ID into a dso_id, and
__dso__improve_id() permanently poisons the DSO with the wrong build ID.

Every module after the first therefore receives the first module's build
ID in its MMAP2 record.

On a system with the sunrpc and nfsd modules loaded, this causes perf
script and perf report to show [unknown] for all module symbols.

The latent bug has existed since commit d9f2ecbc5e ("perf dso:
Move build_id to dso_id") introduced the PERF_RECORD_MISC_MMAP_BUILD_ID
check in perf_record_mmap2__read_build_id().

Commit 53b00ff358 ("perf record: Make --buildid-mmap the default")
then exposed it to all users by making the MMAP2-with-build-ID path the
default.  Both commits were merged in the same series.

Clear the MMAP_BUILD_ID flag and zero the build_id union before each
call to perf_record_mmap2__read_build_id() so that every module starts
with a clean slate.

Fixes: d9f2ecbc5e ("perf dso: Move build_id to dso_id")
Reviewed-by: Ian Rogers <irogers@google.com>
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@linaro.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2026-03-11 17:47:42 -03:00
Ian Rogers
c7c92f76f9 perf annotate loongarch: Fix off-by-one bug in outside check
A copy-paste of a similar issue fixed by Peter Collingbourne in:
https://lore.kernel.org/linux-perf-users/20260304190613.2507582-1-pcc@google.com/

Signed-off-by: Ian Rogers <irogers@google.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Bill Wendling <morbo@google.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@linaro.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Justin Stitt <justinstitt@google.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Nathan Chancellor <nathan@kernel.org>
Cc: Nick Desaulniers <nick.desaulniers+lkml@gmail.com>
Cc: Peter Collingbourne <pcc@google.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2026-03-10 12:06:57 -03:00
Chen Ni
be34705aa5 perf ftrace: Fix hashmap__new() error checking
The hashmap__new() function never returns NULL, it returns error
pointers. Fix the error checking to match.

Additionally, set ftrace->profile_hash to NULL on error, and return the
exact error code from hashmap__new().

Fixes: 0f223813ed ("perf ftrace: Add 'profile' command")
Suggested-by: Ian Rogers <irogers@google.com>
Signed-off-by: Chen Ni <nichen@iscas.ac.cn>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@linaro.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2026-03-10 11:53:27 -03:00
Chen Ni
bf29cb3641 perf annotate: Fix hashmap__new() error checking
The hashmap__new() function never returns NULL, it returns error
pointers. Fix the error checking to match.

Additionally, set src->samples to NULL to prevent any later code from
accidentally using the error pointer.

Fixes: d3e7cad6f3 ("perf annotate: Add a hashmap for symbol histogram")
Reviewed-by: Ian Rogers <irogers@google.com>
Signed-off-by: Chen Ni <nichen@iscas.ac.cn>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@linaro.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Tianyou Li <tianyou.li@intel.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2026-03-10 10:19:44 -03:00
James Clark
3c3b41e591 perf cs-etm: Finish removal of ETM_OPT_*
These #defines have been removed from the kernel headers in favour of
the string based PMU format attributes. Usages were previously removed
from the recording side of cs-etm in Perf. Finish the removal by
removing usages from the decode side too.

It's a straight replacement of the old #defines with the new register
bit definitions. Except cs_etm__setup_timeless_decoding() which wasn't
looking at the saved metadata and was instead hard coding an access to
'attr.config'. This was vulnerable to the same issue of .config being
moved to .config2 etc that the original removal of ETM_OPT_* tried to
fix. So fix that too.

Signed-off-by: James Clark <james.clark@linaro.org>
Tested-by: Leo Yan <leo.yan@arm.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: John Garry <john.g.garry@oracle.com>
Cc: Leo Yan <leo.yan@linux.dev>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Mike Leach <mike.leach@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Suzuki Poulouse <suzuki.poulose@arm.com>
Cc: Will Deacon <will@kernel.org>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2026-03-10 09:50:44 -03:00
Arnaldo Carvalho de Melo
c9d77f0a0c tools headers: Update the syscall tables and unistd.h, to support the new 'rseq_slice_yield' syscall
Picking up the changes from these csets:

  2153b2e891 ("sparc: Add architecture support for clone3")
  99d2592023 ("rseq: Implement sys_rseq_slice_yield()")
  4ac286c4a8 ("s390/syscalls: Switch to generic system call table generation")

This makes 'perf trace' support it, now its possible, for instance, to
do:

  # perf trace -e rseq_slice_yield --max-stack=16

Here is an example with the 'sendmmsg' syscall:

  root@x1:~# perf trace -e sendmmsg --max-stack 16 --max-events=1
       0.000 ( 0.062 ms): dbus-broker/1012 sendmmsg(fd: 150, mmsg: 0x7ffef57cca50, vlen: 1, flags: DONTWAIT|NOSIGNAL) = 1
                                         syscall_exit_to_user_mode_prepare ([kernel.kallsyms])
                                         syscall_exit_to_user_mode_prepare ([kernel.kallsyms])
                                         syscall_exit_to_user_mode ([kernel.kallsyms])
                                         do_syscall_64 ([kernel.kallsyms])
                                         entry_SYSCALL_64 ([kernel.kallsyms])
                                         [0x117ce7] (/usr/lib64/libc.so.6 (deleted))
  root@x1:~#

To do a system wide tracing of the new 'rseq_slice_yield' syscall with a
backtrace of at most 16 entries.

This addresses these perf tools build warnings:

  Warning: Kernel ABI header differences:
    diff -u tools/include/uapi/asm-generic/unistd.h include/uapi/asm-generic/unistd.h
    diff -u tools/scripts/syscall.tbl scripts/syscall.tbl
    diff -u tools/perf/arch/x86/entry/syscalls/syscall_32.tbl arch/x86/entry/syscalls/syscall_32.tbl
    diff -u tools/perf/arch/x86/entry/syscalls/syscall_64.tbl arch/x86/entry/syscalls/syscall_64.tbl
    diff -u tools/perf/arch/powerpc/entry/syscalls/syscall.tbl arch/powerpc/kernel/syscalls/syscall.tbl
    diff -u tools/perf/arch/s390/entry/syscalls/syscall.tbl arch/s390/kernel/syscalls/syscall.tbl
    diff -u tools/perf/arch/mips/entry/syscalls/syscall_n64.tbl arch/mips/kernel/syscalls/syscall_n64.tbl
    diff -u tools/perf/arch/arm/entry/syscalls/syscall.tbl arch/arm/tools/syscall.tbl
    diff -u tools/perf/arch/sh/entry/syscalls/syscall.tbl arch/sh/kernel/syscalls/syscall.tbl
    diff -u tools/perf/arch/sparc/entry/syscalls/syscall.tbl arch/sparc/kernel/syscalls/syscall.tbl
    diff -u tools/perf/arch/xtensa/entry/syscalls/syscall.tbl arch/xtensa/kernel/syscalls/syscall.tbl

Cc: Andreas Larsson <andreas@gaisler.com>
Cc: Heiko Carstens <hca@linux.ibm.com>
Cc: Ludwig Rydberg <ludwig.rydberg@gaisler.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@kernel.org>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2026-03-05 17:20:23 -03:00
Peter Collingbourne
b3ce769203 perf disasm: Fix off-by-one bug in outside check
If a branch target points to one past the end of a function, the branch
should be treated as a branch to another function.

This can happen e.g. with a tail call to a function that is laid out
immediately after the caller.

Fixes: 751b1783da ("perf annotate: Mark jumps to outher functions with the call arrow")
Reviewed-by: Ian Rogers <irogers@google.com>
Signed-off-by: Peter Collingbourne <pcc@google.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Bill Wendling <morbo@google.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@linaro.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Justin Stitt <justinstitt@google.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Nathan Chancellor <nathan@kernel.org>
Cc: Nick Desaulniers <nick.desaulniers+lkml@gmail.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: https://linux-review.googlesource.com/id/Ide471112e82d68177e0faf08ca411d9fcf0a7bdf
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2026-03-05 16:51:09 -03:00
Arnaldo Carvalho de Melo
3abbb7cae8 perf beauty: Update the arch/x86/include/asm/irq_vectors.h copy with the kernel sources
To pick up the change in:

  a1fab3e69d ("x86/irq: Fix comment on IRQ vector layout")

That just adds one comment, so no changes in perf tooling, just silences
this build warning:

  diff -u tools/perf/trace/beauty/arch/x86/include/asm/irq_vectors.h arch/x86/include/asm/irq_vectors.h

Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2026-03-04 11:49:24 -03:00
Arnaldo Carvalho de Melo
e367679f16 perf beauty: Sync UAPI linux/fs.h with kernel sources
To pick up changes from:

  0e6b7eae1f ("fs: add FS_XFLAG_VERITY for fs-verity files")

These are used to beautify fs syscall arguments, albeit the changes in
this update are not affecting those beautifiers.

This addresses these tools/perf build warnings:

  Warning: Kernel ABI header differences:
    diff -u tools/perf/trace/beauty/include/uapi/linux/fs.h include/uapi/linux/fs.h

Please see tools/include/uapi/README.

Cc: Andrey Albershteyn <aalbersh@kernel.org>
Cc: Christian Brauner <brauner@kernel.org>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2026-03-04 11:46:01 -03:00
Arnaldo Carvalho de Melo
6036165ab1 perf beauty: Sync linux/mount.h copy with the kernel sources
To pick the changes from:

  9b8a0ba682 ("mount: add OPEN_TREE_NAMESPACE")
  0e5032237e ("statmount: accept fd as a parameter")

That doesn't change anything in tools this time as nothing that is
harvested by the beauty scripts got changed:

  $ ls -1 tools/perf/trace/beauty/*mount*sh
  tools/perf/trace/beauty/fsmount.sh
  tools/perf/trace/beauty/mount_flags.sh
  tools/perf/trace/beauty/move_mount_flags.sh
  $

This addresses this perf build warning.

  Warning: Kernel ABI header differences:
    diff -u tools/include/uapi/linux/mount.h include/uapi/linux/mount.h

Please see tools/include/uapi/README for further details.

Cc: Christian Brauner <brauner@kernel.org>
Cc: Bhavik Sachdev <b.sachdev1904@gmail.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2026-03-04 11:41:12 -03:00
Dmitrii Dolgov
30f998c992 tools build: Fix rust cross compilation
Currently no target is specified to compile rust code when needed, which
breaks cross compilation. E.g. for arm64:

      LD      /tmp/build/tests/workloads/perf-test-in.o
    aarch64-linux-gnu-ld: /tmp/build/tests/workloads/code_with_type.a(code_with_type.code_with_type.d12f4324cb53c560-cgu.0.rcgu.o): Relocations in generic ELF (EM: 62)
    aarch64-linux-gnu-ld: /tmp/build/tests/workloads/code_with_type.a(code_with_type.code_with_type.d12f4324cb53c560-cgu.0.rcgu.o): Relocations in generic ELF (EM: 62)
    [...repeated...]
    aarch64-linux-gnu-ld: /tmp/build/tests/workloads/code_with_type.a(code_with_type.code_with_type.d12f4324cb53c560-cgu.0.rcgu.o): Relocations in generic ELF (EM: 62)
    aarch64-linux-gnu-ld: /tmp/build/tests/workloads/code_with_type.a(code_with_type.code_with_type.d12f4324cb53c560-cgu.0.rcgu.o): Relocations in generic ELF (EM: 62)
    aarch64-linux-gnu-ld: /tmp/build/tests/workloads/code_with_type.a: error adding symbols: file in wrong format
    make[5]: *** [/perf/tools/build/Makefile.build:162: /tmp/build/tests/workloads/perf-test-in.o] Error 1
    make[4]: *** [/perf/tools/build/Makefile.build:156: workloads] Error 2
    make[3]: *** [/perf/tools/build/Makefile.build:156: tests] Error 2
    make[2]: *** [Makefile.perf:785: /tmp/build/perf-test-in.o] Error 2
    make[2]: *** Waiting for unfinished jobs....
    make[1]: *** [Makefile.perf:289: sub-make] Error 2
    make: *** [Makefile:76: all] Error 2

Detect required target and pass it via rust_flags to the compiler.

Note that CROSS_COMPILE might be different from what rust compiler
expects, since it may omit the target vendor value, e.g.
"aarch64-linux-gnu" instead of "aarch64-unknown-linux-gnu".

Thus explicitly map supported CROSS_COMPILE values to corresponding Rust
versions, as suggested by Miguel Ojeda.

Tested using arm64 cross-compilation example from [1].

Fixes: 2e05bb52a1 ("perf test workload: Add code_with_type test workload")
Reviewed-by: Ian Rogers <irogers@google.com>
Signed-off-by: Dmitrii Dolgov <9erthalion6@gmail.com>
Cc: Levi Zim <i@kxxt.dev>
Cc: Miguel Ojeda <ojeda@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Nathan Chancellor <nathan@kernel.org>
Cc: Nicolas Schier <nsc@kernel.org>
Link: https://perfwiki.github.io/main/arm64-cross-compilation-dockerfile/ [1]
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2026-03-04 11:37:30 -03:00
Markus Mayer
b6712d91f8 perf build: Prevent "argument list too long" error
Due to a recent change, building perf may result in a build error when
it is trying to "prune orphans".

The file list passed to "rm" may exceed what the shell can handle.

The build will then abort with an error like this:

    TEST    [...]/arm64/build/linux-custom/tools/perf/pmu-events/metric_test.log
  make[5]: /bin/sh: Argument list too long
  make[5]: *** [pmu-events/Build:217: prune_orphans] Error 127
  make[5]: *** Waiting for unfinished jobs....
  make[4]: *** [Makefile.perf:773: [...]/tools/perf/pmu-events/pmu-events-in.o] Error 2
  make[4]: *** Waiting for unfinished jobs....
  make[3]: *** [Makefile.perf:289: sub-make] Error 2

Processing the arguments via "xargs", instead of passing the list of
files directly to "rm" via the shell, prevents this issue.

Fixes: 36a1b0061a ("perf build: Reduce pmu-events related copying and mkdirs")
Reviewed-by: Ian Rogers <irogers@google.com>
Signed-off-by: Markus Mayer <mmayer@broadcom.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@linaro.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2026-03-04 11:22:31 -03:00
Arnaldo Carvalho de Melo
cfdf6456c0 tools headers: Sync uapi/linux/prctl.h with the kernel source
To pick up the changes in these csets:

  5ca243f6e3 ("prctl: add arch-agnostic prctl()s for indirect branch tracking")
  28621ec2d4 ("rseq: Add prctl() to enable time slice extensions")

That don't introduced these new prctls:

  $ tools/perf/trace/beauty/prctl_option.sh > before.txt
  $ cp include/uapi/linux/prctl.h tools/perf/trace/beauty/include/uapi/linux/prctl.h
  $ tools/perf/trace/beauty/prctl_option.sh > after.txt
  $ diff -u before.txt after.txt
  --- before.txt	2026-02-27 09:07:16.435611457 -0300
  +++ after.txt	2026-02-27 09:07:28.189816531 -0300
  @@ -73,6 +73,10 @@
   	[76] = "LOCK_SHADOW_STACK_STATUS",
   	[77] = "TIMER_CREATE_RESTORE_IDS",
   	[78] = "FUTEX_HASH",
  +	[79] = "RSEQ_SLICE_EXTENSION",
  +	[80] = "GET_INDIR_BR_LP_STATUS",
  +	[81] = "SET_INDIR_BR_LP_STATUS",
  +	[82] = "LOCK_INDIR_BR_LP_STATUS",
   };
   static const char *prctl_set_mm_options[] = {
   	[1] = "START_CODE",
  $

That now will be used to decode the syscall option and also to compose
filters, for instance:

  [root@five ~]# perf trace -e syscalls:sys_enter_prctl --filter option==SET_NAME
       0.000 Isolated Servi/3474327 syscalls:sys_enter_prctl(option: SET_NAME, arg2: 0x7f23f13b7aee)
       0.032 DOM Worker/3474327 syscalls:sys_enter_prctl(option: SET_NAME, arg2: 0x7f23deb25670)
       7.920 :3474328/3474328 syscalls:sys_enter_prctl(option: SET_NAME, arg2: 0x7f23e24fbb10)
       7.935 StreamT~s #374/3474328 syscalls:sys_enter_prctl(option: SET_NAME, arg2: 0x7f23e24fb970)
       8.400 Isolated Servi/3474329 syscalls:sys_enter_prctl(option: SET_NAME, arg2: 0x7f23e24bab10)
       8.418 StreamT~s #374/3474329 syscalls:sys_enter_prctl(option: SET_NAME, arg2: 0x7f23e24ba970)
  ^C[root@five ~]#

This addresses these perf build warnings:

  Warning: Kernel ABI header differences:
    diff -u tools/perf/trace/beauty/include/uapi/linux/prctl.h include/uapi/linux/prctl.h

Please see tools/include/uapi/README for further details.

Cc: Deepak Gupta <debug@rivosinc.com>
Cc: Paul Walmsley <pjw@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@kernel.org>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2026-03-04 11:14:10 -03:00
Linus Torvalds
c7decec2f2 Merge tag 'perf-tools-for-v7.0-1-2026-02-21' of git://git.kernel.org/pub/scm/linux/kernel/git/perf/perf-tools
Pull perf tools updates from Arnaldo Carvalho de Melo:

 - Introduce 'perf sched stats' tool with record/report/diff workflows
   using schedstat counters

 - Add a faster libdw based addr2line implementation and allow selecting
   it or its alternatives via 'perf config addr2line.style='

 - Data-type profiling fixes and improvements including the ability to
   select fields using 'perf report''s -F/-fields, e.g.:

     'perf report --fields overhead,type'

 - Add 'perf test' regression tests for Data-type profiling with C and
   Rust workloads

 - Fix srcline printing with inlines in callchains, make sure this has
   coverage in 'perf test'

 - Fix printing of leaf IP in LBR callchains

 - Fix display of metrics without sufficient permission in 'perf stat'

 - Print all machines in 'perf kvm report -vvv', not just the host

 - Switch from SHA-1 to BLAKE2s for build ID generation, remove SHA-1
   code

 - Fix 'perf report's histogram entry collapsing with '-F' option

 - Use system's cacheline size instead of a hardcoded value in 'perf
   report'

 - Allow filtering conversion by time range in 'perf data'

 - Cover conversion to CTF using 'perf data' in 'perf test'

 - Address newer glibc const-correctness (-Werror=discarded-qualifiers)
   issues

 - Fixes and improvements for ARM's CoreSight support, simplify ARM SPE
   event config in 'perf mem', update docs for 'perf c2c' including the
   ARM events it can be used with

 - Build support for generating metrics from arch specific python
   script, add extra AMD, Intel, ARM64 metrics using it

 - Add AMD Zen 6 events and metrics

 - Add JSON file with OpenHW Risc-V CVA6 hardware counters

 - Add 'perf kvm' stats live testing

 - Add more 'perf stat' tests to 'perf test'

 - Fix segfault in `perf lock contention -b/--use-bpf`

 - Fix various 'perf test' cases for s390

 - Build system cleanups, bump minimum shellcheck version to 0.7.2

 - Support building the capstone based annotation routines as a plugin

 - Allow passing extra Clang flags via EXTRA_BPF_FLAGS

* tag 'perf-tools-for-v7.0-1-2026-02-21' of git://git.kernel.org/pub/scm/linux/kernel/git/perf/perf-tools: (255 commits)
  perf test script: Add python script testing support
  perf test script: Add perl script testing support
  perf script: Allow the generated script to be a path
  perf test: perf data --to-ctf testing
  perf test: Test pipe mode with data conversion --to-json
  perf json: Pipe mode --to-ctf support
  perf json: Pipe mode --to-json support
  perf check: Add libbabeltrace to the listed features
  perf build: Allow passing extra Clang flags via EXTRA_BPF_FLAGS
  perf test data_type_profiling.sh: Skip just the Rust tests if code_with_type workload is missing
  tools build: Fix feature test for rust compiler
  perf libunwind: Fix calls to thread__e_machine()
  perf stat: Add no-affinity flag
  perf evlist: Reduce affinity use and move into iterator, fix no affinity
  perf evlist: Missing TPEBS close in evlist__close()
  perf evlist: Special map propagation for tool events that read on 1 CPU
  perf stat-shadow: In prepare_metric fix guard on reading NULL perf_stat_evsel
  Revert "perf tool_pmu: More accurately set the cpus for tool events"
  tools build: Emit dependencies file for test-rust.bin
  tools build: Make test-rust.bin be removed by the 'clean' target
  ...
2026-02-21 10:51:08 -08:00
Linus Torvalds
cb5573868e Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm
Pull KVM updates from Paolo Bonzini:
 "Loongarch:

   - Add more CPUCFG mask bits

   - Improve feature detection

   - Add lazy load support for FPU and binary translation (LBT) register
     state

   - Fix return value for memory reads from and writes to in-kernel
     devices

   - Add support for detecting preemption from within a guest

   - Add KVM steal time test case to tools/selftests

  ARM:

   - Add support for FEAT_IDST, allowing ID registers that are not
     implemented to be reported as a normal trap rather than as an UNDEF
     exception

   - Add sanitisation of the VTCR_EL2 register, fixing a number of
     UXN/PXN/XN bugs in the process

   - Full handling of RESx bits, instead of only RES0, and resulting in
     SCTLR_EL2 being added to the list of sanitised registers

   - More pKVM fixes for features that are not supposed to be exposed to
     guests

   - Make sure that MTE being disabled on the pKVM host doesn't give it
     the ability to attack the hypervisor

   - Allow pKVM's host stage-2 mappings to use the Force Write Back
     version of the memory attributes by using the "pass-through'
     encoding

   - Fix trapping of ICC_DIR_EL1 on GICv5 hosts emulating GICv3 for the
     guest

   - Preliminary work for guest GICv5 support

   - A bunch of debugfs fixes, removing pointless custom iterators
     stored in guest data structures

   - A small set of FPSIMD cleanups

   - Selftest fixes addressing the incorrect alignment of page
     allocation

   - Other assorted low-impact fixes and spelling fixes

  RISC-V:

   - Fixes for issues discoverd by KVM API fuzzing in
     kvm_riscv_aia_imsic_has_attr(), kvm_riscv_aia_imsic_rw_attr(), and
     kvm_riscv_vcpu_aia_imsic_update()

   - Allow Zalasr, Zilsd and Zclsd extensions for Guest/VM

   - Transparent huge page support for hypervisor page tables

   - Adjust the number of available guest irq files based on MMIO
     register sizes found in the device tree or the ACPI tables

   - Add RISC-V specific paging modes to KVM selftests

   - Detect paging mode at runtime for selftests

  s390:

   - Performance improvement for vSIE (aka nested virtualization)

   - Completely new memory management. s390 was a special snowflake that
     enlisted help from the architecture's page table management to
     build hypervisor page tables, in particular enabling sharing the
     last level of page tables. This however was a lot of code (~3K
     lines) in order to support KVM, and also blocked several features.
     The biggest advantages is that the page size of userspace is
     completely independent of the page size used by the guest:
     userspace can mix normal pages, THPs and hugetlbfs as it sees fit,
     and in fact transparent hugepages were not possible before. It's
     also now possible to have nested guests and guests with huge pages
     running on the same host

   - Maintainership change for s390 vfio-pci

   - Small quality of life improvement for protected guests

  x86:

   - Add support for giving the guest full ownership of PMU hardware
     (contexted switched around the fastpath run loop) and allowing
     direct access to data MSRs and PMCs (restricted by the vPMU model).

     KVM still intercepts access to control registers, e.g. to enforce
     event filtering and to prevent the guest from profiling sensitive
     host state. This is more accurate, since it has no risk of
     contention and thus dropped events, and also has significantly less
     overhead.

     For more information, see the commit message for merge commit
     bf2c3138ae ("Merge tag 'kvm-x86-pmu-6.20' ...")

   - Disallow changing the virtual CPU model if L2 is active, for all
     the same reasons KVM disallows change the model after the first
     KVM_RUN

   - Fix a bug where KVM would incorrectly reject host accesses to PV
     MSRs when running with KVM_CAP_ENFORCE_PV_FEATURE_CPUID enabled,
     even if those were advertised as supported to userspace,

   - Fix a bug with protected guest state (SEV-ES/SNP and TDX) VMs,
     where KVM would attempt to read CR3 configuring an async #PF entry

   - Fail the build if EXPORT_SYMBOL_GPL or EXPORT_SYMBOL is used in KVM
     (for x86 only) to enforce usage of EXPORT_SYMBOL_FOR_KVM_INTERNAL.
     Only a few exports that are intended for external usage, and those
     are allowed explicitly

   - When checking nested events after a vCPU is unblocked, ignore
     -EBUSY instead of WARNing. Userspace can sometimes put the vCPU
     into what should be an impossible state, and spurious exit to
     userspace on -EBUSY does not really do anything to solve the issue

   - Also throw in the towel and drop the WARN on INIT/SIPI being
     blocked when vCPU is in Wait-For-SIPI, which also resulted in
     playing whack-a-mole with syzkaller stuffing architecturally
     impossible states into KVM

   - Add support for new Intel instructions that don't require anything
     beyond enumerating feature flags to userspace

   - Grab SRCU when reading PDPTRs in KVM_GET_SREGS2

   - Add WARNs to guard against modifying KVM's CPU caps outside of the
     intended setup flow, as nested VMX in particular is sensitive to
     unexpected changes in KVM's golden configuration

   - Add a quirk to allow userspace to opt-in to actually suppress EOI
     broadcasts when the suppression feature is enabled by the guest
     (currently limited to split IRQCHIP, i.e. userspace I/O APIC).
     Sadly, simply fixing KVM to honor Suppress EOI Broadcasts isn't an
     option as some userspaces have come to rely on KVM's buggy behavior
     (KVM advertises Supress EOI Broadcast irrespective of whether or
     not userspace I/O APIC supports Directed EOIs)

   - Clean up KVM's handling of marking mapped vCPU pages dirty

   - Drop a pile of *ancient* sanity checks hidden behind in KVM's
     unused ASSERT() macro, most of which could be trivially triggered
     by the guest and/or user, and all of which were useless

   - Fold "struct dest_map" into its sole user, "struct rtc_status", to
     make it more obvious what the weird parameter is used for, and to
     allow fropping these RTC shenanigans if CONFIG_KVM_IOAPIC=n

   - Bury all of ioapic.h, i8254.h and related ioctls (including
     KVM_CREATE_IRQCHIP) behind CONFIG_KVM_IOAPIC=y

   - Add a regression test for recent APICv update fixes

   - Handle "hardware APIC ISR", a.k.a. SVI, updates in
     kvm_apic_update_apicv() to consolidate the updates, and to
     co-locate SVI updates with the updates for KVM's own cache of ISR
     information

   - Drop a dead function declaration

   - Minor cleanups

  x86 (Intel):

   - Rework KVM's handling of VMCS updates while L2 is active to
     temporarily switch to vmcs01 instead of deferring the update until
     the next nested VM-Exit.

     The deferred updates approach directly contributed to several bugs,
     was proving to be a maintenance burden due to the difficulty in
     auditing the correctness of deferred updates, and was polluting
     "struct nested_vmx" with a growing pile of booleans

   - Fix an SGX bug where KVM would incorrectly try to handle EPCM page
     faults, and instead always reflect them into the guest. Since KVM
     doesn't shadow EPCM entries, EPCM violations cannot be due to KVM
     interference and can't be resolved by KVM

   - Fix a bug where KVM would register its posted interrupt wakeup
     handler even if loading kvm-intel.ko ultimately failed

   - Disallow access to vmcb12 fields that aren't fully supported,
     mostly to avoid weirdness and complexity for FRED and other
     features, where KVM wants enable VMCS shadowing for fields that
     conditionally exist

   - Print out the "bad" offsets and values if kvm-intel.ko refuses to
     load (or refuses to online a CPU) due to a VMCS config mismatch

  x86 (AMD):

   - Drop a user-triggerable WARN on nested_svm_load_cr3() failure

   - Add support for virtualizing ERAPS. Note, correct virtualization of
     ERAPS relies on an upcoming, publicly announced change in the APM
     to reduce the set of conditions where hardware (i.e. KVM) *must*
     flush the RAP

   - Ignore nSVM intercepts for instructions that are not supported
     according to L1's virtual CPU model

   - Add support for expedited writes to the fast MMIO bus, a la VMX's
     fastpath for EPT Misconfig

   - Don't set GIF when clearing EFER.SVME, as GIF exists independently
     of SVM, and allow userspace to restore nested state with GIF=0

   - Treat exit_code as an unsigned 64-bit value through all of KVM

   - Add support for fetching SNP certificates from userspace

   - Fix a bug where KVM would use vmcb02 instead of vmcb01 when
     emulating VMLOAD or VMSAVE on behalf of L2

   - Misc fixes and cleanups

  x86 selftests:

   - Add a regression test for TPR<=>CR8 synchronization and IRQ masking

   - Overhaul selftest's MMU infrastructure to genericize stage-2 MMU
     support, and extend x86's infrastructure to support EPT and NPT
     (for L2 guests)

   - Extend several nested VMX tests to also cover nested SVM

   - Add a selftest for nested VMLOAD/VMSAVE

   - Rework the nested dirty log test, originally added as a regression
     test for PML where KVM logged L2 GPAs instead of L1 GPAs, to
     improve test coverage and to hopefully make the test easier to
     understand and maintain

  guest_memfd:

   - Remove kvm_gmem_populate()'s preparation tracking and half-baked
     hugepage handling. SEV/SNP was the only user of the tracking and it
     can do it via the RMP

   - Retroactively document and enforce (for SNP) that
     KVM_SEV_SNP_LAUNCH_UPDATE and KVM_TDX_INIT_MEM_REGION require the
     source page to be 4KiB aligned, to avoid non-trivial complexity for
     something that no known VMM seems to be doing and to avoid an API
     special case for in-place conversion, which simply can't support
     unaligned sources

   - When populating guest_memfd memory, GUP the source page in common
     code and pass the refcounted page to the vendor callback, instead
     of letting vendor code do the heavy lifting. Doing so avoids a
     looming deadlock bug with in-place due an AB-BA conflict betwee
     mmap_lock and guest_memfd's filemap invalidate lock

  Generic:

   - Fix a bug where KVM would ignore the vCPU's selected address space
     when creating a vCPU-specific mapping of guest memory. Actually
     this bug could not be hit even on x86, the only architecture with
     multiple address spaces, but it's a bug nevertheless"

* tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm: (267 commits)
  KVM: s390: Increase permitted SE header size to 1 MiB
  MAINTAINERS: Replace backup for s390 vfio-pci
  KVM: s390: vsie: Fix race in acquire_gmap_shadow()
  KVM: s390: vsie: Fix race in walk_guest_tables()
  KVM: s390: Use guest address to mark guest page dirty
  irqchip/riscv-imsic: Adjust the number of available guest irq files
  RISC-V: KVM: Transparent huge page support
  RISC-V: KVM: selftests: Add Zalasr extensions to get-reg-list test
  RISC-V: KVM: Allow Zalasr extensions for Guest/VM
  KVM: riscv: selftests: Add riscv vm satp modes
  KVM: riscv: selftests: add Zilsd and Zclsd extension to get-reg-list test
  riscv: KVM: allow Zilsd and Zclsd extensions for Guest/VM
  RISC-V: KVM: Skip IMSIC update if vCPU IMSIC state is not initialized
  RISC-V: KVM: Fix null pointer dereference in kvm_riscv_aia_imsic_rw_attr()
  RISC-V: KVM: Fix null pointer dereference in kvm_riscv_aia_imsic_has_attr()
  RISC-V: KVM: Remove unnecessary 'ret' assignment
  KVM: s390: Add explicit padding to struct kvm_s390_keyop
  KVM: LoongArch: selftests: Add steal time test case
  LoongArch: KVM: Add paravirt vcpu_is_preempted() support in guest side
  LoongArch: KVM: Add paravirt preempt feature in hypervisor side
  ...
2026-02-13 11:31:15 -08:00
Ian Rogers
dbf0108347 perf test script: Add python script testing support
Basic coverage of python script support from `perf script`.

Committer testing:

  $ perf test 'perf script python'
  107: perf script python tests                                        : Ok
  $ perf test -vv 'perf script python'
  107: perf script python tests:
  --- start ---
  test child forked, pid 595537
  Testing event: sched:sched_switch
  perf script python test [Skipped: failed to record sched:sched_switch]
  Testing event: task-clock
  Generating python script...
  generated Python script: /tmp/__perf_test_script.J4rWj.py
  Executing python script...
  perf script python test [Success: task-clock triggered param_dict]
  ---- end(0) ----
  107: perf script python tests                                        : Ok
  $

Signed-off-by: Ian Rogers <irogers@google.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@linaro.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Leo Yan <leo.yan@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Sandipan Das <sandipan.das@amd.com>
Cc: Yujie Liu <yujie.liu@intel.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2026-02-12 17:45:23 -03:00
Ian Rogers
2273697781 perf test script: Add perl script testing support
Basic coverage of perl script support from `perf script`. This is
disabled by default and so the test will most normally skip.

Committer testing:

  $ perf test 'perf script perl'
  106: perf script perl tests                                          : Skip
  $ perf test -vv 'perf script perl'
  106: perf script perl tests:
  --- start ---
  test child forked, pid 578323
  perf script perl test [Skipped: no libperl support]
  ---- end(-2) ----
  106: perf script perl tests                                          : Skip
  $ perf check feature libperl
                 libperl: [ OFF ]  # HAVE_LIBPERL_SUPPORT ( tip: Deprecated, use LIBPERL=1 and install perl-ExtUtils-Embed/libperl-dev to build with it )
  $

Install perl-ExtUtils-Embed, build with LIBPERL=1, rebuild:

  $ perf check feature libperl
                 libperl: [ on  ]  # HAVE_LIBPERL_SUPPORT
  $ perf test 'perf script perl'
  106: perf script perl tests                                          : Ok
  $ perf test -vv 'perf script perl'
  106: perf script perl tests:
  --- start ---
  test child forked, pid 588206
  Testing event: sched:sched_switch
  perf script perl test [Skipped: failed to record sched:sched_switch]
  Testing event: task-clock
  Generating perl script...
  generated Perl script: /tmp/__perf_test_script.RpMn5.pl
  Executing perl script...
  perf script perl test [Success: task-clock triggered $VAR1]
  ---- end(0) ----
  106: perf script perl tests                                          : Ok
  $

Signed-off-by: Ian Rogers <irogers@google.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@linaro.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Leo Yan <leo.yan@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Sandipan Das <sandipan.das@amd.com>
Cc: Yujie Liu <yujie.liu@intel.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2026-02-12 17:45:23 -03:00
Ian Rogers
22ca2f7f32 perf script: Allow the generated script to be a path
Allow the script generated by "perf script -g <language>" to be a file
path and the language determined by the file extension.

This is useful in testing so that the generated script file can be
written to a test directory.

Committer testing:

  $ perf record ls a.a
  ls: cannot access 'a.a': No such file or directory
  [ perf record: Woken up 2 times to write data ]
  [ perf record: Captured and wrote 0.003 MB perf.data (7 samples) ]
  $ perf script -g python
  generated Python script: perf-script.py
  $ perf script -g myscript.py
  generated Python script: myscript.py
  $ diff -u perf-script.py myscript.py
  $ tail myscript.py
  def trace_unhandled(event_name, context, event_fields_dict, perf_sample_dict):
  		print(get_dict_as_string(event_fields_dict))
  		print('Sample: {'+get_dict_as_string(perf_sample_dict['sample'], ', ')+'}')

  def print_header(event_name, cpu, secs, nsecs, pid, comm):
  	print("%-20s %5u %05u.%09u %8u %-20s " % \
  	(event_name, cpu, secs, nsecs, pid, comm), end="")

  def get_dict_as_string(a_dict, delimiter=' '):
  	return delimiter.join(['%s=%s'%(k,str(v))for k,v in sorted(a_dict.items())])
  $

Signed-off-by: Ian Rogers <irogers@google.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@linaro.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Leo Yan <leo.yan@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Sandipan Das <sandipan.das@amd.com>
Cc: Yujie Liu <yujie.liu@intel.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2026-02-12 17:45:22 -03:00
Ian Rogers
9083ce531a perf test: perf data --to-ctf testing
If babeltrace is detected check that --to-ctf functions with a data
file and in pipe mode.

Committer testing:

  $ perf test 'perf data convert --to-ctf'
  124: 'perf data convert --to-ctf' command test                       : Ok
  $ perf test -vv 'perf data convert --to-ctf'
  124: 'perf data convert --to-ctf' command test:
  --- start ---
  test child forked, pid 556008
           libbabeltrace: [ on  ]  # HAVE_LIBBABELTRACE_SUPPORT
  Testing Perf Data Conversion Command to CTF (File input)
  [ perf record: Woken up 1 times to write data ]
  [ perf record: Captured and wrote 0.021 MB /tmp/__perf_test.perf.data.9TxzZ (115 samples) ]
  [ perf data convert: Converted '/tmp/__perf_test.perf.data.9TxzZ' into CTF data '/tmp/__perf_test.ctf.f5EkS' ]
  [ perf data convert: Converted and wrote 0.012 MB (115 samples) ]
  Perf Data Converter Command to CTF (File input) [SUCCESS]
  Testing Perf Data Conversion Command to CTF (Pipe mode)
  [ perf record: Woken up 2 times to write data ]
  [ perf record: Captured and wrote 0.047 MB - ]
  Failed to setup all events.
  [ perf data convert: Converted '/tmp/__perf_test.perf.data.9TxzZ' into CTF data '/tmp/__perf_test.ctf.f5EkS' ]
  [ perf data convert: Converted and wrote 0.000 MB (0 samples) ]
  Perf Data Converter Command to CTF (Pipe mode) [SUCCESS]
  Unexpected signal in main
  ---- end(0) ----
  124: 'perf data convert --to-ctf' command test                       : Ok
  $

Signed-off-by: Ian Rogers <irogers@google.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Derek Foreman <derek.foreman@collabora.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@linaro.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2026-02-12 17:45:22 -03:00
Ian Rogers
fc4577b52a perf test: Test pipe mode with data conversion --to-json
Add pipe mode test for json data conversion. Tidy up exit and cleanup
code.

Committer testing:

  $ perf test 'perf data convert --to-json'
  124: 'perf data convert --to-json' command test                      : Ok
  $ perf test -vv 'perf data convert --to-json'
  124: 'perf data convert --to-json' command test:
  --- start ---
  test child forked, pid 548738
  Testing Perf Data Conversion Command to JSON
  [ perf record: Woken up 1 times to write data ]
  [ perf record: Captured and wrote 0.020 MB /tmp/__perf_test.perf.data.krxvl (104 samples) ]
  [ perf data convert: Converted '/tmp/__perf_test.perf.data.krxvl' into JSON data '/tmp/__perf_test.output.json.0z60p' ]
  [ perf data convert: Converted and wrote 0.075 MB (104 samples) ]
  Perf Data Converter Command to JSON [SUCCESS]
  Validating Perf Data Converted JSON file
  The file contains valid JSON format [SUCCESS]
  Testing Perf Data Conversion Command to JSON (Pipe mode)
  [ perf record: Woken up 2 times to write data ]
  [ perf record: Captured and wrote 0.046 MB - ]
  [ perf data convert: Converted '-' into JSON data '/tmp/__perf_test.output.json.0z60p' ]
  [ perf data convert: Converted and wrote 0.081 MB (110 samples) ]
  Perf Data Converter Command to JSON (Pipe mode) [SUCCESS]
  Validating Perf Data Converted JSON file
  The file contains valid JSON format [SUCCESS]
  ---- end(0) ----
  124: 'perf data convert --to-json' command test                      : Ok
  $

Signed-off-by: Ian Rogers <irogers@google.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Derek Foreman <derek.foreman@collabora.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@linaro.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2026-02-12 17:45:22 -03:00
Ian Rogers
5b92fc082c perf json: Pipe mode --to-ctf support
In pipe mode the environment may not be fully initialized so be robust
to fields being NULL.

Add default handling of attr events, use the feature events to populate
the ctf writer environment.

Signed-off-by: Ian Rogers <irogers@google.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Derek Foreman <derek.foreman@collabora.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@linaro.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2026-02-12 17:45:22 -03:00
Ian Rogers
6db2f7c67b perf json: Pipe mode --to-json support
In pipe mode the environment may not be fully initialized so be robust
to fields being NULL. Add default handling of feature and attr events.

Signed-off-by: Ian Rogers <irogers@google.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Derek Foreman <derek.foreman@collabora.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@linaro.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2026-02-12 17:45:22 -03:00
Ian Rogers
8772598b78 perf check: Add libbabeltrace to the listed features
This enables scripts to more easily determine if `perf data --to-ctf`
is supported.

Committer testing:

  $ perf check feature libbabeltrace
         libbabeltrace: [ on  ]  # HAVE_LIBBABELTRACE_SUPPORT
  $ perf check feature -q libbabeltrace && echo have libbabeltrace support
  have libbabeltrace support
  $

Signed-off-by: Ian Rogers <irogers@google.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Derek Foreman <derek.foreman@collabora.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@linaro.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2026-02-12 17:45:22 -03:00
hupu
9eb1760f84 perf build: Allow passing extra Clang flags via EXTRA_BPF_FLAGS
Add support for EXTRA_BPF_FLAGS in the eBPF skeleton build, allowing
users to pass additional clang options such as --sysroot or custom
include paths when cross-compiling perf.

This is primarily intended for cross-build scenarios where the default
host include paths do not match the target kernel version.

Example usage:

    make perf ARCH="arm64" EXTRA_BPF_FLAGS="--sysroot=..."

Reviewed-by: Namhyung Kim <namhyung@kernel.org>
Signed-off-by: hupu <hupu.gm@gmail.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Bill Wendling <morbo@google.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@linaro.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Justin Stitt <justinstitt@google.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Nathan Chancellor <nathan@kernel.org>
Cc: Nick Desaulniers <nick.desaulniers+lkml@gmail.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2026-02-12 17:45:22 -03:00
Arnaldo Carvalho de Melo
adc1284bae perf test data_type_profiling.sh: Skip just the Rust tests if code_with_type workload is missing
Namhyung suggested skipping only the rust tests when the code_with_type
'perf test' workload is not built into perf, do it so that we can
continue to test the C based workloads:

With rust:

  root@number:/# perf test -vv "data type"
   83: perf data type profiling tests:
  --- start ---
  test child forked, pid 2645245
  Basic Rust perf annotate test
  Basic annotate test [Success]
  Pipe Rust perf annotate test
  Pipe annotate test [Success]
  Basic C perf annotate test
  Basic annotate test [Success]
  Pipe C perf annotate test
  Pipe annotate test [Success]
  ---- end(0) ----
   83: perf data type profiling tests                                  : Ok
  root@number:/#

Without:

  root@number:/# perf test "data type"
   83: perf data type profiling tests                                  : Ok
  root@number:/# perf test -vv "data type"
   83: perf data type profiling tests:
  --- start ---
  test child forked, pid 2634759
  Basic Rust perf annotate test
  Skip: code_with_type workload not built in 'perf test'
  Pipe Rust perf annotate test
  Skip: code_with_type workload not built in 'perf test'
  Basic C perf annotate test
  Basic annotate test [Success]
  Pipe C perf annotate test
  Pipe annotate test [Success]
  ---- end(0) ----
   83: perf data type profiling tests                                  : Ok
  root@number:/#

Suggested-by: Namhyung Kim <namhyung@kernel.org>
Cc: Dmitrii Dolgov <9erthalion6@gmail.com>
Cc: Miguel Ojeda <ojeda@kernel.org>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2026-02-12 17:45:22 -03:00
Ian Rogers
1a6c45969a perf libunwind: Fix calls to thread__e_machine()
Add the missing 'e_flags' option to fix the build.

Fixes: 4e66527f88 ("perf thread: Add optional e_flags output argument to thread__e_machine")
Signed-off-by: Ian Rogers <irogers@google.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@linaro.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2026-02-12 17:45:22 -03:00
Linus Torvalds
41f1a08645 Merge tag 'kbuild-7.0-1' of git://git.kernel.org/pub/scm/linux/kernel/git/kbuild/linux
Pull Kbuild/Kconfig updates from Nathan Chancellor:
 "Kbuild:

   - Drop '*_probe' pattern from modpost section check allowlist, which
     hid legitimate warnings (Johan Hovold)

   - Disable -Wtype-limits altogether, instead of enabling at W=2
     (Vincent Mailhol)

   - Improve UAPI testing to skip testing headers that require a libc
     when CONFIG_CC_CAN_LINK is not set, opening up testing of headers
     with no libc dependencies to more environments (Thomas Weißschuh)

   - Update gendwarfksyms documentation with required dependencies
     (Jihan LIN)

   - Reject invalid LLVM= values to avoid unintentionally falling back
     to system toolchain (Thomas Weißschuh)

   - Add a script to help run the kernel build process in a container
     for consistent environments and testing (Guillaume Tucker)

   - Simplify kallsyms by getting rid of the relative base (Ard
     Biesheuvel)

   - Performance and usability improvements to scripts/make_fit.py
     (Simon Glass)

   - Minor various clean ups and fixes

  Kconfig:

   - Move XPM icons to individual files, clearing up GTK deprecation
     warnings (Rostislav Krasny)

   - Support

        depends on FOO if BAR

     as syntactic sugar for

        depends on FOO || !BAR

     (Nicolas Pitre, Graham Roff)

   - Refactor merge_config.sh to use awk over shell/sed/grep,
     dramatically speeding up processing large number of config
     fragments (Anders Roxell, Mikko Rapeli)"

* tag 'kbuild-7.0-1' of git://git.kernel.org/pub/scm/linux/kernel/git/kbuild/linux: (39 commits)
  kbuild: remove dependency of run-command on config
  scripts/make_fit: Compress dtbs in parallel
  scripts/make_fit: Support a few more parallel compressors
  kbuild: Support a FIT_EXTRA_ARGS environment variable
  scripts/make_fit: Move dtb processing into a function
  scripts/make_fit: Support an initial ramdisk
  scripts/make_fit: Speed up operation
  rust: kconfig: Don't require RUST_IS_AVAILABLE for rustc-option
  MAINTAINERS: Add scripts/install.sh into Kbuild entry
  modpost: Amend ppc64 save/restfpr symnames for -Os build
  MIPS: tools: relocs: Ship a definition of R_MIPS_PC32
  streamline_config.pl: remove superfluous exclamation mark
  kbuild: dummy-tools: Add python3
  scripts: kconfig: merge_config.sh: warn on duplicate input files
  scripts: kconfig: merge_config.sh: use awk in checks too
  scripts: kconfig: merge_config.sh: refactor from shell/sed/grep to awk
  kallsyms: Get rid of kallsyms relative base
  mips: Add support for PC32 relocations in vmlinux
  Documentation: dev-tools: add container.rst page
  scripts: add tool to run containerized builds
  ...
2026-02-11 13:40:35 -08:00
Paolo Bonzini
bf2c3138ae Merge tag 'kvm-x86-pmu-6.20' of https://github.com/kvm-x86/linux into HEAD
KVM mediated PMU support for 6.20

Add support for mediated PMUs, where KVM gives the guest full ownership of PMU
hardware (contexted switched around the fastpath run loop) and allows direct
access to data MSRs and PMCs (restricted by the vPMU model), but intercepts
access to control registers, e.g. to enforce event filtering and to prevent the
guest from profiling sensitive host state.

To keep overall complexity reasonable, mediated PMU usage is all or nothing
for a given instance of KVM (controlled via module param).  The Mediated PMU
is disabled default, partly to maintain backwards compatilibity for existing
setup, partly because there are tradeoffs when running with a mediated PMU that
may be non-starters for some use cases, e.g. the host loses the ability to
profile guests with mediated PMUs, the fastpath run loop is also a blind spot,
entry/exit transitions are more expensive, etc.

Versus the emulated PMU, where KVM is "just another perf user", the mediated
PMU delivers more accurate profiling and monitoring (no risk of contention and
thus dropped events), with significantly less overhead (fewer exits and faster
emulation/programming of event selectors) E.g. when running Specint-2017 on
a single-socket Sapphire Rapids with 56 cores and no-SMT, and using perf from
within the guest:

  Perf command:
  a. basic-sampling: perf record -F 1000 -e 6-instructions  -a --overwrite
  b. multiplex-sampling: perf record -F 1000 -e 10-instructions -a --overwrite

  Guest performance overhead:
  ---------------------------------------------------------------------------
  | Test case          | emulated vPMU | all passthrough | passthrough with |
  |                    |               |                 | event filters    |
  ---------------------------------------------------------------------------
  | basic-sampling     |   33.62%      |    4.24%        |   6.21%          |
  ---------------------------------------------------------------------------
  | multiplex-sampling |   79.32%      |    7.34%        |   10.45%         |
  ---------------------------------------------------------------------------
2026-02-11 12:45:40 -05:00
Linus Torvalds
4d84667627 Merge tag 'perf-core-2026-02-09' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull performance event updates from Ingo Molnar:
 "x86 PMU driver updates:

   - Add support for the core PMU for Intel Diamond Rapids (DMR) CPUs
     (Dapeng Mi)

     Compared to previous iterations of the Intel PMU code, there's been
     a lot of changes, which center around three main areas:

      - Introduce the OFF-MODULE RESPONSE (OMR) facility to replace the
        Off-Core Response (OCR) facility

      - New PEBS data source encoding layout

      - Support the new "RDPMC user disable" feature

   - Likewise, a large series adds uncore PMU support for Intel Diamond
     Rapids (DMR) CPUs (Zide Chen)

     This centers around these four main areas:

      - DMR may have two Integrated I/O and Memory Hub (IMH) dies,
        separate from the compute tile (CBB) dies. Each CBB and each IMH
        die has its own discovery domain.

      - Unlike prior CPUs that retrieve the global discovery table
        portal exclusively via PCI or MSR, DMR uses PCI for IMH PMON
        discovery and MSR for CBB PMON discovery.

      - DMR introduces several new PMON types: SCA, HAMVF, D2D_ULA, UBR,
        PCIE4, CRS, CPC, ITC, OTC, CMS, and PCIE6.

      - IIO free-running counters in DMR are MMIO-based, unlike SPR.

   - Also add support for Add missing PMON units for Intel Panther Lake,
     and support Nova Lake (NVL), which largely maps to Panther Lake.
     (Zide Chen)

   - KVM integration: Add support for mediated vPMUs (by Kan Liang and
     Sean Christopherson, with fixes and cleanups by Peter Zijlstra,
     Sandipan Das and Mingwei Zhang)

   - Add Intel cstate driver to support for Wildcat Lake (WCL) CPUs,
     which are a low-power variant of Panther Lake (Zide Chen)

   - Add core, cstate and MSR PMU support for the Airmont NP Intel CPU
     (aka MaxLinear Lightning Mountain), which maps to the existing
     Airmont code (Martin Schiller)

  Performance enhancements:

   - Speed up kexec shutdown by avoiding unnecessary cross CPU calls
     (Jan H. Schönherr)

   - Fix slow perf_event_task_exit() with LBR callstacks (Namhyung Kim)

  User-space stack unwinding support:

   - Various cleanups and refactorings in preparation to generalize the
     unwinding code for other architectures (Jens Remus)

  Uprobes updates:

   - Transition from kmap_atomic to kmap_local_page (Keke Ming)

   - Fix incorrect lockdep condition in filter_chain() (Breno Leitao)

   - Fix XOL allocation failure for 32-bit tasks (Oleg Nesterov)

  Misc fixes and cleanups:

   - s390: Remove kvm_types.h from Kbuild (Randy Dunlap)

   - x86/intel/uncore: Convert comma to semicolon (Chen Ni)

   - x86/uncore: Clean up const mismatch (Greg Kroah-Hartman)

   - x86/ibs: Fix typo in dc_l2tlb_miss comment (Xiang-Bin Shi)"

* tag 'perf-core-2026-02-09' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (58 commits)
  s390: remove kvm_types.h from Kbuild
  uprobes: Fix incorrect lockdep condition in filter_chain()
  x86/ibs: Fix typo in dc_l2tlb_miss comment
  x86/uprobes: Fix XOL allocation failure for 32-bit tasks
  perf/x86/intel/uncore: Convert comma to semicolon
  perf/x86/intel: Add support for rdpmc user disable feature
  perf/x86: Use macros to replace magic numbers in attr_rdpmc
  perf/x86/intel: Add core PMU support for Novalake
  perf/x86/intel: Add support for PEBS memory auxiliary info field in NVL
  perf/x86/intel: Add core PMU support for DMR
  perf/x86/intel: Add support for PEBS memory auxiliary info field in DMR
  perf/x86/intel: Support the 4 new OMR MSRs introduced in DMR and NVL
  perf/core: Fix slow perf_event_task_exit() with LBR callstacks
  perf/core: Speed up kexec shutdown by avoiding unnecessary cross CPU calls
  uprobes: use kmap_local_page() for temporary page mappings
  arm/uprobes: use kmap_local_page() in arch_uprobe_copy_ixol()
  mips/uprobes: use kmap_local_page() in arch_uprobe_copy_ixol()
  arm64/uprobes: use kmap_local_page() in arch_uprobe_copy_ixol()
  riscv/uprobes: use kmap_local_page() in arch_uprobe_copy_ixol()
  perf/x86/intel/uncore: Add Nova Lake support
  ...
2026-02-10 12:00:46 -08:00
Ian Rogers
5d1ab659fb perf stat: Add no-affinity flag
Add flag that disables affinity behavior.

Using sched_setaffinity() to place a perf thread on a CPU can avoid
certain interprocessor interrupts but may introduce a delay due to the
scheduling, particularly on loaded machines.

Add a command line option to disable the behavior.

This behavior is less present in other tools like `perf record`, as it
uses a ring buffer and doesn't make repeated system calls.

Signed-off-by: Ian Rogers <irogers@google.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Andres Freund <andres@anarazel.de>
Cc: Dapeng Mi <dapeng1.mi@linux.intel.com>
Cc: Dr. David Alan Gilbert <linux@treblig.org>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@linaro.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Falcon <thomas.falcon@intel.com>
Cc: Thomas Richter <tmricht@linux.ibm.com>
Cc: Yang Li <yang.lee@linux.alibaba.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2026-02-10 09:35:28 -03:00
Ian Rogers
d484361550 perf evlist: Reduce affinity use and move into iterator, fix no affinity
The evlist__for_each_cpu iterator will call sched_setaffitinity when
moving between CPUs to avoid IPIs.

If only 1 IPI is saved then this may be unprofitable as the delay to get
scheduled may be considerable.

This may be particularly true if reading an event group in `perf stat`
in interval mode.

Move the affinity handling completely into the iterator so that a single
evlist__use_affinity can determine whether CPU affinities will be used.

For `perf record` the change is minimal as the dummy event and the real
event will always make the use of affinities the thing to do.

In `perf stat`, tool events are ignored and affinities only used if >1
event on the same CPU occur.

Determining if affinities are useful is done by evlist__use_affinity
which tests per-event whether the event's PMU benefits from affinity use
- it is assumed only perf event using PMUs do.

Fix a bug where when there are no affinities that the CPU map iterator
may reference a CPU not present in the initial evsel. Fix by making the
iterator and non-iterator code common.

Signed-off-by: Ian Rogers <irogers@google.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Andres Freund <andres@anarazel.de>
Cc: Dapeng Mi <dapeng1.mi@linux.intel.com>
Cc: Dr. David Alan Gilbert <linux@treblig.org>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@linaro.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Falcon <thomas.falcon@intel.com>
Cc: Thomas Richter <tmricht@linux.ibm.com>
Cc: Yang Li <yang.lee@linux.alibaba.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2026-02-10 09:34:44 -03:00
Ian Rogers
47172912c9 perf evlist: Missing TPEBS close in evlist__close()
The libperf evsel close won't close TPEBS events properly.

Add a test to do this. The libperf close routine is used in
evlist__close() for affinity reasons.

Signed-off-by: Ian Rogers <irogers@google.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Andres Freund <andres@anarazel.de>
Cc: Dapeng Mi <dapeng1.mi@linux.intel.com>
Cc: Dr. David Alan Gilbert <linux@treblig.org>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@linaro.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Falcon <thomas.falcon@intel.com>
Cc: Thomas Richter <tmricht@linux.ibm.com>
Cc: Yang Li <yang.lee@linux.alibaba.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2026-02-10 09:34:11 -03:00
Ian Rogers
ff8548172f perf evlist: Special map propagation for tool events that read on 1 CPU
Tool events like duration_time don't need a perf_cpu_map that contains
all online CPUs.

Having such a perf_cpu_map causes overheads when iterating between
events for CPU affinity.

During parsing mark events that just read on a single CPU map index as
such, then during map propagation set up the evsel's CPUs and thereby
the evlists accordingly.

The setting cannot be done early in parsing as user CPUs are only fully
known when evlist__create_maps is called.

Signed-off-by: Ian Rogers <irogers@google.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Andres Freund <andres@anarazel.de>
Cc: Dapeng Mi <dapeng1.mi@linux.intel.com>
Cc: Dr. David Alan Gilbert <linux@treblig.org>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@linaro.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Falcon <thomas.falcon@intel.com>
Cc: Thomas Richter <tmricht@linux.ibm.com>
Cc: Yang Li <yang.lee@linux.alibaba.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2026-02-10 09:33:28 -03:00
Ian Rogers
63b320aaac perf stat-shadow: In prepare_metric fix guard on reading NULL perf_stat_evsel
The aggr value is setup to always be non-null creating a redundant
guard for reading from it. Switch to using the perf_stat_evsel (ps)
and narrow the scope of aggr so that it is known valid when used.

Fixes: 3d65f6445f ("perf stat-shadow: Read tool events directly")
Reported-by: Andres Freund <andres@anarazel.de>
Signed-off-by: Ian Rogers <irogers@google.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Dapeng Mi <dapeng1.mi@linux.intel.com>
Cc: Dr. David Alan Gilbert <linux@treblig.org>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@linaro.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Falcon <thomas.falcon@intel.com>
Cc: Thomas Richter <tmricht@linux.ibm.com>
Cc: Yang Li <yang.lee@linux.alibaba.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2026-02-10 09:33:13 -03:00
Ian Rogers
bc105a8918 Revert "perf tool_pmu: More accurately set the cpus for tool events"
This reverts commit d8d8a0b360.

The setting of a user CPU map can cause an empty intersection when
combined with CPU 0 and the event removed. This later triggers a segv in
the stat-shadow logic. Let's put back a full online CPU map for now by
reverting this patch.

Closes: https://lore.kernel.org/linux-perf-users/cgja46br2smmznxs7kbeabs6zgv3b4olfqgh2fdp5mxk2yom4v@w6jjgov6hdi6/
Reported-by: Andres Freund <andres@anarazel.de>
Signed-off-by: Ian Rogers <irogers@google.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Dapeng Mi <dapeng1.mi@linux.intel.com>
Cc: Dr. David Alan Gilbert <linux@treblig.org>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@linaro.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Falcon <thomas.falcon@intel.com>
Cc: Thomas Richter <tmricht@linux.ibm.com>
Cc: Yang Li <yang.lee@linux.alibaba.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2026-02-10 09:32:34 -03:00
Thomas Richter
3d012b8614 perf test: Fix test case perftool-testsuite_report for s390
Test case perftool-testsuite_report fails on s390 for some time
now.

Root cause is a time out which is too tight for large s390 machines.
The time out value addr2line_timeout_ms is per default set to 1 second.

This is the maximum time the function read_addr2line_record() waits for
a reply from the forked off tool addr2line, which is started as a child
in interactive mode.

It reads stdin (an address in hexadecimal) and replies on stdout with
function name, file name and line number. This might take more than one
second.

However one second is not always enough and the reply from addr2line
tool is not received. Function read_addr2line_record() fails and emits
a warning, which is not expected by the test case. It fails.

Output before:

 # perf test -F 133
 -- [ PASS ] -- perf_report :: setup :: prepare the perf.data file
 ==================
 [ perf record: Woken up 1 times to write data ]
 [ perf record: Captured and wrote 0.087 MB \
	/tmp/perftool-testsuite_report.FHz/perf_report/perf.data.1 \
	(207 samples) ]
 ==================
 -- [ PASS ] -- perf_report :: setup :: prepare the perf.data.1 file
 ## [ PASS ] ## perf_report :: setup SUMMARY
 -- [ SKIP ] -- perf_report :: test_basic :: help message :: testcase skipped
 Line did not match any pattern: "cmd__addr2line /usr/lib/debug/lib/modules/
 	6.19.0-20260205.rc8.git366.9845cf73f7db.300.fc43.s390x+next/
	vmlinux: could not read first record"
 Line did not match any pattern: "cmd__addr2line /usr/lib/debug/lib/modules/
	6.19.0-20260205.rc8.git366.9845cf73f7db.300.fc43.s390x+next/
	vmlinux: could not read first record"
 -- [ FAIL ] -- perf_report :: test_basic :: basic execution
	(output regexp parsing)
 ....
 133: perftool-testsuite_report      : FAILED!

Output after:

 # ./perf test -F 133
 -- [ PASS ] -- perf_report :: setup :: prepare the perf.data file
 ==================
 [ perf record: Woken up 1 times to write data ]
 [ perf record: Captured and wrote 0.087 MB \
	 /tmp/perftool-testsuite_report.Mlp/perf_report/perf.data.1
	 (188 samples) ]
 ==================
 -- [ PASS ] -- perf_report :: setup :: prepare the perf.data.1 file
 ## [ PASS ] ## perf_report :: setup SUMMARY
 -- [ SKIP ] -- perf_report :: test_basic :: help message :: testcase skipped
 -- [ PASS ] -- perf_report :: test_basic :: basic execution
 -- [ PASS ] -- perf_report :: test_basic :: number of samples
 -- [ PASS ] -- perf_report :: test_basic :: header
 -- [ PASS ] -- perf_report :: test_basic :: header timestamp
 -- [ PASS ] -- perf_report :: test_basic :: show CPU utilization
 -- [ PASS ] -- perf_report :: test_basic :: pid
 -- [ PASS ] -- perf_report :: test_basic :: non-existing symbol
 -- [ PASS ] -- perf_report :: test_basic :: symbol filter
 -- [ PASS ] -- perf_report :: test_basic :: latency header
 -- [ PASS ] -- perf_report :: test_basic :: default report for latency profile
 -- [ PASS ] -- perf_report :: test_basic :: latency report for latency profile
 -- [ PASS ] -- perf_report :: test_basic :: parallelism histogram
 ## [ PASS ] ## perf_report :: test_basic SUMMARY
 133: perftool-testsuite_report      : Ok
 #

Fixes: 257046a367 ("perf srcline: Fallback between addr2line implementations")
Reviewed-by: Jan Polensky <japo@linux.ibm.com>
Signed-off-by: Thomas Richter <tmricht@linux.ibm.com>
Cc: Alexander Gordeev <agordeev@linux.ibm.com>
Cc: Heiko Carstens <hca@linux.ibm.com>
Cc: Ian Rogers <irogers@google.com>
Cc: linux-s390@vger.kernel.org
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Sumanth Korikkar <sumanthk@linux.ibm.com>
Cc: Vasily Gorbik <gor@linux.ibm.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2026-02-09 15:51:35 -03:00
Arnaldo Carvalho de Melo
2a400eeba4 perf test code_with_type.sh: Skip test if rust wasn't available at build time
$ perf test 'perf data type profiling tests'
   83: perf data type profiling tests                         : Skip
  $ perf test -vv 'perf data type profiling tests'
   83: perf data type profiling tests:
  --- start ---
  test child forked, pid 977213
  Skip: code_with_type workload not built in 'perf test'
  ---- end(-2) ----
   83: perf data type profiling tests                         : Skip
  $

Cc: Dmitrii Dolgov <9erthalion6@gmail.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2026-02-09 15:47:03 -03:00
Dmitrii Dolgov
1d3ffe6233 perf tests workload: Formatting for code_with_type.rs
One part of the rust code for code_with_type workload wasn't properly
formatted.

Pass it through rustfmt to fix that.

Closes: https://lore.kernel.org/oe-kbuild-all/202602091357.oyRv6hgQ-lkp@intel.com/
Reported-by: kernel test robot <lkp@intel.com>
Reviewed-by: Ian Rogers <irogers@google.com>
Signed-off-by: Dmitrii Dolgov <9erthalion6@gmail.com>
Cc: Miguel Ojeda <ojeda@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2026-02-09 15:36:00 -03:00
Paolo Bonzini
5490063269 Merge tag 'kvmarm-7.0' of git://git.kernel.org/pub/scm/linux/kernel/git/kvmarm/kvmarm into HEAD
KVM/arm64 updates for 7.0

- Add support for FEAT_IDST, allowing ID registers that are not
  implemented to be reported as a normal trap rather than as an UNDEF
  exception.

- Add sanitisation of the VTCR_EL2 register, fixing a number of
  UXN/PXN/XN bugs in the process.

- Full handling of RESx bits, instead of only RES0, and resulting in
  SCTLR_EL2 being added to the list of sanitised registers.

- More pKVM fixes for features that are not supposed to be exposed to
  guests.

- Make sure that MTE being disabled on the pKVM host doesn't give it
  the ability to attack the hypervisor.

- Allow pKVM's host stage-2 mappings to use the Force Write Back
  version of the memory attributes by using the "pass-through'
  encoding.

- Fix trapping of ICC_DIR_EL1 on GICv5 hosts emulating GICv3 for the
  guest.

- Preliminary work for guest GICv5 support.

- A bunch of debugfs fixes, removing pointless custom iterators stored
  in guest data structures.

- A small set of FPSIMD cleanups.

- Selftest fixes addressing the incorrect alignment of page
  allocation.

- Other assorted low-impact fixes and spelling fixes.
2026-02-09 18:18:19 +01:00
Arnaldo Carvalho de Melo
3f5dfa472e tools build: Fix rust feature detection
Features in FEATURE_TESTS_BASIC will be set as being available if
test-all.c builds, so since the rust test isn't included in test-all.c,
we can't have 'rust' in there, remove it from FEATURE_TESTS_BASIC and
use feature-check so that it tries to build test-rust.bin, doing the
actual feature detection.

On a system lacking a rust compiler:

  Makefile.config:1158: Rust is not found. Test workloads with rust are disabled.

  Auto-detecting system features:
  ...                                   libdw: [ on  ]
  ...                                   glibc: [ on  ]
  ...                                  libelf: [ on  ]
  ...                                 libnuma: [ on  ]
  ...                  numa_num_possible_cpus: [ on  ]
  ...                               libpython: [ on  ]
  ...                             libcapstone: [ on  ]
  ...                               llvm-perf: [ on  ]
  ...                                    zlib: [ on  ]
  ...                                    lzma: [ on  ]
  ...                                     bpf: [ on  ]
  ...                                  libaio: [ on  ]
  ...                                 libzstd: [ on  ]
  ...                              libopenssl: [ on  ]
  ...                                    rust: [ OFF ]

  $ cat /tmp/build/perf-tools-next/feature/test-rust.make.output
  /bin/sh: line 1: rustc: command not found
  $ file /tmp/build/perf-tools-next/feature/test-rust.bin
  /tmp/build/perf-tools-next/feature/test-rust.bin: cannot open `/tmp/build/perf-tools-next/feature/test-rust.bin' (No such file or directory)
  $
  $ perf -vv | grep RUST
                  rust: [ OFF ]  # HAVE_RUST_SUPPORT
  $

And after installing it:

  ...                                    rust: [ on  ]

  $ cat /tmp/build/perf-tools-next/feature/test-rust.make.output
  $ file /tmp/build/perf-tools-next/feature/test-rust.bin
/tmp/build/perf-tools-next/feature/test-rust.bin: ELF 64-bit LSB pie executable, x86-64, version 1 (SYSV), dynamically linked, interpreter /lib64/ld-linux-x86-64.so.2, BuildID[sha1]=9c416edf673ee3705b97bae893a99a6fcf1ee258, for GNU/Linux 3.2.0, with debug_info, not stripped
  $
  $ perf -vv | grep RUST
                  rust: [ on  ]  # HAVE_RUST_SUPPORT
  $

Fixes: 6a32fa5ccd ("tools build: Add a feature test for rust compiler")
Cc: Dmitrii Dolgov <9erthalion6@gmail.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2026-02-09 11:10:56 -03:00
Dmitrii Dolgov
335047109d perf tests: Test annotate with data type profiling and C
Exercise the annotate command with data type profiling feature with C.

For that extend the existing data type profiling shell test to profile
the datasym workload, then annotate the result expecting to see some
data structures from the C code.

Committer testing:

  root@number:~# perf test 'perf data type profiling tests'
   83: perf data type profiling tests                                  : Ok
  root@number:~# perf test -vv 'perf data type profiling tests'
   83: perf data type profiling tests:
  --- start ---
  test child forked, pid 125028
  Basic Rust perf annotate test
  Basic annotate test [Success]
  Pipe Rust perf annotate test
  Pipe annotate test [Success]
  Basic C perf annotate test
  Basic annotate test [Success]
  Pipe C perf annotate test
  Pipe annotate test [Success]
  ---- end(0) ----
   83: perf data type profiling tests                                  : Ok
  root@number:~#

Signed-off-by: Dmitrii Dolgov <9erthalion6@gmail.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2026-02-08 19:16:28 -03:00
Dmitrii Dolgov
f60a5c2296 perf tests: Test annotate with data type profiling and rust
Exercise the annotate command with data type profiling feature on the
rust runtime. For that add a new shell test, which will profile the
code_with_type workload, then annotate the result expecting to see some
data structures from the rust code.

Committer testing:

  root@number:~# perf test 'perf data type profiling tests'
   83: perf data type profiling tests                            : Ok
  root@number:~# perf test -v 'perf data type profiling tests'
   83: perf data type profiling tests                            : Ok
  root@number:~# perf test -vv 'perf data type profiling tests'
   83: perf data type profiling tests:
  --- start ---
  test child forked, pid 111044
  Basic perf annotate test
  Basic annotate test [Success]
  Pipe perf annotate test
  Pipe annotate test [Success]
  ---- end(0) ----
   83: perf data type profiling tests                            : Ok
  root@number:~#

Signed-off-by: Dmitrii Dolgov <9erthalion6@gmail.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2026-02-08 19:16:28 -03:00
Dmitrii Dolgov
2e05bb52a1 perf test workload: Add code_with_type test workload
The purpose of the workload is to gather samples of rust runtime. To
achieve that it has a dummy rust library linked with it.

Per recommendations for such scenarios [1], the rust library is
statically linked.

An example:

$ perf record perf test -w code_with_type
[ perf record: Woken up 1 times to write data ]
[ perf record: Captured and wrote 0.160 MB perf.data (4074 samples) ]

$ perf report --stdio --dso perf -s srcfile,srcline
    45.16%  ub_checks.rs       ub_checks.rs:72
     6.72%  code_with_type.rs  code_with_type.rs:15
     6.64%  range.rs           range.rs:767
     4.26%  code_with_type.rs  code_with_type.rs:21
     4.23%  range.rs           range.rs:0
     3.99%  code_with_type.rs  code_with_type.rs:16
    [...]

[1]: https://doc.rust-lang.org/reference/linkage.html#mixed-rust-and-foreign-codebases

Signed-off-by: Dmitrii Dolgov <9erthalion6@gmail.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2026-02-08 19:16:24 -03:00
Dmitrii Dolgov
6a32fa5ccd tools build: Add a feature test for rust compiler
Add a feature test to identify if the rust compiler is available, so
that perf could build rust based worloads based on that.

Signed-off-by: Dmitrii Dolgov <9erthalion6@gmail.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2026-02-08 11:30:45 -03:00
Ian Rogers
ff9aeb6bd1 perf test parse-metric: Ensure aggregate counts appear to have run
Commit bb5a920b90 ("perf stat: Ensure metrics are displayed even
with failed events") with failed events") made it so that counters which
weren't enabled in the kernel were handled as NaN in metrics.

This caused the "Parse and process metrics" test to start failing as it
wasn't putting a non-zero value in these variables.

Add arbitrary values of 1 to fix the test.

Fixes: bb5a920b90 ("perf stat: Ensure metrics are displayed even with failed events")
Signed-off-by: Ian Rogers <irogers@google.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Chun-Tse Shao <ctshao@google.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@linaro.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Yang Li <yang.lee@linux.alibaba.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2026-02-08 11:30:01 -03:00
Ian Rogers
c60ee958d6 perf test record.sh: Fix shellcheck warning
Add quotes to avoid the following warning:
```
In tests/shell/record.sh line 264:
 [ $(uname -m) = "s390x" ] && {
   ^---------^ SC2046 (warning): Quote this to prevent word splitting.

For more information:
 https://www.shellcheck.net/wiki/SC2046 -- Quote this to prevent word splitt...
```

Fixes: c73a56ed3c ("perf test: Fix test case Leader sampling on s390")
Signed-off-by: Ian Rogers <irogers@google.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Thomas Richter <tmricht@linux.ibm.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2026-02-08 11:29:58 -03:00
Ian Rogers
36a1b0061a perf build: Reduce pmu-events related copying and mkdirs
When building to an output directory the previous code would remove
files and then copy the source files over.

Each source file copy would have a rule to make its directory. All JSON
for every architecture was considered a source file.

This led to unnecessary copying as a file would be deleted and then the
same file copied again, unnecessary directory making, and copying of
files not used in the build.

A side-effect would be a lot of build messages.

This change makes it so that all computed output files are created and
then compared to all files in the OUTPUT directory.

By filtering out the files that would be copied, unnecessary files can
be determined and then deleted - note, this is a phony target which
would remake the pmu-events.c if always depended upon, and so the
dependency is conditional on there being files to remove.

This has some overhead as the $(OUTPUT)/pmu-events is "find" over rather
than just "rm -fr", but the savings from unnecessary copying, etc.
should make up for this new make overhead.

The copy target just does copying but has a dependency on the directory
it needs being built, avoiding repetitive mkdirs.

The source files for copying only consider the JEVENTS_ARCH unless the
JEVENTS_ARCH is all.

The metric JSON is only generated if appropriate, rather than always
being generated and jevents.py deciding whether or not to use the files.

The mypy and pylint targets are fixed as variable names had changed but
the rules not updated.

The line count of a build with "make -C tools/perf O=/tmp/perf clean all"
prior to this change was 2181 lines, after this change it is 1596
lines.

This is a reduction of 585 lines or about 27%.

The generated pmu-events.c for JEVENTS_ARCH "x86" and "all" were
validated as being identical after this change.

Signed-off-by: Ian Rogers <irogers@google.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@linaro.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Leo Yan <leo.yan@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2026-02-06 19:11:36 -03:00
Tycho Andersen (AMD)
4479884d1f perf lock contention: fix segfault in lock contention -b/--use-bpf
When run on a kernel without BTF info, perf crashes:

    libbpf: kernel BTF is missing at '/sys/kernel/btf/vmlinux', was CONFIG_DEBUG_INFO_BTF enabled?
    libbpf: failed to find valid kernel BTF

    Program received signal SIGSEGV, Segmentation fault.
    0x00005555556915b7 in btf.type_cnt ()
    (gdb) bt
    #0  0x00005555556915b7 in btf.type_cnt ()
    #1  0x0000555555691fbc in btf_find_by_name_kind ()
    #2  0x00005555556920d0 in btf.find_by_name_kind ()
    #3  0x00005555558a1b7c in init_numa_data (con=0x7fffffffd0a0) at util/bpf_lock_contention.c:125
    #4  0x00005555558a264b in lock_contention_prepare (con=0x7fffffffd0a0) at util/bpf_lock_contention.c:313
    #5  0x0000555555620702 in __cmd_contention (argc=0, argv=0x7fffffffea10) at builtin-lock.c:2084
    #6  0x0000555555622c8d in cmd_lock (argc=0, argv=0x7fffffffea10) at builtin-lock.c:2755
    #7  0x0000555555651451 in run_builtin (p=0x555556104f00 <commands+576>, argc=3, argv=0x7fffffffea10)
        at perf.c:349
    #8  0x00005555556516ed in handle_internal_command (argc=3, argv=0x7fffffffea10) at perf.c:401
    #9  0x000055555565184e in run_argv (argcp=0x7fffffffe7fc, argv=0x7fffffffe7f0) at perf.c:445
    #10 0x0000555555651b9f in main (argc=3, argv=0x7fffffffea10) at perf.c:553

Check if btf loading failed, and don't do anything with it in
init_numa_data(). This leads to the following error message, instead of
just a crash:

    libbpf: kernel BTF is missing at '/sys/kernel/btf/vmlinux', was CONFIG_DEBUG_INFO_BTF enabled?
    libbpf: failed to find valid kernel BTF
    libbpf: kernel BTF is missing at '/sys/kernel/btf/vmlinux', was CONFIG_DEBUG_INFO_BTF enabled?
    libbpf: failed to find valid kernel BTF
    libbpf: Error loading vmlinux BTF: -ESRCH
    libbpf: failed to load BPF skeleton 'lock_contention_bpf': -ESRCH
    Failed to load lock-contention BPF skeleton
    lock contention BPF setup failed

Signed-off-by: Tycho Andersen (AMD) <tycho@kernel.org>
Acked-by: Namhyung Kim <namhyung@kernel.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Chun-Tse Shao <ctshao@google.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@linaro.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: K Prateek Nayak <kprateek.nayak@amd.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Ravi Bangoria <ravi.bangoria@amd.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2026-02-06 18:57:26 -03:00
Ricky Ringler
920c5570a6 perf sort: Replace static cacheline size with sysconf cacheline size
Testing:
- Built perf
- Executed perf mem record and report

Committer notes:

This addresses a TODO and improves the situation where record and
report/c2c are performed on the same machine or in machines with the
same cacheline size, but the proper way is to store the cacheline size
in the perf.data header at 'record' time and then use it at post
processing time.

Signed-off-by: Ricky Ringler <ricky.ringler@proton.me>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: https://lore.kernel.org/r/20260129004223.26799-1-ricky.ringler@proton.me
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2026-02-06 18:51:15 -03:00
Thomas Richter
c73a56ed3c perf test: Fix test case Leader sampling on s390
The subtest 'Leader sampling' some time fails on s390.
- for z/VM guest: Disable the test for z/VM guest. There is no
  CPU Measurement facility to run the test successfully.
- for LPAR: Use correct event names.

A detailed analysis follows here:
Now to the debugging and investigation:
1. With command
       perf record -e '{cycles,cycles}:S' -- ....
   the first cycles event starts sampling.
   On s390 this sets up sampling with a frequency of 4000 Hz.
   This translates to hardware sample rate of 1377000 instructions per
   micro-second to meet a frequency of 4000 HZ.

2. With first event cycles now sampling into a hardware buffer, an
   interrupt is triggered each time a sampling buffer gets full.
   The interrupt handler is then invoked and debug output shows the
   processing of samples.  The size of one hardware sample is 32 bytes.
   With an interrupt triggered when the hardware buffer page of 4KB
   gets full, the interrupt handler processes 128 samples.
   (This is taken from s390 specific fast debug data gathering)
   2025-11-07 14:35:51.977248  000003ffe013cbfa \
	   perf_event_count_update event->count 0x0 count 0x1502e8
   2025-11-07 14:35:51.977248  000003ffe013cbfa \
	   perf_event_count_update event->count 0x1502e8 count 0x1502e8
   2025-11-07 14:35:51.977248  000003ffe013cbfa \
	   perf_event_count_update event->count 0x2a05d0 count 0x1502e8
   2025-11-07 14:35:51.977252  000003ffe013cbfa \
	   perf_event_count_update event->count 0x3f08b8 count 0x1502e8
   2025-11-07 14:35:51.977252  000003ffe013cbfa \
	   perf_event_count_update event->count 0x540ba0 count 0x1502e8
   2025-11-07 14:35:51.977253  000003ffe013cbfa \
	   perf_event_count_update event->count 0x690e88 count 0x1502e8
   2025-11-07 14:35:51.977254  000003ffe013cbfa \
	   perf_event_count_update event->count 0x7e1170 count 0x1502e8
   2025-11-07 14:35:51.977254  000003ffe013cbfa \
	   perf_event_count_update event->count 0x931458 count 0x1502e8
   2025-11-07 14:35:51.977254  000003ffe013cbfa \
	   perf_event_count_update event->count 0xa81740 count 0x1502e8

3. The value is constantly increasing by the number of instructions
   executed to generate a sample entry.  This is the first line of the
   pairs of lines. count 0x1502e8 --> 1377000

   # perf script | grep 1377000 | wc -l
   214
   # perf script | wc -l
   428
   #
   That is 428 lines in total, and half of the lines contain value
   1377000.

4. The second event cycles is opened against the counting PMU, which
   is an independent PMU and is not interrupt driven.  Once enabled it
   runs in the background and keeps running, incrementing silently
   about 400+ counters. The counter values are read via assembly
   instructions.

   This second counter PMU's read call back function is called when the
   interrupt handler of the sampling facility processes each sample. The
   function call sequence is:

   perf_event_overflow()
   +--> __perf_event_overflow()
        +--> __perf_event_output()
               +--> perf_output_sample()
                    +--> perf_output_read()
                         +--> perf_output_read_group()
	                          for_each_sibling_event(sub, leader) {
		values[n++] = perf_event_count(sub, self);
		printk("%s sub %p values %#lx\n", __func__, sub, values[n-1]);
			          }

   The last function perf_event_count() is invoked on the second event
   cylces *on* the counting PMU. An added printk statement shows the
   following lines in the dmesg output:

   # dmesg|grep perf_output_read_group |head -10
   [  332.368620] perf_output_read_group sub 00000000d80b7c1f values 0x3a80917 (1)
   [  332.368624] perf_output_read_group sub 00000000d80b7c1f values 0x3a86c7f (2)
   [  332.368627] perf_output_read_group sub 00000000d80b7c1f values 0x3a89c15 (3)
   [  332.368629] perf_output_read_group sub 00000000d80b7c1f values 0x3a8c895 (4)
   [  332.368631] perf_output_read_group sub 00000000d80b7c1f values 0x3a8f569 (5)
   [  332.368633] perf_output_read_group sub 00000000d80b7c1f values 0x3a9204b
   [  332.368635] perf_output_read_group sub 00000000d80b7c1f values 0x3a94790
   [  332.368637] perf_output_read_group sub 00000000d80b7c1f values 0x3a9704b
   [  332.368638] perf_output_read_group sub 00000000d80b7c1f values 0x3a99888
   #

   This correlates with the output of
   # perf report -D | grep 'id 00000000000000'|head -10
   ..... id 0000000000000006, value 00000000001502e8, lost 0
   ..... id 000000000000000e, value 0000000003a80917, lost 0 --> line (1) above
   ..... id 0000000000000006, value 00000000002a05d0, lost 0
   ..... id 000000000000000e, value 0000000003a86c7f, lost 0 --> line (2) above
   ..... id 0000000000000006, value 00000000003f08b8, lost 0
   ..... id 000000000000000e, value 0000000003a89c15, lost 0 --> line (3) above
   ..... id 0000000000000006, value 0000000000540ba0, lost 0
   ..... id 000000000000000e, value 0000000003a8c895, lost 0 --> line (4) above
   ..... id 0000000000000006, value 0000000000690e88, lost 0
   ..... id 000000000000000e, value 0000000003a8f569, lost 0 --> line (5) above

Summary:
- Above command starts the CPU sampling facility, with runs interrupt
  driven when a 4KB page is full. An interrupt processes the 128 samples
  and calls eventually perf_output_read_group() for each sample to save it
  in the event's ring buffer.

- At that time the CPU counting facility is invoked to read the value of
  the event cycles. This value is saved as the second value in the
  sample_read structure.

- The first and odd lines in the perf script output displays the period
  value between 2 samples being created by hardware. It is the number
  of instructions executes before the hardware writes a sample.

- The second and even lines in the perf script output displays the number
  of CPU cycles needed to process each sample and save it in the event's
  ring buffer.
These 2 different values can never be identical on s390.

Since event leader sampling is not possible on s390 the perf tool will
return EOPNOTSUPP soon. Perpare the test case for that.

Suggested-by: James Clark <james.clark@linaro.org>
Reviewed-by: Jan Polensky <japo@linux.ibm.com>
Signed-off-by: Thomas Richter <tmricht@linux.ibm.com>
Tested-by: Jan Polensky <japo@linux.ibm.com>
Cc: Alexander Gordeev <agordeev@linux.ibm.com>
Cc: Heiko Carstens <hca@linux.ibm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Sumanth Korikkar <sumanthk@linux.ibm.com>
Cc: Vasily Gorbik <gor@linux.ibm.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2026-02-06 18:40:24 -03:00