Commit Graph

17532 Commits

Author SHA1 Message Date
Ian Rogers
afffec6f03 perf dso: Add support for reading the e_machine type for a dso
For ELF file dsos read the e_machine from the ELF header. For kernel
types assume the e_machine matches the perf tool. In other cases
return EM_NONE.

When reading from the ELF header use DSO__SWAP that may need
dso->needs_swap initializing. Factor out dso__swap_init to allow this.

Signed-off-by: Ian Rogers <irogers@google.com>
Reviewed-by: Namhyung Kim <namhyung@kernel.org>
Acked-by: Arnaldo Carvalho de Melo <acme@kernel.org>
Link: https://lore.kernel.org/r/20250319050741.269828-7-irogers@google.com
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2025-03-20 22:58:02 -07:00
Ian Rogers
5c2938fe78 perf syscalltbl: Remove struct syscalltbl
The syscalltbl held entries of system call name and number pairs,
generated from a native syscalltbl at start up. As there are gaps in
the system call number there is a notion of index into the
table. Going forward we want the system call table to be identifiable
by a machine type, for example, i386 vs x86-64. Change the interface
to the syscalltbl so (1) a (currently unused machine type of EM_HOST)
is passed (2) the index to syscall number and system call name mapping
is computed at build time.

Two tables are used for this, an array of system call number to name,
an array of system call numbers sorted by the system call name. The
sorted array doesn't store strings in part to save memory and
relocations. The index notion is carried forward and is an index into
the sorted array of system call numbers, the data structures are
opaque (held only in syscalltbl.c), and so the number of indices for a
machine type is exposed as a new API.

The arrays are computed in the syscalltbl.sh script and so no start-up
time computation and storage is necessary.

Signed-off-by: Ian Rogers <irogers@google.com>
Reviewed-by: Howard Chu <howardchu95@gmail.com>
Reviewed-by: Charlie Jenkins <charlie@rivosinc.com>
Reviewed-by: Namhyung Kim <namhyung@kernel.org>
Acked-by: Arnaldo Carvalho de Melo <acme@kernel.org>
Link: https://lore.kernel.org/r/20250319050741.269828-6-irogers@google.com
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2025-03-20 22:57:57 -07:00
Ian Rogers
3d94b8441c perf trace: Reorganize syscalls
Identify struct syscall information in the syscalls table by a machine
type and syscall number, not just system call number. Having the
machine type means that 32-bit system calls can be differentiated from
64-bit ones on a machine capable of both. Having a table for all
machine types and all system call numbers would be too large, so
maintain a sorted array of system calls as they are encountered.

Signed-off-by: Ian Rogers <irogers@google.com>
Reviewed-by: Howard Chu <howardchu95@gmail.com>
Reviewed-by: Charlie Jenkins <charlie@rivosinc.com>
Reviewed-by: Namhyung Kim <namhyung@kernel.org>
Acked-by: Arnaldo Carvalho de Melo <acme@kernel.org>
Link: https://lore.kernel.org/r/20250319050741.269828-5-irogers@google.com
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2025-03-20 22:57:53 -07:00
Ian Rogers
af472d3c44 perf syscalltbl: Remove syscall_table.h
The definition of "static const char *const syscalltbl[] = {" is done
in a generated syscalls_32.h or syscalls_64.h that is architecture
dependent. In order to include the appropriate file a syscall_table.h
is found via the perf include path and it includes the syscalls_32.h
or syscalls_64.h as appropriate.

To support having multiple syscall tables, one for 32-bit and one for
64-bit, or for different architectures, an include path cannot be
used. Remove syscall_table.h because of this and inline what it does
into syscalltbl.c.

For architectures without a syscall_table.h this will cause a failure
to include either syscalls_32.h or syscalls_64.h rather than a failure
to include syscall_table.h. For architectures that only included one
or other, the behavior matches BITS_PER_LONG as previously done on
architectures supporting both syscalls_32.h and syscalls_64.h.

Signed-off-by: Ian Rogers <irogers@google.com>
Reviewed-by: Howard Chu <howardchu95@gmail.com>
Reviewed-by: Charlie Jenkins <charlie@rivosinc.com>
Reviewed-by: Namhyung Kim <namhyung@kernel.org>
Acked-by: Arnaldo Carvalho de Melo <acme@kernel.org>
Link: https://lore.kernel.org/r/20250319050741.269828-4-irogers@google.com
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2025-03-20 22:57:35 -07:00
Ian Rogers
4773175c9d perf dso: kernel-doc for enum dso_binary_type
There are many and non-obvious meanings to the dso_binary_type enum
values. Add kernel-doc to speed interpretting their meanings.

Acked-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: Ian Rogers <irogers@google.com>
Reviewed-by: Namhyung Kim <namhyung@kernel.org>
Link: https://lore.kernel.org/r/20250319050741.269828-3-irogers@google.com
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2025-03-20 22:57:25 -07:00
Ian Rogers
f1794ecb0c perf dso: Move libunwind dso_data variables into ifdef
The variables elf_base_addr, debug_frame_offset, eh_frame_hdr_addr and
eh_frame_hdr_offset are only accessed in unwind-libunwind-local.c
which is conditionally built on having libunwind support. Make the
variables conditional on libunwind support too.

Reviewed-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: Ian Rogers <irogers@google.com>
Reviewed-by: Namhyung Kim <namhyung@kernel.org>
Acked-by: Arnaldo Carvalho de Melo <acme@kernel.org>
Link: https://lore.kernel.org/r/20250319050741.269828-2-irogers@google.com
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2025-03-20 22:56:29 -07:00
Namhyung Kim
d10a7aaaf8 perf report: Disable children column for data type profiling
I've realized that it doesn't make sense to accumulate the samples to
parent in the callchain when data type profiling is enabled.  Because it
won't have the same data type access in the parent.  Otherwise it'd see
something like this:

  $ perf report -s type --stdio -g none
  # To display the perf.data header info, please use --header/--header-only options.
  #
  #
  # Total Lost Samples: 0
  #
  # Samples: 2K of event 'cycles:Pu'
  # Event count (approx.): 8266456478
  #
  # Children  Latency      Self   Latency  Data Type
  # ........  .......  ........  ........  .........
  #
     698.97%   697.72%    99.80%    99.61%  (unknown)
       0.09%    0.18%     0.09%     0.18%  Elf64_Rela
       0.05%    0.10%     0.05%     0.10%  unsigned char
       0.05%    0.10%     0.05%     0.10%  struct exit_function_list
       0.00%    0.01%     0.00%     0.01%  struct rtld_global

Link: https://lore.kernel.org/r/20250307080829.354947-3-namhyung@kernel.org
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2025-03-20 09:17:56 -07:00
Namhyung Kim
6df71c7237 perf report: Allow hierarchy mode for --children
It was prohibited because the output fields in the children mode were
not handled properly with hierarchy.  But we can have the output fields
in the same level, it can allow them together.

For example, latency mode adds more output fields by default and now
they are displayed properly.

  $ perf record --latency -g -- perf test -w thloop

  $ perf report -H --stdio
  # To display the perf.data header info, please use --header/--header-only options.
  #
  #
  # Total Lost Samples: 0
  #
  # Samples: 2K of event 'cycles:Pu'
  # Event count (approx.): 8266456478
  #
  #       Children  Latency  Overhead   Latency  Command / Shared Object / Symbol
  # ...........................................  ........................................................
  #
       0.08%    0.16%   100.00%   100.00%        perf
          0.08%    0.16%     0.24%     0.47%        ld-linux-x86-64.so.2
             0.12%    0.24%     0.12%     0.24%        [.] _dl_relocate_object
             0.08%    0.16%     0.08%     0.16%        [.] _dl_lookup_symbol_x
             0.03%    0.06%     0.03%     0.06%        [.] strcmp
             0.00%    0.01%     0.00%     0.01%        [.] _dl_start
             0.00%    0.00%     0.00%     0.00%        [.] _dl_start_user
             0.00%    0.00%     0.00%     0.00%        [.] _dl_sysdep_start
             0.00%    0.00%     0.00%     0.00%        [.] _start
             0.00%    0.00%     0.00%     0.00%        [.] dl_main
          0.03%    0.06%     0.03%     0.06%        libLLVM-16.so.1
             0.03%    0.06%     0.03%     0.06%        [.] llvm::StringMapImpl::RehashTable(unsigned int)
             0.00%    0.00%     0.00%     0.00%        [.] 0x00007f137ccd18e8
          0.00%    0.00%    99.66%    99.31%        perf
            99.66%   99.31%    99.66%    99.31%        [.] test_loop
              |
              |--49.86%--0x7f137b633d68
              |          0x55dbdbbb7d2c
              ...

Link: https://lore.kernel.org/r/20250307080829.354947-2-namhyung@kernel.org
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2025-03-20 09:17:56 -07:00
Namhyung Kim
a1bbd66627 perf sort: Keep output fields in the same level
This is useful for hierarchy output mode where the first level is
considered as output fields.  We want them in the same level so that it
can show only the remaining groups in the hierarchy.

Before:
  $ perf report -s overhead,sample,period,comm,dso -H --stdio
  ...
  #          Overhead  Samples / Period / Command / Shared Object
  # .................  ..........................................
  #
     100.00%           4035
        100.00%           3835883066
           100.00%           perf
               99.37%           perf
                0.50%           ld-linux-x86-64.so.2
                0.06%           [unknown]
                0.04%           libc.so.6
                0.02%           libLLVM-16.so.1

After:
  $ perf report -s overhead,sample,period,comm,dso -H --stdio
  ...
  #    Overhead       Samples        Period  Command / Shared Object
  # .......................................  .......................
  #
     100.00%          4035    3835883066     perf
         99.37%          4005    3811826223     perf
          0.50%            19      19210014     ld-linux-x86-64.so.2
          0.06%             8       2367089     [unknown]
          0.04%             2       1720336     libc.so.6
          0.02%             1        759404     libLLVM-16.so.1

Acked-by: Ingo Molnar <mingo@kernel.org>
Link: https://lore.kernel.org/r/20250307080829.354947-1-namhyung@kernel.org
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2025-03-20 09:17:56 -07:00
Thomas Richter
431db90a73 perf pmu: Handle memory failure in tool_pmu__new()
On linux-next
commit 72c6f57a41 ("perf pmu: Dynamically allocate tool PMU")
allocated PMU named "tool" dynamicly. However that allocation
can fail and a NULL pointer is returned. That case is currently
not handled and would result in an invalid address reference.
Add a check for NULL pointer.

Fixes: 72c6f57a41 ("perf pmu: Dynamically allocate tool PMU")
Signed-off-by: Thomas Richter <tmricht@linux.ibm.com>
Reviewed-by: James Clark <james.clark@linaro.org>
Link: https://lore.kernel.org/r/20250319122820.2898333-1-tmricht@linux.ibm.com
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2025-03-19 17:00:16 -07:00
James Clark
6d2dcd6352 perf: intel-tpebs: Fix incorrect usage of zfree()
zfree() requires an address otherwise it frees what's in name, rather
than name itself. Pass the address of name to fix it.

This was the only incorrect occurrence in Perf found using a search.

Fixes: 8db5cabcf1 ("perf stat: Fork and launch 'perf record' when 'perf stat' needs to get retire latency value for a metric.")
Signed-off-by: James Clark <james.clark@linaro.org>
Link: https://lore.kernel.org/r/20250319101614.190922-1-james.clark@linaro.org
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2025-03-19 16:56:56 -07:00
Ian Rogers
58b8b5d142 perf cpumap: Increment reference count for online cpumap
Thomas Richter <tmricht@linux.ibm.com> reported a double put on the
cpumap for the placeholder core PMU:
https://lore.kernel.org/lkml/20250318095132.1502654-3-tmricht@linux.ibm.com/
Requiring the caller to get the cpumap is not how these things are
usually done, switch cpu_map__online to do the get and then fix up any
use cases where a put is needed.

Signed-off-by: Ian Rogers <irogers@google.com>
Tested-by: Thomas Richter <tmricht@linux.ibm.com>
Link: https://lore.kernel.org/r/20250318171914.145616-1-irogers@google.com
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2025-03-19 16:56:33 -07:00
Stephen Brennan
ebf0b33273 perf dso: fix dso__is_kallsyms() check
Kernel modules for which we cannot find a file on-disk will have a
dso->long_name that looks like "[module_name]". Prior to the commit
listed in the fixes, the dso->kernel field would be zero (for user
space), so dso__is_kallsyms() would return false. After the commit,
kernel module DSOs are correctly labeled, but the result is that
dso__is_kallsyms() erroneously returns true for those modules without a
filesystem path.

Later, build_id_cache__add() consults this value of is_kallsyms, and
when true, it copies /proc/kallsyms into the cache. Users with many
kernel modules without a filesystem path (e.g. ksplice or possibly
kernel live patch modules) have reported excessive disk space usage in
the build ID cache directory due to this behavior.

To reproduce the issue, it's enough to build a trivial out-of-tree hello
world kernel module, load it using insmod, and then use:

   perf record -ag -- sleep 1

In the build ID directory, there will be a directory for your module
name containing a kallsyms file.

Fix this up by changing dso__is_kallsyms() to consult the
dso_binary_type enumeration, which is also symmetric to the above checks
for dso__is_vmlinux() and dso__is_kcore(). With this change, kallsyms is
not cached in the build-id cache for out-of-tree modules.

Fixes: 02213cec64 ("perf maps: Mark module DSOs with kernel type")
Signed-off-by: Stephen Brennan <stephen.s.brennan@oracle.com>
Link: https://lore.kernel.org/r/20250318230012.2038790-1-stephen.s.brennan@oracle.com
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2025-03-19 16:56:05 -07:00
Xin Li (Intel)
8f97566c8a x86/cpufeatures: Remove {disabled,required}-features.h
The functionalities of {disabled,required}-features.h have been replaced with
the auto-generated generated/<asm/cpufeaturemasks.h> header.

Thus they are no longer needed and can be removed.

None of the macros defined in {disabled,required}-features.h is used in tools,
delete them too.

Signed-off-by: Xin Li (Intel) <xin@zytor.com>
Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Link: https://lore.kernel.org/r/20250305184725.3341760-4-xin@zytor.com
2025-03-19 11:15:12 +01:00
Feng Yang
2b5b834cc3 perf kwork: Remove unreachable judgments
When s2[i] = '\0', if s1[i] != '\0', it will be judged by ret,
and if s1[i] = '\0', it will be judegd by !s1[i].
So in reality, s2 [i] will never make a judgment

Signed-off-by: Feng Yang <yangfeng@kylinos.cn>
Reviewed-by: Ian Rogers <irogers@google.com>
Link: https://lore.kernel.org/r/20250314031013.94480-1-yangfeng59949@163.com
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2025-03-18 16:55:30 -07:00
Arnaldo Carvalho de Melo
89aaeaf842 perf python: Check if there is space to copy all the event
The pyrf_event__new() method copies the event obtained from the perf
ring buffer to a structure that will then be turned into a python object
for further consumption, so it copies perf_event.header.size bytes to
its 'event' member:

  $ pahole -C pyrf_event /tmp/build/perf-tools-next/python/perf.cpython-312-x86_64-linux-gnu.so
  struct pyrf_event {
  	PyObject                   ob_base;              /*     0    16 */
  	struct evsel *             evsel;                /*    16     8 */
  	struct perf_sample         sample;               /*    24   312 */

  	/* XXX last struct has 7 bytes of padding, 2 holes */

  	/* --- cacheline 5 boundary (320 bytes) was 16 bytes ago --- */
  	union perf_event           event;                /*   336  4168 */

  	/* size: 4504, cachelines: 71, members: 4 */
  	/* member types with holes: 1, total: 2 */
  	/* paddings: 1, sum paddings: 7 */
  	/* last cacheline: 24 bytes */
  };

  $

It was doing so without checking if the event just obtained has more
than that space, fix it.

This isn't a proper, final solution, as we need to support larger
events, but for the time being we at least bounds check and document it.

Fixes: 877108e42b ("perf tools: Initial python binding")
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Reviewed-by: Ian Rogers <irogers@google.com>
Link: https://lore.kernel.org/r/20250312203141.285263-7-acme@kernel.org
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2025-03-18 16:08:45 -07:00
Arnaldo Carvalho de Melo
f3fed3ae34 perf python: Don't keep a raw_data pointer to consumed ring buffer space
When processing tracepoints the perf python binding was parsing the
event before calling perf_mmap__consume(&md->core) in
pyrf_evlist__read_on_cpu().

But part of this event parsing was to set the perf_sample->raw_data
pointer to the payload of the event, which then could be overwritten by
other event before tracepoint fields were asked for via event.prev_comm
in a python program, for instance.

This also happened with other fields, but strings were were problems
were surfacing, as there is UTF-8 validation for the potentially garbled
data.

This ended up showing up as (with some added debugging messages):

  ( field 'prev_comm' ret=0x7f7c31f65110, raw_size=68 )  ( field 'prev_pid' ret=0x7f7c23b1bed0, raw_size=68 )  ( field 'prev_prio' ret=0x7f7c239c0030, raw_size=68 )  ( field 'prev_state' ret=0x7f7c239c0250, raw_size=68 ) time 14771421785867 prev_comm= prev_pid=1919907691 prev_prio=796026219 prev_state=0x303a32313175 ==>
  ( XXX '��' len=16, raw_size=68)  ( field 'next_comm' ret=(nil), raw_size=68 ) Traceback (most recent call last):
   File "/home/acme/git/perf-tools-next/tools/perf/python/tracepoint.py", line 51, in <module>
     main()
   File "/home/acme/git/perf-tools-next/tools/perf/python/tracepoint.py", line 46, in main
     event.next_comm,
     ^^^^^^^^^^^^^^^
  AttributeError: 'perf.sample_event' object has no attribute 'next_comm'

When event.next_comm was asked for, the PyUnicode_FromString() python
API would fail and that tracepoint field wouldn't be available, stopping
the tools/perf/python/tracepoint.py test tool.

But, since we already do a copy of the whole event in pyrf_event__new,
just use it and while at it remove what was done in in e8968e6541
("perf python: Fix pyrf_evlist__read_on_cpu event consuming") because we
don't really need to wait for parsing the sample before declaring the
event as consumed.

This copy is questionable as is now, as it limits the maximum event +
sample_type and tracepoint payload to sizeof(union perf_event), this all
has been "working" because 'struct perf_event_mmap2', the largest entry
in 'union perf_event' is:

  $ pahole -C perf_event ~/bin/perf | grep mmap2
	struct perf_record_mmap2   mmap2;              /*     0  4168 */
  $

Fixes: bae57e3825 ("perf python: Add support to resolve tracepoint fields")
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Reviewed-by: Ian Rogers <irogers@google.com>
Link: https://lore.kernel.org/r/20250312203141.285263-6-acme@kernel.org
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2025-03-18 16:08:35 -07:00
Arnaldo Carvalho de Melo
3de5a2bf5b perf python: Decrement the refcount of just created event on failure
To avoid a leak if we have the python object but then something happens
and we need to return the operation, decrement the offset of the newly
created object.

Fixes: 377f698db1 ("perf python: Add struct evsel into struct pyrf_event")
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Reviewed-by: Ian Rogers <irogers@google.com>
Link: https://lore.kernel.org/r/20250312203141.285263-5-acme@kernel.org
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2025-03-18 16:08:29 -07:00
Arnaldo Carvalho de Melo
a570da2148 perf python tracepoint.py: Change the COMM using setproctitle if available
Otherwise when debugging we see just "python" in perf, top, etc.

Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Reviewed-by: Ian Rogers <irogers@google.com>
Link: https://lore.kernel.org/r/20250312203141.285263-4-acme@kernel.org
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2025-03-18 16:08:22 -07:00
Arnaldo Carvalho de Melo
1882625c91 perf python: Remove some unused macros (_PyUnicode_FromString(arg), etc)
When python2 support was removed in e7e9943c87 ("perf python:
Remove python 2 scripting support"), all use of the
_PyUnicode_FromString(arg), _PyUnicode_FromFormat(...), and
_PyLong_FromLong(arg) macros was removed as well, so remove it.

Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Reviewed-by: Ian Rogers <irogers@google.com>
Link: https://lore.kernel.org/r/20250312203141.285263-3-acme@kernel.org
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2025-03-18 16:08:14 -07:00
Arnaldo Carvalho de Melo
1376c195e8 perf python: Fixup description of sample.id event member
Some old cut'n'paste error, its "ip", so the description should be
"event ip", not "event type".

Fixes: 877108e42b ("perf tools: Initial python binding")
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Reviewed-by: Ian Rogers <irogers@google.com>
Link: https://lore.kernel.org/r/20250312203141.285263-2-acme@kernel.org
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2025-03-18 16:08:05 -07:00
Ian Rogers
ca2182097e perf test dso-data: Correctly free test file in read test
The DSO data read test opens a file but as dsos__exit is used the test
file isn't closed. This causes the subsequent subtests in don't fork
(-F) mode to fail as one more than expected file descriptor is open.

Signed-off-by: Ian Rogers <irogers@google.com>
Link: https://lore.kernel.org/r/20250318043151.137973-4-irogers@google.com
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2025-03-17 22:07:18 -07:00
Ian Rogers
5ac22c35aa perf dso: Use lock annotations to fix asan deadlock
dso__list_del with address sanitizer and/or reference count checking
will call dso__put that can call dso__data_close reentrantly trying to
lock the dso__data_open_lock and deadlocking. Switch from pthread
mutexes to perf's mutex so that lock checking is performed in debug
builds. Add lock annotations that diagnosed the problem. Release the
dso__data_open_lock around the dso__put to avoid the deadlock.

Change the declaration of dso__data_get_fd to return a boolean,
indicating the fd is valid and the lock is held, to make it compatible
with the thread safety annotations as a try lock.

Signed-off-by: Ian Rogers <irogers@google.com>
Link: https://lore.kernel.org/r/20250318043151.137973-3-irogers@google.com
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2025-03-17 22:07:18 -07:00
Ian Rogers
c5ebf3a266 perf mutex: Add annotations for LOCKS_EXCLUDED and LOCKS_RETURNED
Used to annotate when locks shouldn't be held for a function or if a
function returns a lock that's used by later mutex lock unlock
operations.

Signed-off-by: Ian Rogers <irogers@google.com>
Link: https://lore.kernel.org/r/20250318043151.137973-2-irogers@google.com
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2025-03-17 22:07:18 -07:00
Ian Rogers
658b34cc9f perf test: Add pipe output testing for annotate
Parameterize the basic testing to generate directly a perf.data file
or to generate/use one from pipe input or output.  To simplify the
refactor move some of the head/grep logic around. Use "-q" with grep
to make the test output cleaner.

Signed-off-by: Ian Rogers <irogers@google.com>
Link: https://lore.kernel.org/r/20250311211635.541090-1-irogers@google.com
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2025-03-14 18:16:33 -07:00
Ian Rogers
3a86d63e6f perf test: Fixes to variable expansion and stdout for diff test
When make_data fails its error message needs to go to stderr rather
than stdout and the stdout value is captured in a variable.  Quote the
$err value so that it is always a valid input for test.  This error is
commonly encountered if no sample data is gathered by the test.

Signed-off-by: Ian Rogers <irogers@google.com>
Link: https://lore.kernel.org/r/20250312001841.1515779-1-irogers@google.com
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2025-03-14 18:15:13 -07:00
Arnaldo Carvalho de Melo
4e82c88a90 perf libunwind: Fixup conversion perf_sample->user_regs to a pointer
The dc6d2bc2d8 ("perf sample: Make user_regs and intr_regs optional") misses
the changes to a file, resulting in this problem:

  $ make LIBUNWIND=1 -C tools/perf O=/tmp/build/perf-tools-next install-bin
  <SNIP>
    CC      /tmp/build/perf-tools-next/util/unwind-libunwind-local.o
    CC      /tmp/build/perf-tools-next/util/unwind-libunwind.o
  <SNIP>
  util/unwind-libunwind-local.c: In function ‘access_mem’:
  util/unwind-libunwind-local.c:582:56: error: ‘ui->sample->user_regs’ is a pointer; did you mean to use ‘->’?
    582 |         if (__write || !stack || !ui->sample->user_regs.regs) {
        |                                                        ^
        |                                                        ->
  util/unwind-libunwind-local.c:587:38: error: passing argument 2 of ‘perf_reg_value’ from incompatible pointer type [-Wincompatible-pointer-types]
    587 |         ret = perf_reg_value(&start, &ui->sample->user_regs,
        |                                      ^~~~~~~~~~~~~~~~~~~~~~
        |                                      |
        |                                      struct regs_dump **
<SNIP>
  ⬢ [acme@toolbox perf-tools-next]$ git bisect bad
  dc6d2bc2d8 is the first bad commit
  commit dc6d2bc2d8 (HEAD)
  Author: Ian Rogers <irogers@google.com>
  Date:   Mon Jan 13 11:43:45 2025 -0800

      perf sample: Make user_regs and intr_regs optional

Detected using:

  make -C tools/perf build-test

Fixes: dc6d2bc2d8 ("perf sample: Make user_regs and intr_regs optional")
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: Ian Rogers <irogers@google.com>
Link: https://lore.kernel.org/r/20250313033121.758978-1-irogers@google.com
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2025-03-14 18:12:33 -07:00
Veronika Molnarova
02ba09c8ab perf test stat_all_pmu.sh: Correctly check 'perf stat' result
Test case "stat_all_pmu.sh" is not correctly checking 'perf stat' output
due to a poor design. Firstly, having the 'set -e' option with a trap
catching the sigexit causes the shell to exit immediately if 'perf stat' ends
with any non-zero value, which is then caught by the trap reporting an
unexpected signal. This causes events that should be parsed by the if-else
statement to be caught by the trap handler and are reported as errors:

    $ perf test -vv "perf all pmu"
    Testing i915/actual-frequency/
    Unexpected signal in main
    Error:
    Access to performance monitoring and observability operations is limited.

Secondly, the if-else branches are not exclusive as the checking if the
event is present in the output log covers also the "<not supported>"
events, which should be accepted, and also the "Bad name events", which
should be rejected.

Remove the "set -e" option from the test case, correctly parse the
"perf stat" output log and check its return value. Add the missing
outputs for the 'perf stat' result and also add logs messages to
report the branch that parsed the event for more info.

Fixes: 7e73ea4029 ("perf test: Ignore security failures in all PMU test")
Signed-off-by: Veronika Molnarova <vmolnaro@redhat.com>
Tested-by: Qiao Zhao <qzhao@redhat.com>
Link: https://lore.kernel.org/r/20241122231233.79509-1-vmolnaro@redhat.com
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2025-03-14 10:41:34 -07:00
Yujie Liu
fa9bc517af perf script: Update brstack syntax documentation
The following commits added new fields/flags to the branch stack field
list:

commit 1f48989cdc ("perf script: Output branch sample type")
commit 6ade6c6460 ("perf script: Show branch speculation info")
commit 1e66dcff7b ("perf script: Add not taken event for branch stack")

Update brstack syntax documentation to be consistent with the latest
branch stack field list. Improve the descriptions to help users
interpret the fields accurately.

Signed-off-by: Yujie Liu <yujie.liu@intel.com>
Reviewed-by: Leo Yan <leo.yan@arm.com>
Reviewed-by: Sandipan Das <sandipan.das@amd.com>
Link: https://lore.kernel.org/r/20250312072329.419020-1-yujie.liu@intel.com
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2025-03-14 10:41:08 -07:00
Yujie Liu
2f39edece1 perf script: Fix typo in branch event mask
BRACH -> BRANCH

Fixes: 88b1473135 ("perf script: Separate events from branch types")
Signed-off-by: Yujie Liu <yujie.liu@intel.com>
Reviewed-by: Leo Yan <leo.yan@arm.com>
Reviewed-by: James Clark <james.clark@linaro.org>
Link: https://lore.kernel.org/r/20250312075636.429127-1-yujie.liu@intel.com
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2025-03-13 13:19:27 -07:00
Arnaldo Carvalho de Melo
2333cfa9f8 perf hist stdio: Do bounds check when printing callchains to avoid UB with new gcc versions
Do a simple bounds check to avoid this on new gcc versions:

  31    15.81 fedora:rawhide                : FAIL gcc version 15.0.1 20250225 (Red Hat 15.0.1-0) (GCC)
    In function 'callchain__fprintf_left_margin',
        inlined from 'callchain__fprintf_graph.constprop' at ui/stdio/hist.c:246:12:
    ui/stdio/hist.c:27:39: error: iteration 2147483647 invokes undefined behavior [-Werror=aggressive-loop-optimizations]
       27 |         for (i = 0; i < left_margin; i++)
          |                                      ~^~
    ui/stdio/hist.c:27:23: note: within this loop
       27 |         for (i = 0; i < left_margin; i++)
          |                     ~~^~~~~~~~~~~~~
    cc1: all warnings being treated as errors

Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Link: https://lore.kernel.org/r/20250310194534.265487-4-acme@kernel.org
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2025-03-13 00:30:14 -07:00
Arnaldo Carvalho de Melo
cf67629f7f perf units: Fix insufficient array space
No need to specify the array size, let the compiler figure that out.

This addresses this compiler warning that was noticed while build
testing on fedora rawhide:

  31    15.81 fedora:rawhide                : FAIL gcc version 15.0.1 20250225 (Red Hat 15.0.1-0) (GCC)
    util/units.c: In function 'unit_number__scnprintf':
    util/units.c:67:24: error: initializer-string for array of 'char' is too long [-Werror=unterminated-string-initialization]
       67 |         char unit[4] = "BKMG";
          |                        ^~~~~~
    cc1: all warnings being treated as errors

Fixes: 9808143ba2 ("perf tools: Add unit_number__scnprintf function")
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Link: https://lore.kernel.org/r/20250310194534.265487-3-acme@kernel.org
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2025-03-13 00:30:08 -07:00
Namhyung Kim
bbf006d6d1 perf annotate: Add --code-with-type option.
This option is to show data type info in the regular (code) annotation.
It tries to find data type for each (memory) instruction in the
function.  It'd be useful to see function-level memory access pattern
and also to debug the data type profiling result.

The output would be added at the end of the line and have "# data-type:"
prefix.

For now, it only works with --stdio mode for simplicity.  I can work on
enabling it for TUI later.

  $ perf annotate --stdio --code-with-type
   Percent |      Source code & Disassembly of vmlinux for cpu/mem-loads/ppk (253 samples, percent: local period)
  ---------------------------------------------------------------------------------------------------------------
           : 0                0xffffffff81baa000 <check_preemption_disabled>:
      0.00 :   ffffffff81baa000:        pushq   %r12              # data-type: (stack operation)
      0.00 :   ffffffff81baa002:        pushq   %rbp              # data-type: (stack operation)
      0.00 :   ffffffff81baa003:        pushq   %rbx              # data-type: (stack operation)
      0.00 :   ffffffff81baa004:        subq    $0x8, %rsp
     18.00 :   ffffffff81baa008:        movl    %gs:0x7e48893d(%rip), %ebx  # 0x3294c <pcpu_hot+0xc>              # data-type: struct pcpu_hot +0xc (cpu_number)
     12.58 :   ffffffff81baa00f:        movl    %gs:0x7e488932(%rip), %eax  # 0x32948 <pcpu_hot+0x8>              # data-type: struct pcpu_hot +0x8 (preempt_count)
      0.00 :   ffffffff81baa016:        testl   $0x7fffffff, %eax
      0.00 :   ffffffff81baa01b:        je      0xffffffff81baa02c <check_preemption_disabled+0x2c>
      0.00 :   ffffffff81baa01d:        addq    $0x8, %rsp
      0.00 :   ffffffff81baa021:        movl    %ebx, %eax
     14.19 :   ffffffff81baa023:        popq    %rbx              # data-type: (stack operation)
     18.86 :   ffffffff81baa024:        popq    %rbp              # data-type: (stack operation)
     12.10 :   ffffffff81baa025:        popq    %r12              # data-type: (stack operation)
     17.78 :   ffffffff81baa027:        jmp     0xffffffff81bc1170 <__x86_return_thunk>
      6.49 :   ffffffff81baa02c:        callq   *0xc9139e(%rip)  # 0xffffffff8283b3d0 <pv_ops+0xf0>               # data-type: (stack operation)
      0.00 :   ffffffff81baa032:        testb   $0x2, %ah
      0.00 :   ffffffff81baa035:        je      0xffffffff81baa01d <check_preemption_disabled+0x1d>
      0.00 :   ffffffff81baa037:        movq    %rdi, %rbp
      0.00 :   ffffffff81baa03a:        movq    %gs:0x32940, %rax         # data-type: struct pcpu_hot +0 (current_task)
      0.00 :   ffffffff81baa043:        testb   $0x4, 0x2f(%rax)          # data-type: struct task_struct +0x2f (flags)
      0.00 :   ffffffff81baa047:        je      0xffffffff81baa052 <check_preemption_disabled+0x52>
      0.00 :   ffffffff81baa049:        cmpl    $0x1, 0x3d0(%rax)         # data-type: struct task_struct +0x3d0 (nr_cpus_allowed)
      0.00 :   ffffffff81baa050:        je      0xffffffff81baa01d <check_preemption_disabled+0x1d>
      0.00 :   ffffffff81baa052:        movq    %gs:0x32940, %r12         # data-type: struct pcpu_hot +0 (current_task)
      0.00 :   ffffffff81baa05b:        cmpw    $0x0, 0x7f0(%r12)         # data-type: struct task_struct +0x7f0 (migration_disabled)
      0.00 :   ffffffff81baa065:        movq    %rsi, (%rsp)
      0.00 :   ffffffff81baa069:        jne     0xffffffff81baa01d <check_preemption_disabled+0x1d>
      0.00 :   ffffffff81baa06b:        movl    0xe8dd13(%rip), %eax  # 0xffffffff82a37d84 <system_state>         # data-type: enum system_states +0
      0.00 :   ffffffff81baa071:        testl   %eax, %eax
      0.00 :   ffffffff81baa073:        je      0xffffffff81baa01d <check_preemption_disabled+0x1d>
      0.00 :   ffffffff81baa075:        incl    %gs:0x7e4888cc(%rip)  # 0x32948 <pcpu_hot+0x8>            # data-type: struct pcpu_hot +0x8 (preempt_count)
      0.00 :   ffffffff81baa07c:        movq    $-0x7e14a100, %rdi
      0.00 :   ffffffff81baa083:        callq   0xffffffff81148c40 <__printk_ratelimit>           # data-type: (stack operation)
      0.00 :   ffffffff81baa088:        testl   %eax, %eax
      0.00 :   ffffffff81baa08a:        je      0xffffffff81baa0d5 <check_preemption_disabled+0xd5>
      0.00 :   ffffffff81baa08c:        movl    0x958(%r12), %r9d         # data-type: struct task_struct +0x958 (pid)
      0.00 :   ffffffff81baa094:        movq    (%rsp), %rdx              # data-type: char* +0
      0.00 :   ffffffff81baa098:        movq    %rbp, %rsi
      0.00 :   ffffffff81baa09b:        leaq    0xb88(%r12), %r8          # data-type: struct task_struct +0xb88 (comm)
      0.00 :   ffffffff81baa0a3:        movl    %gs:0x7e48889e(%rip), %ecx  # 0x32948 <pcpu_hot+0x8>              # data-type: struct pcpu_hot +0x8 (preempt_count)
      0.00 :   ffffffff81baa0aa:        andl    $0x7fffffff, %ecx
      0.00 :   ffffffff81baa0b0:        movq    $-0x7dd3cdf0, %rdi
      0.00 :   ffffffff81baa0b7:        subl    $0x1, %ecx
      0.00 :   ffffffff81baa0ba:        callq   0xffffffff81149340 <_printk>              # data-type: (stack operation)
      0.00 :   ffffffff81baa0bf:        movq    0x20(%rsp), %rsi
      0.00 :   ffffffff81baa0c4:        movq    $-0x7ddb8c7e, %rdi
      0.00 :   ffffffff81baa0cb:        callq   0xffffffff81149340 <_printk>              # data-type: (stack operation)
      0.00 :   ffffffff81baa0d0:        callq   0xffffffff81b7ab60 <dump_stack>           # data-type: (stack operation)
      0.00 :   ffffffff81baa0d5:        decl    %gs:0x7e48886c(%rip)  # 0x32948 <pcpu_hot+0x8>            # data-type: struct pcpu_hot +0x8 (preempt_count)
      0.00 :   ffffffff81baa0dc:        jmp     0xffffffff81baa01d <check_preemption_disabled+0x1d>

Reviewed-by: Ian Rogers <irogers@google.com>
Link: https://lore.kernel.org/r/20250310224925.799005-8-namhyung@kernel.org
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2025-03-13 00:19:51 -07:00
Namhyung Kim
30c5a3941d perf annotate: Implement code + data type annotation
Sometimes it's useful to see both instructions and their data type
together.  Let's extend the annotate code to use data type profiling
functions.

To make it easy to pass more argument, introduce a struct to carry
necessary information together.  Also add a new annotation_option called
'code_with_type' to control the behavior.  This is not enabled yet but
it'll be set later from the command line.

For simplicity, this is implemented for --stdio only.

Reviewed-by: Ian Rogers <irogers@google.com>
Link: https://lore.kernel.org/r/20250310224925.799005-7-namhyung@kernel.org
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2025-03-13 00:19:51 -07:00
Namhyung Kim
236ee2569a perf annotate: Factor out __hist_entry__get_data_type()
So that it can only handle a single disasm_linme and hopefully make the
code simpler.  This is also a preparation to be called from different
places later.

The NO_TYPE macro was added to distinguish when it failed or needs retry.

Reviewed-by: Ian Rogers <irogers@google.com>
Link: https://lore.kernel.org/r/20250310224925.799005-6-namhyung@kernel.org
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2025-03-13 00:19:51 -07:00
Namhyung Kim
fe8da6692a perf annotate: Pass hist_entry to annotate functions
It's a prepartion to support code annotation and data type
annotation at the same time.  Data type annotation needs more
information in the hist_entry so it needs to be passed deeper.

Also rename a function with the same name in the builtin-annotate.c
to hist_entry__stdio_annotate since it matches better to the command
line option.  And change the condition inside to be simpler.

Reviewed-by: Ian Rogers <irogers@google.com>
Link: https://lore.kernel.org/r/20250310224925.799005-5-namhyung@kernel.org
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2025-03-13 00:19:51 -07:00
Namhyung Kim
9aa3cbbffb perf annotate: Pass annotation_options to annotation_line__print()
The annotation_line__print() has many arguments.  But min_percent,
max_lines and percent_type are from struct annotaion_options.  So let's
pass a pointer to the option instead of passing them separately to
reduce the number of function arguments.

Actually it has a recursive call if 'queue' is set.  Add a new option
instance to pass different values for the case.

Reviewed-by: Ian Rogers <irogers@google.com>
Link: https://lore.kernel.org/r/20250310224925.799005-4-namhyung@kernel.org
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2025-03-13 00:19:51 -07:00
Namhyung Kim
1f284082b1 perf annotate: Remove unused len parameter from annotation_line__print()
It's not used anywhere, let's get rid of it.

Reviewed-by: Ian Rogers <irogers@google.com>
Link: https://lore.kernel.org/r/20250310224925.799005-3-namhyung@kernel.org
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2025-03-13 00:19:51 -07:00
Namhyung Kim
ce2289ad0a perf annotate-data: Add annotated_data_type__get_member_name()
Factor out a function to get the name of member field at the given
offset.  This will be used in other places.

Also update the output of typeoff sort key a little bit.  As we know
that some special types like (stack operation), (stack canary) and
(unknown) won't have fields, skip printing the offset and field.

For example, the following change is expected.

  "(stack operation) +0 (no field)"   ==>   "(stack operation)"

Reviewed-by: Ian Rogers <irogers@google.com>
Link: https://lore.kernel.org/r/20250310224925.799005-2-namhyung@kernel.org
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2025-03-13 00:19:51 -07:00
Namhyung Kim
e1cde2d5e9 perf ftrace: Use atomic inc to update histogram in BPF
It should use an atomic instruction to update even if the histogram is
keyed by delta as it's also used for stats.

Cc: Gabriele Monaco <gmonaco@redhat.com>
Link: https://lore.kernel.org/r/20250227191223.1288473-3-namhyung@kernel.org
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2025-03-13 00:18:10 -07:00
Namhyung Kim
79056b3fe8 perf ftrace: Remove an unnecessary condition check in BPF
The bucket_num is set based on the {max,min}_latency already in
cmd_ftrace(), so no need to check it again in BPF.  Also I found
that it didn't pass the max_latency to BPF. :)

No functional changes intended.

Cc: Gabriele Monaco <gmonaco@redhat.com>
Link: https://lore.kernel.org/r/20250227191223.1288473-2-namhyung@kernel.org
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2025-03-13 00:18:10 -07:00
Namhyung Kim
9c33441418 perf ftrace: Fix latency stats with BPF
When BPF collects the stats for the latency in usec, it first divides
the time by 1000.  But that means it would have 0 if the delta is small
and won't update the total time properly.

Let's keep the stats in nsec always and adjust to usec before printing.

Before:

  $ sudo ./perf ftrace latency -ab -T mutex_lock --hide-empty -- sleep 0.1
  #   DURATION     |      COUNT | GRAPH                                          |
       0 -    1 us |        765 | #############################################  |
       1 -    2 us |         10 |                                                |
       2 -    4 us |          2 |                                                |
       4 -    8 us |          5 |                                                |

  # statistics  (in usec)
    total time:                    0    <<<--- (here)
      avg time:                    0
      max time:                    6
      min time:                    0
         count:                  782

After:

  $ sudo ./perf ftrace latency -ab -T mutex_lock --hide-empty -- sleep 0.1
  #   DURATION     |      COUNT | GRAPH                                          |
       0 -    1 us |        880 | ############################################   |
       1 -    2 us |         13 |                                                |
       2 -    4 us |          8 |                                                |
       4 -    8 us |          3 |                                                |

  # statistics  (in usec)
    total time:                  268    <<<--- (here)
      avg time:                    0
      max time:                    6
      min time:                    0
         count:                  904

Tested-by: Athira Rajeev <atrajeev@linux.ibm.com>
Cc: Gabriele Monaco <gmonaco@redhat.com>
Link: https://lore.kernel.org/r/20250227191223.1288473-1-namhyung@kernel.org
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2025-03-13 00:18:10 -07:00
Ian Rogers
5b562763d7 perf test stat: Additional topdown grouping tests
Add a loop and helper function to avoid repetition, the loop uses
arrays so switch the shell to bash. Add additional topdown group tests
where a topdown event needs to be moved beyond others and the slots
event isn't first in the target group. This replicates issues that
occur on hybrid systems where the other events are for the cpu_atom
PMU. Test with both PMU and software events. Place the slots event
later in the event list.

Signed-off-by: Ian Rogers <irogers@google.com>
Link: https://lore.kernel.org/r/20250307023906.1135613-5-irogers@google.com
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2025-03-11 19:05:04 -07:00
Dapeng Mi
16dd43dfd6 perf x86 evlist: Update comments on topdown regrouping
Update to remove comments about groupings not working and with the:
```
perf stat -e "{instructions,slots},{cycles,topdown-retiring}"
```
case that now works.

Signed-off-by: Dapeng Mi <dapeng1.mi@linux.intel.com>
Signed-off-by: Ian Rogers <irogers@google.com>
Link: https://lore.kernel.org/r/20250307023906.1135613-4-irogers@google.com
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2025-03-11 19:04:56 -07:00
Ian Rogers
9a1c57fe26 perf parse-events: Corrections to topdown sorting
In the case of '{instructions,slots},faults,topdown-retiring' the
first event that must be grouped, slots, is ignored causing the
topdown-retiring event not to be adjacent to the group it needs to be
inserted into. Don't ignore the group members when computing the
force_grouped_index.

Make the force_grouped_index be for the leader of the group it is
within and always use it first rather than a group leader index so
that topdown events may be sorted from one group into another.

As the PMU name comparison applies to moving events in the same group
ensure the name ordering is always respected.

Change the group splitting logic to not group if there are no other
topdown events and to fix cases where the force group leader wasn't
being grouped with the other members of its group.

Reported-by: Dapeng Mi <dapeng1.mi@linux.intel.com>
Closes: https://lore.kernel.org/lkml/20250224083306.71813-2-dapeng1.mi@linux.intel.com/
Closes: https://lore.kernel.org/lkml/f7e4f7e8-748c-4ec7-9088-0e844392c11a@linux.intel.com/
Signed-off-by: Ian Rogers <irogers@google.com>
Tested-by: Dapeng Mi <dapeng1.mi@linux.intel.com>
Link: https://lore.kernel.org/r/20250307023906.1135613-3-irogers@google.com
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2025-03-11 19:00:50 -07:00
Dapeng Mi
b74683b3bb perf x86/topdown: Fix topdown leader sampling test error on hybrid
When running topdown leader smapling test on Intel hybrid platforms,
such as LNL/ARL, we see the below error.

Topdown leader sampling test
Topdown leader sampling [Failed topdown events not reordered correctly]

It indciates the below command fails.

perf record -o "${perfdata}" -e "{instructions,slots,topdown-retiring}:S" true

The root cause is that perf tool creats a perf event for each PMU type
if it can create.

As for this command, there would be 5 perf events created,
cpu_atom/instructions/,cpu_atom/topdown_retiring/,
cpu_core/slots/,cpu_core/instructions/,cpu_core/topdown-retiring/

For these 5 events, the 2 cpu_atom events are in a group and the other 3
cpu_core events are in another group.

When arch_topdown_sample_read() traverses all these 5 events, events
cpu_atom/instructions/ and cpu_core/slots/ don't have a same group
leade, and then return false directly and lead to cpu_core/slots/ event
is used to sample and this is not allowed by PMU driver.

It's a overkill to return false directly if "evsel->core.leader !=
 leader->core.leader" since there could be multiple groups in the event
list.

Just "continue" instead of "return false" to fix this issue.

Fixes: 1e53e9d178 ("perf x86/topdown: Correct leader selection with sample_read enabled")
Signed-off-by: Dapeng Mi <dapeng1.mi@linux.intel.com>
Tested-by: Thomas Falcon <thomas.falcon@intel.com>
Tested-by: Ian Rogers <irogers@google.com>
Link: https://lore.kernel.org/r/20250307023906.1135613-2-irogers@google.com
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2025-03-11 19:00:50 -07:00
Ian Rogers
fd5de637a4 perf tools: Improve handling of hybrid PMUs in perf_event_attr__fprintf
Support the PMU name from the legacy hardware and hw_cache PMU
extended types.  Remove some macros and make variables more intention
revealing, rather than just being called "value".

Before:
```
$ perf stat -vv -e instructions true
...
------------------------------------------------------------
perf_event_attr:
  type                             0 (PERF_TYPE_HARDWARE)
  size                             136
  config                           0xa00000001
  sample_type                      IDENTIFIER
  read_format                      TOTAL_TIME_ENABLED|TOTAL_TIME_RUNNING
  disabled                         1
  inherit                          1
  enable_on_exec                   1
  exclude_guest                    1
------------------------------------------------------------
sys_perf_event_open: pid 181636  cpu -1  group_fd -1  flags 0x8 = 5
------------------------------------------------------------
perf_event_attr:
  type                             0 (PERF_TYPE_HARDWARE)
  size                             136
  config                           0x400000001
  sample_type                      IDENTIFIER
  read_format                      TOTAL_TIME_ENABLED|TOTAL_TIME_RUNNING
  disabled                         1
  inherit                          1
  enable_on_exec                   1
  exclude_guest                    1
------------------------------------------------------------
sys_perf_event_open: pid 181636  cpu -1  group_fd -1  flags 0x8 = 6
...
```

After:
```
$ perf stat -vv -e instructions true
...
------------------------------------------------------------
perf_event_attr:
  type                             0 (PERF_TYPE_HARDWARE)
  size                             136
  config                           0xa00000001 (cpu_atom/PERF_COUNT_HW_INSTRUCTIONS/)
  sample_type                      IDENTIFIER
  read_format                      TOTAL_TIME_ENABLED|TOTAL_TIME_RUNNING
  disabled                         1
  inherit                          1
  enable_on_exec                   1
------------------------------------------------------------
sys_perf_event_open: pid 181724  cpu -1  group_fd -1  flags 0x8 = 5
------------------------------------------------------------
perf_event_attr:
  type                             0 (PERF_TYPE_HARDWARE)
  size                             136
  config                           0x400000001 (cpu_core/PERF_COUNT_HW_INSTRUCTIONS/)
  sample_type                      IDENTIFIER
  read_format                      TOTAL_TIME_ENABLED|TOTAL_TIME_RUNNING
  disabled                         1
  inherit                          1
  enable_on_exec                   1
------------------------------------------------------------
sys_perf_event_open: pid 181724  cpu -1  group_fd -1  flags 0x8 = 6
...
```

Signed-off-by: Ian Rogers <irogers@google.com>
Reviewed-by: James Clark <james.clark@linaro.org>
Tested-by: Thomas Falcon <thomas.falcon@intel.com>
Tested-by: James Clark <james.clark@linaro.org>
Link: https://lore.kernel.org/r/20250307023906.1135613-1-irogers@google.com
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2025-03-11 19:00:50 -07:00
Ian Rogers
f7cffbabf7 perf python tracepoint: Switch to using parse_events
Rather than manually configuring an evsel, switch to using
parse_events for greater commonality with the rest of the perf code.

Reviewed-by: Howard Chu <howardchu95@gmail.com>
Signed-off-by: Ian Rogers <irogers@google.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Link: https://lore.kernel.org/r/20250228222308.626803-12-irogers@google.com
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2025-03-11 18:55:38 -07:00
Ian Rogers
0dfcc7c86c perf python: Add evlist.config to set up record options
Add access to evlist__config that is used to configure an evlist with
record options.

Reviewed-by: Howard Chu <howardchu95@gmail.com>
Signed-off-by: Ian Rogers <irogers@google.com>
Acked-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Link: https://lore.kernel.org/r/20250228222308.626803-11-irogers@google.com
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2025-03-11 18:55:38 -07:00
Ian Rogers
1a8356fbf8 perf python: Add evlist all_cpus accessor
Add a means to get the reference counted all_cpus CPU map from an
evlist in its python form.

Reviewed-by: Howard Chu <howardchu95@gmail.com>
Signed-off-by: Ian Rogers <irogers@google.com>
Reviewed-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Link: https://lore.kernel.org/r/20250228222308.626803-10-irogers@google.com
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2025-03-11 18:55:38 -07:00