linux

mirror of git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git synced 2026-03-27 09:56:48 +08:00

Author	SHA1	Message	Date
Ian Rogers	afffec6f03	perf dso: Add support for reading the e_machine type for a dso For ELF file dsos read the e_machine from the ELF header. For kernel types assume the e_machine matches the perf tool. In other cases return EM_NONE. When reading from the ELF header use DSO__SWAP that may need dso->needs_swap initializing. Factor out dso__swap_init to allow this. Signed-off-by: Ian Rogers <irogers@google.com> Reviewed-by: Namhyung Kim <namhyung@kernel.org> Acked-by: Arnaldo Carvalho de Melo <acme@kernel.org> Link: https://lore.kernel.org/r/20250319050741.269828-7-irogers@google.com Signed-off-by: Namhyung Kim <namhyung@kernel.org>	2025-03-20 22:58:02 -07:00
Ian Rogers	5c2938fe78	perf syscalltbl: Remove struct syscalltbl The syscalltbl held entries of system call name and number pairs, generated from a native syscalltbl at start up. As there are gaps in the system call number there is a notion of index into the table. Going forward we want the system call table to be identifiable by a machine type, for example, i386 vs x86-64. Change the interface to the syscalltbl so (1) a (currently unused machine type of EM_HOST) is passed (2) the index to syscall number and system call name mapping is computed at build time. Two tables are used for this, an array of system call number to name, an array of system call numbers sorted by the system call name. The sorted array doesn't store strings in part to save memory and relocations. The index notion is carried forward and is an index into the sorted array of system call numbers, the data structures are opaque (held only in syscalltbl.c), and so the number of indices for a machine type is exposed as a new API. The arrays are computed in the syscalltbl.sh script and so no start-up time computation and storage is necessary. Signed-off-by: Ian Rogers <irogers@google.com> Reviewed-by: Howard Chu <howardchu95@gmail.com> Reviewed-by: Charlie Jenkins <charlie@rivosinc.com> Reviewed-by: Namhyung Kim <namhyung@kernel.org> Acked-by: Arnaldo Carvalho de Melo <acme@kernel.org> Link: https://lore.kernel.org/r/20250319050741.269828-6-irogers@google.com Signed-off-by: Namhyung Kim <namhyung@kernel.org>	2025-03-20 22:57:57 -07:00
Ian Rogers	3d94b8441c	perf trace: Reorganize syscalls Identify struct syscall information in the syscalls table by a machine type and syscall number, not just system call number. Having the machine type means that 32-bit system calls can be differentiated from 64-bit ones on a machine capable of both. Having a table for all machine types and all system call numbers would be too large, so maintain a sorted array of system calls as they are encountered. Signed-off-by: Ian Rogers <irogers@google.com> Reviewed-by: Howard Chu <howardchu95@gmail.com> Reviewed-by: Charlie Jenkins <charlie@rivosinc.com> Reviewed-by: Namhyung Kim <namhyung@kernel.org> Acked-by: Arnaldo Carvalho de Melo <acme@kernel.org> Link: https://lore.kernel.org/r/20250319050741.269828-5-irogers@google.com Signed-off-by: Namhyung Kim <namhyung@kernel.org>	2025-03-20 22:57:53 -07:00
Ian Rogers	af472d3c44	perf syscalltbl: Remove syscall_table.h The definition of "static const char *const syscalltbl[] = {" is done in a generated syscalls_32.h or syscalls_64.h that is architecture dependent. In order to include the appropriate file a syscall_table.h is found via the perf include path and it includes the syscalls_32.h or syscalls_64.h as appropriate. To support having multiple syscall tables, one for 32-bit and one for 64-bit, or for different architectures, an include path cannot be used. Remove syscall_table.h because of this and inline what it does into syscalltbl.c. For architectures without a syscall_table.h this will cause a failure to include either syscalls_32.h or syscalls_64.h rather than a failure to include syscall_table.h. For architectures that only included one or other, the behavior matches BITS_PER_LONG as previously done on architectures supporting both syscalls_32.h and syscalls_64.h. Signed-off-by: Ian Rogers <irogers@google.com> Reviewed-by: Howard Chu <howardchu95@gmail.com> Reviewed-by: Charlie Jenkins <charlie@rivosinc.com> Reviewed-by: Namhyung Kim <namhyung@kernel.org> Acked-by: Arnaldo Carvalho de Melo <acme@kernel.org> Link: https://lore.kernel.org/r/20250319050741.269828-4-irogers@google.com Signed-off-by: Namhyung Kim <namhyung@kernel.org>	2025-03-20 22:57:35 -07:00
Ian Rogers	4773175c9d	perf dso: kernel-doc for enum dso_binary_type There are many and non-obvious meanings to the dso_binary_type enum values. Add kernel-doc to speed interpretting their meanings. Acked-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: Ian Rogers <irogers@google.com> Reviewed-by: Namhyung Kim <namhyung@kernel.org> Link: https://lore.kernel.org/r/20250319050741.269828-3-irogers@google.com Signed-off-by: Namhyung Kim <namhyung@kernel.org>	2025-03-20 22:57:25 -07:00
Ian Rogers	f1794ecb0c	perf dso: Move libunwind dso_data variables into ifdef The variables elf_base_addr, debug_frame_offset, eh_frame_hdr_addr and eh_frame_hdr_offset are only accessed in unwind-libunwind-local.c which is conditionally built on having libunwind support. Make the variables conditional on libunwind support too. Reviewed-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: Ian Rogers <irogers@google.com> Reviewed-by: Namhyung Kim <namhyung@kernel.org> Acked-by: Arnaldo Carvalho de Melo <acme@kernel.org> Link: https://lore.kernel.org/r/20250319050741.269828-2-irogers@google.com Signed-off-by: Namhyung Kim <namhyung@kernel.org>	2025-03-20 22:56:29 -07:00
Namhyung Kim	d10a7aaaf8	perf report: Disable children column for data type profiling I've realized that it doesn't make sense to accumulate the samples to parent in the callchain when data type profiling is enabled. Because it won't have the same data type access in the parent. Otherwise it'd see something like this: $ perf report -s type --stdio -g none # To display the perf.data header info, please use --header/--header-only options. # # # Total Lost Samples: 0 # # Samples: 2K of event 'cycles:Pu' # Event count (approx.): 8266456478 # # Children Latency Self Latency Data Type # ........ ....... ........ ........ ......... # 698.97% 697.72% 99.80% 99.61% (unknown) 0.09% 0.18% 0.09% 0.18% Elf64_Rela 0.05% 0.10% 0.05% 0.10% unsigned char 0.05% 0.10% 0.05% 0.10% struct exit_function_list 0.00% 0.01% 0.00% 0.01% struct rtld_global Link: https://lore.kernel.org/r/20250307080829.354947-3-namhyung@kernel.org Signed-off-by: Namhyung Kim <namhyung@kernel.org>	2025-03-20 09:17:56 -07:00
Namhyung Kim	6df71c7237	perf report: Allow hierarchy mode for --children It was prohibited because the output fields in the children mode were not handled properly with hierarchy. But we can have the output fields in the same level, it can allow them together. For example, latency mode adds more output fields by default and now they are displayed properly. $ perf record --latency -g -- perf test -w thloop $ perf report -H --stdio # To display the perf.data header info, please use --header/--header-only options. # # # Total Lost Samples: 0 # # Samples: 2K of event 'cycles:Pu' # Event count (approx.): 8266456478 # # Children Latency Overhead Latency Command / Shared Object / Symbol # ........................................... ........................................................ # 0.08% 0.16% 100.00% 100.00% perf 0.08% 0.16% 0.24% 0.47% ld-linux-x86-64.so.2 0.12% 0.24% 0.12% 0.24% [.] _dl_relocate_object 0.08% 0.16% 0.08% 0.16% [.] _dl_lookup_symbol_x 0.03% 0.06% 0.03% 0.06% [.] strcmp 0.00% 0.01% 0.00% 0.01% [.] _dl_start 0.00% 0.00% 0.00% 0.00% [.] _dl_start_user 0.00% 0.00% 0.00% 0.00% [.] _dl_sysdep_start 0.00% 0.00% 0.00% 0.00% [.] _start 0.00% 0.00% 0.00% 0.00% [.] dl_main 0.03% 0.06% 0.03% 0.06% libLLVM-16.so.1 0.03% 0.06% 0.03% 0.06% [.] llvm::StringMapImpl::RehashTable(unsigned int) 0.00% 0.00% 0.00% 0.00% [.] 0x00007f137ccd18e8 0.00% 0.00% 99.66% 99.31% perf 99.66% 99.31% 99.66% 99.31% [.] test_loop \| \|--49.86%--0x7f137b633d68 \| 0x55dbdbbb7d2c ... Link: https://lore.kernel.org/r/20250307080829.354947-2-namhyung@kernel.org Signed-off-by: Namhyung Kim <namhyung@kernel.org>	2025-03-20 09:17:56 -07:00
Namhyung Kim	a1bbd66627	perf sort: Keep output fields in the same level This is useful for hierarchy output mode where the first level is considered as output fields. We want them in the same level so that it can show only the remaining groups in the hierarchy. Before: $ perf report -s overhead,sample,period,comm,dso -H --stdio ... # Overhead Samples / Period / Command / Shared Object # ................. .......................................... # 100.00% 4035 100.00% 3835883066 100.00% perf 99.37% perf 0.50% ld-linux-x86-64.so.2 0.06% [unknown] 0.04% libc.so.6 0.02% libLLVM-16.so.1 After: $ perf report -s overhead,sample,period,comm,dso -H --stdio ... # Overhead Samples Period Command / Shared Object # ....................................... ....................... # 100.00% 4035 3835883066 perf 99.37% 4005 3811826223 perf 0.50% 19 19210014 ld-linux-x86-64.so.2 0.06% 8 2367089 [unknown] 0.04% 2 1720336 libc.so.6 0.02% 1 759404 libLLVM-16.so.1 Acked-by: Ingo Molnar <mingo@kernel.org> Link: https://lore.kernel.org/r/20250307080829.354947-1-namhyung@kernel.org Signed-off-by: Namhyung Kim <namhyung@kernel.org>	2025-03-20 09:17:56 -07:00
Thomas Richter	431db90a73	perf pmu: Handle memory failure in tool_pmu__new() On linux-next commit `72c6f57a41` ("perf pmu: Dynamically allocate tool PMU") allocated PMU named "tool" dynamicly. However that allocation can fail and a NULL pointer is returned. That case is currently not handled and would result in an invalid address reference. Add a check for NULL pointer. Fixes: `72c6f57a41` ("perf pmu: Dynamically allocate tool PMU") Signed-off-by: Thomas Richter <tmricht@linux.ibm.com> Reviewed-by: James Clark <james.clark@linaro.org> Link: https://lore.kernel.org/r/20250319122820.2898333-1-tmricht@linux.ibm.com Signed-off-by: Namhyung Kim <namhyung@kernel.org>	2025-03-19 17:00:16 -07:00
James Clark	6d2dcd6352	perf: intel-tpebs: Fix incorrect usage of zfree() zfree() requires an address otherwise it frees what's in name, rather than name itself. Pass the address of name to fix it. This was the only incorrect occurrence in Perf found using a search. Fixes: `8db5cabcf1` ("perf stat: Fork and launch 'perf record' when 'perf stat' needs to get retire latency value for a metric.") Signed-off-by: James Clark <james.clark@linaro.org> Link: https://lore.kernel.org/r/20250319101614.190922-1-james.clark@linaro.org Signed-off-by: Namhyung Kim <namhyung@kernel.org>	2025-03-19 16:56:56 -07:00
Ian Rogers	58b8b5d142	perf cpumap: Increment reference count for online cpumap Thomas Richter <tmricht@linux.ibm.com> reported a double put on the cpumap for the placeholder core PMU: https://lore.kernel.org/lkml/20250318095132.1502654-3-tmricht@linux.ibm.com/ Requiring the caller to get the cpumap is not how these things are usually done, switch cpu_map__online to do the get and then fix up any use cases where a put is needed. Signed-off-by: Ian Rogers <irogers@google.com> Tested-by: Thomas Richter <tmricht@linux.ibm.com> Link: https://lore.kernel.org/r/20250318171914.145616-1-irogers@google.com Signed-off-by: Namhyung Kim <namhyung@kernel.org>	2025-03-19 16:56:33 -07:00
Stephen Brennan	ebf0b33273	perf dso: fix dso__is_kallsyms() check Kernel modules for which we cannot find a file on-disk will have a dso->long_name that looks like "[module_name]". Prior to the commit listed in the fixes, the dso->kernel field would be zero (for user space), so dso__is_kallsyms() would return false. After the commit, kernel module DSOs are correctly labeled, but the result is that dso__is_kallsyms() erroneously returns true for those modules without a filesystem path. Later, build_id_cache__add() consults this value of is_kallsyms, and when true, it copies /proc/kallsyms into the cache. Users with many kernel modules without a filesystem path (e.g. ksplice or possibly kernel live patch modules) have reported excessive disk space usage in the build ID cache directory due to this behavior. To reproduce the issue, it's enough to build a trivial out-of-tree hello world kernel module, load it using insmod, and then use: perf record -ag -- sleep 1 In the build ID directory, there will be a directory for your module name containing a kallsyms file. Fix this up by changing dso__is_kallsyms() to consult the dso_binary_type enumeration, which is also symmetric to the above checks for dso__is_vmlinux() and dso__is_kcore(). With this change, kallsyms is not cached in the build-id cache for out-of-tree modules. Fixes: `02213cec64` ("perf maps: Mark module DSOs with kernel type") Signed-off-by: Stephen Brennan <stephen.s.brennan@oracle.com> Link: https://lore.kernel.org/r/20250318230012.2038790-1-stephen.s.brennan@oracle.com Signed-off-by: Namhyung Kim <namhyung@kernel.org>	2025-03-19 16:56:05 -07:00
Xin Li (Intel)	8f97566c8a	x86/cpufeatures: Remove {disabled,required}-features.h The functionalities of {disabled,required}-features.h have been replaced with the auto-generated generated/<asm/cpufeaturemasks.h> header. Thus they are no longer needed and can be removed. None of the macros defined in {disabled,required}-features.h is used in tools, delete them too. Signed-off-by: Xin Li (Intel) <xin@zytor.com> Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de> Signed-off-by: Ingo Molnar <mingo@kernel.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Link: https://lore.kernel.org/r/20250305184725.3341760-4-xin@zytor.com	2025-03-19 11:15:12 +01:00
Feng Yang	2b5b834cc3	perf kwork: Remove unreachable judgments When s2[i] = '\0', if s1[i] != '\0', it will be judged by ret, and if s1[i] = '\0', it will be judegd by !s1[i]. So in reality, s2 [i] will never make a judgment Signed-off-by: Feng Yang <yangfeng@kylinos.cn> Reviewed-by: Ian Rogers <irogers@google.com> Link: https://lore.kernel.org/r/20250314031013.94480-1-yangfeng59949@163.com Signed-off-by: Namhyung Kim <namhyung@kernel.org>	2025-03-18 16:55:30 -07:00
Arnaldo Carvalho de Melo	89aaeaf842	perf python: Check if there is space to copy all the event The pyrf_event__new() method copies the event obtained from the perf ring buffer to a structure that will then be turned into a python object for further consumption, so it copies perf_event.header.size bytes to its 'event' member: $ pahole -C pyrf_event /tmp/build/perf-tools-next/python/perf.cpython-312-x86_64-linux-gnu.so struct pyrf_event { PyObject ob_base; /* 0 16 / struct evsel evsel; /* 16 8 / struct perf_sample sample; / 24 312 / / XXX last struct has 7 bytes of padding, 2 holes / / --- cacheline 5 boundary (320 bytes) was 16 bytes ago --- / union perf_event event; / 336 4168 / / size: 4504, cachelines: 71, members: 4 / / member types with holes: 1, total: 2 / / paddings: 1, sum paddings: 7 / / last cacheline: 24 bytes */ }; $ It was doing so without checking if the event just obtained has more than that space, fix it. This isn't a proper, final solution, as we need to support larger events, but for the time being we at least bounds check and document it. Fixes: `877108e42b` ("perf tools: Initial python binding") Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Reviewed-by: Ian Rogers <irogers@google.com> Link: https://lore.kernel.org/r/20250312203141.285263-7-acme@kernel.org Signed-off-by: Namhyung Kim <namhyung@kernel.org>	2025-03-18 16:08:45 -07:00
Arnaldo Carvalho de Melo	f3fed3ae34	perf python: Don't keep a raw_data pointer to consumed ring buffer space When processing tracepoints the perf python binding was parsing the event before calling perf_mmap__consume(&md->core) in pyrf_evlist__read_on_cpu(). But part of this event parsing was to set the perf_sample->raw_data pointer to the payload of the event, which then could be overwritten by other event before tracepoint fields were asked for via event.prev_comm in a python program, for instance. This also happened with other fields, but strings were were problems were surfacing, as there is UTF-8 validation for the potentially garbled data. This ended up showing up as (with some added debugging messages): ( field 'prev_comm' ret=0x7f7c31f65110, raw_size=68 ) ( field 'prev_pid' ret=0x7f7c23b1bed0, raw_size=68 ) ( field 'prev_prio' ret=0x7f7c239c0030, raw_size=68 ) ( field 'prev_state' ret=0x7f7c239c0250, raw_size=68 ) time 14771421785867 prev_comm= prev_pid=1919907691 prev_prio=796026219 prev_state=0x303a32313175 ==> ( XXX '��' len=16, raw_size=68) ( field 'next_comm' ret=(nil), raw_size=68 ) Traceback (most recent call last): File "/home/acme/git/perf-tools-next/tools/perf/python/tracepoint.py", line 51, in <module> main() File "/home/acme/git/perf-tools-next/tools/perf/python/tracepoint.py", line 46, in main event.next_comm, ^^^^^^^^^^^^^^^ AttributeError: 'perf.sample_event' object has no attribute 'next_comm' When event.next_comm was asked for, the PyUnicode_FromString() python API would fail and that tracepoint field wouldn't be available, stopping the tools/perf/python/tracepoint.py test tool. But, since we already do a copy of the whole event in pyrf_event__new, just use it and while at it remove what was done in in `e8968e6541` ("perf python: Fix pyrf_evlist__read_on_cpu event consuming") because we don't really need to wait for parsing the sample before declaring the event as consumed. This copy is questionable as is now, as it limits the maximum event + sample_type and tracepoint payload to sizeof(union perf_event), this all has been "working" because 'struct perf_event_mmap2', the largest entry in 'union perf_event' is: $ pahole -C perf_event ~/bin/perf \| grep mmap2 struct perf_record_mmap2 mmap2; /* 0 4168 */ $ Fixes: `bae57e3825` ("perf python: Add support to resolve tracepoint fields") Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Reviewed-by: Ian Rogers <irogers@google.com> Link: https://lore.kernel.org/r/20250312203141.285263-6-acme@kernel.org Signed-off-by: Namhyung Kim <namhyung@kernel.org>	2025-03-18 16:08:35 -07:00
Arnaldo Carvalho de Melo	3de5a2bf5b	perf python: Decrement the refcount of just created event on failure To avoid a leak if we have the python object but then something happens and we need to return the operation, decrement the offset of the newly created object. Fixes: `377f698db1` ("perf python: Add struct evsel into struct pyrf_event") Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Reviewed-by: Ian Rogers <irogers@google.com> Link: https://lore.kernel.org/r/20250312203141.285263-5-acme@kernel.org Signed-off-by: Namhyung Kim <namhyung@kernel.org>	2025-03-18 16:08:29 -07:00
Arnaldo Carvalho de Melo	a570da2148	perf python tracepoint.py: Change the COMM using setproctitle if available Otherwise when debugging we see just "python" in perf, top, etc. Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Reviewed-by: Ian Rogers <irogers@google.com> Link: https://lore.kernel.org/r/20250312203141.285263-4-acme@kernel.org Signed-off-by: Namhyung Kim <namhyung@kernel.org>	2025-03-18 16:08:22 -07:00
Arnaldo Carvalho de Melo	1882625c91	perf python: Remove some unused macros (_PyUnicode_FromString(arg), etc) When python2 support was removed in `e7e9943c87` ("perf python: Remove python 2 scripting support"), all use of the _PyUnicode_FromString(arg), _PyUnicode_FromFormat(...), and _PyLong_FromLong(arg) macros was removed as well, so remove it. Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Reviewed-by: Ian Rogers <irogers@google.com> Link: https://lore.kernel.org/r/20250312203141.285263-3-acme@kernel.org Signed-off-by: Namhyung Kim <namhyung@kernel.org>	2025-03-18 16:08:14 -07:00
Arnaldo Carvalho de Melo	1376c195e8	perf python: Fixup description of sample.id event member Some old cut'n'paste error, its "ip", so the description should be "event ip", not "event type". Fixes: `877108e42b` ("perf tools: Initial python binding") Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Reviewed-by: Ian Rogers <irogers@google.com> Link: https://lore.kernel.org/r/20250312203141.285263-2-acme@kernel.org Signed-off-by: Namhyung Kim <namhyung@kernel.org>	2025-03-18 16:08:05 -07:00
Ian Rogers	ca2182097e	perf test dso-data: Correctly free test file in read test The DSO data read test opens a file but as dsos__exit is used the test file isn't closed. This causes the subsequent subtests in don't fork (-F) mode to fail as one more than expected file descriptor is open. Signed-off-by: Ian Rogers <irogers@google.com> Link: https://lore.kernel.org/r/20250318043151.137973-4-irogers@google.com Signed-off-by: Namhyung Kim <namhyung@kernel.org>	2025-03-17 22:07:18 -07:00
Ian Rogers	5ac22c35aa	perf dso: Use lock annotations to fix asan deadlock dso__list_del with address sanitizer and/or reference count checking will call dso__put that can call dso__data_close reentrantly trying to lock the dso__data_open_lock and deadlocking. Switch from pthread mutexes to perf's mutex so that lock checking is performed in debug builds. Add lock annotations that diagnosed the problem. Release the dso__data_open_lock around the dso__put to avoid the deadlock. Change the declaration of dso__data_get_fd to return a boolean, indicating the fd is valid and the lock is held, to make it compatible with the thread safety annotations as a try lock. Signed-off-by: Ian Rogers <irogers@google.com> Link: https://lore.kernel.org/r/20250318043151.137973-3-irogers@google.com Signed-off-by: Namhyung Kim <namhyung@kernel.org>	2025-03-17 22:07:18 -07:00
Ian Rogers	c5ebf3a266	perf mutex: Add annotations for LOCKS_EXCLUDED and LOCKS_RETURNED Used to annotate when locks shouldn't be held for a function or if a function returns a lock that's used by later mutex lock unlock operations. Signed-off-by: Ian Rogers <irogers@google.com> Link: https://lore.kernel.org/r/20250318043151.137973-2-irogers@google.com Signed-off-by: Namhyung Kim <namhyung@kernel.org>	2025-03-17 22:07:18 -07:00
Ian Rogers	658b34cc9f	perf test: Add pipe output testing for annotate Parameterize the basic testing to generate directly a perf.data file or to generate/use one from pipe input or output. To simplify the refactor move some of the head/grep logic around. Use "-q" with grep to make the test output cleaner. Signed-off-by: Ian Rogers <irogers@google.com> Link: https://lore.kernel.org/r/20250311211635.541090-1-irogers@google.com Signed-off-by: Namhyung Kim <namhyung@kernel.org>	2025-03-14 18:16:33 -07:00
Ian Rogers	3a86d63e6f	perf test: Fixes to variable expansion and stdout for diff test When make_data fails its error message needs to go to stderr rather than stdout and the stdout value is captured in a variable. Quote the $err value so that it is always a valid input for test. This error is commonly encountered if no sample data is gathered by the test. Signed-off-by: Ian Rogers <irogers@google.com> Link: https://lore.kernel.org/r/20250312001841.1515779-1-irogers@google.com Signed-off-by: Namhyung Kim <namhyung@kernel.org>	2025-03-14 18:15:13 -07:00
Arnaldo Carvalho de Melo	4e82c88a90	perf libunwind: Fixup conversion perf_sample->user_regs to a pointer The `dc6d2bc2d8` ("perf sample: Make user_regs and intr_regs optional") misses the changes to a file, resulting in this problem: $ make LIBUNWIND=1 -C tools/perf O=/tmp/build/perf-tools-next install-bin <SNIP> CC /tmp/build/perf-tools-next/util/unwind-libunwind-local.o CC /tmp/build/perf-tools-next/util/unwind-libunwind.o <SNIP> util/unwind-libunwind-local.c: In function ‘access_mem’: util/unwind-libunwind-local.c:582:56: error: ‘ui->sample->user_regs’ is a pointer; did you mean to use ‘->’? 582 \| if (__write \|\| !stack \|\| !ui->sample->user_regs.regs) { \| ^ \| -> util/unwind-libunwind-local.c:587:38: error: passing argument 2 of ‘perf_reg_value’ from incompatible pointer type [-Wincompatible-pointer-types] 587 \| ret = perf_reg_value(&start, &ui->sample->user_regs, \| ^~~~~~~~~~~~~~~~~~~~~~ \| \| \| struct regs_dump ** <SNIP> ⬢ [acme@toolbox perf-tools-next]$ git bisect bad `dc6d2bc2d8` is the first bad commit commit `dc6d2bc2d8` (HEAD) Author: Ian Rogers <irogers@google.com> Date: Mon Jan 13 11:43:45 2025 -0800 perf sample: Make user_regs and intr_regs optional Detected using: make -C tools/perf build-test Fixes: `dc6d2bc2d8` ("perf sample: Make user_regs and intr_regs optional") Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: Ian Rogers <irogers@google.com> Link: https://lore.kernel.org/r/20250313033121.758978-1-irogers@google.com Signed-off-by: Namhyung Kim <namhyung@kernel.org>	2025-03-14 18:12:33 -07:00
Veronika Molnarova	02ba09c8ab	perf test stat_all_pmu.sh: Correctly check 'perf stat' result Test case "stat_all_pmu.sh" is not correctly checking 'perf stat' output due to a poor design. Firstly, having the 'set -e' option with a trap catching the sigexit causes the shell to exit immediately if 'perf stat' ends with any non-zero value, which is then caught by the trap reporting an unexpected signal. This causes events that should be parsed by the if-else statement to be caught by the trap handler and are reported as errors: $ perf test -vv "perf all pmu" Testing i915/actual-frequency/ Unexpected signal in main Error: Access to performance monitoring and observability operations is limited. Secondly, the if-else branches are not exclusive as the checking if the event is present in the output log covers also the "<not supported>" events, which should be accepted, and also the "Bad name events", which should be rejected. Remove the "set -e" option from the test case, correctly parse the "perf stat" output log and check its return value. Add the missing outputs for the 'perf stat' result and also add logs messages to report the branch that parsed the event for more info. Fixes: `7e73ea4029` ("perf test: Ignore security failures in all PMU test") Signed-off-by: Veronika Molnarova <vmolnaro@redhat.com> Tested-by: Qiao Zhao <qzhao@redhat.com> Link: https://lore.kernel.org/r/20241122231233.79509-1-vmolnaro@redhat.com Signed-off-by: Namhyung Kim <namhyung@kernel.org>	2025-03-14 10:41:34 -07:00
Yujie Liu	fa9bc517af	perf script: Update brstack syntax documentation The following commits added new fields/flags to the branch stack field list: commit `1f48989cdc` ("perf script: Output branch sample type") commit `6ade6c6460` ("perf script: Show branch speculation info") commit `1e66dcff7b` ("perf script: Add not taken event for branch stack") Update brstack syntax documentation to be consistent with the latest branch stack field list. Improve the descriptions to help users interpret the fields accurately. Signed-off-by: Yujie Liu <yujie.liu@intel.com> Reviewed-by: Leo Yan <leo.yan@arm.com> Reviewed-by: Sandipan Das <sandipan.das@amd.com> Link: https://lore.kernel.org/r/20250312072329.419020-1-yujie.liu@intel.com Signed-off-by: Namhyung Kim <namhyung@kernel.org>	2025-03-14 10:41:08 -07:00
Yujie Liu	2f39edece1	perf script: Fix typo in branch event mask BRACH -> BRANCH Fixes: `88b1473135` ("perf script: Separate events from branch types") Signed-off-by: Yujie Liu <yujie.liu@intel.com> Reviewed-by: Leo Yan <leo.yan@arm.com> Reviewed-by: James Clark <james.clark@linaro.org> Link: https://lore.kernel.org/r/20250312075636.429127-1-yujie.liu@intel.com Signed-off-by: Namhyung Kim <namhyung@kernel.org>	2025-03-13 13:19:27 -07:00
Arnaldo Carvalho de Melo	2333cfa9f8	perf hist stdio: Do bounds check when printing callchains to avoid UB with new gcc versions Do a simple bounds check to avoid this on new gcc versions: 31 15.81 fedora:rawhide : FAIL gcc version 15.0.1 20250225 (Red Hat 15.0.1-0) (GCC) In function 'callchain__fprintf_left_margin', inlined from 'callchain__fprintf_graph.constprop' at ui/stdio/hist.c:246:12: ui/stdio/hist.c:27:39: error: iteration 2147483647 invokes undefined behavior [-Werror=aggressive-loop-optimizations] 27 \| for (i = 0; i < left_margin; i++) \| ~^~ ui/stdio/hist.c:27:23: note: within this loop 27 \| for (i = 0; i < left_margin; i++) \| ~~^~~~~~~~~~~~~ cc1: all warnings being treated as errors Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Link: https://lore.kernel.org/r/20250310194534.265487-4-acme@kernel.org Signed-off-by: Namhyung Kim <namhyung@kernel.org>	2025-03-13 00:30:14 -07:00
Arnaldo Carvalho de Melo	cf67629f7f	perf units: Fix insufficient array space No need to specify the array size, let the compiler figure that out. This addresses this compiler warning that was noticed while build testing on fedora rawhide: 31 15.81 fedora:rawhide : FAIL gcc version 15.0.1 20250225 (Red Hat 15.0.1-0) (GCC) util/units.c: In function 'unit_number__scnprintf': util/units.c:67:24: error: initializer-string for array of 'char' is too long [-Werror=unterminated-string-initialization] 67 \| char unit[4] = "BKMG"; \| ^~~~~~ cc1: all warnings being treated as errors Fixes: `9808143ba2` ("perf tools: Add unit_number__scnprintf function") Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Link: https://lore.kernel.org/r/20250310194534.265487-3-acme@kernel.org Signed-off-by: Namhyung Kim <namhyung@kernel.org>	2025-03-13 00:30:08 -07:00
Namhyung Kim	bbf006d6d1	perf annotate: Add --code-with-type option. This option is to show data type info in the regular (code) annotation. It tries to find data type for each (memory) instruction in the function. It'd be useful to see function-level memory access pattern and also to debug the data type profiling result. The output would be added at the end of the line and have "# data-type:" prefix. For now, it only works with --stdio mode for simplicity. I can work on enabling it for TUI later. $ perf annotate --stdio --code-with-type Percent \| Source code & Disassembly of vmlinux for cpu/mem-loads/ppk (253 samples, percent: local period) --------------------------------------------------------------------------------------------------------------- : 0 0xffffffff81baa000 <check_preemption_disabled>: 0.00 : ffffffff81baa000: pushq %r12 # data-type: (stack operation) 0.00 : ffffffff81baa002: pushq %rbp # data-type: (stack operation) 0.00 : ffffffff81baa003: pushq %rbx # data-type: (stack operation) 0.00 : ffffffff81baa004: subq $0x8, %rsp 18.00 : ffffffff81baa008: movl %gs:0x7e48893d(%rip), %ebx # 0x3294c <pcpu_hot+0xc> # data-type: struct pcpu_hot +0xc (cpu_number) 12.58 : ffffffff81baa00f: movl %gs:0x7e488932(%rip), %eax # 0x32948 <pcpu_hot+0x8> # data-type: struct pcpu_hot +0x8 (preempt_count) 0.00 : ffffffff81baa016: testl $0x7fffffff, %eax 0.00 : ffffffff81baa01b: je 0xffffffff81baa02c <check_preemption_disabled+0x2c> 0.00 : ffffffff81baa01d: addq $0x8, %rsp 0.00 : ffffffff81baa021: movl %ebx, %eax 14.19 : ffffffff81baa023: popq %rbx # data-type: (stack operation) 18.86 : ffffffff81baa024: popq %rbp # data-type: (stack operation) 12.10 : ffffffff81baa025: popq %r12 # data-type: (stack operation) 17.78 : ffffffff81baa027: jmp 0xffffffff81bc1170 <__x86_return_thunk> 6.49 : ffffffff81baa02c: callq 0xc9139e(%rip) # 0xffffffff8283b3d0 <pv_ops+0xf0> # data-type: (stack operation) 0.00 : ffffffff81baa032: testb $0x2, %ah 0.00 : ffffffff81baa035: je 0xffffffff81baa01d <check_preemption_disabled+0x1d> 0.00 : ffffffff81baa037: movq %rdi, %rbp 0.00 : ffffffff81baa03a: movq %gs:0x32940, %rax # data-type: struct pcpu_hot +0 (current_task) 0.00 : ffffffff81baa043: testb $0x4, 0x2f(%rax) # data-type: struct task_struct +0x2f (flags) 0.00 : ffffffff81baa047: je 0xffffffff81baa052 <check_preemption_disabled+0x52> 0.00 : ffffffff81baa049: cmpl $0x1, 0x3d0(%rax) # data-type: struct task_struct +0x3d0 (nr_cpus_allowed) 0.00 : ffffffff81baa050: je 0xffffffff81baa01d <check_preemption_disabled+0x1d> 0.00 : ffffffff81baa052: movq %gs:0x32940, %r12 # data-type: struct pcpu_hot +0 (current_task) 0.00 : ffffffff81baa05b: cmpw $0x0, 0x7f0(%r12) # data-type: struct task_struct +0x7f0 (migration_disabled) 0.00 : ffffffff81baa065: movq %rsi, (%rsp) 0.00 : ffffffff81baa069: jne 0xffffffff81baa01d <check_preemption_disabled+0x1d> 0.00 : ffffffff81baa06b: movl 0xe8dd13(%rip), %eax # 0xffffffff82a37d84 <system_state> # data-type: enum system_states +0 0.00 : ffffffff81baa071: testl %eax, %eax 0.00 : ffffffff81baa073: je 0xffffffff81baa01d <check_preemption_disabled+0x1d> 0.00 : ffffffff81baa075: incl %gs:0x7e4888cc(%rip) # 0x32948 <pcpu_hot+0x8> # data-type: struct pcpu_hot +0x8 (preempt_count) 0.00 : ffffffff81baa07c: movq $-0x7e14a100, %rdi 0.00 : ffffffff81baa083: callq 0xffffffff81148c40 <__printk_ratelimit> # data-type: (stack operation) 0.00 : ffffffff81baa088: testl %eax, %eax 0.00 : ffffffff81baa08a: je 0xffffffff81baa0d5 <check_preemption_disabled+0xd5> 0.00 : ffffffff81baa08c: movl 0x958(%r12), %r9d # data-type: struct task_struct +0x958 (pid) 0.00 : ffffffff81baa094: movq (%rsp), %rdx # data-type: char +0 0.00 : ffffffff81baa098: movq %rbp, %rsi 0.00 : ffffffff81baa09b: leaq 0xb88(%r12), %r8 # data-type: struct task_struct +0xb88 (comm) 0.00 : ffffffff81baa0a3: movl %gs:0x7e48889e(%rip), %ecx # 0x32948 <pcpu_hot+0x8> # data-type: struct pcpu_hot +0x8 (preempt_count) 0.00 : ffffffff81baa0aa: andl $0x7fffffff, %ecx 0.00 : ffffffff81baa0b0: movq $-0x7dd3cdf0, %rdi 0.00 : ffffffff81baa0b7: subl $0x1, %ecx 0.00 : ffffffff81baa0ba: callq 0xffffffff81149340 <_printk> # data-type: (stack operation) 0.00 : ffffffff81baa0bf: movq 0x20(%rsp), %rsi 0.00 : ffffffff81baa0c4: movq $-0x7ddb8c7e, %rdi 0.00 : ffffffff81baa0cb: callq 0xffffffff81149340 <_printk> # data-type: (stack operation) 0.00 : ffffffff81baa0d0: callq 0xffffffff81b7ab60 <dump_stack> # data-type: (stack operation) 0.00 : ffffffff81baa0d5: decl %gs:0x7e48886c(%rip) # 0x32948 <pcpu_hot+0x8> # data-type: struct pcpu_hot +0x8 (preempt_count) 0.00 : ffffffff81baa0dc: jmp 0xffffffff81baa01d <check_preemption_disabled+0x1d> Reviewed-by: Ian Rogers <irogers@google.com> Link: https://lore.kernel.org/r/20250310224925.799005-8-namhyung@kernel.org Signed-off-by: Namhyung Kim <namhyung@kernel.org>	2025-03-13 00:19:51 -07:00
Namhyung Kim	30c5a3941d	perf annotate: Implement code + data type annotation Sometimes it's useful to see both instructions and their data type together. Let's extend the annotate code to use data type profiling functions. To make it easy to pass more argument, introduce a struct to carry necessary information together. Also add a new annotation_option called 'code_with_type' to control the behavior. This is not enabled yet but it'll be set later from the command line. For simplicity, this is implemented for --stdio only. Reviewed-by: Ian Rogers <irogers@google.com> Link: https://lore.kernel.org/r/20250310224925.799005-7-namhyung@kernel.org Signed-off-by: Namhyung Kim <namhyung@kernel.org>	2025-03-13 00:19:51 -07:00
Namhyung Kim	236ee2569a	perf annotate: Factor out __hist_entry__get_data_type() So that it can only handle a single disasm_linme and hopefully make the code simpler. This is also a preparation to be called from different places later. The NO_TYPE macro was added to distinguish when it failed or needs retry. Reviewed-by: Ian Rogers <irogers@google.com> Link: https://lore.kernel.org/r/20250310224925.799005-6-namhyung@kernel.org Signed-off-by: Namhyung Kim <namhyung@kernel.org>	2025-03-13 00:19:51 -07:00
Namhyung Kim	fe8da6692a	perf annotate: Pass hist_entry to annotate functions It's a prepartion to support code annotation and data type annotation at the same time. Data type annotation needs more information in the hist_entry so it needs to be passed deeper. Also rename a function with the same name in the builtin-annotate.c to hist_entry__stdio_annotate since it matches better to the command line option. And change the condition inside to be simpler. Reviewed-by: Ian Rogers <irogers@google.com> Link: https://lore.kernel.org/r/20250310224925.799005-5-namhyung@kernel.org Signed-off-by: Namhyung Kim <namhyung@kernel.org>	2025-03-13 00:19:51 -07:00
Namhyung Kim	9aa3cbbffb	perf annotate: Pass annotation_options to annotation_line__print() The annotation_line__print() has many arguments. But min_percent, max_lines and percent_type are from struct annotaion_options. So let's pass a pointer to the option instead of passing them separately to reduce the number of function arguments. Actually it has a recursive call if 'queue' is set. Add a new option instance to pass different values for the case. Reviewed-by: Ian Rogers <irogers@google.com> Link: https://lore.kernel.org/r/20250310224925.799005-4-namhyung@kernel.org Signed-off-by: Namhyung Kim <namhyung@kernel.org>	2025-03-13 00:19:51 -07:00
Namhyung Kim	1f284082b1	perf annotate: Remove unused len parameter from annotation_line__print() It's not used anywhere, let's get rid of it. Reviewed-by: Ian Rogers <irogers@google.com> Link: https://lore.kernel.org/r/20250310224925.799005-3-namhyung@kernel.org Signed-off-by: Namhyung Kim <namhyung@kernel.org>	2025-03-13 00:19:51 -07:00
Namhyung Kim	ce2289ad0a	perf annotate-data: Add annotated_data_type__get_member_name() Factor out a function to get the name of member field at the given offset. This will be used in other places. Also update the output of typeoff sort key a little bit. As we know that some special types like (stack operation), (stack canary) and (unknown) won't have fields, skip printing the offset and field. For example, the following change is expected. "(stack operation) +0 (no field)" ==> "(stack operation)" Reviewed-by: Ian Rogers <irogers@google.com> Link: https://lore.kernel.org/r/20250310224925.799005-2-namhyung@kernel.org Signed-off-by: Namhyung Kim <namhyung@kernel.org>	2025-03-13 00:19:51 -07:00
Namhyung Kim	e1cde2d5e9	perf ftrace: Use atomic inc to update histogram in BPF It should use an atomic instruction to update even if the histogram is keyed by delta as it's also used for stats. Cc: Gabriele Monaco <gmonaco@redhat.com> Link: https://lore.kernel.org/r/20250227191223.1288473-3-namhyung@kernel.org Signed-off-by: Namhyung Kim <namhyung@kernel.org>	2025-03-13 00:18:10 -07:00
Namhyung Kim	79056b3fe8	perf ftrace: Remove an unnecessary condition check in BPF The bucket_num is set based on the {max,min}_latency already in cmd_ftrace(), so no need to check it again in BPF. Also I found that it didn't pass the max_latency to BPF. :) No functional changes intended. Cc: Gabriele Monaco <gmonaco@redhat.com> Link: https://lore.kernel.org/r/20250227191223.1288473-2-namhyung@kernel.org Signed-off-by: Namhyung Kim <namhyung@kernel.org>	2025-03-13 00:18:10 -07:00
Namhyung Kim	9c33441418	perf ftrace: Fix latency stats with BPF When BPF collects the stats for the latency in usec, it first divides the time by 1000. But that means it would have 0 if the delta is small and won't update the total time properly. Let's keep the stats in nsec always and adjust to usec before printing. Before: $ sudo ./perf ftrace latency -ab -T mutex_lock --hide-empty -- sleep 0.1 # DURATION \| COUNT \| GRAPH \| 0 - 1 us \| 765 \| ############################################# \| 1 - 2 us \| 10 \| \| 2 - 4 us \| 2 \| \| 4 - 8 us \| 5 \| \| # statistics (in usec) total time: 0 <<<--- (here) avg time: 0 max time: 6 min time: 0 count: 782 After: $ sudo ./perf ftrace latency -ab -T mutex_lock --hide-empty -- sleep 0.1 # DURATION \| COUNT \| GRAPH \| 0 - 1 us \| 880 \| ############################################ \| 1 - 2 us \| 13 \| \| 2 - 4 us \| 8 \| \| 4 - 8 us \| 3 \| \| # statistics (in usec) total time: 268 <<<--- (here) avg time: 0 max time: 6 min time: 0 count: 904 Tested-by: Athira Rajeev <atrajeev@linux.ibm.com> Cc: Gabriele Monaco <gmonaco@redhat.com> Link: https://lore.kernel.org/r/20250227191223.1288473-1-namhyung@kernel.org Signed-off-by: Namhyung Kim <namhyung@kernel.org>	2025-03-13 00:18:10 -07:00
Ian Rogers	5b562763d7	perf test stat: Additional topdown grouping tests Add a loop and helper function to avoid repetition, the loop uses arrays so switch the shell to bash. Add additional topdown group tests where a topdown event needs to be moved beyond others and the slots event isn't first in the target group. This replicates issues that occur on hybrid systems where the other events are for the cpu_atom PMU. Test with both PMU and software events. Place the slots event later in the event list. Signed-off-by: Ian Rogers <irogers@google.com> Link: https://lore.kernel.org/r/20250307023906.1135613-5-irogers@google.com Signed-off-by: Namhyung Kim <namhyung@kernel.org>	2025-03-11 19:05:04 -07:00
Dapeng Mi	16dd43dfd6	perf x86 evlist: Update comments on topdown regrouping Update to remove comments about groupings not working and with the: ``` perf stat -e "{instructions,slots},{cycles,topdown-retiring}" ``` case that now works. Signed-off-by: Dapeng Mi <dapeng1.mi@linux.intel.com> Signed-off-by: Ian Rogers <irogers@google.com> Link: https://lore.kernel.org/r/20250307023906.1135613-4-irogers@google.com Signed-off-by: Namhyung Kim <namhyung@kernel.org>	2025-03-11 19:04:56 -07:00
Ian Rogers	9a1c57fe26	perf parse-events: Corrections to topdown sorting In the case of '{instructions,slots},faults,topdown-retiring' the first event that must be grouped, slots, is ignored causing the topdown-retiring event not to be adjacent to the group it needs to be inserted into. Don't ignore the group members when computing the force_grouped_index. Make the force_grouped_index be for the leader of the group it is within and always use it first rather than a group leader index so that topdown events may be sorted from one group into another. As the PMU name comparison applies to moving events in the same group ensure the name ordering is always respected. Change the group splitting logic to not group if there are no other topdown events and to fix cases where the force group leader wasn't being grouped with the other members of its group. Reported-by: Dapeng Mi <dapeng1.mi@linux.intel.com> Closes: https://lore.kernel.org/lkml/20250224083306.71813-2-dapeng1.mi@linux.intel.com/ Closes: https://lore.kernel.org/lkml/f7e4f7e8-748c-4ec7-9088-0e844392c11a@linux.intel.com/ Signed-off-by: Ian Rogers <irogers@google.com> Tested-by: Dapeng Mi <dapeng1.mi@linux.intel.com> Link: https://lore.kernel.org/r/20250307023906.1135613-3-irogers@google.com Signed-off-by: Namhyung Kim <namhyung@kernel.org>	2025-03-11 19:00:50 -07:00
Dapeng Mi	b74683b3bb	perf x86/topdown: Fix topdown leader sampling test error on hybrid When running topdown leader smapling test on Intel hybrid platforms, such as LNL/ARL, we see the below error. Topdown leader sampling test Topdown leader sampling [Failed topdown events not reordered correctly] It indciates the below command fails. perf record -o "${perfdata}" -e "{instructions,slots,topdown-retiring}:S" true The root cause is that perf tool creats a perf event for each PMU type if it can create. As for this command, there would be 5 perf events created, cpu_atom/instructions/,cpu_atom/topdown_retiring/, cpu_core/slots/,cpu_core/instructions/,cpu_core/topdown-retiring/ For these 5 events, the 2 cpu_atom events are in a group and the other 3 cpu_core events are in another group. When arch_topdown_sample_read() traverses all these 5 events, events cpu_atom/instructions/ and cpu_core/slots/ don't have a same group leade, and then return false directly and lead to cpu_core/slots/ event is used to sample and this is not allowed by PMU driver. It's a overkill to return false directly if "evsel->core.leader != leader->core.leader" since there could be multiple groups in the event list. Just "continue" instead of "return false" to fix this issue. Fixes: `1e53e9d178` ("perf x86/topdown: Correct leader selection with sample_read enabled") Signed-off-by: Dapeng Mi <dapeng1.mi@linux.intel.com> Tested-by: Thomas Falcon <thomas.falcon@intel.com> Tested-by: Ian Rogers <irogers@google.com> Link: https://lore.kernel.org/r/20250307023906.1135613-2-irogers@google.com Signed-off-by: Namhyung Kim <namhyung@kernel.org>	2025-03-11 19:00:50 -07:00
Ian Rogers	fd5de637a4	perf tools: Improve handling of hybrid PMUs in perf_event_attr__fprintf Support the PMU name from the legacy hardware and hw_cache PMU extended types. Remove some macros and make variables more intention revealing, rather than just being called "value". Before: ``` $ perf stat -vv -e instructions true ... ------------------------------------------------------------ perf_event_attr: type 0 (PERF_TYPE_HARDWARE) size 136 config 0xa00000001 sample_type IDENTIFIER read_format TOTAL_TIME_ENABLED\|TOTAL_TIME_RUNNING disabled 1 inherit 1 enable_on_exec 1 exclude_guest 1 ------------------------------------------------------------ sys_perf_event_open: pid 181636 cpu -1 group_fd -1 flags 0x8 = 5 ------------------------------------------------------------ perf_event_attr: type 0 (PERF_TYPE_HARDWARE) size 136 config 0x400000001 sample_type IDENTIFIER read_format TOTAL_TIME_ENABLED\|TOTAL_TIME_RUNNING disabled 1 inherit 1 enable_on_exec 1 exclude_guest 1 ------------------------------------------------------------ sys_perf_event_open: pid 181636 cpu -1 group_fd -1 flags 0x8 = 6 ... ``` After: ``` $ perf stat -vv -e instructions true ... ------------------------------------------------------------ perf_event_attr: type 0 (PERF_TYPE_HARDWARE) size 136 config 0xa00000001 (cpu_atom/PERF_COUNT_HW_INSTRUCTIONS/) sample_type IDENTIFIER read_format TOTAL_TIME_ENABLED\|TOTAL_TIME_RUNNING disabled 1 inherit 1 enable_on_exec 1 ------------------------------------------------------------ sys_perf_event_open: pid 181724 cpu -1 group_fd -1 flags 0x8 = 5 ------------------------------------------------------------ perf_event_attr: type 0 (PERF_TYPE_HARDWARE) size 136 config 0x400000001 (cpu_core/PERF_COUNT_HW_INSTRUCTIONS/) sample_type IDENTIFIER read_format TOTAL_TIME_ENABLED\|TOTAL_TIME_RUNNING disabled 1 inherit 1 enable_on_exec 1 ------------------------------------------------------------ sys_perf_event_open: pid 181724 cpu -1 group_fd -1 flags 0x8 = 6 ... ``` Signed-off-by: Ian Rogers <irogers@google.com> Reviewed-by: James Clark <james.clark@linaro.org> Tested-by: Thomas Falcon <thomas.falcon@intel.com> Tested-by: James Clark <james.clark@linaro.org> Link: https://lore.kernel.org/r/20250307023906.1135613-1-irogers@google.com Signed-off-by: Namhyung Kim <namhyung@kernel.org>	2025-03-11 19:00:50 -07:00
Ian Rogers	f7cffbabf7	perf python tracepoint: Switch to using parse_events Rather than manually configuring an evsel, switch to using parse_events for greater commonality with the rest of the perf code. Reviewed-by: Howard Chu <howardchu95@gmail.com> Signed-off-by: Ian Rogers <irogers@google.com> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Link: https://lore.kernel.org/r/20250228222308.626803-12-irogers@google.com Signed-off-by: Namhyung Kim <namhyung@kernel.org>	2025-03-11 18:55:38 -07:00
Ian Rogers	0dfcc7c86c	perf python: Add evlist.config to set up record options Add access to evlist__config that is used to configure an evlist with record options. Reviewed-by: Howard Chu <howardchu95@gmail.com> Signed-off-by: Ian Rogers <irogers@google.com> Acked-by: Arnaldo Carvalho de Melo <acme@redhat.com> Link: https://lore.kernel.org/r/20250228222308.626803-11-irogers@google.com Signed-off-by: Namhyung Kim <namhyung@kernel.org>	2025-03-11 18:55:38 -07:00
Ian Rogers	1a8356fbf8	perf python: Add evlist all_cpus accessor Add a means to get the reference counted all_cpus CPU map from an evlist in its python form. Reviewed-by: Howard Chu <howardchu95@gmail.com> Signed-off-by: Ian Rogers <irogers@google.com> Reviewed-by: Arnaldo Carvalho de Melo <acme@redhat.com> Link: https://lore.kernel.org/r/20250228222308.626803-10-irogers@google.com Signed-off-by: Namhyung Kim <namhyung@kernel.org>	2025-03-11 18:55:38 -07:00

... 4 5 6 7 8 ...

17532 Commits