484 Commits

Author SHA1 Message Date
Ian Rogers
c4f4392264 perf header: Add e_machine/e_flags to the header
Add 64-bits of feature data to record the ELF machine and flags.

This allows readers to initialize based on the data.

For example, `perf kvm stat` wants to initialize based on the kind of
data to be read, but at initialization time there are no threads to base
this data upon and using the host means cross platform support won't
work.

The values in the perf_env also act as a cache for these within the
session.

Signed-off-by: Ian Rogers <irogers@google.com>
Cc: Aditya Bodkhe <aditya.b1@linux.ibm.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Albert Ou <aou@eecs.berkeley.edu>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Alexandre Ghiti <alex@ghiti.fr>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Andrew Jones <ajones@ventanamicro.com>
Cc: Anubhav Shelat <ashelat@redhat.com>
Cc: Anup Patel <anup@brainfault.org>
Cc: Athira Rajeev <atrajeev@linux.ibm.com>
Cc: Blake Jones <blakejones@google.com>
Cc: Chun-Tse Shao <ctshao@google.com>
Cc: Dapeng Mi <dapeng1.mi@linux.intel.com>
Cc: Dmitriy Vyukov <dvyukov@google.com>
Cc: Howard Chu <howardchu95@gmail.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@linaro.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: John Garry <john.g.garry@oracle.com>
Cc: Leo Yan <leo.yan@linux.dev>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Palmer Dabbelt <palmer@dabbelt.com>
Cc: Paul Walmsley <pjw@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Quan Zhou <zhouquan@iscas.ac.cn>
Cc: Shimin Guo <shimin.guo@skydio.com>
Cc: Swapnil Sapkal <swapnil.sapkal@amd.com>
Cc: Thomas Falcon <thomas.falcon@intel.com>
Cc: Will Deacon <will@kernel.org>
Cc: Yunseong Kim <ysk@kzalloc.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2026-02-03 18:01:27 -03:00
Swapnil Sapkal
2a85bbaed0 perf header: Replace hardcoded max cpus by MAX_NR_CPUS
cpumask and cpulist from cpu-domain header have hardcoded max_cpus value
of 1024.

Current systems have more cpus than this value. Replace it with
MAX_NR_CPUS.

Also define a macro to represent domain name length.

Fixes: d40c68a49f ("perf header: Support CPU DOMAIN relation info")
Reported-by: Shrikanth Hegde <sshegde@linux.ibm.com>
Reviewed-by: Shrikanth Hegde <sshegde@linux.ibm.com>
Signed-off-by: Swapnil Sapkal <swapnil.sapkal@amd.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Anubhav Shelat <ashelat@redhat.com>
Cc: Chen Yu <yu.c.chen@intel.com>
Cc: Gautham Shenoy <gautham.shenoy@amd.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@arm.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Ravi Bangoria <ravi.bangoria@amd.com>
Cc: Thomas Falcon <thomas.falcon@intel.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2026-01-28 15:18:11 -03:00
Ian Rogers
d84f24c898 perf header: Fix memory leaks in process_cpu_domain_info()
do_read_string() returns a string in allocated memory, for some reason
there was unused memory allocations and unnecessary strdups.

Remove these and make the "perf annotate basic tests" leak sanitizer
clean.

Fixes: d40c68a49f ("perf header: Support CPU DOMAIN relation info")
Reviewed-by: James Clark <james.clark@linaro.org>
Signed-off-by: Ian Rogers <irogers@google.com>
Cc: Aditya Bodkhe <aditya.b1@linux.ibm.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Albert Ou <aou@eecs.berkeley.edu>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Alexandre Ghiti <alex@ghiti.fr>
Cc: Athira Rajeev <atrajeev@linux.ibm.com>
Cc: Bill Wendling <morbo@google.com>
Cc: Dr. David Alan Gilbert <linux@treblig.org>
Cc: Guo Ren <guoren@kernel.org>
Cc: Howard Chu <howardchu95@gmail.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: John Garry <john.g.garry@oracle.com>
Cc: Julia Lawall <Julia.Lawall@inria.fr>
Cc: Justin Stitt <justinstitt@google.com>
Cc: Krzysztof Łopatowski <krzysztof.m.lopatowski@gmail.com>
Cc: Leo Yan <leo.yan@linux.dev>
Cc: linux-arm-kernel@lists.infradead.org
Cc: linux-csky@vger.kernel.org
Cc: linux-riscv@lists.infradead.org
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Nathan Chancellor <nathan@kernel.org>
Cc: Nick Desaulniers <nick.desaulniers+lkml@gmail.com>
Cc: Palmer Dabbelt <palmer@dabbelt.com>
Cc: Paul Walmsley <pjw@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Sergei Trofimovich <slyich@gmail.com>
Cc: Shimin Guo <shimin.guo@skydio.com>
Cc: Suchit Karunakaran <suchitkarunakaran@gmail.com>
Cc: Swapnil Sapkal <swapnil.sapkal@amd.com>
Cc: Thomas Falcon <thomas.falcon@intel.com>
Cc: Tianyou Li <tianyou.li@intel.com>
Cc: Will Deacon <will@kernel.org>
Cc: Zecheng Li <zecheng@google.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2026-01-23 16:58:38 -03:00
Swapnil Sapkal
00093b3133 perf sched stats: Add support for live mode
The live mode works similar to simple `perf stat` command, by profiling
the target and printing results on the terminal as soon as the target
finishes.

Example usage:

  # perf sched stats -- true
  Description
  ----------------------------------------------------------------------------------------------------
  DESC                          -> Description of the field
  COUNT                         -> Value of the field
  PCT_CHANGE                    -> Percent change with corresponding base value
  AVG_JIFFIES                   -> Avg time in jiffies between two consecutive occurrence of event
  ----------------------------------------------------------------------------------------------------

  Time elapsed (in jiffies)                                        :           1
  ----------------------------------------------------------------------------------------------------
  CPU: <ALL CPUS SUMMARY>
  ----------------------------------------------------------------------------------------------------
  DESC                                                                     COUNT   PCT_CHANGE
  ----------------------------------------------------------------------------------------------------
  yld_count                                                        :           0
  array_exp                                                        :           0
  sched_count                                                      :           0
  sched_goidle                                                     :           0  (     0.00% )
  ttwu_count                                                       :           0
  ttwu_local                                                       :           0  (     0.00% )
  rq_cpu_time                                                      :       27875
  run_delay                                                        :           0  (     0.00% )
  pcount                                                           :           0
  ----------------------------------------------------------------------------------------------------
  CPU: <ALL CPUS SUMMARY> | DOMAIN: SMT
  ----------------------------------------------------------------------------------------------------
  DESC                                                                     COUNT    AVG_JIFFIES
  ----------------------------------------- <Category busy> ------------------------------------------
  busy_lb_count                                                    :           0  $        0.00 $
  busy_lb_balanced                                                 :           0  $        0.00 $
  busy_lb_failed                                                   :           0  $        0.00 $
  busy_lb_imbalance_load                                           :           0
  busy_lb_imbalance_util                                           :           0
  busy_lb_imbalance_task                                           :           0
  busy_lb_imbalance_misfit                                         :           0
  busy_lb_gained                                                   :           0
  busy_lb_hot_gained                                               :           0
  busy_lb_nobusyq                                                  :           0  $        0.00 $
  busy_lb_nobusyg                                                  :           0  $        0.00 $
  *busy_lb_success_count                                           :           0
  *busy_lb_avg_pulled                                              :        0.00

  ... and so on. Output will show similar data for all the cpus in the
system.

Co-developed-by: Ravi Bangoria <ravi.bangoria@amd.com>
Signed-off-by: Ravi Bangoria <ravi.bangoria@amd.com>
Signed-off-by: Swapnil Sapkal <swapnil.sapkal@amd.com>
Tested-by: Chen Yu <yu.c.chen@intel.com>
Tested-by: James Clark <james.clark@linaro.org>
Acked-by: Ian Rogers <irogers@google.com>
Acked-by: Peter Zijlstra <peterz@infradead.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Anubhav Shelat <ashelat@redhat.com>
Cc: Ben Gainey <ben.gainey@arm.com>
Cc: Blake Jones <blakejones@google.com>
Cc: Chun-Tse Shao <ctshao@google.com>
Cc: David Vernet <void@manifault.com>
Cc: Dmitriy Vyukov <dvyukov@google.com>
Cc: Dr. David Alan Gilbert <linux@treblig.org>
Cc: Gautham Shenoy <gautham.shenoy@amd.com>
Cc: Graham Woodward <graham.woodward@arm.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@arm.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Juri Lelli <juri.lelli@redhat.com>
Cc: K Prateek Nayak <kprateek.nayak@amd.com>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Leo Yan <leo.yan@arm.com>
Cc: Madadi Vineeth Reddy <vineethr@linux.ibm.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Sandipan Das <sandipan.das@amd.com>
Cc: Santosh Shukla <santosh.shukla@amd.com>
Cc: Shrikanth Hegde <sshegde@linux.ibm.com>
Cc: Steven Rostedt (VMware) <rostedt@goodmis.org>
Cc: Tejun Heo <tj@kernel.org>
Cc: Thomas Falcon <thomas.falcon@intel.com>
Cc: Tim Chen <tim.c.chen@linux.intel.com>
Cc: Vincent Guittot <vincent.guittot@linaro.org>
Cc: Yang Jihong <yangjihong@bytedance.com>
Cc: Yujie Liu <yujie.liu@intel.com>
Cc: Zhongqiu Han <quic_zhonhan@quicinc.com>
[ Avoid potentially using 'sv' uninitialized by calling free_cpu_domain_info() only when build_cpu_domain_map() is called ]
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2026-01-22 12:44:03 -03:00
Swapnil Sapkal
d40c68a49f perf header: Support CPU DOMAIN relation info
The '/proc/schedstat' file gives info about load balancing statistics
within a given domain.

It also contains the cpu_mask giving information about the sibling cpus
and domain names after schedstat version 17.

Storing this information in perf header will help tools like `perf sched
stats` for better analysis.

Signed-off-by: Swapnil Sapkal <swapnil.sapkal@amd.com>
Tested-by: Chen Yu <yu.c.chen@intel.com>
Acked-by: Ian Rogers <irogers@google.com>
Acked-by: Namhyung Kim <namhyung@kernel.org>
Acked-by: Peter Zijlstra <peterz@infradead.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Anubhav Shelat <ashelat@redhat.com>
Cc: Ben Gainey <ben.gainey@arm.com>
Cc: Blake Jones <blakejones@google.com>
Cc: Chun-Tse Shao <ctshao@google.com>
Cc: David Vernet <void@manifault.com>
Cc: Dmitriy Vyukov <dvyukov@google.com>
Cc: Dr. David Alan Gilbert <linux@treblig.org>
Cc: Gautham Shenoy <gautham.shenoy@amd.com>
Cc: Graham Woodward <graham.woodward@arm.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@arm.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Juri Lelli <juri.lelli@redhat.com>
Cc: K Prateek Nayak <kprateek.nayak@amd.com>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Leo Yan <leo.yan@arm.com>
Cc: Madadi Vineeth Reddy <vineethr@linux.ibm.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Ravi Bangoria <ravi.bangoria@amd.com>
Cc: Sandipan Das <sandipan.das@amd.com>
Cc: Santosh Shukla <santosh.shukla@amd.com>
Cc: Shrikanth Hegde <sshegde@linux.ibm.com>
Cc: Steven Rostedt (VMware) <rostedt@goodmis.org>
Cc: Tejun Heo <tj@kernel.org>
Cc: Thomas Falcon <thomas.falcon@intel.com>
Cc: Tim Chen <tim.c.chen@linux.intel.com>
Cc: Vincent Guittot <vincent.guittot@linaro.org>
Cc: Yang Jihong <yangjihong@bytedance.com>
Cc: Yujie Liu <yujie.liu@intel.com>
Cc: Zhongqiu Han <quic_zhonhan@quicinc.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2026-01-21 21:55:09 -03:00
Linus Torvalds
9e906a9dea Merge tag 'perf-tools-for-v6.19-2025-12-06' of git://git.kernel.org/pub/scm/linux/kernel/git/perf/perf-tools
Pull perf tools updates from Namhyung Kim:
 "Perf event/metric description:

  Unify all event and metric descriptions in JSON format. Now event
  parsing and handling is greatly simplified by that.

  From users point of view, perf list will provide richer information
  about hardware events like the following.

    $ perf list hw

    List of pre-defined events (to be used in -e or -M):

    legacy hardware:
      branch-instructions
           [Retired branch instructions [This event is an alias of branches]. Unit: cpu]
      branch-misses
           [Mispredicted branch instructions. Unit: cpu]
      branches
           [Retired branch instructions [This event is an alias of branch-instructions]. Unit: cpu]
      bus-cycles
           [Bus cycles,which can be different from total cycles. Unit: cpu]
      cache-misses
           [Cache misses. Usually this indicates Last Level Cache misses; this is intended to be used in conjunction with the
            PERF_COUNT_HW_CACHE_REFERENCES event to calculate cache miss rates. Unit: cpu]
      cache-references
           [Cache accesses. Usually this indicates Last Level Cache accesses but this may vary depending on your CPU. This may include
            prefetches and coherency messages; again this depends on the design of your CPU. Unit: cpu]
      cpu-cycles
           [Total cycles. Be wary of what happens during CPU frequency scaling [This event is an alias of cycles]. Unit: cpu]
      cycles
           [Total cycles. Be wary of what happens during CPU frequency scaling [This event is an alias of cpu-cycles]. Unit: cpu]
      instructions
           [Retired instructions. Be careful,these can be affected by various issues,most notably hardware interrupt counts. Unit: cpu]
      ref-cycles
           [Total cycles; not affected by CPU frequency scaling. Unit: cpu]

  But most notable changes would be in the perf stat. On the right side,
  the default metrics are better named and aligned. :)

    $ perf stat -- perf test -w noploop

     Performance counter stats for 'perf test -w noploop':

                    11      context-switches                 #     10.8 cs/sec  cs_per_second
                     0      cpu-migrations                   #      0.0 migrations/sec  migrations_per_second
                 3,612      page-faults                      #   3532.5 faults/sec  page_faults_per_second
              1,022.51 msec task-clock                       #      1.0 CPUs  CPUs_utilized
               110,466      branch-misses                    #      0.0 %  branch_miss_rate         (88.66%)
         6,934,452,104      branches                         #   6781.8 M/sec  branch_frequency     (88.66%)
         4,657,032,590      cpu-cycles                       #      4.6 GHz  cycles_frequency       (88.65%)
        27,755,874,218      instructions                     #      6.0 instructions  insn_per_cycle  (89.03%)
                            TopdownL1                        #      0.3 %  tma_backend_bound
                                                             #      9.3 %  tma_bad_speculation      (89.05%)
                                                             #      9.7 %  tma_frontend_bound       (77.86%)
                                                             #     80.7 %  tma_retiring             (88.81%)

           1.025318171 seconds time elapsed

           1.013248000 seconds user
           0.012014000 seconds sys

  Deferred unwinding support:

  With the kernel support (commit c69993ecdd: "perf: Support deferred
  user unwind"), perf can use deferred callchains for userspace stack
  trace with frame pointers like below:

    $ perf record --call-graph fp,defer ...

  This will be transparent to users when it comes to other commands like
  perf report and perf script. They will merge the deferred callchains
  to the previous samples as if they were collected together.

  ARM SPE updates

   - Extensive enhancements to support various kinds of memory
     operations including GCS, MTE allocation tags, memcpy/memset,
     register access, and SIMD operations.

   - Add inverted data source filter (inv_data_src_filter) support to
     exclude certain data sources.

   - Improve documentation.

  Vendor event updates:

   - Intel: Updated event files for Sierra Forest, Panther Lake, Meteor
     Lake, Lunar Lake, Granite Rapids, and others.

   - Arm64: Added metrics for i.MX94 DDR PMU and Cortex-A720AE
     definitions.

   - RISC-V: Added JSON support for T-HEAD C920V2.

  Misc:

   - Improve pointer tracking in data type profiling. It'd give better
     output when the variable is using container_of() to convert type.

   - Annotation support for perf c2c report in TUI. Press 'a' key to
     enter annotation view from cacheline browser window. This will show
     which instruction is causing the cacheline contention.

   - Lots of fixes and test coverage improvements!"

* tag 'perf-tools-for-v6.19-2025-12-06' of git://git.kernel.org/pub/scm/linux/kernel/git/perf/perf-tools: (214 commits)
  libperf: Use 'extern' in LIBPERF_API visibility macro
  perf stat: Improve handling of termination by signal
  perf tests stat: Add test for error for an offline CPU
  perf stat: When no events, don't report an error if there is none
  perf tests stat: Add "--null" coverage
  perf cpumap: Add "any" CPU handling to cpu_map__snprint_mask
  libperf cpumap: Fix perf_cpu_map__max for an empty/NULL map
  perf stat: Allow no events to open if this is a "--null" run
  perf test kvm: Add some basic perf kvm test coverage
  perf tests evlist: Add basic evlist test
  perf tests script dlfilter: Add a dlfilter test
  perf tests kallsyms: Add basic kallsyms test
  perf tests timechart: Add a perf timechart test
  perf tests top: Add basic perf top coverage test
  perf tests buildid: Add purge and remove testing
  perf tests c2c: Add a basic c2c
  perf c2c: Clean up some defensive gets and make asan clean
  perf jitdump: Fix missed dso__put
  perf mem-events: Don't leak online CPU map
  perf hist: In init, ensure mem_info is put on error paths
  ...
2025-12-07 07:07:02 -08:00
Ian Rogers
1a6b0deb2b perf header: Switch "cpu" for find_core_pmu in caps feature writing
Writing currently fails on non-x86 and hybrid CPUs. Switch to the more
regular find_core_pmu that is normally used in this case. Tested on
hybrid alderlake system.

Signed-off-by: Ian Rogers <irogers@google.com>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2025-11-19 17:49:45 -08:00
Thomas Falcon
85c894a80a perf header: Write bpf_prog (infos|btfs)_cnt to data file
With commit f0d0f978f3 ("perf header: Don't write empty BPF/BTF
info"), the write_bpf_( prog_info() | btf() ) functions exit without
writing anything if env->bpf_prog.(infos| btfs)_cnt is zero.

process_bpf_( prog_info() | btf() ), however, still expect a "count"
value to exist in the data file. If btf information is empty, for
example, process_bpf_btf will read garbage or some other data as the
number of btf nodes in the data file. As a result, the data file will
not be processed correctly.

Instead, write the count to the data file and exit if it is zero.

Fixes: f0d0f978f3 ("perf header: Don't write empty BPF/BTF info")
Reviewed-by: Ian Rogers <irogers@google.com>
Signed-off-by: Thomas Falcon <thomas.falcon@intel.com>
Acked-by: Namhyung Kim <namhyung@kernel.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2025-11-13 17:16:33 -03:00
Ian Rogers
71062e282d perf tool: Add the perf_tool argument to all callbacks
Getting context for what a tool is doing, such as the perf_inject
instance, using container_of the tool is a common pattern in the
code. This isn't possible event_op2, event_op3 and event_op4 callbacks
as the tool isn't passed. Add the argument and then fix function
signatures to match. As tools maybe reading a tool from somewhere
else, change that code to use the passed in tool.

Signed-off-by: Ian Rogers <irogers@google.com>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2025-11-07 13:25:05 -08:00
Namhyung Kim
6bd89ae7d1 perf record: Make sure to update build-ID cache
Recent change on enabling --buildid-mmap by default brought an issue
with build-id handling.  With build-ID in MMAP2 records, we don't need
to save the build-ID table in the header of a perf data file.

But the actual file contents still need to be cached in the debug
directory for annotation etc.  Split the build-ID header processing and
caching and make sure perf record to save hit DSOs in the build-ID cache
by moving perf_session__cache_build_ids() to the end of the record__
finish_output().

Reviewed-by: Ian Rogers <irogers@google.com>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2025-11-06 17:29:16 -08:00
Ian Rogers
b743a1368d perf header: Clean up use of perf_env
Always use the perf_env from the feat_fd's perf_header. Cache the
value on entry to a function in `env` and use `env->` consistently in
the code. Ensure the header is initialized for use in
perf_session__do_write_header.

Signed-off-by: Ian Rogers <irogers@google.com>
Link: https://lore.kernel.org/r/20250724163302.596743-12-irogers@google.com
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2025-07-25 10:37:57 -07:00
Ian Rogers
57ddb9cbb5 perf evlist: Change env variable to session
The session holds a perf_env pointer env. In UI code container_of is
used to turn the env to a session, but this assumes the session
header's env is in use. Rather than a dubious container_of, hold the
session in the evlist and derive the env from the session with
evsel__env, perf_session__env, etc.

Signed-off-by: Ian Rogers <irogers@google.com>
Link: https://lore.kernel.org/r/20250724163302.596743-11-irogers@google.com
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2025-07-25 10:37:56 -07:00
Ian Rogers
fccaaf6fbb perf build-id: Change sprintf functions to snprintf
Pass in a size argument rather than implying all build id strings must
be SBUILD_ID_SIZE.

Signed-off-by: Ian Rogers <irogers@google.com>
Link: https://lore.kernel.org/r/20250724163302.596743-4-irogers@google.com
[ fixed some build errors ]
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2025-07-25 10:37:13 -07:00
Ian Rogers
6d765f5f7e libperf evsel: Rename own_cpus to pmu_cpus
own_cpus is generally the cpumask from the PMU. Rename to pmu_cpus to
try to make this clearer. Variable rename with no other changes.

Reviewed-by: Thomas Falcon <thomas.falcon@intel.com>
Signed-off-by: Ian Rogers <irogers@google.com>
Tested-by: James Clark <james.clark@linaro.org>
Link: https://lore.kernel.org/r/20250719030517.1990983-7-irogers@google.com
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2025-07-24 13:41:35 -07:00
Ian Rogers
6c21316e52 perf header: Fix pipe mode header dumping
The pipe mode header dumping was accidentally removed when tracing of
header feature events in pipe mode was added.

Minor spelling tweak to header test failure message.

Fixes: 61051f9a84 ("perf header: In pipe mode dump features without --header/-I")
Signed-off-by: Ian Rogers <irogers@google.com>
Tested-by: Namhyung Kim <namhyung@kernel.org>
Link: https://lore.kernel.org/r/20250703042000.2740640-1-irogers@google.com
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2025-07-03 11:22:17 -07:00
Ian Rogers
f0d0f978f3 perf header: Don't write empty BPF/BTF info
If there are no values in bpf_prog_info or bpf_btf feature don't write
the data into the header.

Signed-off-by: Ian Rogers <irogers@google.com>
Link: https://lore.kernel.org/r/20250617223356.2752099-4-irogers@google.com
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2025-06-25 12:37:41 -07:00
Ian Rogers
4d2eefd7fb perf header: Display message if BPF/BTF info is empty
The perf.data file may contain a bpf_prog_info or bpf_btf feature. If
the contents of these are empty then nothing is displayed. Rather than
display nothing and not account for the file space, display an empty
message.

Signed-off-by: Ian Rogers <irogers@google.com>
Link: https://lore.kernel.org/r/20250617223356.2752099-4-irogers@google.com
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2025-06-25 12:37:35 -07:00
Ian Rogers
57cbd56e2e perf header: Allow tracing of attr events
In pipe mode attr events capture the perf_event_attr. Allow their
dumping as they normally start the file.

Before:
```
$ perf record -o - -a sleep 1 | perf script -D -i -
. ... raw event: size 272 bytes
.  0000:  40 00 00 00 00 00 10 01 00 00 00 00 88 00 00 00  @...............
.  0010:  00 00 00 00 00 00 00 00 a0 0f 00 00 00 00 00 00  ................
.  0020:  87 01 01 00 00 00 00 00 14 00 00 00 00 00 00 00  ................
.  0030:  01 84 05 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
.  0040:  00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
.  0050:  00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
.  0060:  00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
.  0070:  00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
.  0080:  00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
.  0090:  91 08 00 00 00 00 00 00 92 08 00 00 00 00 00 00  ................
.  00a0:  93 08 00 00 00 00 00 00 94 08 00 00 00 00 00 00  ................
.  00b0:  95 08 00 00 00 00 00 00 96 08 00 00 00 00 00 00  ................
.  00c0:  97 08 00 00 00 00 00 00 98 08 00 00 00 00 00 00  ................
.  00d0:  99 08 00 00 00 00 00 00 9a 08 00 00 00 00 00 00  ................
.  00e0:  9b 08 00 00 00 00 00 00 9c 08 00 00 00 00 00 00  ................
.  00f0:  9d 08 00 00 00 00 00 00 9e 08 00 00 00 00 00 00  ................
.  0100:  9f 08 00 00 00 00 00 00 a0 08 00 00 00 00 00 00  ................

-1 -1 0 [0x110]: PERF_RECORD_ATTR
0x110@pipe [0x110]: event: 64
...
```

After:
```
$ perf record -o - -a sleep 1 | perf script -D -i -
0@pipe [0x110]: event: 64
.
. ... raw event: size 272 bytes
.  0000:  40 00 00 00 00 00 10 01 00 00 00 00 88 00 00 00  @...............
.  0010:  00 00 00 00 00 00 00 00 a0 0f 00 00 00 00 00 00  ................
.  0020:  87 01 01 00 00 00 00 00 14 00 00 00 00 00 00 00  ................
.  0030:  01 84 05 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
.  0040:  00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
.  0050:  00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
.  0060:  00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
.  0070:  00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
.  0080:  00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
.  0090:  5c 08 00 00 00 00 00 00 5d 08 00 00 00 00 00 00  \.......].......
.  00a0:  5e 08 00 00 00 00 00 00 5f 08 00 00 00 00 00 00  ^......._.......
.  00b0:  60 08 00 00 00 00 00 00 61 08 00 00 00 00 00 00  `.......a.......
.  00c0:  62 08 00 00 00 00 00 00 63 08 00 00 00 00 00 00  b.......c.......
.  00d0:  64 08 00 00 00 00 00 00 65 08 00 00 00 00 00 00  d.......e.......
.  00e0:  66 08 00 00 00 00 00 00 67 08 00 00 00 00 00 00  f.......g.......
.  00f0:  68 08 00 00 00 00 00 00 69 08 00 00 00 00 00 00  h.......i.......
.  0100:  6a 08 00 00 00 00 00 00 6b 08 00 00 00 00 00 00  j.......k.......

-1 -1 0 [0x110]: PERF_RECORD_ATTR, type = 0 (PERF_TYPE_HARDWARE), size = 136, config = 0 (PERF_COUNT_HW_CPU_CYCLES), { sample_period, sample_freq } = 4000, sample_type = IP|TID|TIME|CPU|PERIOD|IDENTIFIER, read_format = ID|LOST, disabled = 1, freq = 1, precise_ip = 3, sample_id_all = 1
0x110@pipe [0x110]: event: 64
...
```

Signed-off-by: Ian Rogers <irogers@google.com>
Link: https://lore.kernel.org/r/20250617223356.2752099-4-irogers@google.com
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2025-06-25 12:37:29 -07:00
Ian Rogers
61051f9a84 perf header: In pipe mode dump features without --header/-I
In pipe mode the header features are contained within events. While
other events dump details the header features only dump if --header or
-I are passed, which doesn't make sense as in pipe mode there is no
perf file header. Make the printing of the information conditional on
dump_trace as with other events.

Before:
```
$ perf record -o - -a sleep 1 | perf script -D -i -
...
0x2c8@pipe [0x54]: event: 80
.
. ... raw event: size 84 bytes
.  0000:  50 00 00 00 00 00 54 00 05 00 00 00 00 00 00 00  P.....T.........
.  0010:  40 00 00 00 36 2e 31 35 2e 72 63 37 2e 67 61 64  @...6.15.rc7.gad
.  0020:  32 61 36 39 31 63 39 39 66 62 00 00 00 00 00 00  2a691c99fb......
.  0030:  00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
.  0040:  00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
.  0050:  00 00 00 00                                      ....

0 0 0x2c8 [0x54]: PERF_RECORD_FEATURE
```

After:
```
$ perf record -o - -a sleep 1 | perf script -D -i -
...
0x2c8@pipe [0x54]: event: 80
.
. ... raw event: size 84 bytes
.  0000:  50 00 00 00 00 00 54 00 05 00 00 00 00 00 00 00  P.....T.........
.  0010:  40 00 00 00 36 2e 31 35 2e 72 63 37 2e 67 61 64  @...6.15.rc7.gad
.  0020:  32 61 36 39 31 63 39 39 66 62 00 00 00 00 00 00  2a691c99fb......
.  0030:  00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
.  0040:  00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
.  0050:  00 00 00 00                                      ....

0 0 0x2c8 [0x54]: PERF_RECORD_FEATURE, # perf version : 6.15.rc7.gad2a691c99fb
```

Signed-off-by: Ian Rogers <irogers@google.com>
Link: https://lore.kernel.org/r/20250617223356.2752099-4-irogers@google.com
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2025-06-25 12:37:23 -07:00
Blake Jones
fdc3441f2d perf record: collect BPF metadata from new programs
This collects metadata for any BPF programs that were loaded during a
"perf record" run, and emits it at the end of the run.

Signed-off-by: Blake Jones <blakejones@google.com>
Link: https://lore.kernel.org/r/20250612194939.162730-4-blakejones@google.com
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2025-06-20 14:48:49 -07:00
Anubhav Shelat
13b38e6b80 perf header: remove unecessary core id test
It is possible for systems to have a greater socket id number than the
number of cpus present on a machine, so this test is obselete and should
be removed.

Signed-off-by: Anubhav Shelat <ashelat@redhat.com>
Reviewed-by: Ian Rogers <irogers@google.com>
Link: https://lore.kernel.org/r/20250618142921.4053400-2-ashelat@redhat.com
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2025-06-20 13:49:46 -07:00
Namhyung Kim
9daa05c84a perf report: Fix a memory leak for perf_env on AMD
The env.pmu_mapping can be leaked when it reads data from a pipe on AMD.
For a pipe data, it reads the header data including pmu_mapping from
PERF_RECORD_HEADER_FEATURE runtime.  But it's already set in:

  perf_session__new()
    __perf_session__new()
      evlist__init_trace_event_sample_raw()
        evlist__has_amd_ibs()
          perf_env__nr_pmu_mappings()

Then it'll overwrite that when it processes the HEADER_FEATURE record.
Here's a report from address sanitizer.

  Direct leak of 2689 byte(s) in 1 object(s) allocated from:
    #0 0x7fed8f814596 in realloc ../../../../src/libsanitizer/lsan/lsan_interceptors.cpp:98
    #1 0x5595a7d416b1 in strbuf_grow util/strbuf.c:64
    #2 0x5595a7d414ef in strbuf_init util/strbuf.c:25
    #3 0x5595a7d0f4b7 in perf_env__read_pmu_mappings util/env.c:362
    #4 0x5595a7d12ab7 in perf_env__nr_pmu_mappings util/env.c:517
    #5 0x5595a7d89d2f in evlist__has_amd_ibs util/amd-sample-raw.c:315
    #6 0x5595a7d87fb2 in evlist__init_trace_event_sample_raw util/sample-raw.c:23
    #7 0x5595a7d7f893 in __perf_session__new util/session.c:179
    #8 0x5595a7b79572 in perf_session__new util/session.h:115
    #9 0x5595a7b7e9dc in cmd_report builtin-report.c:1603
    #10 0x5595a7c019eb in run_builtin perf.c:351
    #11 0x5595a7c01c92 in handle_internal_command perf.c:404
    #12 0x5595a7c01deb in run_argv perf.c:448
    #13 0x5595a7c02134 in main perf.c:556
    #14 0x7fed85833d67 in __libc_start_call_main ../sysdeps/nptl/libc_start_call_main.h:58

Let's free the existing pmu_mapping data if any.

Cc: Ravi Bangoria <ravi.bangoria@amd.com>
Link: https://lore.kernel.org/r/20250311000416.817631-1-namhyung@kernel.org
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2025-03-24 16:22:06 -07:00
Ian Rogers
d6cd7c9f02 perf header: Switch mem topology to io_dir__readdir
Switch memory_node__read and build_mem_topology from opendir/readdir
to io_dir__readdir, with smaller stack allocations. Reduces peak
memory consumption of perf record by 10kb.

Signed-off-by: Ian Rogers <irogers@google.com>
Link: https://lore.kernel.org/r/20250222061015.303622-5-irogers@google.com
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2025-02-24 15:46:33 -08:00
Zhongqiu Han
a7da6c7030 perf header: Fix one memory leakage in process_bpf_prog_info()
Function __perf_env__insert_bpf_prog_info() will return without inserting
bpf prog info node into perf env again due to a duplicate bpf prog info
node insertion, causing the temporary info_linear and info_node memory to
leak. Modify the return type of this function to bool and add a check to
ensure the memory is freed if the function returns false.

Fixes: 606f972b13 ("perf bpf: Save bpf_prog_info information as headers to perf.data")
Reviewed-by: Namhyung Kim <namhyung@kernel.org>
Signed-off-by: Zhongqiu Han <quic_zhonhan@quicinc.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@linaro.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Song Liu <song@kernel.org>
Cc: Yicong Yang <yangyicong@hisilicon.com>
Link: https://lore.kernel.org/r/20241205084500.823660-3-quic_zhonhan@quicinc.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2024-12-10 16:59:32 -03:00
Zhongqiu Han
875d22980a perf header: Fix one memory leakage in process_bpf_btf()
If __perf_env__insert_btf() returns false due to a duplicate btf node
insertion, the temporary node will leak. Add a check to ensure the memory
is freed if the function returns false.

Fixes: a70a112317 ("perf bpf: Save BTF information as headers to perf.data")
Reviewed-by: Namhyung Kim <namhyung@kernel.org>
Signed-off-by: Zhongqiu Han <quic_zhonhan@quicinc.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@linaro.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Song Liu <song@kernel.org>
Cc: Yicong Yang <yangyicong@hisilicon.com>
Link: https://lore.kernel.org/r/20241205084500.823660-2-quic_zhonhan@quicinc.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2024-12-10 16:59:32 -03:00
Ian Rogers
494c403ff1 perf header: Pass a perf_cpu rather than a PMU to get_cpuid_str
On ARM the cpuid is dependent on the core type of the CPU in
question. The PMU was passed for the sake of the CPU map but this
means in places a temporary PMU is created just to pass a CPU
value. Just pass the CPU and fix up the callers.

As there are no longer PMU users in header.h, shuffle forward
declarations earlier to work around build failures.

Reviewed-by: James Clark <james.clark@linaro.org>
Signed-off-by: Ian Rogers <irogers@google.com>
Tested-by: Xu Yang <xu.yang_2@nxp.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Albert Ou <aou@eecs.berkeley.edu>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Alexandre Ghiti <alexghiti@rivosinc.com>
Cc: Athira Rajeev <atrajeev@linux.vnet.ibm.com>
Cc: Ben Zong-You Xie <ben717@andestech.com>
Cc: Benjamin Gray <bgray@linux.ibm.com>
Cc: Bibo Mao <maobibo@loongson.cn>
Cc: Clément Le Goffic <clement.legoffic@foss.st.com>
Cc: Dima Kogan <dima@secretsauce.net>
Cc: Dr. David Alan Gilbert <linux@treblig.org>
Cc: Huacai Chen <chenhuacai@kernel.org>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: John Garry <john.g.garry@oracle.com>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Leo Yan <leo.yan@linux.dev>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Mike Leach <mike.leach@linaro.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Palmer Dabbelt <palmer@dabbelt.com>
Cc: Paul Walmsley <paul.walmsley@sifive.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Ravi Bangoria <ravi.bangoria@amd.com>
Cc: Sandipan Das <sandipan.das@amd.com>
Cc: Will Deacon <will@kernel.org>
Cc: Yicong Yang <yangyicong@hisilicon.com>
Cc: linux-arm-kernel@lists.infradead.org
Cc: linux-riscv@lists.infradead.org
Link: https://lore.kernel.org/r/20241107162035.52206-7-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2024-11-16 16:40:30 -03:00
Ian Rogers
cec0d6572a perf header: Refactor get_cpuid to take a CPU for ARM
ARM BIG.little has no notion of a constant CPUID for both core
types. To reflect this reality, change the get_cpuid function to also
pass in a possibly unused logical cpu.

If the dummy value (-1) is passed in then ARM can, as currently happens,
select the first logical CPU's "CPUID".

The changes to ARM getcpuid happen in a follow up change.

Reviewed-by: James Clark <james.clark@linaro.org>
Signed-off-by: Ian Rogers <irogers@google.com>
Tested-by: Xu Yang <xu.yang_2@nxp.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Albert Ou <aou@eecs.berkeley.edu>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Alexandre Ghiti <alexghiti@rivosinc.com>
Cc: Athira Rajeev <atrajeev@linux.vnet.ibm.com>
Cc: Ben Zong-You Xie <ben717@andestech.com>
Cc: Benjamin Gray <bgray@linux.ibm.com>
Cc: Bibo Mao <maobibo@loongson.cn>
Cc: Clément Le Goffic <clement.legoffic@foss.st.com>
Cc: Dima Kogan <dima@secretsauce.net>
Cc: Dr. David Alan Gilbert <linux@treblig.org>
Cc: Huacai Chen <chenhuacai@kernel.org>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: John Garry <john.g.garry@oracle.com>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Leo Yan <leo.yan@linux.dev>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Mike Leach <mike.leach@linaro.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Palmer Dabbelt <palmer@dabbelt.com>
Cc: Paul Walmsley <paul.walmsley@sifive.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Ravi Bangoria <ravi.bangoria@amd.com>
Cc: Sandipan Das <sandipan.das@amd.com>
Cc: Will Deacon <will@kernel.org>
Cc: Yicong Yang <yangyicong@hisilicon.com>
Cc: linux-arm-kernel@lists.infradead.org
Cc: linux-riscv@lists.infradead.org
Link: https://lore.kernel.org/r/20241107162035.52206-4-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2024-11-16 16:37:54 -03:00
Ian Rogers
c6fafe36ba perf header: Move is_cpu_online to numa bench
The helper function is only used in the NUMA benchmark as typically
online CPUs are determined through perf_cpu_map__new_online_cpus().

Reduce the scope of the function for now.

Reviewed-by: James Clark <james.clark@linaro.org>
Signed-off-by: Ian Rogers <irogers@google.com>
Tested-by: Xu Yang <xu.yang_2@nxp.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Albert Ou <aou@eecs.berkeley.edu>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Alexandre Ghiti <alexghiti@rivosinc.com>
Cc: Athira Rajeev <atrajeev@linux.vnet.ibm.com>
Cc: Ben Zong-You Xie <ben717@andestech.com>
Cc: Benjamin Gray <bgray@linux.ibm.com>
Cc: Bibo Mao <maobibo@loongson.cn>
Cc: Clément Le Goffic <clement.legoffic@foss.st.com>
Cc: Dima Kogan <dima@secretsauce.net>
Cc: Dr. David Alan Gilbert <linux@treblig.org>
Cc: Huacai Chen <chenhuacai@kernel.org>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: John Garry <john.g.garry@oracle.com>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Leo Yan <leo.yan@linux.dev>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Mike Leach <mike.leach@linaro.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Palmer Dabbelt <palmer@dabbelt.com>
Cc: Paul Walmsley <paul.walmsley@sifive.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Ravi Bangoria <ravi.bangoria@amd.com>
Cc: Sandipan Das <sandipan.das@amd.com>
Cc: Will Deacon <will@kernel.org>
Cc: Yicong Yang <yangyicong@hisilicon.com>
Cc: linux-arm-kernel@lists.infradead.org
Cc: linux-riscv@lists.infradead.org
Link: https://lore.kernel.org/r/20241107162035.52206-3-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2024-11-16 16:36:47 -03:00
Yicong Yang
35de42cdfb perf build: Include libtraceevent headers directly indicated by pkg-config
Currently the libtraceevent's found by pkg-config, which give the
include path as:

  [root@localhost tmp]# pkg-config --cflags libtraceevent
  -I/usr/local/include/traceevent

So we should include the libtraceevent headers directly without
"traceevent/" prefix. Update all the users.

Fixes: 0f0e1f4456 ("perf build: Use pkg-config for feature check for libtrace{event,fs}")
Suggested-by: Namhyung Kim <namhyung@kernel.org>
Link: https://lore.kernel.org/linux-perf-users/ZyF5_Hf1iL01kldE@google.com/
Signed-off-by: Yicong Yang <yangyicong@hisilicon.com>
Cc: leo.yan@arm.com
Cc: amadio@gentoo.org
Cc: linuxarm@huawei.com
Link: https://lore.kernel.org/r/20241105105649.45399-1-yangyicong@huawei.com
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2024-11-08 22:42:57 -08:00
Ian Rogers
2d57c32b32 perf header: Remove repipe option
No longer used by `perf inject` the repipe_fd is always -1 and repipe
is always false. Remove the options and associated code knowing the
constant values of the removed variables.

Signed-off-by: Ian Rogers <irogers@google.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@linaro.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Nick Terrell <terrelln@fb.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Yanteng Si <siyanteng@loongson.cn>
Cc: Yicong Yang <yangyicong@hisilicon.com>
Link: https://lore.kernel.org/r/20240829150154.37929-8-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2024-08-30 09:24:27 -03:00
Ian Rogers
89d64e7273 perf inject: Overhaul handling of pipe files
Previously inject->is_pipe was set if the input or output were a
pipe. Determining the input was a pipe had to be done prior to
starting the session and opening the file. This was done by comparing
the input file name with '-' but it fails if the pipe file is written
to disk.

Opening a pipe file from disk will correctly set perf_data.is_pipe, but
this is too late for 'perf inject' and results in a broken file. A
workaround is 'cat pipe_perf|perf inject -i - ...'.

This change removes inject->is_pipe and changes the dependent
conditions to use the is_pipe flag on the input
(inject->session->data) and output files (inject->output). This
ensures the is_pipe condition reflects things like the header being
read.

The change removes the use of perf file header repiping, that is
writing the file header out while reading it in. The case of input
pipe and output file cannot repipe as the attributes for the file are
unknown. To resolve this, write the file header when writing to disk
and as the attributes may be unknown, write them after the data.

Update sessions repipe variable to be trace_event_repipe as those are
the only events now impacted by it. Update __perf_session__new as the
repipe_fd no longer needs passing. Fully removing repipe from session
header reading will be done in a later change.

Committer testing:

  root@number:~# perf record -e syscalls:sys_enter_*sleep/max-stack=4/ -o - sleep 0.01 | perf report -i -
  # To display the perf.data header info, please use --header/--header-only options.
  #
  [ perf record: Woken up 1 times to write data ]
  [ perf record: Captured and wrote 0.050 MB - ]
  #
  # Total Lost Samples: 0
  #
  # Samples: 1  of event 'syscalls:sys_enter_clock_nanosleep'
  # Event count (approx.): 1
  #
  # Overhead  Command  Shared Object  Symbol
  # ........  .......  .............  ...............................
  #
     100.00%  sleep    libc.so.6      [.] clock_nanosleep@GLIBC_2.2.5
              |
              ---__libc_start_main@@GLIBC_2.34
                 __libc_start_call_main
                 0x562fc2560a9f
                 clock_nanosleep@GLIBC_2.2.5

  #
  # (Tip: Create an archive with symtabs to analyse on other machine: perf archive)
  #
  root@number:~# perf record -e syscalls:sys_enter_*sleep/max-stack=4/ -o - sleep 0.01 > pipe.data
  [ perf record: Woken up 1 times to write data ]
  [ perf record: Captured and wrote 0.050 MB - ]
  root@number:~# perf report --stdio -i pipe.data
  # To display the perf.data header info, please use --header/--header-only options.
  #
  #
  # Total Lost Samples: 0
  #
  # Samples: 1  of event 'syscalls:sys_enter_clock_nanosleep'
  # Event count (approx.): 1
  #
  # Overhead  Command  Shared Object  Symbol
  # ........  .......  .............  ...............................
  #
     100.00%  sleep    libc.so.6      [.] clock_nanosleep@GLIBC_2.2.5
              |
              ---__libc_start_main@@GLIBC_2.34
                 __libc_start_call_main
                 0x55f775975a9f
                 clock_nanosleep@GLIBC_2.2.5

  #
  # (Tip: To set sampling period of individual events use perf record -e cpu/cpu-cycles,period=100001/,cpu/branches,period=10001/ ...)
  #
  root@number:~#

Signed-off-by: Ian Rogers <irogers@google.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@linaro.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Nick Terrell <terrelln@fb.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Yanteng Si <siyanteng@loongson.cn>
Cc: Yicong Yang <yangyicong@hisilicon.com>
Link: https://lore.kernel.org/r/20240829150154.37929-7-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2024-08-30 09:23:51 -03:00
Ian Rogers
e9a7053da3 perf header: Allow attributes to be written after data
With a file, to write data an offset needs to be known. Typically data
follows the event attributes in a file.

However, if processing a pipe the number of event attributes may not be
known.

It is convenient in that case to write the attributes after the data.

Expand perf_session__do_write_header() to allow this when the data
offset and size are known.

This approach may be useful for more than just taking a pipe file to
write into a data file, `perf inject --itrace` will reserve and
additional 8kb for attributes, which would be unnecessary if the
attributes were written after the data.

Signed-off-by: Ian Rogers <irogers@google.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@linaro.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Nick Terrell <terrelln@fb.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Yanteng Si <siyanteng@loongson.cn>
Cc: Yicong Yang <yangyicong@hisilicon.com>
Link: https://lore.kernel.org/r/20240829150154.37929-6-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2024-08-29 16:17:43 -03:00
Ian Rogers
10df481fda perf header: Fail read if header sections overlap
Buggy perf.data files can have the attributes and data
overlapping.

For example, when processing pipe data the attributes aren't known and
so file offset header calculations can consider them not present.

Later this can cause the attributes to overwrite the data. This can be
seen in:

  $ perf record -o - true > a.data
  [ perf record: Woken up 1 times to write data ]
  [ perf record: Captured and wrote 0.059 MB - ]
  $ perf inject -i a.data -o b.data
  $ perf report --stats -i b.data
  0x68 [0]: failed to process type: 510379 [Invalid argument]
  Error:
  failed to process sample
  $

This change makes reading the corrupt file fail:

  $ perf report --stats -i b.data
  Perf file header corrupt: Attributes and data overlap
  incompatible file format (rerun with -v to learn more)
  $

Which is more informative.

Signed-off-by: Ian Rogers <irogers@google.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@linaro.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Nick Terrell <terrelln@fb.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Yanteng Si <siyanteng@loongson.cn>
Cc: Yicong Yang <yangyicong@hisilicon.com>
Link: https://lore.kernel.org/r/20240829150154.37929-5-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2024-08-29 16:15:29 -03:00
Ian Rogers
30f29bae91 perf tool: Constify tool pointers
The tool pointer (to a struct largely of function pointers) is passed
around but is unchanged except at initialization. Change parameter and
variable types to be const to lower the possibilities of what could
happen with a tool.

Reviewed-by: Adrian Hunter <adrian.hunter@intel.com>
Signed-off-by: Ian Rogers <irogers@google.com>
Tested-by: Adrian Hunter <adrian.hunter@intel.com>
Tested-by: Leo Yan <leo.yan@arm.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Anshuman Khandual <anshuman.khandual@arm.com>
Cc: Athira Rajeev <atrajeev@linux.vnet.ibm.com>
Cc: Huacai Chen <chenhuacai@kernel.org>
Cc: Ilkka Koskinen <ilkka@os.amperecomputing.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@arm.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: John Garry <john.g.garry@oracle.com>
Cc: Jonathan Cameron <jonathan.cameron@huawei.com>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Leo Yan <leo.yan@linux.dev>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Mike Leach <mike.leach@linaro.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Nick Desaulniers <ndesaulniers@google.com>
Cc: Nick Terrell <terrelln@fb.com>
Cc: Oliver Upton <oliver.upton@linux.dev>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Song Liu <song@kernel.org>
Cc: Sun Haiyong <sunhaiyong@loongson.cn>
Cc: Suzuki Poulouse <suzuki.poulose@arm.com>
Cc: Will Deacon <will@kernel.org>
Cc: Yanteng Si <siyanteng@loongson.cn>
Cc: Yicong Yang <yangyicong@hisilicon.com>
Cc: linux-arm-kernel@lists.infradead.org
Link: https://lore.kernel.org/r/20240812204720.631678-4-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2024-08-12 18:05:14 -03:00
Ian Rogers
ee756ef749 perf dso: Add reference count checking and accessor functions
Add reference count checking to struct dso, this can help with
implementing correct reference counting discipline. To avoid
RC_CHK_ACCESS everywhere, add accessor functions for the variables in
struct dso.

The majority of the change is mechanical in nature and not easy to
split up.

Committer testing:

'perf test' up to this patch shows no regressions.

But:

  util/symbol.c: In function ‘dso__load_bfd_symbols’:
  util/symbol.c:1683:9: error: too few arguments to function ‘dso__set_adjust_symbols’
   1683 |         dso__set_adjust_symbols(dso);
        |         ^~~~~~~~~~~~~~~~~~~~~~~
  In file included from util/symbol.c:21:
  util/dso.h:268:20: note: declared here
    268 | static inline void dso__set_adjust_symbols(struct dso *dso, bool val)
        |                    ^~~~~~~~~~~~~~~~~~~~~~~
  make[6]: *** [/home/acme/git/perf-tools-next/tools/build/Makefile.build:106: /tmp/tmp.ZWHbQftdN6/util/symbol.o] Error 1
    MKDIR   /tmp/tmp.ZWHbQftdN6/tests/workloads/
  make[6]: *** Waiting for unfinished jobs....

This was updated:

  -       symbols__fixup_end(&dso->symbols, false);
  -       symbols__fixup_duplicate(&dso->symbols);
  -       dso->adjust_symbols = 1;
  +       symbols__fixup_end(dso__symbols(dso), false);
  +       symbols__fixup_duplicate(dso__symbols(dso));
  +       dso__set_adjust_symbols(dso);

But not build tested with BUILD_NONDISTRO and libbfd devel files installed
(binutils-devel on fedora).

Add the missing argument:

   	symbols__fixup_end(dso__symbols(dso), false);
   	symbols__fixup_duplicate(dso__symbols(dso));
  -	dso__set_adjust_symbols(dso);
  +	dso__set_adjust_symbols(dso, true);

Signed-off-by: Ian Rogers <irogers@google.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Ahelenia Ziemiańska <nabijaczleweli@nabijaczleweli.xyz>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Athira Rajeev <atrajeev@linux.vnet.ibm.com>
Cc: Ben Gainey <ben.gainey@arm.com>
Cc: Changbin Du <changbin.du@huawei.com>
Cc: Chengen Du <chengen.du@canonical.com>
Cc: Colin Ian King <colin.i.king@gmail.com>
Cc: Dima Kogan <dima@secretsauce.net>
Cc: Ilkka Koskinen <ilkka@os.amperecomputing.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@arm.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: K Prateek Nayak <kprateek.nayak@amd.com>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Leo Yan <leo.yan@linux.dev>
Cc: Li Dong <lidong@vivo.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Paran Lee <p4ranlee@gmail.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Song Liu <song@kernel.org>
Cc: Sun Haiyong <sunhaiyong@loongson.cn>
Cc: Thomas Richter <tmricht@linux.ibm.com>
Cc: Tiezhu Yang <yangtiezhu@loongson.cn>
Cc: Yanteng Si <siyanteng@loongson.cn>
Cc: zhaimingbing <zhaimingbing@cmss.chinamobile.com>
Link: https://lore.kernel.org/r/20240504213803.218974-6-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2024-05-06 15:28:49 -03:00
Ian Rogers
9c51f8788b perf env: Avoid recursively taking env->bpf_progs.lock
Add variants of perf_env__insert_bpf_prog_info(), perf_env__insert_btf()
and perf_env__find_btf prefixed with __ to indicate the
env->bpf_progs.lock is assumed held.

Call these variants when the lock is held to avoid recursively taking it
and potentially having a thread deadlock with itself.

Fixes: f8dfeae009 ("perf bpf: Show more BPF program info in print_bpf_prog_info()")
Signed-off-by: Ian Rogers <irogers@google.com>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Acked-by: Song Liu <song@kernel.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Huacai Chen <chenhuacai@kernel.org>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: K Prateek Nayak <kprateek.nayak@amd.com>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Ming Wang <wangming01@loongson.cn>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Ravi Bangoria <ravi.bangoria@amd.com>
Link: https://lore.kernel.org/r/20231207014655.1252484-1-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2024-01-03 17:54:54 -03:00
Yicong Yang
813900d19b perf header: Fix one memory leakage in perf_event__fprintf_event_update()
When dump the raw trace by `perf report -D` ASan reports a memory
leakage in perf_event__fprintf_event_update().

It shows that we allocated a temporary cpumap for dumping the CPUs but
doesn't release it and it's not used elsewhere. Fix this by free the
cpumap after the dumping.

Fixes: c853f9394b ("perf tools: Add perf_event__fprintf_event_update function")
Reviewed-by: Ian Rogers <irogers@google.com>
Signed-off-by: Yicong Yang <yangyicong@hisilicon.com>
Acked-by: Namhyung Kim <namhyung@kernel.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Jonathan Cameron <jonathan.cameron@huawei.com>
Cc: Junhao He <hejunhao3@huawei.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: linuxarm@huawei.com
Link: https://lore.kernel.org/r/20231207081635.8427-2-yangyicong@huawei.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2023-12-13 13:53:43 -03:00
Adrian Hunter
70df07838f perf header: Fix segfault on build_mem_topology() error path
Do not increase the node count unless a node has been successfully read,
because it can lead to a segfault if an error occurs.

For example, if perf exceeds the open file limit in memory_node__read(),
which, on a test system, could be made to happen by setting the file limit
to exactly 32:

 Before:

  $ ulimit -n 32
  $ perf mem record --all-user -- sleep 1
  [ perf record: Woken up 1 times to write data ]
  failed: can't open memory sysfs data
  perf: Segmentation fault
  Obtained 14 stack frames.
  perf(sighandler_dump_stack+0x48) [0x55f4b1f59558]
  /lib/x86_64-linux-gnu/libc.so.6(+0x42520) [0x7f4ba1c42520]
  /lib/x86_64-linux-gnu/libc.so.6(free+0x1e) [0x7f4ba1ca53fe]
  perf(+0x178ff4) [0x55f4b1f48ff4]
  perf(+0x179a70) [0x55f4b1f49a70]
  perf(+0x17ef5d) [0x55f4b1f4ef5d]
  perf(+0x85c0b) [0x55f4b1e55c0b]
  perf(cmd_record+0xe1d) [0x55f4b1e5920d]
  perf(cmd_mem+0xc96) [0x55f4b1e80e56]
  perf(+0x130460) [0x55f4b1f00460]
  perf(main+0x689) [0x55f4b1e427d9]
  /lib/x86_64-linux-gnu/libc.so.6(+0x29d90) [0x7f4ba1c29d90]
  /lib/x86_64-linux-gnu/libc.so.6(__libc_start_main+0x80) [0x7f4ba1c29e40]
  perf(_start+0x25) [0x55f4b1e42a25]
  Segmentation fault (core dumped)
  $

After:

  $ ulimit -n 32
  $ perf mem record --all-user -- sleep 1
  [ perf record: Woken up 1 times to write data ]
  failed: can't open memory sysfs data
  [ perf record: Captured and wrote 0.005 MB perf.data (11 samples) ]
  $

Fixes: f8e502b9d1 ("perf header: Ensure bitmaps are freed")
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Acked-by: Ian Rogers <irogers@google.com>
Cc: German Gomez <german.gomez@arm.com>
Cc: Ian Rogers <irogers@google.com>
Cc: James Clark <james.clark@arm.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Leo Yan <leo.yan@linaro.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: https://lore.kernel.org/r/20231123075848.9652-2-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2023-11-27 15:38:45 -03:00
Arnaldo Carvalho de Melo
dd678532f9 perf header: Additional note on AMD IBS for max_precise pmu cap
x86 core PMU exposes supported maximum precision level via max_precise
PMU capability. Although, AMD core PMU does not support precise mode,
certain core PMU events with precise_ip > 0 are allowed and forwarded to
IBS OP PMU.

Display a note about this in the 'perf report' header output and
document the details in the perf-list man page.

Signed-off-by: Ravi Bangoria <ravi.bangoria@amd.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Ananth Narayan <ananth.narayan@amd.com>
Cc: Changbin Du <changbin.du@huawei.com>
Cc: Huacai Chen <chenhuacai@kernel.org>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: K Prateek Nayak <kprateek.nayak@amd.com>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Kim Phillips <kim.phillips@amd.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Ming Wang <wangming01@loongson.cn>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Ross Zwisler <zwisler@chromium.org>
Cc: Sandipan Das <sandipan.das@amd.com>
Cc: Santosh Shukla <santosh.shukla@amd.com>
Cc: Yang Jihong <yangjihong1@huawei.com>
Link: https://lore.kernel.org/r/20231107083331.901-2-ravi.bangoria@amd.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2023-11-10 08:31:13 -03:00
Kan Liang
ac9cd7245f perf header: Support num and width of branch counters
To support the branch counters feature, the information of the maximum
number of supported counters and the width of the counters is exposed in
the sysfs caps folder. The perf tool can use the information to parse
the logged counters in each branch.

Store the information in the perf_env for later usage.

Reviewed-by: Ian Rogers <irogers@google.com>
Signed-off-by: Kan Liang <kan.liang@linux.intel.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Alexey Bayduraev <alexey.v.bayduraev@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Tinghao Zhang <tinghao.zhang@intel.com>
Link: https://lore.kernel.org/r/20231025201626.3000228-7-kan.liang@linux.intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2023-11-06 18:15:16 -03:00
Ian Rogers
1052545017 perf header: Fix various error path memory leaks
Memory leaks were detected by clang-tidy.

Signed-off-by: Ian Rogers <irogers@google.com>
Acked-by: Namhyung Kim <namhyung@kernel.org>
Cc: Ravi Bangoria <ravi.bangoria@amd.com>
Cc: Nick Desaulniers <ndesaulniers@google.com>
Cc: Yang Jihong <yangjihong1@huawei.com>
Cc: Huacai Chen <chenhuacai@kernel.org>
Cc: Nathan Chancellor <nathan@kernel.org>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: llvm@lists.linux.dev
Cc: Ming Wang <wangming01@loongson.cn>
Cc: Tom Rix <trix@redhat.com>
Cc: bpf@vger.kernel.org
Link: https://lore.kernel.org/r/20231009183920.200859-19-irogers@google.com
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2023-10-12 10:01:57 -07:00
Namhyung Kim
f174341d0d perf tools: Convert to perf_record_header_attr_id()
Instead of accessing the attr.id directly, use the
perf_record_header_attr_id() helper to handle old versions.

Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: https://lore.kernel.org/r/20230825152552.112913-3-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2023-08-29 14:16:14 -03:00
Namhyung Kim
9bf63282ea perf tools: Handle old data in PERF_RECORD_ATTR
The PERF_RECORD_ATTR is used for a pipe mode to describe an event with
attribute and IDs.  The ID table comes after the attr and it calculate
size of the table using the total record size and the attr size.

  n_ids = (total_record_size - end_of_the_attr_field) / sizeof(u64)

This is fine for most use cases, but sometimes it saves the pipe output
in a file and then process it later.  And it becomes a problem if there
is a change in attr size between the record and report.

  $ perf record -o- > perf-pipe.data  # old version
  $ perf report -i- < perf-pipe.data  # new version

For example, if the attr size is 128 and it has 4 IDs, then it would
save them in 168 byte like below:

   8 byte: perf event header { .type = PERF_RECORD_ATTR, .size = 168 },
 128 byte: perf event attr { .size = 128, ... },
  32 byte: event IDs [] = { 1234, 1235, 1236, 1237 },

But when report later, it thinks the attr size is 136 then it only read
the last 3 entries as ID.

   8 byte: perf event header { .type = PERF_RECORD_ATTR, .size = 168 },
 136 byte: perf event attr { .size = 136, ... },
  24 byte: event IDs [] = { 1235, 1236, 1237 },  // 1234 is missing

So it should use the recorded version of the attr.  The attr has the
size field already then it should honor the size when reading data.

Fixes: 2c46dbb517 ("perf: Convert perf header attrs into attr events")
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Tom Zanussi <zanussi@kernel.org>
Cc: stable@vger.kernel.org
Link: https://lore.kernel.org/r/20230825152552.112913-1-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2023-08-29 14:16:14 -03:00
Yanteng Si
c56f286f24 perf tools: Allow to use cpuinfo on LoongArch
Define these macros so that the CPU name can be displayed when running
'perf report' and 'perf timechart'.

Committer notes:

No need to have:

	if (strcasestr(buf, "Model Name")) {
		strlcpy(cpu_m, &buf[13], 255);
		break;
	} else if (strcasestr(buf, "model name")) {
		strlcpy(cpu_m, &buf[13], 255);
		break;
	}

As the point of strcasestr() is to be case insensitive to both the
haystack and the needle, so simplify the above to just:

	if (strcasestr(buf, "model name")) {
		strlcpy(cpu_m, &buf[13], 255);
		break;
	}

Signed-off-by: Yanteng Si <siyanteng@loongson.cn>
Acked-by: Huacai Chen <chenhuacai@loongson.cn>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: loongarch@lists.linux.dev
Cc: loongson-kernel@lists.loongnix.cn
Link: https://lore.kernel.org/r/db968a186a10e4629fe10c26a1210f7126ad41ec.1692962043.git.siyanteng@loongson.cn
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2023-08-29 14:13:48 -03:00
Ian Rogers
c091ee9089 perf pmu: Remove logic for PMU name being NULL
The PMU name could be NULL in the case of the fake_pmu. Initialize the
name for the fake_pmu to "fake" so that all other logic can assume it
is initialized. Add a const to the type of name so that a literal can
be used to avoid additional initialization code. Propagate the cost
through related routines and remove now unnecessary "(char *)"
casts. Doing this located a bug in builtin-list for the pmu_glob that
was missing a strdup.

Signed-off-by: Ian Rogers <irogers@google.com>
Link: https://lore.kernel.org/r/20230825024002.801955-3-irogers@google.com
Cc: K Prateek Nayak <kprateek.nayak@amd.com>
Cc: Ravi Bangoria <ravi.bangoria@amd.com>
Cc: James Clark <james.clark@arm.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Suzuki Poulouse <suzuki.poulose@arm.com>
Cc: Sean Christopherson <seanjc@google.com>
Cc: Wei Li <liwei391@huawei.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Arnaldo Carvalho de Melo <acme@kernel.org>
Cc: Huacai Chen <chenhuacai@kernel.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Will Deacon <will@kernel.org>
Cc: Leo Yan <leo.yan@linaro.org>
Cc: Mike Leach <mike.leach@linaro.org>
Cc: Kajol Jain <kjain@linux.ibm.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Athira Rajeev <atrajeev@linux.vnet.ibm.com>
Cc: linux-arm-kernel@lists.infradead.org
Cc: Ming Wang <wangming01@loongson.cn>
Cc: John Garry <john.g.garry@oracle.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: linux-kernel@vger.kernel.org
Cc: linux-perf-users@vger.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2023-08-25 10:22:16 -03:00
Ian Rogers
9897009eec perf header: Fix missing PMU caps
PMU caps are written as HEADER_PMU_CAPS or for the special case of the
PMU "cpu" as HEADER_CPU_PMU_CAPS. As the PMU "cpu" is special, and not
any "core" PMU, the logic had become broken and core PMUs not called
"cpu" were not having their caps written.

This affects ARM and s390 non-hybrid PMUs.

Simplify the PMU caps writing logic to scan one fewer time and to be
more explicit in its behavior.

Fixes: 178ddf3bad ("perf header: Avoid hybrid PMU list in write_pmu_caps")
Reported-by: Wei Li <liwei391@huawei.com>
Signed-off-by: Ian Rogers <irogers@google.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Athira Rajeev <atrajeev@linux.vnet.ibm.com>
Cc: Huacai Chen <chenhuacai@kernel.org>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@arm.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: John Garry <john.g.garry@oracle.com>
Cc: K Prateek Nayak <kprateek.nayak@amd.com>
Cc: Kajol Jain <kjain@linux.ibm.com>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Leo Yan <leo.yan@linaro.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Mike Leach <mike.leach@linaro.org>
Cc: Ming Wang <wangming01@loongson.cn>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Ravi Bangoria <ravi.bangoria@amd.com>
Cc: Sean Christopherson <seanjc@google.com>
Cc: Suzuki Poulouse <suzuki.poulose@arm.com>
Cc: Will Deacon <will@kernel.org>
Cc: linux-arm-kernel@lists.infradead.org
Link: https://lore.kernel.org/r/20230825024002.801955-2-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2023-08-25 10:21:33 -03:00
Ian Rogers
d7ba60a4e5 perf header: Avoid out-of-bounds read
intel-pt tests were failing:

  -- Test virtual LBR ---
  Linux
  [ perf record: Woken up 1 times to write data ]
  [ perf record: Captured and wrote 0.126 MB /tmp/perf-test-intel-pt-sh.FW57CXnCqQ/test-perf.data ]
  Failed with virtual lbr
  ...
  ```

  The root cause is an out-of-bounds read in header (where maxbrstack.py
  is from test_intel_pt.sh):
  ```
  $ perf --no-pager script --itrace=L -s maxbrstack.py
  =================================================================
  ==3907930==ERROR: AddressSanitizer: heap-buffer-overflow on address 0x6020000095a8 at pc 0x563c26c840bb bp 0x7fff43582710 sp 0x7fff43582708
  READ of size 4 at 0x6020000095a8 thread T0
      #0 0x563c26c840ba in process_group_desc util/header.c:2847
      #1 0x563c26c8bc78 in perf_file_section__process util/header.c:4037
      #2 0x563c26c8aa9b in perf_header__process_sections util/header.c:3813
      #3 0x563c26c8d028 in perf_session__read_header util/header.c:4286
      #4 0x563c26cbab29 in perf_session__open util/session.c:113
      #5 0x563c26cbb3d0 in __perf_session__new util/session.c:221
      #6 0x563c26aacb14 in perf_session__new util/session.h:73
      #7 0x563c26acf7f1 in cmd_script tools/perf/builtin-script.c:4212
      #8 0x563c26bb58ff in run_builtin tools/perf/perf.c:323
      #9 0x563c26bb5e70 in handle_internal_command tools/perf/perf.c:377
      #10 0x563c26bb6238 in run_argv tools/perf/perf.c:421
      #11 0x563c26bb67a0 in main tools/perf/perf.c:537
      #12 0x7f34bde46189 in __libc_start_call_main ../sysdeps/nptl/libc_start_call_main.h:58
      #13 0x7f34bde46244 in __libc_start_main_impl ../csu/libc-start.c:381
      #14 0x563c26a33390 in _start (/tmp/perf/perf+0x1eb390)

  0x6020000095a8 is located 8 bytes to the right of 16-byte region [0x602000009590,0x6020000095a0)
  allocated by thread T0 here:
      #0 0x7f34beeb83b7 in __interceptor_calloc ../../../../src/libsanitizer/asan/asan_malloc_linux.cpp:77
      #1 0x563c26c83df8 in process_group_desc util/header.c:2824
      #2 0x563c26c8bc78 in perf_file_section__process util/header.c:4037
      #3 0x563c26c8aa9b in perf_header__process_sections util/header.c:3813
      #4 0x563c26c8d028 in perf_session__read_header util/header.c:4286
      #5 0x563c26cbab29 in perf_session__open util/session.c:113
      #6 0x563c26cbb3d0 in __perf_session__new util/session.c:221
      #7 0x563c26aacb14 in perf_session__new util/session.h:73
      #8 0x563c26acf7f1 in cmd_script tools/perf/builtin-script.c:4212
      #9 0x563c26bb58ff in run_builtin tools/perf/perf.c:323
      #10 0x563c26bb5e70 in handle_internal_command tools/perf/perf.c:377
      #11 0x563c26bb6238 in run_argv tools/perf/perf.c:421
      #12 0x563c26bb67a0 in main tools/perf/perf.c:537
      #13 0x7f34bde46189 in __libc_start_call_main ../sysdeps/nptl/libc_start_call_main.h:58
  ```

Avoid the out-of-bounds read checking for the leader. Leave the 'nr'
check intact as nr will be 0 or the counting down and evsel be a group
member.

Signed-off-by: Ian Rogers <irogers@google.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Ali Saidi <alisaidi@amazon.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Athira Rajeev <atrajeev@linux.vnet.ibm.com>
Cc: Brian Robbins <brianrob@linux.microsoft.com>
Cc: Changbin Du <changbin.du@huawei.com>
Cc: Dmitrii Dolgov <9erthalion6@gmail.com>
Cc: Fangrui Song <maskray@google.com>
Cc: German Gomez <german.gomez@arm.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Ivan Babrou <ivan@cloudflare.com>
Cc: James Clark <james.clark@arm.com>
Cc: Jing Zhang <renyu.zj@linux.alibaba.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: John Garry <john.g.garry@oracle.com>
Cc: K Prateek Nayak <kprateek.nayak@amd.com>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Leo Yan <leo.yan@linaro.org>
Cc: Liam Howlett <liam.howlett@oracle.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Miguel Ojeda <ojeda@kernel.org>
Cc: Mike Leach <mike.leach@linaro.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Ravi Bangoria <ravi.bangoria@amd.com>
Cc: Sean Christopherson <seanjc@google.com>
Cc: Steinar H. Gunderson <sesse@google.com>
Cc: Suzuki Poulouse <suzuki.poulose@arm.com>
Cc: Wenyu Liu <liuwenyu7@huawei.com>
Cc: Will Deacon <will@kernel.org>
Cc: Yang Jihong <yangjihong1@huawei.com>
Cc: Ye Xingchen <ye.xingchen@zte.com.cn>
Cc: Yuan Can <yuancan@huawei.com>
Cc: linux-arm-kernel@lists.infradead.org
Link: https://lore.kernel.org/lkml/20230608232823.4027869-24-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2023-06-12 15:57:54 -03:00
Ian Rogers
f8e502b9d1 perf header: Ensure bitmaps are freed
memory_node bitmaps need a bitmap_free to avoid memory leaks. Caught
by leak sanitizer.

Signed-off-by: Ian Rogers <irogers@google.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Ali Saidi <alisaidi@amazon.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Athira Rajeev <atrajeev@linux.vnet.ibm.com>
Cc: Brian Robbins <brianrob@linux.microsoft.com>
Cc: Changbin Du <changbin.du@huawei.com>
Cc: Dmitrii Dolgov <9erthalion6@gmail.com>
Cc: Fangrui Song <maskray@google.com>
Cc: German Gomez <german.gomez@arm.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Ivan Babrou <ivan@cloudflare.com>
Cc: James Clark <james.clark@arm.com>
Cc: Jing Zhang <renyu.zj@linux.alibaba.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: John Garry <john.g.garry@oracle.com>
Cc: K Prateek Nayak <kprateek.nayak@amd.com>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Leo Yan <leo.yan@linaro.org>
Cc: Liam Howlett <liam.howlett@oracle.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Miguel Ojeda <ojeda@kernel.org>
Cc: Mike Leach <mike.leach@linaro.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Ravi Bangoria <ravi.bangoria@amd.com>
Cc: Sean Christopherson <seanjc@google.com>
Cc: Steinar H. Gunderson <sesse@google.com>
Cc: Suzuki Poulouse <suzuki.poulose@arm.com>
Cc: Wenyu Liu <liuwenyu7@huawei.com>
Cc: Will Deacon <will@kernel.org>
Cc: Yang Jihong <yangjihong1@huawei.com>
Cc: Ye Xingchen <ye.xingchen@zte.com.cn>
Cc: Yuan Can <yuancan@huawei.com>
Cc: coresight@lists.linaro.org
Cc: linux-arm-kernel@lists.infradead.org
Link: https://lore.kernel.org/r/20230608232823.4027869-11-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2023-06-12 15:57:53 -03:00
Ian Rogers
5c6e7c21ae perf header: Make nodes dynamic in write_mem_topology()
Avoid a large static array, dynamically allocate the nodes avoiding a
hard coded limited as well.

Signed-off-by: Ian Rogers <irogers@google.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: K Prateek Nayak <kprateek.nayak@amd.com>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Leo Yan <leo.yan@linaro.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Ravi Bangoria <ravi.bangoria@amd.com>
Cc: Ross Zwisler <zwisler@chromium.org>
Cc: Sean Christopherson <seanjc@google.com>
Cc: Steven Rostedt (VMware) <rostedt@goodmis.org>
Cc: Tiezhu Yang <yangtiezhu@loongson.cn>
Cc: Yang Jihong <yangjihong1@huawei.com>
Link: https://lore.kernel.org/r/20230526183401.2326121-2-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2023-05-28 10:20:38 -03:00
Ian Rogers
94f9eb95d9 perf pmus: Remove perf_pmus__has_hybrid
perf_pmus__has_hybrid was used to detect when there was >1 core PMU,
this can be achieved with perf_pmus__num_core_pmus that doesn't depend
upon is_pmu_hybrid and PMU name comparisons. When modifying the
function calls take the opportunity to improve comments,
enable/simplify tests that were previously failing for hybrid but now
pass and to simplify generic code.

Reviewed-by: Kan Liang <kan.liang@linux.intel.com>
Signed-off-by: Ian Rogers <irogers@google.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Ali Saidi <alisaidi@amazon.com>
Cc: Athira Rajeev <atrajeev@linux.vnet.ibm.com>
Cc: Dmitrii Dolgov <9erthalion6@gmail.com>
Cc: Huacai Chen <chenhuacai@kernel.org>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@arm.com>
Cc: Jing Zhang <renyu.zj@linux.alibaba.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: John Garry <john.g.garry@oracle.com>
Cc: Kajol Jain <kjain@linux.ibm.com>
Cc: Kang Minchul <tegongkang@gmail.com>
Cc: Leo Yan <leo.yan@linaro.org>
Cc: Madhavan Srinivasan <maddy@linux.ibm.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Mike Leach <mike.leach@linaro.org>
Cc: Ming Wang <wangming01@loongson.cn>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Ravi Bangoria <ravi.bangoria@amd.com>
Cc: Rob Herring <robh@kernel.org>
Cc: Sandipan Das <sandipan.das@amd.com>
Cc: Sean Christopherson <seanjc@google.com>
Cc: Suzuki Poulouse <suzuki.poulose@arm.com>
Cc: Thomas Richter <tmricht@linux.ibm.com>
Cc: Will Deacon <will@kernel.org>
Cc: Xing Zhengjun <zhengjun.xing@linux.intel.com>
Cc: coresight@lists.linaro.org
Cc: linux-arm-kernel@lists.infradead.org
Link: https://lore.kernel.org/r/20230527072210.2900565-34-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2023-05-27 09:42:38 -03:00