2
0
mirror of git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git synced 2025-09-04 20:19:47 +08:00
Commit Graph

1243 Commits

Author SHA1 Message Date
Linus Torvalds
7685b334d1 perf-tools changes for v6.14
There are a lot of changes in the perf tools in this cycle.
 
 build
 -----
 * Use generic syscall table to generate syscall numbers on supported archs.
 * This also enables to get rid of libaudit which was used for syscall numbers.
 * Remove python2 support as it's deprecated for years.
 * Fix issues on static build with libzstd.
 
 perf record
 -----------
 * Intel-PT supports "aux-action" config term to pause or resume tracing in
   the aux-buffer.  Users can start the intel_pt event as "started-paused" and
   configure other events to control the Intel-PT tracing.
 
     # perf record --kcore -e intel_pt/aux-action=start-paused/   \
         -e syscalls:sys_enter_newuname/aux-action=resume/        \
         -e syscalls:sys_exit_newuname/aux-action=pause/ -- uname
 
   This requires the kernel support (which was added in v6.13).
 
 perf lock
 ---------
 * 'perf lock contention' command has an ability to symbolize locks in
   dynamically allocated objects using slab cache name when it runs with BPF.
   Those dynamic locks would have "&" prefix in the name to distinguish them
   from ordinary (static) locks.
 
     # perf lock con -abl -E 5 sleep 1
        contended   total wait     max wait     avg wait            address   symbol
 
                2      1.95 us      1.77 us       975 ns   ffff9d5e852d3498   &task_struct (mutex)
                1      1.18 us      1.18 us      1.18 us   ffff9d5e852d3538   &task_struct (mutex)
                4      1.12 us       354 ns       279 ns   ffff9d5e841ca800   &kmalloc-cg-512 (mutex)
                2       859 ns       617 ns       429 ns   ffffffffa41c3620   delayed_uprobe_lock (mutex)
                3       691 ns       388 ns       230 ns   ffffffffa41c0940   pack_mutex (mutex)
 
   This also requires the kernel/BPF support (which was added in v6.13).
 
 perf ftrace
 -----------
 * 'perf ftrace latency' command gets a couple of options to support linear
   buckets instead of exponential.  Also it's possible to specify max and
   min latency for the linear buckets.
 
     # perf ftrace latency -abn -T switch_mm_irqs_off --bucket-range=100   \
         --min-latency=200 --max-latency=800 -- sleep 1
     #   DURATION     |      COUNT | GRAPH                                  |
          0 -  200 ns |        186 | ###                                    |
        200 -  300 ns |        256 | #####                                  |
        300 -  400 ns |        364 | #######                                |
        400 -  500 ns |        223 | ####                                   |
        500 -  600 ns |        111 | ##                                     |
        600 -  700 ns |         41 |                                        |
        700 -  800 ns |        141 | ##                                     |
        800 -  ... ns |        169 | ###                                    |
 
     # statistics  (in nsec)
       total time:              2162212
         avg time:                  967
         max time:                16817
         min time:                  132
            count:                 2236
 
 * As you can see in the above example, it nows shows the statistics at the
   end so that users can see the avg/max/min latencies easily.
 
 * 'perf ftrace profile' command has --graph-opts option like 'perf ftrace
   trace' so that it can control the tracing behaviors in the same way.
   For example, it can limit the function call depth or threshold.
 
 perf script
 -----------
 * Improve physical memory resolution in 'mem-phys-addr' script by parsing
   /proc/iomem file.
 
     # perf script mem-phys-addr -- find /
     ...
     Event: mem_inst_retired.all_loads:P
     Memory type                                    count  percentage
     ----------------------------------------  ----------  ----------
     100000000-85f7fffff : System RAM                8929        69.7
       547600000-54785d23f : Kernel data             1240         9.7
       546a00000-5474bdfff : Kernel rodata            490         3.8
       5480ce000-5485fffff : Kernel bss               121         0.9
     0-fff : Reserved                                3860        30.1
     100000-89c01fff : System RAM                      18         0.1
     8a22c000-8df6efff : System RAM                     5         0.0
 
 Others
 ------
 * 'perf test' gets --runs-per-test option to run the test cases repeatedly.
   This would be helpful to see if it's flaky.
 
 * Add 'parse_events' method to Python perf extension module, so that users
   can use the same event parsing logic in the python code.  One more step
   towards implementing perf tools in Python. :)
 
 * Support opening tracepoint events without libtraceevent.  This will be
   helpful if it won't use the tracing data like in 'perf stat'.
 
 * Update ARM Neoverse N2/V2 JSON events and metrics
 
 Signed-off-by: Namhyung Kim <namhyung@kernel.org>
 -----BEGIN PGP SIGNATURE-----
 
 iHUEABYIAB0WIQSo2x5BnqMqsoHtzsmMstVUGiXMgwUCZ5AgiQAKCRCMstVUGiXM
 g0WhAP43Dpfatrm1jicTyAogk5D/JrIMOgjGtrJJi5RXG/r0gwD8DSWFzLppS9xy
 KGtjLHrN6v6BqR4DCubdlZmRfh9Qjgg=
 =M0Kz
 -----END PGP SIGNATURE-----

Merge tag 'perf-tools-for-v6.14-2025-01-21' of git://git.kernel.org/pub/scm/linux/kernel/git/perf/perf-tools

Pull perf-tools updates from Namhyung Kim:
 "There are a lot of changes in the perf tools in this cycle.

  build:

   - Use generic syscall table to generate syscall numbers on supported
     archs

   - This also enables to get rid of libaudit which was used for syscall
     numbers

   - Remove python2 support as it's deprecated for years

   - Fix issues on static build with libzstd

  perf record:

   - Intel-PT supports "aux-action" config term to pause or resume
     tracing in the aux-buffer. Users can start the intel_pt event as
     "started-paused" and configure other events to control the Intel-PT
     tracing:

         # perf record --kcore -e intel_pt/aux-action=start-paused/   \
             -e syscalls:sys_enter_newuname/aux-action=resume/        \
             -e syscalls:sys_exit_newuname/aux-action=pause/ -- uname

     This requires kernel support (which was added in v6.13)

  perf lock:

   - 'perf lock contention' command has an ability to symbolize locks in
     dynamically allocated objects using slab cache name when it runs
     with BPF. Those dynamic locks would have "&" prefix in the name to
     distinguish them from ordinary (static) locks

        # perf lock con -abl -E 5 sleep 1
           contended   total wait     max wait     avg wait            address   symbol

                   2      1.95 us      1.77 us       975 ns   ffff9d5e852d3498   &task_struct (mutex)
                   1      1.18 us      1.18 us      1.18 us   ffff9d5e852d3538   &task_struct (mutex)
                   4      1.12 us       354 ns       279 ns   ffff9d5e841ca800   &kmalloc-cg-512 (mutex)
                   2       859 ns       617 ns       429 ns   ffffffffa41c3620   delayed_uprobe_lock (mutex)
                   3       691 ns       388 ns       230 ns   ffffffffa41c0940   pack_mutex (mutex)

     This also requires kernel/BPF support (which was added in v6.13)

  perf ftrace:

   - 'perf ftrace latency' command gets a couple of options to support
     linear buckets instead of exponential. Also it's possible to
     specify max and min latency for the linear buckets:

        # perf ftrace latency -abn -T switch_mm_irqs_off --bucket-range=100   \
            --min-latency=200 --max-latency=800 -- sleep 1
        #   DURATION     |      COUNT | GRAPH                                  |
             0 -  200 ns |        186 | ###                                    |
           200 -  300 ns |        256 | #####                                  |
           300 -  400 ns |        364 | #######                                |
           400 -  500 ns |        223 | ####                                   |
           500 -  600 ns |        111 | ##                                     |
           600 -  700 ns |         41 |                                        |
           700 -  800 ns |        141 | ##                                     |
           800 -  ... ns |        169 | ###                                    |

        # statistics  (in nsec)
          total time:              2162212
            avg time:                  967
            max time:                16817
            min time:                  132
               count:                 2236

   - As you can see in the above example, it nows shows the statistics
     at the end so that users can see the avg/max/min latencies easily

   - 'perf ftrace profile' command has --graph-opts option like 'perf
     ftrace trace' so that it can control the tracing behaviors in the
     same way. For example, it can limit the function call depth or
     threshold

  perf script:

   - Improve physical memory resolution in 'mem-phys-addr' script by
     parsing /proc/iomem file

        # perf script mem-phys-addr -- find /
        ...
        Event: mem_inst_retired.all_loads:P
        Memory type                                    count  percentage
        ----------------------------------------  ----------  ----------
        100000000-85f7fffff : System RAM                8929        69.7
          547600000-54785d23f : Kernel data             1240         9.7
          546a00000-5474bdfff : Kernel rodata            490         3.8
          5480ce000-5485fffff : Kernel bss               121         0.9
        0-fff : Reserved                                3860        30.1
        100000-89c01fff : System RAM                      18         0.1
        8a22c000-8df6efff : System RAM                     5         0.0

  Others:

   - 'perf test' gets --runs-per-test option to run the test cases
     repeatedly. This would be helpful to see if it's flaky

   - Add 'parse_events' method to Python perf extension module, so that
     users can use the same event parsing logic in the python code. One
     more step towards implementing perf tools in Python. :)

   - Support opening tracepoint events without libtraceevent. This will
     be helpful if it won't use the tracing data like in 'perf stat'

   - Update ARM Neoverse N2/V2 JSON events and metrics"

* tag 'perf-tools-for-v6.14-2025-01-21' of git://git.kernel.org/pub/scm/linux/kernel/git/perf/perf-tools: (176 commits)
  perf test: Update event_groups test to use instructions
  perf bench: Fix undefined behavior in cmpworker()
  perf annotate: Prefer passing evsel to evsel->core.idx
  perf lock: Rename fields in lock_type_table
  perf lock: Add percpu-rwsem for type filter
  perf lock: Fix parse_lock_type which only retrieve one lock flag
  perf lock: Fix return code for functions in __cmd_contention
  perf hist: Fix width calculation in hpp__fmt()
  perf hist: Fix bogus profiles when filters are enabled
  perf hist: Deduplicate cmp/sort/collapse code
  perf test: Improve verbose documentation
  perf test: Add a runs-per-test flag
  perf test: Fix parallel/sequential option documentation
  perf test: Send list output to stdout rather than stderr
  perf test: Rename functions and variables for better clarity
  perf tools: Expose quiet/verbose variables in Makefile.perf
  perf config: Add a function to set one variable in .perfconfig
  perf test perftool_testsuite: Return correct value for skipping
  perf test perftool_testsuite: Add missing description
  perf test record+probe_libc_inet_pton: Make test resilient
  ...
2025-01-24 05:45:40 -08:00
Chun-Tse Shao
e9188ae3cd perf lock: Add percpu-rwsem for type filter
percpu-rwsem was missing in man page. And for backward compatibility,
replace `pcpu-sem` with `percpu-rwsem` before parsing lock name.
Tested `./perf lock con -ab -Y pcpu-sem` and `./perf lock con -ab -Y
percpu-rwsem`

Fixes: 4f701063bf ("perf lock contention: Show lock type with address")
Reviewed-by: Namhyung Kim <namhyung@kernel.org>
Signed-off-by: Chun-Tse Shao <ctshao@google.com>
Cc: nick.forrington@arm.com
Link: https://lore.kernel.org/r/20250116235838.2769691-2-ctshao@google.com
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2025-01-17 10:12:40 -08:00
Ian Rogers
4e38f2814f perf test: Improve verbose documentation
Add a little more detail on the output expectations for each verbose
level.

Signed-off-by: Ian Rogers <irogers@google.com>
Reviewed-by: Namhyung Kim <namhyung@kernel.org>
Cc: James Clark <james.clark@linaro.org>
Link: https://lore.kernel.org/r/20250110045736.598281-6-irogers@google.com
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2025-01-16 11:01:03 -08:00
Ian Rogers
1c0d9816e9 perf test: Add a runs-per-test flag
To detect flakes it is useful to run tests more than once. Add a
runs-per-test flag that will run each test multiple times. Example
output:

```
$ perf test -r 3 lbr -v
122: perf record LBR tests                                           : Ok
122: perf record LBR tests                                           : Ok
122: perf record LBR tests                                           : Ok
```

Update the documentation for the runs-per-test option.

Signed-off-by: Ian Rogers <irogers@google.com>
Reviewed-by: Namhyung Kim <namhyung@kernel.org>
Cc: James Clark <james.clark@linaro.org>
Link: https://lore.kernel.org/r/20250110045736.598281-5-irogers@google.com
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2025-01-16 11:01:03 -08:00
Ian Rogers
4dd8bc4bf5 perf test: Fix parallel/sequential option documentation
The parallel option was removed in commit 94d1a913bd ("perf test:
Make parallel testing the default"). Update the sequential
documentation to reflect it isn't the default except for "exclusive"
tests.

Fixes: 94d1a913bd ("perf test: Make parallel testing the default")
Signed-off-by: Ian Rogers <irogers@google.com>
Reviewed-by: Namhyung Kim <namhyung@kernel.org>
Cc: James Clark <james.clark@linaro.org>
Link: https://lore.kernel.org/r/20250110045736.598281-4-irogers@google.com
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2025-01-16 11:01:03 -08:00
James Clark
ba113ecad8 perf docs: arm_spe: Document new discard mode
Document the flag along with PMU events to hint what it's used for and
give an example with other useful options to get minimal output.

Reviewed-by: Yeoreum Yun <yeoreum.yun@arm.com>
Signed-off-by: James Clark <james.clark@linaro.org>
Link: https://lore.kernel.org/r/20250108142904.401139-3-james.clark@linaro.org
Signed-off-by: Will Deacon <will@kernel.org>
2025-01-10 14:50:55 +00:00
Charlie Jenkins
3cc550f5bb perf tools: Remove dependency on libaudit
All architectures now support HAVE_SYSCALL_TABLE_SUPPORT, so the flag is
no longer needed. With the removal of the flag, the related
GENERIC_SYSCALL_TABLE can also be removed.

libaudit was only used as a fallback for when HAVE_SYSCALL_TABLE_SUPPORT
was not defined, so libaudit is also no longer needed for any
architecture.

Signed-off-by: Charlie Jenkins <charlie@rivosinc.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Christian Brauner <brauner@kernel.org>
Cc: Guo Ren <guoren@kernel.org>
Cc: Günther Noack <gnoack@google.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@linaro.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: John Garry <john.g.garry@oracle.com>
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: Leo Yan <leo.yan@linux.dev>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Mickaël Salaün <mic@digikod.net>
Cc: Mike Leach <mike.leach@linaro.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Palmer Dabbelt <palmer@dabbelt.com>
Cc: Paul Walmsley <paul.walmsley@sifive.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Will Deacon <will@kernel.org>
Link: https://lore.kernel.org/r/20250108-perf_syscalltbl-v6-16-7543b5293098@rivosinc.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2025-01-10 10:59:42 -03:00
Namhyung Kim
e5f2024cb9 perf ftrace profile: Add --graph-opts option
Like trace subcommand, it should be able to pass some options to control
the tracing behavior for the function graph tracer.

But some options are limited in order to maintain the internal behavior.

For example, it can limit the function call depth like below:

  # perf ftrace profile --graph-opts depth=5 -- myprog

Committer testing:

  root@number:~# perf ftrace profile --graph-opts thresh=1000 -- sleep 1
  # Total (us)   Avg (us)   Max (us)      Count   Function
   1001419.301 500709.650 1000032.000          2   x64_sys_call
   1000032.000 1000032.000 1000032.000          1   __x64_sys_clock_nanosleep
   1000032.000 1000032.000 1000032.000          1   common_nsleep
   1000031.000 1000031.000 1000031.000          1   do_nanosleep
   1000031.000 1000031.000 1000031.000          1   hrtimer_nanosleep
   1000024.000 1000024.000 1000024.000          1   schedule
      1387.208   1387.208   1387.208          1   __x64_sys_execve
      1386.691   1386.691   1386.691          1   do_execveat_common.isra.0
      1334.170   1334.170   1334.170          1   bprm_execve
      1258.413   1258.413   1258.413          1   load_elf_binary
      1123.068   1123.068   1123.068          1   begin_new_exec
      1113.550   1113.550   1113.550          1   mmput
      1109.237   1109.237   1109.237          1   exit_mmap
  root@number:~# perf ftrace profile --graph-opts thresh=1200 -- sleep 1
  # Total (us)   Avg (us)   Max (us)      Count   Function
   1001448.204 500724.102 1000018.000          2   x64_sys_call
   1000017.000 1000017.000 1000017.000          1   __x64_sys_clock_nanosleep
   1000017.000 1000017.000 1000017.000          1   common_nsleep
   1000017.000 1000017.000 1000017.000          1   hrtimer_nanosleep
   1000016.000 1000016.000 1000016.000          1   do_nanosleep
   1000012.000 1000012.000 1000012.000          1   schedule
      1430.112   1430.112   1430.112          1   __x64_sys_execve
      1429.581   1429.581   1429.581          1   do_execveat_common.isra.0
      1376.289   1376.289   1376.289          1   bprm_execve
      1301.743   1301.743   1301.743          1   load_elf_binary
  root@number:~#

Reviewed-by: James Clark <james.clark@linaro.org>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: https://lore.kernel.org/r/20250107224352.1128669-2-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2025-01-08 17:20:42 -03:00
Howard Chu
00c640595e perf docs: Add documentation for --force-btf option
The --force-btf option is intended for debugging purposes and is
currently undocumented. Add documentation for it.

Committer notes:

We need a follow up patch expanding on what can be done via BTF and what
isn't possible and thus needs further work to convert kernel C source
code into tables that can then be associated with syscall integer args
and struct members, as discussed in:

  https://lore.kernel.org/all/20241215190712.787847-3-howardchu95@gmail.com/T/#mcfbba653200775c59c730705229a49b34a153db7

Signed-off-by: Howard Chu <howardchu95@gmail.com>
Acked-by: Namhyung Kim <namhyung@kernel.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Link: https://lore.kernel.org/r/20241215190712.787847-3-howardchu95@gmail.com
Link: https://lore.kernel.org/all/20241215190712.787847-3-howardchu95@gmail.com/T/#mcfbba653200775c59c730705229a49b34a153db7
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2024-12-26 12:19:26 -03:00
Adrian Hunter
f8b301e0a4 perf intel-pt: Add documentation for pause / resume
Document the use of aux-action config term and provide a simple example.

Reviewed-by: Andi Kleen <ak@linux.intel.com>
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Acked-by: Ian Rogers <irogers@google.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Leo Yan <leo.yan@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: https://lore.kernel.org/r/20241216070244.14450-7-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2024-12-18 16:24:32 -03:00
Adrian Hunter
f38ec2274c perf intel-pt: Improve man page format
Improve format of config terms and section references.

Reviewed-by: Andi Kleen <ak@linux.intel.com>
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Acked-by: Ian Rogers <irogers@google.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Leo Yan <leo.yan@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: https://lore.kernel.org/r/20241216070244.14450-6-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2024-12-18 16:24:32 -03:00
Adrian Hunter
8a0f49a7f1 perf tools: Parse aux-action
Add parsing for aux-action to accept "pause", "resume" or "start-paused"
values.

"start-paused" is valid only for AUX area events.

"pause" and "resume" are valid only for events grouped with an AUX area
event as the group leader.  However, like with aux-output, the events
will be automatically grouped if they are not currently in a group, and
the AUX area event precedes the other events.

Reviewed-by: Andi Kleen <ak@linux.intel.com>
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Acked-by: Ian Rogers <irogers@google.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Leo Yan <leo.yan@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: https://lore.kernel.org/r/20241216070244.14450-4-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2024-12-18 16:24:32 -03:00
Gabriele Monaco
690a052a6d perf ftrace latency: Add --max-latency option
This patch adds a max-latency option as discussed, in case the number of
buckets is more than 22, we don't observe the setting (for now, let's
say).

By default or if 0 is passed, the value is automatically determined
based on the number of buckets, range and minimum, so that we fill all
available buffers (equivalent to the behaviour before this patch).

We now get something like this:

  # perf ftrace latency --bucket-range=20 \
			--min-latency 10 \
			--max-latency=100 \
			-T switch_mm_irqs_off -a sleep 2
  #   DURATION     |      COUNT | GRAPH             |
       0 -   10 us |       1731 | ################  |
      10 -   30 us |          1 |                   |
      30 -   50 us |          0 |                   |
      50 -   70 us |          0 |                   |
      70 -   90 us |          0 |                   |
      90 -  100 us |          0 |                   |
     100 -  ... us |          0 |                   |

Note the maximum is observed also if it doesn't cover completely a full
range (the second to last range is 10us long to let the last start at
100 sharp), this looks to me more sensible and eases the computations,
since we don't need to account for the range while filling the buckets.

Signed-off-by: Gabriele Monaco <gmonaco@redhat.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Clark Williams <williams@redhat.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: https://lore.kernel.org/r/20241112181214.1171244-5-acme@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2024-12-10 15:16:40 -03:00
Arnaldo Carvalho de Melo
08b875b6bf perf ftrace latency: Introduce --min-latency to narrow down into a latency range
Things below and over will be in the first and last, outlier, buckets.

Without it:

  # perf ftrace latency --use-nsec --use-bpf \
			--bucket-range=200 \
			-T switch_mm_irqs_off -a sleep 2
  #   DURATION     |      COUNT | GRAPH                                   |
       0 -  200 ns |          0 |                                         |
     200 -  400 ns |         44 |                                         |
     400 -  600 ns |        291 | #                                       |
     600 -  800 ns |        506 | ##                                      |
     800 - 1000 ns |        148 |                                         |
    1.00 - 1.20 us |        581 | ##                                      |
    1.20 - 1.40 us |       2199 | ##########                              |
    1.40 - 1.60 us |       1048 | ####                                    |
    1.60 - 1.80 us |       1448 | ######                                  |
    1.80 - 2.00 us |       1091 | #####                                   |
    2.00 - 2.20 us |        517 | ##                                      |
    2.20 - 2.40 us |        318 | #                                       |
    2.40 - 2.60 us |        370 | #                                       |
    2.60 - 2.80 us |        271 | #                                       |
    2.80 - 3.00 us |        150 |                                         |
    3.00 - 3.20 us |         85 |                                         |
    3.20 - 3.40 us |         48 |                                         |
    3.40 - 3.60 us |         40 |                                         |
    3.60 - 3.80 us |         22 |                                         |
    3.80 - 4.00 us |         13 |                                         |
    4.00 - 4.20 us |         14 |                                         |
    4.20 - ...  us |        626 | ##                                      |
  #
  # perf ftrace latency --use-nsec --use-bpf \
			--bucket-range=20 --min-latency=1200 \
			-T switch_mm_irqs_off -a sleep 2
  #   DURATION     |      COUNT | GRAPH                                   |
       0 - 1200 ns |       1243 | #####                                   |
    1.20 - 1.22 us |        141 |                                         |
    1.22 - 1.24 us |        202 |                                         |
    1.24 - 1.26 us |        209 |                                         |
    1.26 - 1.28 us |        219 |                                         |
    1.28 - 1.30 us |        208 |                                         |
    1.30 - 1.32 us |        245 | #                                       |
    1.32 - 1.34 us |        246 | #                                       |
    1.34 - 1.36 us |        224 | #                                       |
    1.36 - 1.38 us |        219 |                                         |
    1.38 - 1.40 us |        206 |                                         |
    1.40 - 1.42 us |        190 |                                         |
    1.42 - 1.44 us |        190 |                                         |
    1.44 - 1.46 us |        146 |                                         |
    1.46 - 1.48 us |        140 |                                         |
    1.48 - 1.50 us |        125 |                                         |
    1.50 - 1.52 us |        115 |                                         |
    1.52 - 1.54 us |        102 |                                         |
    1.54 - 1.56 us |         87 |                                         |
    1.56 - 1.58 us |         90 |                                         |
    1.58 - 1.60 us |         85 |                                         |
    1.60 - ...  us |       5487 | ########################                |
  #

Now we want focus on the latencies starting at 1.2us, with a finer
grained range of 20ns:

This is all on a live system, so statistically interesting, but not
narrowing down on the same numbers, so a 'perf ftrace latency record'
seems interesting to then use all on the same snapshot of latencies.

A --max-latency counterpart should come next, at first limiting the
max-latency to 20 * bucket-size, as we have a fixed buckets array with
20 + 2 entries (+ for the outliers) and thus would need to make it
larger for higher latencies.

We also may need a way to ask for not considering the out of range
values (first and last buckets) when drawing the buckets bars.

Co-developed-by: Gabriele Monaco <gmonaco@redhat.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Clark Williams <williams@redhat.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: https://lore.kernel.org/r/20241112181214.1171244-4-acme@kernel.org
Signed-off-by: Gabriele Monaco <gmonaco@redhat.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2024-12-10 15:16:27 -03:00
Arnaldo Carvalho de Melo
e8536dd47a perf ftrace latency: Introduce --bucket-range to ask for linear bucketing
In addition to showing it exponentially, using log2() to figure out the
histogram index, allow for showing it linearly:

The preexisting more, the default:

  # perf ftrace latency --use-nsec --use-bpf \
  			-T switch_mm_irqs_off -a sleep 2
  #   DURATION     |      COUNT | GRAPH                                   |
       0 -    1 ns |          0 |                                         |
       1 -    2 ns |          0 |                                         |
       2 -    4 ns |          0 |                                         |
       4 -    8 ns |          0 |                                         |
       8 -   16 ns |          0 |                                         |
      16 -   32 ns |          0 |                                         |
      32 -   64 ns |          0 |                                         |
      64 -  128 ns |        238 | #                                       |
     128 -  256 ns |       1704 | ##########                              |
     256 -  512 ns |        672 | ###                                     |
     512 - 1024 ns |       4458 | ##########################              |
       1 -    2 us |        677 | ####                                    |
       2 -    4 us |          5 |                                         |
       4 -    8 us |          0 |                                         |
       8 -   16 us |          0 |                                         |
      16 -   32 us |          0 |                                         |
      32 -   64 us |          0 |                                         |
      64 -  128 us |          0 |                                         |
     128 -  256 us |          0 |                                         |
     256 -  512 us |          0 |                                         |
     512 - 1024 us |          0 |                                         |
       1 - ...  ms |          0 |                                         |
  #

The new histogram mode:

  # perf ftrace latency --bucket-range=150 --use-nsec --use-bpf \
  			-T switch_mm_irqs_off -a sleep 2
  #   DURATION     |      COUNT | GRAPH                                   |
       0 -    1 ns |          0 |                                         |
       1 -  151 ns |        265 | #                                       |
     151 -  301 ns |       1797 | ###########                             |
     301 -  451 ns |        258 | #                                       |
     451 -  601 ns |        289 | #                                       |
     601 -  751 ns |       2049 | #############                           |
     751 -  901 ns |        967 | ######                                  |
     901 - 1051 ns |        513 | ###                                     |
    1.05 - 1.20 us |        114 |                                         |
    1.20 - 1.35 us |        559 | ###                                     |
    1.35 - 1.50 us |        189 | #                                       |
    1.50 - 1.65 us |        137 |                                         |
    1.65 - 1.80 us |         32 |                                         |
    1.80 - 1.95 us |          2 |                                         |
    1.95 - 2.10 us |          0 |                                         |
    2.10 - 2.25 us |          1 |                                         |
    2.25 - 2.40 us |          1 |                                         |
    2.40 - 2.55 us |          0 |                                         |
    2.55 - 2.70 us |          0 |                                         |
    2.70 - 2.85 us |          0 |                                         |
    2.85 - 3.00 us |          1 |                                         |
    3.00 - ...  us |          4 |                                         |
  #

Co-developed-by: Gabriele Monaco <gmonaco@redhat.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Clark Williams <williams@redhat.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: https://lore.kernel.org/r/20241112181214.1171244-3-acme@kernel.org
Signed-off-by: Gabriele Monaco <gmonaco@redhat.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2024-12-10 15:16:01 -03:00
Arnaldo Carvalho de Melo
161c3402fd perf config: Fix trival typo 'an' -> 'can'
Just a trivial typo, should be 'can', did a spell check on the rest of
the file just in case, nothing more stood out.

Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2024-12-09 17:51:53 -03:00
Arnaldo Carvalho de Melo
a6e8a58de6 perf disasm: Allow configuring what disassemblers to use
The perf tools annotation code used for a long time parsing the output
of binutils's objdump (or its reimplementations, like llvm's) to then
parse and augment it with samples, allow navigation, etc.

More recently disassemblers from the capstone and llvm (libraries, not
parsing the output of tools using those libraries to mimic binutils's
objdump output) were introduced.

So when all those methods are available, there is a static preference
for a series of attempts of disassembling a binary, with the 'llvm,
capstone, objdump' sequence being hard coded.

This patch allows users to change that sequence, specifying via a 'perf
config' 'annotate.disassemblers' entry which and in what order
disassemblers should be attempted.

As alluded to in the comments in the source code of this series, this
flexibility is useful for users and developers alike, elliminating the
requirement to rebuild the tool with some specific set of libraries to
see how the output of disassembling would be for one of these methods.

  root@x1:~# rm -f ~/.perfconfig
  root@x1:~# perf annotate -v --stdio2 update_load_avg
  <SNIP>
  symbol__disassemble:
    filename=/usr/lib/debug/lib/modules/6.11.4-201.fc40.x86_64/vmlinux,
    sym=update_load_avg, start=0xffffffffb6148fe0, en>
  annotating [0x6ff7170]
    /usr/lib/debug/lib/modules/6.11.4-201.fc40.x86_64/vmlinux :
    [0x7407ca0] update_load_avg
  Disassembled with llvm
  annotate.disassemblers=llvm,capstone,objdump
  Samples: 66  of event 'cpu_atom/cycles/P', 10000 Hz,
	Event count (approx.): 5185444, [percent: local period]
  update_load_avg()
    /usr/lib/debug/lib/modules/6.11.4-201.fc40.x86_64/vmlinux
  Percent       0xffffffff81148fe0 <update_load_avg>:
     1.61         pushq   %r15
                  pushq   %r14
     1.00         pushq   %r13
                  movl    %edx,%r13d
     1.90         pushq   %r12
                  pushq   %rbp
                  movq    %rsi,%rbp
                  pushq   %rbx
                  movq    %rdi,%rbx
                  subq    $0x18,%rsp
    15.14         movl    0x1a4(%rdi),%eax

  root@x1:~# perf config annotate.disassemblers=capstone
  root@x1:~# cat ~/.perfconfig
  # this file is auto-generated.
  [annotate]
	  disassemblers = capstone
  root@x1:~#
  root@x1:~# perf annotate -v --stdio2 update_load_avg
  <SNIP>
  Disassembled with capstone
  annotate.disassemblers=capstone
  Samples: 66  of event 'cpu_atom/cycles/P', 10000 Hz,
  Event count (approx.): 5185444, [percent: local period]
  update_load_avg()
  /usr/lib/debug/lib/modules/6.11.4-201.fc40.x86_64/vmlinux
  Percent       0xffffffff81148fe0 <update_load_avg>:
     1.61         pushq   %r15
                  pushq   %r14
     1.00         pushq   %r13
                  movl    %edx,%r13d
     1.90         pushq   %r12
                  pushq   %rbp
                  movq    %rsi,%rbp
                  pushq   %rbx
                  movq    %rdi,%rbx
                  subq    $0x18,%rsp
    15.14         movl    0x1a4(%rdi),%eax
  root@x1:~# perf config annotate.disassemblers=objdump,capstone
  root@x1:~# perf config annotate.disassemblers
  annotate.disassemblers=objdump,capstone
  root@x1:~# cat ~/.perfconfig
  # this file is auto-generated.
  [annotate]
	  disassemblers = objdump,capstone
  root@x1:~# perf annotate -v --stdio2 update_load_avg
  Executing: objdump  --start-address=0xffffffff81148fe0 \
		      --stop-address=0xffffffff811497aa  \
		      -d --no-show-raw-insn -S -C "$1"
  Disassembled with objdump
  annotate.disassemblers=objdump,capstone
  Samples: 66  of event 'cpu_atom/cycles/P', 10000 Hz,
  Event count (approx.): 5185444, [percent: local period]
  update_load_avg()
  /usr/lib/debug/lib/modules/6.11.4-201.fc40.x86_64/vmlinux
  Percent

                Disassembly of section .text:

                ffffffff81148fe0 <update_load_avg>:
                #define DO_ATTACH       0x4

                ffffffff81148fe0 <update_load_avg>:
                #define DO_ATTACH       0x4
                #define DO_DETACH       0x8

                /* Update task and its cfs_rq load average */
                static inline void update_load_avg(struct cfs_rq *cfs_rq,
						   struct sched_entity *se,
						   int flags)
                {
     1.61         push   %r15
                  push   %r14
     1.00         push   %r13
                  mov    %edx,%r13d
     1.90         push   %r12
                  push   %rbp
                  mov    %rsi,%rbp
                  push   %rbx
                  mov    %rdi,%rbx
                  sub    $0x18,%rsp
                }

                /* rq->task_clock normalized against any time
		   this cfs_rq has spent throttled */
                static inline u64 cfs_rq_clock_pelt(struct cfs_rq *cfs_rq)
                {
                if (unlikely(cfs_rq->throttle_count))
    15.14         mov    0x1a4(%rdi),%eax
  root@x1:~#

After adding a way to select the disassembler from the command line a
'perf test' comparing the output of the various diassemblers should be
introduced, to test these codebases.

Acked-by: Ian Rogers <irogers@google.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Athira Rajeev <atrajeev@linux.vnet.ibm.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Steinar H. Gunderson <sesse@google.com>
Link: https://lore.kernel.org/r/20241111151734.1018476-4-acme@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2024-11-13 16:27:35 -03:00
Ian Rogers
6d5d90a6ab perf docs: Document tool and hwmon events
Add a few paragraphs on tool and hwmon events.

Signed-off-by: Ian Rogers <irogers@google.com>
Acked-by: Namhyung Kim <namhyung@kernel.org>
Cc: Ravi Bangoria <ravi.bangoria@amd.com>
Cc: Yoshihiro Furudera <fj5100bi@fujitsu.com>
Cc: Howard Chu <howardchu95@gmail.com>
Cc: Ze Gao <zegao2021@gmail.com>
Cc: Changbin Du <changbin.du@huawei.com>
Cc: Junhao He <hejunhao3@huawei.com>
Cc: Weilin Wang <weilin.wang@intel.com>
Cc: James Clark <james.clark@linaro.org>
Cc: Oliver Upton <oliver.upton@linux.dev>
Cc: Athira Jajeev <atrajeev@linux.vnet.ibm.com>
Link: https://lore.kernel.org/r/20241109003759.473460-8-irogers@google.com
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2024-11-09 08:28:03 -08:00
Graham Woodward
35f5aa9ccc perf arm-spe: Update --itrace help text
The --itrace help now needs updating to reflect that
the --itrace=b argument sythesises branches as well
as branch misses.

Signed-off-by: Graham Woodward <graham.woodward@arm.com>
Reviewed-by: James Clark <james.clark@linaro.org>
Tested-by: Leo Yan <leo.yan@arm.com>
Cc: nd@arm.com
Cc: mike.leach@linaro.org
Cc: linux-arm-kernel@lists.infradead.org
Link: https://lore.kernel.org/r/20241025143009.25419-5-graham.woodward@arm.com
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2024-10-29 16:10:17 -07:00
Arnaldo Carvalho de Melo
915a377627 perf test: Document the -w/--workload option
Wasn't documented so far, mention that it is mostly used in the shell
regression tests.

Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Reviewed-by: James Clark <james.clark@linaro.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Clark Williams <williams@redhat.com>
Link: https://lore.kernel.org/r/20241020021842.1752770-4-acme@kernel.org
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2024-10-21 21:10:50 -07:00
Ian Rogers
8838abf626 perf build: Rename HAVE_DWARF_SUPPORT to HAVE_LIBDW_SUPPORT
In Makefile.config for unwinding the name dwarf implies either
libunwind or libdw. Make it clearer that HAVE_DWARF_SUPPORT is really
just defined when libdw is present by renaming to HAVE_LIBDW_SUPPORT.

Signed-off-by: Ian Rogers <irogers@google.com>
Acked-by: Namhyung Kim <namhyung@kernel.org>
Tested-by: Leo Yan <leo.yan@arm.com>
Cc: Anup Patel <anup@brainfault.org>
Cc: Yang Jihong <yangjihong@bytedance.com>
Cc: Palmer Dabbelt <palmer@dabbelt.com>
Cc: David S. Miller <davem@davemloft.net>
Cc: Albert Ou <aou@eecs.berkeley.edu>
Cc: Shenlin Liang <liangshenlin@eswincomputing.com>
Cc: Nick Terrell <terrelln@fb.com>
Cc: Guilherme Amadio <amadio@gentoo.org>
Cc: Steinar H. Gunderson <sesse@google.com>
Cc: Changbin Du <changbin.du@huawei.com>
Cc: Alexander Lobakin <aleksander.lobakin@intel.com>
Cc: Przemek Kitszel <przemyslaw.kitszel@intel.com>
Cc: Huacai Chen <chenhuacai@kernel.org>
Cc: Guo Ren <guoren@kernel.org>
Cc: Masahiro Yamada <masahiroy@kernel.org>
Cc: Masami Hiramatsu (Google) <mhiramat@kernel.org>
Cc: Will Deacon <will@kernel.org>
Cc: James Clark <james.clark@linaro.org>
Cc: Mike Leach <mike.leach@linaro.org>
Cc: Chen Pei <cp0613@linux.alibaba.com>
Cc: Leo Yan <leo.yan@linux.dev>
Cc: Oliver Upton <oliver.upton@linux.dev>
Cc: Aditya Gupta <adityag@linux.ibm.com>
Cc: Kajol Jain <kjain@linux.ibm.com>
Cc: Athira Rajeev <atrajeev@linux.vnet.ibm.com>
Cc: linux-arm-kernel@lists.infradead.org
Cc: linux-riscv@lists.infradead.org
Cc: Bibo Mao <maobibo@loongson.cn>
Cc: John Garry <john.g.garry@oracle.com>
Cc: Atish Patra <atishp@rivosinc.com>
Cc: Dima Kogan <dima@secretsauce.net>
Cc: Paul Walmsley <paul.walmsley@sifive.com>
Cc: Dr. David Alan Gilbert <linux@treblig.org>
Cc: linux-csky@vger.kernel.org
Link: https://lore.kernel.org/r/20241017001354.56973-11-irogers@google.com
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2024-10-18 10:17:40 -07:00
Ian Rogers
5eb2242513 perf libdw: Remove unnecessary defines
As HAVE_DWARF_GETLOCATIONS_SUPPORT and HAVE_DWARF_CFI_SUPPORT always
match HAVE_DWARF_SUPPORT remove the macros and use
HAVE_DWARF_SUPPORT. If building the file is guarded by CONFIG_DWARF
then remove all ifs.

Signed-off-by: Ian Rogers <irogers@google.com>
Acked-by: Namhyung Kim <namhyung@kernel.org>
Cc: Anup Patel <anup@brainfault.org>
Cc: Yang Jihong <yangjihong@bytedance.com>
Cc: Palmer Dabbelt <palmer@dabbelt.com>
Cc: David S. Miller <davem@davemloft.net>
Cc: Albert Ou <aou@eecs.berkeley.edu>
Cc: Shenlin Liang <liangshenlin@eswincomputing.com>
Cc: Nick Terrell <terrelln@fb.com>
Cc: Guilherme Amadio <amadio@gentoo.org>
Cc: Steinar H. Gunderson <sesse@google.com>
Cc: Changbin Du <changbin.du@huawei.com>
Cc: Alexander Lobakin <aleksander.lobakin@intel.com>
Cc: Przemek Kitszel <przemyslaw.kitszel@intel.com>
Cc: Huacai Chen <chenhuacai@kernel.org>
Cc: Guo Ren <guoren@kernel.org>
Cc: Masahiro Yamada <masahiroy@kernel.org>
Cc: Masami Hiramatsu (Google) <mhiramat@kernel.org>
Cc: Will Deacon <will@kernel.org>
Cc: James Clark <james.clark@linaro.org>
Cc: Mike Leach <mike.leach@linaro.org>
Cc: Chen Pei <cp0613@linux.alibaba.com>
Cc: Leo Yan <leo.yan@linux.dev>
Cc: Oliver Upton <oliver.upton@linux.dev>
Cc: Aditya Gupta <adityag@linux.ibm.com>
Cc: Kajol Jain <kjain@linux.ibm.com>
Cc: Athira Rajeev <atrajeev@linux.vnet.ibm.com>
Cc: linux-arm-kernel@lists.infradead.org
Cc: linux-riscv@lists.infradead.org
Cc: Bibo Mao <maobibo@loongson.cn>
Cc: John Garry <john.g.garry@oracle.com>
Cc: Atish Patra <atishp@rivosinc.com>
Cc: Dima Kogan <dima@secretsauce.net>
Cc: Paul Walmsley <paul.walmsley@sifive.com>
Cc: Dr. David Alan Gilbert <linux@treblig.org>
Cc: linux-csky@vger.kernel.org
Link: https://lore.kernel.org/r/20241017001354.56973-10-irogers@google.com
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2024-10-18 10:17:40 -07:00
Madadi Vineeth Reddy
cd912ab3b6 perf sched timehist: Add pre-migration wait time option
pre-migration wait time is the time that a task unnecessarily spends
on the runqueue of a CPU but doesn't get switched-in there. In terms
of tracepoints, it is the time between sched:sched_wakeup and
sched:sched_migrate_task.

Let's say a task woke up on CPU2, then it got migrated to CPU4 and
then it's switched-in to CPU4. So, here pre-migration wait time is
time that it was waiting on runqueue of CPU2 after it is woken up.

The general pattern for pre-migration to occur is:
sched:sched_wakeup
sched:sched_migrate_task
sched:sched_switch

The sched:sched_waking event is used to capture the wakeup time,
as it aligns with the existing code and only introduces a negligible
time difference.

pre-migrations are generally not useful and it increases migrations.
This metric would be helpful in testing patches mainly related to wakeup
and load-balancer code paths as better wakeup logic would choose an
optimal CPU where task would be switched-in and thereby reducing pre-
migrations.

The sample output(s) when -P or --pre-migrations is used:
=================
           time    cpu  task name                       wait time  sch delay   run time  pre-mig time
                        [tid/pid]                          (msec)     (msec)     (msec)     (msec)
--------------- ------  ------------------------------  ---------  ---------  ---------  ---------
   38456.720806 [0001]  schbench[28634/28574]               4.917      4.768      1.004      0.000
   38456.720810 [0001]  rcu_preempt[18]                     3.919      0.003      0.004      0.000
   38456.721800 [0006]  schbench[28779/28574]              23.465     23.465      1.999      0.000
   38456.722800 [0002]  schbench[28773/28574]              60.371     60.237      3.955     60.197
   38456.722806 [0001]  schbench[28634/28574]               0.004      0.004      1.996      0.000
   38456.722811 [0001]  rcu_preempt[18]                     1.996      0.005      0.005      0.000
   38456.723800 [0000]  schbench[28833/28574]               4.000      4.000      3.999      0.000
   38456.723800 [0004]  schbench[28762/28574]              42.951     42.839      3.999     39.867
   38456.723802 [0007]  schbench[28812/28574]              43.947     43.817      3.999     40.866
   38456.723804 [0001]  schbench[28587/28574]               7.935      7.822      0.993      0.000

Signed-off-by: Madadi Vineeth Reddy <vineethr@linux.ibm.com>
Reviewed-by: Tim Chen <tim.c.chen@linux.intel.com>
Link: https://lore.kernel.org/r/20241004170756.18064-1-vineethr@linux.ibm.com
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2024-10-14 12:04:31 -07:00
Thomas Falcon
48966a5a48 perf report: Display columns Predicted/Abort/Cycles in --branch-history
The original commit message:

"
Use current sort mechanism but the real .se_cmp() just returns 0 so
that new columns "Predicted", "Abort" and "Cycles" are created in display
but actually these keys are not the sort keys.

For example:

Overhead  Source:Line   Symbol    Shared Object  Predicted  Abort  Cycles
........  ............  ........  .............  .........  .....  ......

  38.25%  div.c:45      [.] main  div            97.6%      0      3
"

Update missed commit from series "perf report: Show branch flags/cycles
in --branch-history callgraph view" to apply to current repository so that
new columns described above are visible.

Link to original series:
https://lore.kernel.org/lkml/1477876794-30749-1-git-send-email-yao.jin@linux.intel.com/

Reported-by: Dr. David Alan Gilbert <linux@treblig.org>
Suggested-by: Kan Liang <kan.liang@linux.intel.com>
Co-developed-by: Jin Yao <yao.jin@linux.intel.com>
Signed-off-by: Jin Yao <yao.jin@linux.intel.com>
Tested-by: Thomas Falcon <thomas.falcon@intel.com>
Signed-off-by: Thomas Falcon <thomas.falcon@intel.com>
Link: https://lore.kernel.org/r/20241010184046.203822-1-thomas.falcon@intel.com
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2024-10-10 23:41:23 -07:00
Yoshihiro Furudera
f7ef062fe1 perf list: update option desc in man page
There is a difference between the SYNOPSIS section of the help message
and the man page (tools/perf/Documentation/perf-list.txt) for the perf
list command. After checking, we found that the help message reflected
the latest specifications. Therefore, revised the SYNOPSIS section of
the man page to match the help message.

Signed-off-by: Yoshihiro Furudera <fj5100bi@fujitsu.com>
Cc: Ravi Bangoria <ravi.bangoria@amd.com>
Cc: Weilin Wang <weilin.wang@intel.com>
Cc: Liang
Link: https://lore.kernel.org/r/20241003002404.2592094-1-fj5100bi@fujitsu.com
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2024-10-03 10:01:01 -07:00
James Clark
9943581c64 perf scripting python: Add function to get a config value
This can be used to get config values like which objdump Perf uses for
disassembly.

Reviewed-by: Leo Yan <leo.yan@arm.com>
Signed-off-by: James Clark <james.clark@linaro.org>
Tested-by: Ganapatrao Kulkarni <gankulkarni@os.amperecomputing.com>
Cc: Ben Gainey <ben.gainey@arm.com>
Cc: Suzuki K Poulose <suzuki.poulose@arm.com>
Cc: Will Deacon <will@kernel.org>
Cc: Mathieu Poirier <mathieu.poirier@linaro.org>
Cc: Mike Leach <mike.leach@linaro.org>
Cc: Ruidong Tian <tianruidong@linux.alibaba.com>
Cc: Leo Yan <leo.yan@linux.dev>
Cc: Benjamin Gray <bgray@linux.ibm.com>
Cc: linux-arm-kernel@lists.infradead.org
Cc: coresight@lists.linaro.org
Cc: John Garry <john.g.garry@oracle.com>
Cc: scclevenger@os.amperecomputing.com
Link: https://lore.kernel.org/r/20240916135743.1490403-4-james.clark@linaro.org
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2024-09-24 11:47:03 -07:00
Aditya Gupta
35439fe4e2 perf check: Fix inconsistencies in feature names
Fix two inconsistencies in feature names as discussed in [1]:

1. Rename "dwarf-unwind-support" to "dwarf-unwind"

2. 'get_cpuid' feature and 'HAVE_AUXTRACE_SUPPORT' names don't
   look related, change the feature name to 'auxtrace' to match the
   macro name, as 'get_cpuid' string is not used anywhere to check the
   feature presence

[1]: https://lore.kernel.org/linux-perf-users/ZoRw5we4HLSTZND6@x1/

Suggested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: Aditya Gupta <adityag@linux.ibm.com>
Cc: Athira Rajeev <atrajeev@linux.vnet.ibm.com>
Cc: Disha Goel <disgoel@linux.vnet.ibm.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kajol Jain <kjain@linux.ibm.com>
Cc: Madhavan Srinivasan <maddy@linux.ibm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: https://lore.kernel.org/r/20240904190132.415212-7-adityag@linux.ibm.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2024-09-04 16:19:53 -03:00
Aditya Gupta
98ad0b7732 perf check: Introduce 'check' subcommand
Currently the presence of a feature is checked with a combination of
perf version --build-options and greps, such as:

    perf version --build-options | grep " on .* HAVE_FEATURE"

Instead of this, introduce a subcommand "perf check feature", with which
scripts can test for presence of a feature, such as:

    perf check feature HAVE_FEATURE

'perf check feature' command is expected to have exit status of 0 if
feature is built-in, and 1 if it's not built-in or if feature is not known.

Multiple features can also be passed as a comma-separated list, in which
case the exit status will be 1 only if all of the passed features are
built-in. For example, with below command, it will have exit status of 0
only if both libtraceevent and bpf are enabled, else 1 in all other cases

    perf check feature libtraceevent,bpf

The arguments are case-insensitive.
An array 'supported_features' has also been introduced that can be used by
other commands like 'perf version --build-options', so that new features
can be added in one place, with the array

Committer testing:

  $ perf check feature libtraceevent,bpf
           libtraceevent: [ on  ]  # HAVE_LIBTRACEEVENT
                     bpf: [ on  ]  # HAVE_LIBBPF_SUPPORT
  $ perf check feature libtraceevent
           libtraceevent: [ on  ]  # HAVE_LIBTRACEEVENT
  $ perf check feature bpf
                     bpf: [ on  ]  # HAVE_LIBBPF_SUPPORT
  $ perf check -q feature bpf && echo "BPF support is present"
  BPF support is present
  $ perf check -q feature Bogus && echo "Bogus support is present"
  $

Reviewed-by: Athira Rajeev <atrajeev@linux.vnet.ibm.com>
Signed-off-by: Aditya Gupta <adityag@linux.ibm.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Athira Rajeev <atrajeev@linux.vnet.ibm.com>
Cc: Disha Goel <disgoel@linux.vnet.ibm.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kajol Jain <kjain@linux.ibm.com>
Cc: Madhavan Srinivasan <maddy@linux.ibm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: https://lore.kernel.org/r/20240904061836.55873-3-adityag@linux.ibm.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2024-09-04 09:56:05 -03:00
Yang Jihong
9b3a48bbe2 perf sched timehist: Add --prio option
The --prio option is used to only show events for the given task priority(ies).
The default is to show events for all priority tasks, which is consistent with
the previous behavior.

Testcase:
  # perf sched record nice -n 9 perf bench sched messaging -l 10000
  # Running 'sched/messaging' benchmark:
  # 20 sender and receiver processes per group
  # 10 groups == 400 processes run

       Total time: 3.435 [sec]
  [ perf record: Woken up 270 times to write data ]
  [ perf record: Captured and wrote 618.688 MB perf.data (5729036 samples) ]

  # perf sched timehist -h

   Usage: perf sched timehist [<options>]

      -C, --cpu <cpu>       list of cpus to profile
      -D, --dump-raw-trace  dump raw trace in ASCII
      -f, --force           don't complain, do it
      -g, --call-graph      Display call chains if present (default on)
      -I, --idle-hist       Show idle events only
      -i, --input <file>    input file name
      -k, --vmlinux <file>  vmlinux pathname
      -M, --migrations      Show migration events
      -n, --next            Show next task
      -p, --pid <pid[,pid...]>
                            analyze events only for given process id(s)
      -s, --summary         Show only syscall summary with statistics
      -S, --with-summary    Show all syscalls and summary with statistics
      -t, --tid <tid[,tid...]>
                            analyze events only for given thread id(s)
      -V, --cpu-visual      Add CPU visual
      -v, --verbose         be more verbose (show symbol address, etc)
      -w, --wakeups         Show wakeup events
          --kallsyms <file>
                            kallsyms pathname
          --max-stack <n>   Maximum number of functions to display backtrace.
          --prio <prio>     analyze events only for given task priority(ies)
          --show-prio       Show task priority
          --state           Show task state when sched-out
          --symfs <directory>
                            Look for files with symbols relative to this directory
          --time <str>      Time span for analysis (start,stop)

  # perf sched timehist --prio 140
  Samples of sched_switch event do not have callchains.
  Invalid prio string

  # perf sched timehist --show-prio --prio 129
  Samples of sched_switch event do not have callchains.
             time    cpu  task name                       prio      wait time  sch delay   run time
                          [tid/pid]                                    (msec)     (msec)     (msec)
  --------------- ------  ------------------------------  --------  ---------  ---------  ---------
   2090450.765421 [0002]  sched-messaging[1229618]        129           0.000      0.000      0.029
   2090450.765445 [0007]  sched-messaging[1229616]        129           0.000      0.062      0.043
   2090450.765448 [0014]  sched-messaging[1229619]        129           0.000      0.000      0.032
   2090450.765478 [0013]  sched-messaging[1229617]        129           0.000      0.065      0.048
   2090450.765503 [0014]  sched-messaging[1229622]        129           0.000      0.000      0.017
   2090450.765550 [0002]  sched-messaging[1229624]        129           0.000      0.000      0.021
   2090450.765562 [0007]  sched-messaging[1229621]        129           0.000      0.071      0.028
   2090450.765570 [0005]  sched-messaging[1229620]        129           0.000      0.064      0.066
   2090450.765583 [0001]  sched-messaging[1229625]        129           0.000      0.001      0.031
   2090450.765595 [0013]  sched-messaging[1229623]        129           0.000      0.060      0.028
   2090450.765637 [0014]  sched-messaging[1229628]        129           0.000      0.000      0.019
   2090450.765665 [0007]  sched-messaging[1229627]        129           0.000      0.038      0.030
  <SNIP>

  # perf sched timehist --show-prio --prio 0,120-129
  Samples of sched_switch event do not have callchains.
             time    cpu  task name                       prio      wait time  sch delay   run time
                          [tid/pid]                                    (msec)     (msec)     (msec)
  --------------- ------  ------------------------------  --------  ---------  ---------  ---------
   2090450.763231 [0000]  perf[1229608]                   120           0.000      0.000      0.000
   2090450.763235 [0000]  migration/0[15]                 0             0.000      0.001      0.003
   2090450.763263 [0001]  perf[1229608]                   120           0.000      0.000      0.000
   2090450.763268 [0001]  migration/1[21]                 0             0.000      0.001      0.004
   2090450.763302 [0002]  perf[1229608]                   120           0.000      0.000      0.000
   2090450.763309 [0002]  migration/2[27]                 0             0.000      0.001      0.007
   2090450.763338 [0003]  perf[1229608]                   120           0.000      0.000      0.000
   2090450.763343 [0003]  migration/3[33]                 0             0.000      0.001      0.004
   2090450.763459 [0004]  perf[1229608]                   120           0.000      0.000      0.000
   2090450.763469 [0004]  migration/4[39]                 0             0.000      0.002      0.010
   2090450.763496 [0005]  perf[1229608]                   120           0.000      0.000      0.000
   2090450.763501 [0005]  migration/5[45]                 0             0.000      0.001      0.004
   2090450.763613 [0006]  perf[1229608]                   120           0.000      0.000      0.000
   2090450.763622 [0006]  migration/6[51]                 0             0.000      0.001      0.008
   2090450.763652 [0007]  perf[1229608]                   120           0.000      0.000      0.000
   2090450.763660 [0007]  migration/7[57]                 0             0.000      0.001      0.008
  <SNIP>
   2090450.765665 [0001]  <idle>                          120           0.031      0.031      0.081
   2090450.765665 [0007]  sched-messaging[1229627]        129           0.000      0.038      0.030
   2090450.765667 [0000]  s1-perf[8235/7168]              120           0.008      0.000      0.004
   2090450.765684 [0013]  <idle>                          120           0.028      0.028      0.088
   2090450.765685 [0001]  sched-messaging[1229630]        129           0.000      0.001      0.020
   2090450.765688 [0000]  <idle>                          120           0.004      0.004      0.020
   2090450.765689 [0002]  <idle>                          120           0.021      0.021      0.138
   2090450.765691 [0005]  sched-messaging[1229626]        129           0.000      0.085      0.029

Signed-off-by: Yang Jihong <yangjihong@bytedance.com>
Acked-by: Namhyung Kim <namhyung@kernel.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@arm.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: https://lore.kernel.org/r/20240819033016.2427235-3-yangjihong@bytedance.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2024-09-03 15:45:59 -03:00
Yang Jihong
3fcd740990 perf sched timehist: Add --show-prio option
The --show-prio option is used to display the priority of task.

It is disabled by default, which is consistent with original behavior.

The display format is xxx (priority does not change during task running)
or xxx->yyy (priority changes during task running)

Testcase:

  # perf sched record nice -n 9 true
  [ perf record: Woken up 0 times to write data ]
  [ perf record: Captured and wrote 0.497 MB perf.data ]

  # perf sched timehist -h

   Usage: perf sched timehist [<options>]

      -C, --cpu <cpu>       list of cpus to profile
      -D, --dump-raw-trace  dump raw trace in ASCII
      -f, --force           don't complain, do it
      -g, --call-graph      Display call chains if present (default on)
      -I, --idle-hist       Show idle events only
      -i, --input <file>    input file name
      -k, --vmlinux <file>  vmlinux pathname
      -M, --migrations      Show migration events
      -n, --next            Show next task
      -p, --pid <pid[,pid...]>
                            analyze events only for given process id(s)
      -s, --summary         Show only syscall summary with statistics
      -S, --with-summary    Show all syscalls and summary with statistics
      -t, --tid <tid[,tid...]>
                            analyze events only for given thread id(s)
      -V, --cpu-visual      Add CPU visual
      -v, --verbose         be more verbose (show symbol address, etc)
      -w, --wakeups         Show wakeup events
          --kallsyms <file>
                            kallsyms pathname
          --max-stack <n>   Maximum number of functions to display backtrace.
          --show-prio       Show task priority
          --state           Show task state when sched-out
          --symfs <directory>
                            Look for files with symbols relative to this directory
          --time <str>      Time span for analysis (start,stop)

  # perf sched timehist
  Samples of sched_switch event do not have callchains.
             time    cpu  task name                       wait time  sch delay   run time
                          [tid/pid]                          (msec)     (msec)     (msec)
  --------------- ------  ------------------------------  ---------  ---------  ---------
     23952.006537 [0000]  perf[534]                           0.000      0.000      0.000
     23952.006593 [0000]  migration/0[19]                     0.000      0.014      0.056
     23952.006899 [0001]  perf[534]                           0.000      0.000      0.000
     23952.006947 [0001]  migration/1[22]                     0.000      0.015      0.047
     23952.007138 [0002]  perf[534]                           0.000      0.000      0.000
  <SNIP>

  # perf sched timehist --show-prio
  Samples of sched_switch event do not have callchains.
             time    cpu  task name                       prio      wait time  sch delay   run time
                          [tid/pid]                                    (msec)     (msec)     (msec)
  --------------- ------  ------------------------------  --------  ---------  ---------  ---------
     23952.006537 [0000]  perf[534]                       120           0.000      0.000      0.000
     23952.006593 [0000]  migration/0[19]                 0             0.000      0.014      0.056
     23952.006899 [0001]  perf[534]                       120           0.000      0.000      0.000
  <SNIP>
     23952.034843 [0003]  nice[535]                       120->129      0.189      0.024     23.314
  <SNIP>
     23952.053838 [0005]  rcu_preempt[16]                 120           3.993      0.000      0.023
     23952.053990 [0005]  <idle>                          120           0.023      0.023      0.152
     23952.054137 [0006]  <idle>                          120           1.427      1.427     17.855
     23952.054278 [0007]  <idle>                          120           0.506      0.506      1.650

Signed-off-by: Yang Jihong <yangjihong@bytedance.com>
Acked-by: Namhyung Kim <namhyung@kernel.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@arm.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: https://lore.kernel.org/r/20240819033016.2427235-2-yangjihong@bytedance.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2024-09-03 15:45:34 -03:00
Kan Liang
6f9d8d1de2 perf script: Add branch counters
It's useful to print the branch counter information for each jump in
the brstackinsn when it's available.

Add a new field 'brcntr' to display the branch counter information.

By default, the abbreviation will be used to indicate the branch
counter. In the verbose mode, the real event name is shown.

  $ perf script -F +brstackinsn,+brcntr

  # Branch counter abbr list:
  # branch-instructions:ppp = A
  # branch-misses = B
  # '-' No event occurs
  # '+' Event occurrences may be lost due to branch counter saturated
      tchain_edit  332203 3366329.405674:      53030 branch-instructions:ppp:            401781 f3+0x2c (home/sdp/test/tchain_edit)
         f3+31:
         0000000000401774        insn: eb 04                     br_cntr: AA     # PRED 5 cycles [5]
         000000000040177a        insn: 81 7d fc 0f 27 00 00
         0000000000401781        insn: 7e e3                     br_cntr: A      # PRED 1 cycles [6] 2.00 IPC
         0000000000401766        insn: 8b 45 fc
         0000000000401769        insn: 83 e0 01
         000000000040176c        insn: 85 c0
         000000000040176e        insn: 74 06                     br_cntr: A      # PRED 1 cycles [7] 4.00 IPC
         0000000000401776        insn: 83 45 fc 01
         000000000040177a        insn: 81 7d fc 0f 27 00 00
         0000000000401781        insn: 7e e3                     br_cntr: A      # PRED 7 cycles [14] 0.43 IPC

  $ perf script -F +brstackinsn,+brcntr -v

     tchain_edit  332203 3366329.405674:      53030 branch-instructions:ppp:            401781 f3+0x2c (/home/sdp/os.linux.perf.test-suite/kernels/lbr_kernel/tchain_edit)
        f3+31:
        0000000000401774        insn: eb 04                     br_cntr: branch-instructions:ppp 2 branch-misses 0      # PRED 5 cycles [5]
        000000000040177a        insn: 81 7d fc 0f 27 00 00
        0000000000401781        insn: 7e e3                     br_cntr: branch-instructions:ppp 1 branch-misses 0      # PRED 1 cycles [6] 2.00 IPC
        0000000000401766        insn: 8b 45 fc
        0000000000401769        insn: 83 e0 01
        000000000040176c        insn: 85 c0
        000000000040176e        insn: 74 06                     br_cntr: branch-instructions:ppp 1 branch-misses 0      # PRED 1 cycles [7] 4.00 IPC
        0000000000401776        insn: 83 45 fc 01
        000000000040177a        insn: 81 7d fc 0f 27 00 00
        0000000000401781        insn: 7e e3                     br_cntr: branch-instructions:ppp 1 branch-misses 0      # PRED 7 cycles [14] 0.43 IPC

Originally-by: Tinghao Zhang <tinghao.zhang@intel.com>
Reviewed-by: Andi Kleen <ak@linux.intel.com>
Signed-off-by: Kan Liang <kan.liang@linux.intel.com>
Acked-by: Namhyung Kim <namhyung@kernel.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Link: https://lore.kernel.org/r/20240813160208.2493643-9-kan.liang@linux.intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2024-08-14 10:20:40 -03:00
Kan Liang
20d6f55528 perf report: Display the branch counter histogram
Reusing the existing --total-cycles option to display the branch
counters. Add a new PERF_HPP_REPORT__BLOCK_BRANCH_COUNTER to display
the logged branch counter events. They are shown right after all the
cycle-related annotations.

Extend the 'struct block_info' to store and pass the branch counter
related information.

The annotation_br_cntr_entry() is to print the histogram of each branch
counter event. If the number of logged events is less than 4, the exact
number of the abbr name is printed. Otherwise, using '+' to stands for
more than 3 events.

Assume the number of logged events is less than 4.

The annotation_br_cntr_abbr_list() prints the branch counter's
abbreviation list. Press 'B' to display the list in the TUI mode.

  $ perf record -e "{branch-instructions:ppp,branch-misses}:S" -j any,counter
  $ perf report  --total-cycles --stdio

  # To display the perf.data header info, please use --header/--header-only options.
  #
  #
  # Total Lost Samples: 0
  #
  # Samples: 1M of events 'anon group { branch-instructions:ppp, branch-misses }'
  # Event count (approx.): 1610046
  #
  # Branch counter abbr list:
  # branch-instructions:ppp = A
  # branch-misses = B
  # '-' No event occurs
  # '+' Event occurrences may be lost due to branch counter saturated
  #
  # Sampled Cycles%  Sampled Cycles  Avg Cycles%  Avg Cycles  Branch Counter [Program Block Range]
  # ...............  ..............  ...........  ..........  ..............  ..................
  #
            57.55%            2.5M        0.00%           3     |A   |-   |                 ...
            25.27%            1.1M        0.00%           2     |AA  |-   |                 ...
            15.61%          667.2K        0.00%           1     |A   |-   |                 ...
             0.16%            6.9K        0.81%         575     |A   |-   |                 ...
             0.16%            6.8K        1.38%         977     |AA  |-   |                 ...
             0.16%            6.8K        0.04%          28     |AA  |B   |                 ...
             0.15%            6.6K        1.33%         946     |A   |-   |                 ...
             0.11%            4.5K        0.06%          46     |AAA+|-   |                 ...
             0.10%            4.4K        0.88%         624     |A   |-   |                 ...
             0.09%            3.7K        0.74%         524     |AAA+|B   |                 ...

With -v applied,

  # Sampled Cycles%  Sampled Cycles  Avg Cycles%  Avg Cycles  Branch Counter [Program Block Range]
  # ...............  ..............  ...........  ..........  ..............  ..................
  #
            57.55%            2.5M        0.00%           3       A=1 ,B=-                  ...
            25.27%            1.1M        0.00%           2       A=2 ,B=-                  ...
            15.61%          667.2K        0.00%           1       A=1 ,B=-                  ...
             0.16%            6.9K        0.81%         575       A=1 ,B=-                  ...
             0.16%            6.8K        1.38%         977       A=2 ,B=-                  ...
             0.16%            6.8K        0.04%          28       A=2 ,B=1                  ...
             0.15%            6.6K        1.33%         946       A=1 ,B=-                  ...
             0.11%            4.5K        0.06%          46       A=3+,B=-                  ...
             0.10%            4.4K        0.88%         624       A=1 ,B=-                  ...
             0.09%            3.7K        0.74%         524       A=3+,B=1                  ...

Reviewed-by: Andi Kleen <ak@linux.intel.com>
Signed-off-by: Kan Liang <kan.liang@linux.intel.com>
Acked-by: Namhyung Kim <namhyung@kernel.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Link: https://lore.kernel.org/r/20240813160208.2493643-7-kan.liang@linux.intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2024-08-14 10:20:40 -03:00
Weilin Wang
169f18fd98 perf Document: Add TPEBS (Timed PEBS(Precise Event-Based Sampling)) to Documents
TPEBS (Timed PEBS(Precise Event-Based Sampling)) is a new feature Intel
PMU from Granite Rapids microarchitecture.

It will be used in new TMA (Top-Down Microarchitecture Analysis)
releases.

Add related introduction to documents while adding new code to support
it in 'perf stat'.

Reviewed-by: Namhyung Kim <namhyung@kernel.org>
Signed-off-by: Weilin Wang <weilin.wang@intel.com>
Acked-by: Ian Rogers <irogers@google.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Caleb Biggers <caleb.biggers@intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Perry Taylor <perry.taylor@intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Samantha Alt <samantha.alt@intel.com>
Link: https://lore.kernel.org/r/20240720062102.444578-8-weilin.wang@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2024-08-13 15:25:33 -03:00
Weilin Wang
d546e3acf3 perf stat: Add command line option for enabling TPEBS recording
With this command line option, TPEBS recording is turned off in 'perf
stat' on default. It will only be turned on when this option is given in
'perf stat' command.

Example with --record-tpebs:

  perf stat -M tma_split_loads -C1-4 --record-tpebs sleep 1

  [ perf record: Woken up 2 times to write data ]
  [ perf record: Captured and wrote 0.044 MB - ]

   Performance counter stats for 'CPU(s) 1-4':

      53,259,156,071      cpu_core/TOPDOWN.SLOTS/          #      1.6 %  tma_split_loads   (50.00%)
      15,867,565,250      cpu_core/topdown-retiring/                                       (50.00%)
      15,655,580,731      cpu_core/topdown-mem-bound/                                      (50.00%)
      11,738,022,218      cpu_core/topdown-bad-spec/                                       (50.00%)
       6,151,265,424      cpu_core/topdown-fe-bound/                                       (50.00%)
      20,445,917,581      cpu_core/topdown-be-bound/                                       (50.00%)
       6,925,098,013      cpu_core/L1D_PEND_MISS.PENDING/                                  (50.00%)
       3,838,653,421      cpu_core/MEMORY_ACTIVITY.STALLS_L1D_MISS/                        (50.00%)
       4,797,059,783      cpu_core/EXE_ACTIVITY.BOUND_ON_LOADS/                            (50.00%)
      11,931,916,714      cpu_core/CPU_CLK_UNHALTED.THREAD/                                (50.00%)
         102,576,164      cpu_core/MEM_LOAD_COMPLETED.L1_MISS_ANY/                         (50.00%)
          64,071,854      cpu_core/MEM_INST_RETIRED.SPLIT_LOADS/                           (50.00%)
                   3      cpu_core/MEM_INST_RETIRED.SPLIT_LOADS/R

         1.003049679 seconds time elapsed

Example without --record-tpebs:

  perf stat -M tma_contested_accesses -C1 sleep 1

   Performance counter stats for 'CPU(s) 1':

          50,203,891      cpu_core/TOPDOWN.SLOTS/          #      0.0 %  tma_contested_accesses   (63.60%)
          10,040,777      cpu_core/topdown-retiring/                                              (63.60%)
           6,890,729      cpu_core/topdown-mem-bound/                                             (63.60%)
           2,756,463      cpu_core/topdown-bad-spec/                                              (63.60%)
          10,828,288      cpu_core/topdown-fe-bound/                                              (63.60%)
          28,350,432      cpu_core/topdown-be-bound/                                              (63.60%)
                  98      cpu_core/OCR.DEMAND_DATA_RD.L3_HIT.SNOOP_HITM/                          (63.70%)
             577,520      cpu_core/MEMORY_ACTIVITY.STALLS_L2_MISS/                                (54.62%)
             313,339      cpu_core/MEMORY_ACTIVITY.STALLS_L3_MISS/                                (54.62%)
              14,155      cpu_core/MEM_LOAD_RETIRED.L1_MISS/                                      (45.54%)
                   0      cpu_core/OCR.DEMAND_DATA_RD.L3_HIT.SNOOP_HIT_WITH_FWD/                  (36.30%)
           8,468,077      cpu_core/CPU_CLK_UNHALTED.THREAD/                                       (45.38%)
                 198      cpu_core/MEM_LOAD_L3_HIT_RETIRED.XSNP_MISS/                             (45.38%)
               8,324      cpu_core/MEM_LOAD_RETIRED.FB_HIT/                                       (45.38%)
       3,388,031,520      TSC
          23,226,785      cpu_core/CPU_CLK_UNHALTED.REF_TSC/                                      (54.46%)
                  80      cpu_core/MEM_LOAD_L3_HIT_RETIRED.XSNP_FWD/                              (54.46%)
                   0      cpu_core/MEM_LOAD_L3_HIT_RETIRED.XSNP_FWD/R
                   0      cpu_core/MEM_LOAD_L3_HIT_RETIRED.XSNP_MISS/R
       1,006,816,667 ns   duration_time

         1.002537737 seconds time elapsed

Reviewed-by: Namhyung Kim <namhyung@kernel.org>
Signed-off-by: Weilin Wang <weilin.wang@intel.com>
Acked-by: Ian Rogers <irogers@google.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Caleb Biggers <caleb.biggers@intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Perry Taylor <perry.taylor@intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Samantha Alt <samantha.alt@intel.com>
Link: https://lore.kernel.org/r/20240720062102.444578-7-weilin.wang@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2024-08-13 15:25:32 -03:00
Leo Yan
043da846c2 perf docs: Refine the description for the buffer size
Current description for the AUX trace buffer size is misleading. When a
user specifies the option '-m,512M', it represents a size value in bytes
(512MiB) but not 512M pages (512M x 4KiB regard to a page of 4KiB).

Make the document clear that the normal buffer and the AUX tracing
buffer share the same semantics. Syncs the documents for consistent
text.

Reviewed-by: James Clark <james.clark@linaro.org>
Signed-off-by: Leo Yan <leo.yan@arm.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: https://lore.kernel.org/r/20240812093459.2575278-1-leo.yan@arm.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2024-08-12 13:59:22 -03:00
Martin Liška
e6b56ae7c2 perf script: add --addr2line option
Similarly to other subcommands (like report, top), it would be handy to
provide a path for addr2line command.

Signed-off-by: Martin Liska <martin.liska@hey.com>
Cc: Ian Rogers <irogers@google.com>
Link: https://lore.kernel.org/r/eadc3e36-029d-4848-9d69-272fe5a83a26@foxlink.cz
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2024-08-12 13:59:22 -03:00
Namhyung Kim
ce533c9bc6 perf annotate: Add --skip-empty option
Like in 'perf report', we want to hide empty events in the 'perf annotate'
output.  This is consistent when the option is set in perf report.

For example, the following command would use 3 events including dummy.

  $ perf mem record -a -- perf test -w noploop

  $ perf evlist
  cpu/mem-loads,ldlat=30/P
  cpu/mem-stores/P
  dummy:u

Just using perf annotate with --group will show the all 3 events.

  $ perf annotate --group --stdio | head
   Percent                 |	Source code & Disassembly of ...
  --------------------------------------------------------------
                           : 0     0xe060 <_dl_relocate_object>:
      0.00    0.00    0.00 :    e060:       pushq   %rbp
      0.00    0.00    0.00 :    e061:       movq    %rsp, %rbp
      0.00    0.00    0.00 :    e064:       pushq   %r15
      0.00    0.00    0.00 :    e066:       movq    %rdi, %r15
      0.00    0.00    0.00 :    e069:       pushq   %r14
      0.00    0.00    0.00 :    e06b:       pushq   %r13
      0.00    0.00    0.00 :    e06d:       movl    %edx, %r13d

Now with --skip-empty, it'll hide the last dummy event.

  $ perf annotate --group --stdio --skip-empty | head
   Percent         |	Source code & Disassembly of ...
  ------------------------------------------------------
                   : 0     0xe060 <_dl_relocate_object>:
      0.00    0.00 :    e060:       pushq   %rbp
      0.00    0.00 :    e061:       movq    %rsp, %rbp
      0.00    0.00 :    e064:       pushq   %r15
      0.00    0.00 :    e066:       movq    %rdi, %r15
      0.00    0.00 :    e069:       pushq   %r14
      0.00    0.00 :    e06b:       pushq   %r13
      0.00    0.00 :    e06d:       movl    %edx, %r13d

Committer testing:

  root@x1:~# perf evlist
  cpu_atom/mem-loads,ldlat=30/P
  cpu_atom/mem-stores/P
  dummy:u
  root@x1:~#

Before:

  root@x1:~# perf annotate --group --stdio2 do_lookup_x | head -25
  Samples: 20  of events 'cpu_atom/mem-loads,ldlat=30/P, cpu_atom/mem-stores/P, dummy:u', 4000 Hz, Event count (approx.): 769079, [percent: local period]
  do_lookup_x() /usr/lib64/ld-linux-x86-64.so.2
  Percent                       0x9900 <do_lookup_x>:
                                  pushq      %rbp
                                  movq       %rsp,%rbp
                                  pushq      %r15
                                  pushq      %r14
                                  pushq      %r13
                                  pushq      %r12
                                  pushq      %rbx
                                  subq       $0x88,%rsp
                                  movq       %rdi,-0x50(%rbp)
                                  movl       8(%r9),%edi
                                  movq       0x10(%rbp),%r12
                                  movq       0x28(%rbp),%r10
                                  movq       %rdx,-0x70(%rbp)
                                  movq       %rcx,-0x58(%rbp)
                                  movq       %rdi,%r11
     0.00    5.73    0.00         movq       %r8,-0x68(%rbp)
                                  movq       (%r9),%r8
                                  movl       %esi,%eax
     8.30    0.00    0.00         movl       0x30(%rbp),%r9d
                                  movl       %esi,%r15d
                                  shrl       $6, %eax
                                  movq       %r8,%r13
  root@x1:~#

After:

  root@x1:~# perf annotate --group --skip-empty --stdio2 do_lookup_x | head -25
  Samples: 20  of events 'cpu_atom/mem-loads,ldlat=30/P, cpu_atom/mem-stores/P', 4000 Hz, Event count (approx.): 769079, [percent: local period]
  do_lookup_x() /usr/lib64/ld-linux-x86-64.so.2
  Percent               0x9900 <do_lookup_x>:
                          pushq      %rbp
                          movq       %rsp,%rbp
                          pushq      %r15
                          pushq      %r14
                          pushq      %r13
                          pushq      %r12
                          pushq      %rbx
                          subq       $0x88,%rsp
                          movq       %rdi,-0x50(%rbp)
                          movl       8(%r9),%edi
                          movq       0x10(%rbp),%r12
                          movq       0x28(%rbp),%r10
                          movq       %rdx,-0x70(%rbp)
                          movq       %rcx,-0x58(%rbp)
                          movq       %rdi,%r11
     0.00    5.73         movq       %r8,-0x68(%rbp)
                          movq       (%r9),%r8
                          movl       %esi,%eax
     8.30    0.00         movl       0x30(%rbp),%r9d
                          movl       %esi,%r15d
                          shrl       $6, %eax
                          movq       %r8,%r13
  root@x1:~#

Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: https://lore.kernel.org/r/20240803211332.1107222-6-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2024-08-05 16:14:01 -03:00
Namhyung Kim
13159a139d perf mem: Update documentation for new options
Add a common options section and move some items to the section.  Also
add description of new options to report options.

Suggested-by: Ian Rogers <irogers@google.com>
Reviewed-by: Ian Rogers <irogers@google.com>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Athira Rajeev <atrajeev@linux.vnet.ibm.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Link: https://lore.kernel.org/lkml/20240802180913.1023886-1-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2024-08-05 11:40:20 -03:00
Namhyung Kim
3dee4b83a6 perf record: Add --setup-filter option
To allow BPF filters for unprivileged users it needs to pin the BPF
objects to BPF-fs first.  Let's add a new option to pin and unpin the
objects easily.  I'm not sure 'perf record' is a right place to do this
but I don't have a better idea right now.

  $ sudo perf record --setup-filter pin

The above command would pin BPF program and maps for the filter when the
system has BPF-fs (usually at /sys/fs/bpf/).  To unpin the objects,
users can run the following command (as root).

  $ sudo perf record --setup-filter unpin

Committer testing:

  root@number:~# perf record --setup-filter pin
  root@number:~# ls -la /sys/fs/bpf/perf_filter/
  total 0
  drwxr-xr-x. 2 root root 0 Jul 31 10:43 .
  drwxr-xr-t. 3 root root 0 Jul 31 10:43 ..
  -rw-rw-rw-. 1 root root 0 Jul 31 10:43 dropped
  -rw-rw-rw-. 1 root root 0 Jul 31 10:43 filters
  -rwxrwxrwx. 1 root root 0 Jul 31 10:43 perf_sample_filter
  -rw-rw-rw-. 1 root root 0 Jul 31 10:43 pid_hash
  -rw-------. 1 root root 0 Jul 31 10:43 sample_f_rodata
  root@number:~# ls -la /sys/fs/bpf/perf_filter/perf_sample_filter
  -rwxrwxrwx. 1 root root 0 Jul 31 10:43 /sys/fs/bpf/perf_filter/perf_sample_filter
  root@number:~#

Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: KP Singh <kpsingh@kernel.org>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Song Liu <song@kernel.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: bpf@vger.kernel.org
Link: https://lore.kernel.org/r/20240703223035.2024586-8-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2024-08-01 12:11:33 -03:00
Namhyung Kim
74ae366c37 perf ftrace profile: Add -s/--sort option
The -s/--sort option is to sort the output by given column.

  $ sudo perf ftrace profile -s max sync | head
  # Total (us)   Avg (us)   Max (us)      Count   Function
      6301.811   6301.811   6301.811          1   __do_sys_sync
      6301.328   6301.328   6301.328          1   ksys_sync
      5320.300   1773.433   2858.819          3   iterate_supers
      2755.875     17.012   2610.633        162   sync_fs_one_sb
      2728.351    682.088   2610.413          4   ext4_sync_fs [ext4]
      2603.654   2603.654   2603.654          1   jbd2_log_wait_commit [jbd2]
      4750.615    593.827   2597.427          8   schedule
      2164.986     26.728   2115.673         81   sync_inodes_one_sb
      2143.842     26.467   2115.438         81   sync_inodes_sb

Reviewed-by: Ian Rogers <irogers@google.com>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Changbin Du <changbin.du@gmail.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Steven Rostedt (VMware) <rostedt@goodmis.org>
Link: https://lore.kernel.org/lkml/20240729004127.238611-5-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2024-07-31 16:58:18 -03:00
Namhyung Kim
0f223813ed perf ftrace: Add 'profile' command
The 'perf ftrace profile' command is to get function execution profiles
using function-graph tracer so that users can see the total, average,
max execution time as well as the number of invocations easily.

The following is a profile for the perf_event_open syscall.

  $ sudo perf ftrace profile -G __x64_sys_perf_event_open -- \
    perf stat -e cycles -C1 true 2> /dev/null | head
  # Total (us)   Avg (us)   Max (us)      Count   Function
        65.611     65.611     65.611          1   __x64_sys_perf_event_open
        30.527     30.527     30.527          1   anon_inode_getfile
        30.260     30.260     30.260          1   __anon_inode_getfile
        29.700     29.700     29.700          1   alloc_file_pseudo
        17.578     17.578     17.578          1   d_alloc_pseudo
        17.382     17.382     17.382          1   __d_alloc
        16.738     16.738     16.738          1   kmem_cache_alloc_lru
        15.686     15.686     15.686          1   perf_event_alloc
        14.012      7.006     11.264          2   obj_cgroup_charge
  #

Reviewed-by: Ian Rogers <irogers@google.com>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Changbin Du <changbin.du@gmail.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Steven Rostedt (VMware) <rostedt@goodmis.org>
Link: https://lore.kernel.org/lkml/20240729004127.238611-4-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2024-07-31 16:58:18 -03:00
Namhyung Kim
c77800894b perf ftrace: Add 'tail' option to --graph-opts
The 'graph-tail' option is to print function name as a comment at the end.
This is useful when a large function is mixed with other functions
(possibly from different CPUs).

For example,

  $ sudo perf ftrace -- perf stat true
  ...
   1)               |    get_unused_fd_flags() {
   1)               |      alloc_fd() {
   1)   0.178 us    |        _raw_spin_lock();
   1)   0.187 us    |        expand_files();
   1)   0.169 us    |        _raw_spin_unlock();
   1)   1.211 us    |      }
   1)   1.503 us    |    }

  $ sudo perf ftrace --graph-opts tail -- perf stat true
  ...
   1)               |    get_unused_fd_flags() {
   1)               |      alloc_fd() {
   1)   0.099 us    |        _raw_spin_lock();
   1)   0.083 us    |        expand_files();
   1)   0.081 us    |        _raw_spin_unlock();
   1)   0.601 us    |      } /* alloc_fd */
   1)   0.751 us    |    } /* get_unused_fd_flags */

Reviewed-by: Ian Rogers <irogers@google.com>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Changbin Du <changbin.du@gmail.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Steven Rostedt (VMware) <rostedt@goodmis.org>
Link: https://lore.kernel.org/lkml/20240729004127.238611-2-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2024-07-31 16:58:18 -03:00
Leo Yan
d27087c76e perf docs: Document cross compilation
Records the commands for cross compilation with two methods.

The first method relies on Multiarch. The second approach is to explicitly
specify the PKG_CONFIG variables, which is widely used in build system
(like Buildroot, Yocto, etc).

Co-developed-by: James Clark <james.clark@arm.com>
Signed-off-by: James Clark <james.clark@arm.com>
Signed-off-by: Leo Yan <leo.yan@arm.com>
Tested-by: Ian Rogers <irogers@google.com>
Cc: amadio@gentoo.org
Cc: Thomas Richter <tmricht@linux.ibm.com>
Link: https://lore.kernel.org/r/20240717082211.524826-7-leo.yan@arm.com
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2024-07-26 11:15:55 -07:00
Madadi Vineeth Reddy
306f921e87 perf sched map: Add --fuzzy-name option for fuzzy matching in task names
The --fuzzy-name option can be used if fuzzy name matching is required.
For example, "taskname" can be matched to any string that contains
"taskname" as its substring.

Sample output for --task-name wdav --fuzzy-name
=============
 .  *A0  .   .   .   .   -   .   131040.641346 secs A0 => wdavdaemon:62509
 .   A0 *B0  .   .   .   -   .   131040.641378 secs B0 => wdavdaemon:62274
 .  *-   B0  .   .   .   -   .   131040.641379 secs
*C0  .   B0  .   .   .   .   .   131040.641572 secs C0 => wdavdaemon:62283
 C0  .   B0  .  *D0  .   .   .   131040.641572 secs D0 => wdavdaemon:62277
 C0  .   B0  .   D0  .  *E0  .   131040.641578 secs E0 => wdavdaemon:62270
*-   .   B0  .   D0  .   E0  .   131040.641581 secs

Suggested-by: Chen Yu <yu.c.chen@intel.com>
Reviewed-and-tested-by: Athira Rajeev <atrajeev@linux.vnet.ibm.com>
Signed-off-by: Madadi Vineeth Reddy <vineethr@linux.ibm.com>
Link: https://lore.kernel.org/r/20240707182716.22054-4-vineethr@linux.ibm.com
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2024-07-12 09:38:40 -07:00
Madadi Vineeth Reddy
9cc0afed6f perf sched map: Add support for multiple task names using CSV
To track the scheduling patterns of multiple tasks simultaneously,
multiple task names can be specified using a comma separator
without any whitespace.

Sample output for --task-name perf,wdavdaemon
=============
 .  *A0  .   .   .   .   -   .   131040.641346 secs A0 => wdavdaemon:62509
 .   A0 *B0  .   .   .   -   .   131040.641378 secs B0 => wdavdaemon:62274
 .  *-   B0  .   .   .   -   .   131040.641379 secs
*C0  .   B0  .   .   .   .   .   131040.641572 secs C0 => wdavdaemon:62283

...

 .  *-   .   .   .   .   .   .   131041.395649 secs
 .   .   .   .   .   .   .  *X2  131041.403969 secs X2 => perf:70211
 .   .   .   .   .   .   .  *-   131041.404006 secs

Suggested-by: Namhyung Kim <namhyung@kernel.org>
Reviewed-and-tested-by: Athira Rajeev <atrajeev@linux.vnet.ibm.com>
Signed-off-by: Madadi Vineeth Reddy <vineethr@linux.ibm.com>
Cc: Chen Yu <yu.c.chen@intel.com>
Link: https://lore.kernel.org/r/20240707182716.22054-3-vineethr@linux.ibm.com
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2024-07-12 09:38:40 -07:00
Madadi Vineeth Reddy
3116d60910 perf sched map: Add task-name option to filter the output map
By default, perf sched map prints sched-in events for all the tasks
which may not be required all the time as it prints lot of symbols
and rows to the terminal.

With --task-name option, one could specify the specific task name
for which the map has to be shown. This would help in analyzing the
CPU usage patterns easier for that specific task. Since multiple
PID's might have the same task name, using task-name filter
would be more useful for debugging.

For other tasks, instead of printing the symbol, '-' is printed and
the same '.' is used to represent idle. '-' is used instead of symbol
for other tasks because it helps in clear visualization of task
of interest and secondly the symbol itself doesn't mean anything
because the sched-in of that symbol will not be printed(first sched-in
contains pid and the corresponding symbol).

When using the --task-name option, the sched-out time is represented
by a '*-'. Since not all task sched-in events are printed, the sched-out
time of the relevant task might be lost. This representation ensures
that the sched-out time of the interested task is not overlooked.

6.10.0-rc1
==========
*A0                              131040.639793 secs A0 => migration/0:19
*.                               131040.639801 secs .  => swapper:0
 .  *B0                          131040.639830 secs B0 => migration/1:24
 .  *.                           131040.639836 secs
 .   .  *C0                      131040.640108 secs C0 => migration/2:30
 .   .  *.                       131040.640163 secs
 .   .   .  *D0                  131040.640386 secs D0 => migration/3:36
 .   .   .  *.                   131040.640395 secs

6.10.0-rc1 + patch (--task-name wdavdaemon)
=============
 .  *A0  .   .   .   .   -   .   131040.641346 secs A0 => wdavdaemon:62509
 .   A0 *B0  .   .   .   -   .   131040.641378 secs B0 => wdavdaemon:62274
 -  *-   B0  .   .   .   -   .   131040.641379 secs
*C0  .   B0  .   .   .   .   .   131040.641572 secs C0 => wdavdaemon:62283
 C0  .   B0  .  *D0  .   .   .   131040.641572 secs D0 => wdavdaemon:62277
 C0  .   B0  .   D0  .  *E0  .   131040.641578 secs E0 => wdavdaemon:62270
*-   .   B0  .   D0  .   E0  .   131040.641581 secs
 .   .   B0  .   D0  .  *-   .   131040.641583 secs

Reviewed-and-tested-by: Athira Rajeev <atrajeev@linux.vnet.ibm.com>
Signed-off-by: Madadi Vineeth Reddy <vineethr@linux.ibm.com>
Cc: Chen Yu <yu.c.chen@intel.com>
Link: https://lore.kernel.org/r/20240707182716.22054-2-vineethr@linux.ibm.com
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2024-07-12 09:38:40 -07:00
Madadi Vineeth Reddy
a7cacaa088 perf sched replay: Fix -r/--repeat command line option for infinity
Currently, the -r/--repeat option accepts values from 0 and complains
for -1. The help section specifies:
-r, --repeat <n>      repeat the workload replay N times (-1: infinite)

The -r -1 option raises an error because replay_repeat is defined as
an unsigned int.

In the current implementation, the workload is repeated n times when
-r <n> is used, except when n is 0.

When -r is set to 0, the workload is also repeated once. This happens
because when -r=0, the run_one_test function is not called. (Note that
mutex unlocking, which is essential for child threads spawned to emulate
the workload, happens in run_one_test.) However, mutex unlocking is
still performed in the destroy_tasks function. Thus, -r=0 results in the
workload running once coincidentally.

To clarify and maintain the existing logic for -r >= 1 (which runs the
workload the specified number of times) and to fix the issue with infinite
runs, make -r=0 perform an infinite run.

Reviewed-by: James Clark <james.clark@arm.com>
Signed-off-by: Madadi Vineeth Reddy <vineethr@linux.ibm.com>
Cc: Athira Rajeev <atrajeev@linux.vnet.ibm.com>
Link: https://lore.kernel.org/r/20240628071821.15264-1-vineethr@linux.ibm.com
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2024-06-28 12:55:46 -07:00
Fernand Sieber
d363c2a880 perf: Timehist account sch delay for scheduled out running
When using perf timehist, sch delay is only computed for a waking task,
not for a pre empted task. This patches changes sch delay to account for
both. This makes sense as testing scheduling policy need to consider the
effect of scheduling delay globally, not only for waking tasks.

Example of `perf timehist` report before the patch for `stress` task
competing with each other.

First column is wait time, second column sch delay, third column
runtime.

1.492060 [0000]  s    stress[81]                          1.999      0.000      2.000      R  next: stress[83]
1.494060 [0000]  s    stress[83]                          2.000      0.000      2.000      R  next: stress[81]
1.496060 [0000]  s    stress[81]                          2.000      0.000      2.000      R  next: stress[83]
1.498060 [0000]  s    stress[83]                          2.000      0.000      1.999      R  next: stress[81]

After the patch, it looks like this (note that all wait time is not zero
anymore):

1.492060 [0000]  s    stress[81]                          1.999      1.999      2.000      R  next: stress[83]
1.494060 [0000]  s    stress[83]                          2.000      2.000      2.000      R  next: stress[81]
1.496060 [0000]  s    stress[81]                          2.000      2.000      2.000      R  next: stress[83]
1.498060 [0000]  s    stress[83]                          2.000      2.000      1.999      R  next: stress[81]

Signed-off-by: Fernand Sieber <sieberf@amazon.com>
Reviewed-by: Madadi Vineeth Reddy <vineethr@linux.ibm.com>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Link: https://lore.kernel.org/r/20240618090339.87482-1-sieberf@amazon.com
2024-06-25 11:06:20 -07:00
Ravi Bangoria
b739759c4e perf doc: Add AMD IBS usage document
Add a perf man page document that describes how to exploit AMD IBS with
Linux perf. Brief intro about IBS and simple one-liner examples will help
naive users to get started. This is not meant to be an exhaustive IBS
guide. User should refer latest AMD64 Architecture Programmer's Manual
for detailed description of IBS.

Usage:

  $ man perf-amd-ibs

Signed-off-by: Ravi Bangoria <ravi.bangoria@amd.com>
Reviewed-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: ananth.narayan@amd.com
Cc: sandipan.das@amd.com
Cc: santosh.shukla@amd.com
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Link: https://lore.kernel.org/r/20240620054104.815-1-ravi.bangoria@amd.com
2024-06-20 16:51:55 -07:00
Nick Forrington
f7d4485fce perf lock info: Display both map and thread by default
Change "perf lock info" argument handling to:

Display both map and thread info (rather than an error) when neither are
specified.

Display both map and thread info (rather than just thread info) when
both are requested.

Signed-off-by: Nick Forrington <nick.forrington@arm.com>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Link: https://lore.kernel.org/r/20240513091413.738537-2-nick.forrington@arm.com
2024-06-03 22:01:00 -07:00