mirror of
				git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git
				synced 2025-09-04 20:19:47 +08:00 
			
		
		
		
	
			
				
					
						
					
					dab55604ae
				
			
			
		
	
	
		
			273 Commits
		
	
	
	| Author | SHA1 | Message | Date | |
|---|---|---|---|---|
|  James Clark | ae8e4f4048 | perf scripts python cs-etm: Restore first sample log in verbose mode The linked commit moved the early return on the first sample to before
the verbose log, so move the log earlier too. Now the first sample is
also logged and not skipped.
Fixes:  | ||
|  Ian Rogers | e467705a9f | perf util: Make util its own library Make the util directory into its own library. This is done to avoid compiling code twice, once for the perf tool and once for the perf python module. For convenience: arch/common.c scripts/perl/Perf-Trace-Util/Context.c scripts/python/Perf-Trace-Util/Context.c are made part of this library. Signed-off-by: Ian Rogers <irogers@google.com> Reviewed-by: James Clark <james.clark@arm.com> Cc: Suzuki K Poulose <suzuki.poulose@arm.com> Cc: Kees Cook <keescook@chromium.org> Cc: Palmer Dabbelt <palmer@dabbelt.com> Cc: Albert Ou <aou@eecs.berkeley.edu> Cc: Nick Terrell <terrelln@fb.com> Cc: Gary Guo <gary@garyguo.net> Cc: Alex Gaynor <alex.gaynor@gmail.com> Cc: Boqun Feng <boqun.feng@gmail.com> Cc: Wedson Almeida Filho <wedsonaf@gmail.com> Cc: Ze Gao <zegao2021@gmail.com> Cc: Alice Ryhl <aliceryhl@google.com> Cc: Andrei Vagin <avagin@google.com> Cc: Yicong Yang <yangyicong@hisilicon.com> Cc: Jonathan Cameron <jonathan.cameron@huawei.com> Cc: Guo Ren <guoren@kernel.org> Cc: Miguel Ojeda <ojeda@kernel.org> Cc: Will Deacon <will@kernel.org> Cc: Mike Leach <mike.leach@linaro.org> Cc: Leo Yan <leo.yan@linux.dev> Cc: Oliver Upton <oliver.upton@linux.dev> Cc: John Garry <john.g.garry@oracle.com> Cc: Benno Lossin <benno.lossin@proton.me> Cc: Björn Roy Baron <bjorn3_gh@protonmail.com> Cc: Andreas Hindborg <a.hindborg@samsung.com> Cc: Paul Walmsley <paul.walmsley@sifive.com> Signed-off-by: Namhyung Kim <namhyung@kernel.org> Link: https://lore.kernel.org/r/20240625214117.953777-7-irogers@google.com | ||
|  Athira Rajeev | 7d49ced808 | tools/perf: Fix parallel-perf python script to replace new python syntax ":=" usage perf test "perf script tests" fails as below in systems
with python 3.6
	File "/home/athira/linux/tools/perf/tests/shell/../../scripts/python/parallel-perf.py", line 442
	if line := p.stdout.readline():
             ^
	SyntaxError: invalid syntax
	--- Cleaning up ---
	---- end(-1) ----
	92: perf script tests: FAILED!
This happens because ":=" is a new syntax that assigns values
to variables as part of a larger expression. This is introduced
from python 3.8 and hence fails in setup with python 3.6
Address this by splitting the large expression and check the
value in two steps:
Previous line: if line := p.stdout.readline():
Current change:
	line = p.stdout.readline()
	if line:
With patch
	./perf test "perf script tests"
	 93: perf script tests:  Ok
Signed-off-by: Athira Rajeev <atrajeev@linux.vnet.ibm.com>
Acked-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: akanksha@linux.ibm.com
Cc: kjain@linux.ibm.com
Cc: maddy@linux.ibm.com
Cc: disgoel@linux.vnet.ibm.com
Cc: linuxppc-dev@lists.ozlabs.org
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Link: https://lore.kernel.org/r/20240623064850.83720-3-atrajeev@linux.vnet.ibm.com | ||
|  Lucas Stach | a9700511fd | perf script: netdev-times: add location parameter to consume_skb  | ||
|  Adrian Hunter | e0c48bf9e8 | perf scripts python: Add a script to run instances of 'perf script' in parallel Add a Python script to run a perf script command multiple times in
parallel, using perf script options --cpu and --time so that each job
processes a different chunk of the data.
Extend perf script tests to test also the new script.
The script supports the use of normal 'perf script' options like
--dlfilter and --script, so that the benefit of running parallel jobs
naturally extends to them also. In addition, a command can be provided
(refer --pipe-to option) to pipe standard output to a custom command.
Refer to the script's own help text at the end of the patch for more
details.
The script is useful for Intel PT traces, that can be efficiently
decoded by 'perf script' when split by CPU and/or time ranges. Running
jobs in parallel can decrease the overall decoding time.
Committer testing:
  Ian reported that shellcheck found some issues, I installed it as there
  are no warnings about it not being available, but when available it
  fails the build with:
    TEST    /tmp/build/perf-tools-next/tests/shell/script.sh.shellcheck_log
    CC      /tmp/build/perf-tools-next/util/header.o
  In tests/shell/script.sh line 20:
                  rm -rf "${temp_dir}/"*
                         ^-------------^ SC2115 (warning): Use "${var:?}" to ensure this never expands to /* .
  In tests/shell/script.sh line 83:
          output1_dir="${temp_dir}/output1"
          ^---------^ SC2034 (warning): output1_dir appears unused. Verify use (or export if used externally).
  In tests/shell/script.sh line 84:
          output2_dir="${temp_dir}/output2"
          ^---------^ SC2034 (warning): output2_dir appears unused. Verify use (or export if used externally).
  In tests/shell/script.sh line 86:
          python3 "${pp}" -o "${output_dir}" --jobs 4 --verbose -- perf script -i "${perf_data}"
                              ^-----------^ SC2154 (warning): output_dir is referenced but not assigned (did you mean 'output1_dir'?).
  For more information:
    https://www.shellcheck.net/wiki/SC2034 -- output1_dir appears unused. Verif...
    https://www.shellcheck.net/wiki/SC2115 -- Use "${var:?}" to ensure this nev...
    https://www.shellcheck.net/wiki/SC2154 -- output_dir is referenced but not ...
Did these fixes:
  -               rm -rf "${temp_dir}/"*
  +               rm -rf "${temp_dir:?}/"*
And:
   @@ -83,8 +83,8 @@ test_parallel_perf()
          output1_dir="${temp_dir}/output1"
          output2_dir="${temp_dir}/output2"
          perf record -o "${perf_data}" --sample-cpu uname
  -       python3 "${pp}" -o "${output_dir}" --jobs 4 --verbose -- perf script -i "${perf_data}"
  -       python3 "${pp}" -o "${output_dir}" --jobs 4 --verbose --per-cpu -- perf script -i "${perf_data}"
  +       python3 "${pp}" -o "${output1_dir}" --jobs 4 --verbose -- perf script -i "${perf_data}"
  +       python3 "${pp}" -o "${output2_dir}" --jobs 4 --verbose --per-cpu -- perf script -i "${perf_data}"
After that:
  root@number:~# perf test -vv "perf script tests"
   97: perf script tests:
  --- start ---
  test child forked, pid 4084139
  DB test
  [ perf record: Woken up 1 times to write data ]
  [ perf record: Captured and wrote 0.032 MB /tmp/perf-test-script.T4MJDr0L6J/perf.data (7 samples) ]
  <SNIP>
  DB test [Success]
  parallel-perf test
  Linux
  [ perf record: Woken up 1 times to write data ]
  [ perf record: Captured and wrote 0.034 MB /tmp/perf-test-script.T4MJDr0L6J/pp-perf.data (7 samples) ]
  Starting: perf script --time=,91898.301878499 -i /tmp/perf-test-script.T4MJDr0L6J/pp-perf.data
  Starting: perf script --time=91898.301878500,91898.301905999 -i /tmp/perf-test-script.T4MJDr0L6J/pp-perf.data
  Starting: perf script --time=91898.301906000,91898.301933499 -i /tmp/perf-test-script.T4MJDr0L6J/pp-perf.data
  Starting: perf script --time=91898.301933500, -i /tmp/perf-test-script.T4MJDr0L6J/pp-perf.data
  Finished: perf script --time=91898.301878500,91898.301905999 -i /tmp/perf-test-script.T4MJDr0L6J/pp-perf.data
  Finished: perf script --time=91898.301906000,91898.301933499 -i /tmp/perf-test-script.T4MJDr0L6J/pp-perf.data
  There are 4 jobs: 2 completed, 2 running
  Finished: perf script --time=,91898.301878499 -i /tmp/perf-test-script.T4MJDr0L6J/pp-perf.data
  Finished: perf script --time=91898.301933500, -i /tmp/perf-test-script.T4MJDr0L6J/pp-perf.data
  There are 4 jobs: 4 completed, 0 running
  All jobs finished successfully
  parallel-perf.py done
  Starting: perf script --cpu=0 -i /tmp/perf-test-script.T4MJDr0L6J/pp-perf.data
  Starting: perf script --cpu=1 -i /tmp/perf-test-script.T4MJDr0L6J/pp-perf.data
  Starting: perf script --cpu=2 -i /tmp/perf-test-script.T4MJDr0L6J/pp-perf.data
  Starting: perf script --cpu=3 -i /tmp/perf-test-script.T4MJDr0L6J/pp-perf.data
  Finished: perf script --cpu=0 -i /tmp/perf-test-script.T4MJDr0L6J/pp-perf.data
  Finished: perf script --cpu=1 -i /tmp/perf-test-script.T4MJDr0L6J/pp-perf.data
  Finished: perf script --cpu=2 -i /tmp/perf-test-script.T4MJDr0L6J/pp-perf.data
  Finished: perf script --cpu=3 -i /tmp/perf-test-script.T4MJDr0L6J/pp-perf.data
  There are 28 jobs: 4 completed, 0 running
  Starting: perf script --cpu=4 -i /tmp/perf-test-script.T4MJDr0L6J/pp-perf.data
  Starting: perf script --cpu=5 -i /tmp/perf-test-script.T4MJDr0L6J/pp-perf.data
  Starting: perf script --cpu=6 -i /tmp/perf-test-script.T4MJDr0L6J/pp-perf.data
  Starting: perf script --cpu=7 -i /tmp/perf-test-script.T4MJDr0L6J/pp-perf.data
  Finished: perf script --cpu=4 -i /tmp/perf-test-script.T4MJDr0L6J/pp-perf.data
  Finished: perf script --cpu=5 -i /tmp/perf-test-script.T4MJDr0L6J/pp-perf.data
  Finished: perf script --cpu=6 -i /tmp/perf-test-script.T4MJDr0L6J/pp-perf.data
  Finished: perf script --cpu=7 -i /tmp/perf-test-script.T4MJDr0L6J/pp-perf.data
  There are 28 jobs: 8 completed, 0 running
  Starting: perf script --cpu=8 -i /tmp/perf-test-script.T4MJDr0L6J/pp-perf.data
  Starting: perf script --cpu=9 -i /tmp/perf-test-script.T4MJDr0L6J/pp-perf.data
  Starting: perf script --cpu=10 -i /tmp/perf-test-script.T4MJDr0L6J/pp-perf.data
  Starting: perf script --cpu=11 -i /tmp/perf-test-script.T4MJDr0L6J/pp-perf.data
  Finished: perf script --cpu=8 -i /tmp/perf-test-script.T4MJDr0L6J/pp-perf.data
  Finished: perf script --cpu=9 -i /tmp/perf-test-script.T4MJDr0L6J/pp-perf.data
  Finished: perf script --cpu=10 -i /tmp/perf-test-script.T4MJDr0L6J/pp-perf.data
  Finished: perf script --cpu=11 -i /tmp/perf-test-script.T4MJDr0L6J/pp-perf.data
  There are 28 jobs: 12 completed, 0 running
  Starting: perf script --cpu=12 -i /tmp/perf-test-script.T4MJDr0L6J/pp-perf.data
  Starting: perf script --cpu=13 -i /tmp/perf-test-script.T4MJDr0L6J/pp-perf.data
  Starting: perf script --cpu=14 -i /tmp/perf-test-script.T4MJDr0L6J/pp-perf.data
  Starting: perf script --cpu=15 -i /tmp/perf-test-script.T4MJDr0L6J/pp-perf.data
  Finished: perf script --cpu=12 -i /tmp/perf-test-script.T4MJDr0L6J/pp-perf.data
  Finished: perf script --cpu=13 -i /tmp/perf-test-script.T4MJDr0L6J/pp-perf.data
  Finished: perf script --cpu=14 -i /tmp/perf-test-script.T4MJDr0L6J/pp-perf.data
  Finished: perf script --cpu=15 -i /tmp/perf-test-script.T4MJDr0L6J/pp-perf.data
  There are 28 jobs: 16 completed, 0 running
  Starting: perf script --cpu=16 -i /tmp/perf-test-script.T4MJDr0L6J/pp-perf.data
  Starting: perf script --cpu=17 -i /tmp/perf-test-script.T4MJDr0L6J/pp-perf.data
  Starting: perf script --cpu=18 -i /tmp/perf-test-script.T4MJDr0L6J/pp-perf.data
  Starting: perf script --cpu=19 -i /tmp/perf-test-script.T4MJDr0L6J/pp-perf.data
  Finished: perf script --cpu=16 -i /tmp/perf-test-script.T4MJDr0L6J/pp-perf.data
  Finished: perf script --cpu=17 -i /tmp/perf-test-script.T4MJDr0L6J/pp-perf.data
  Finished: perf script --cpu=18 -i /tmp/perf-test-script.T4MJDr0L6J/pp-perf.data
  Finished: perf script --cpu=19 -i /tmp/perf-test-script.T4MJDr0L6J/pp-perf.data
  There are 28 jobs: 20 completed, 0 running
  Starting: perf script --cpu=20 -i /tmp/perf-test-script.T4MJDr0L6J/pp-perf.data
  Starting: perf script --cpu=21 -i /tmp/perf-test-script.T4MJDr0L6J/pp-perf.data
  Starting: perf script --cpu=22 -i /tmp/perf-test-script.T4MJDr0L6J/pp-perf.data
  Starting: perf script --cpu=23 -i /tmp/perf-test-script.T4MJDr0L6J/pp-perf.data
  Finished: perf script --cpu=20 -i /tmp/perf-test-script.T4MJDr0L6J/pp-perf.data
  Finished: perf script --cpu=21 -i /tmp/perf-test-script.T4MJDr0L6J/pp-perf.data
  Finished: perf script --cpu=22 -i /tmp/perf-test-script.T4MJDr0L6J/pp-perf.data
  Finished: perf script --cpu=23 -i /tmp/perf-test-script.T4MJDr0L6J/pp-perf.data
  There are 28 jobs: 24 completed, 0 running
  Starting: perf script --cpu=24 -i /tmp/perf-test-script.T4MJDr0L6J/pp-perf.data
  Starting: perf script --cpu=25 -i /tmp/perf-test-script.T4MJDr0L6J/pp-perf.data
  Starting: perf script --cpu=26 -i /tmp/perf-test-script.T4MJDr0L6J/pp-perf.data
  Starting: perf script --cpu=27 -i /tmp/perf-test-script.T4MJDr0L6J/pp-perf.data
  Finished: perf script --cpu=25 -i /tmp/perf-test-script.T4MJDr0L6J/pp-perf.data
  Finished: perf script --cpu=26 -i /tmp/perf-test-script.T4MJDr0L6J/pp-perf.data
  Finished: perf script --cpu=27 -i /tmp/perf-test-script.T4MJDr0L6J/pp-perf.data
  There are 28 jobs: 27 completed, 1 running
  Finished: perf script --cpu=24 -i /tmp/perf-test-script.T4MJDr0L6J/pp-perf.data
  There are 28 jobs: 28 completed, 0 running
  All jobs finished successfully
  parallel-perf.py done
  parallel-perf test [Success]
  --- Cleaning up ---
  ---- end(0) ----
   97: perf script tests                                               : Ok
  root@number:~#
Reviewed-by: Andi Kleen <ak@linux.intel.com>
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: https://lore.kernel.org/r/20240423133248.10206-1-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> | ||
|  Ruidong Tian | 2d98dbb4c9 | perf scripts python arm-cs-trace-disasm.py: Do not ignore disam first sample arm-cs-trace-disasm ignore disam the first branch sample, For example as
follow, the instructions beteween 0x0000ffffae878750 and
0x0000ffffae878754 is lose:
  ARM CoreSight Trace Data Assembler Dump
  Event type: branches:uH
  Sample = { cpu: 0000 addr: 0x0000ffffae878750 phys_addr: 0x0000000000000000 ip: 0x0000000000000000 pid: 4003489 tid: 4003489 period: 1 time: 26765151766034 }
  Event type: branches:uH
  Sample = { cpu: 0000 addr: 0x0000000000000000 phys_addr: 0x0000000000000000 ip: 0x0000ffffae878754 pid: 4003489 tid: 4003489 period: 1 time: 26765151766034 }
Initialize cpu_data earlier to fix it:
  ARM CoreSight Trace Data Assembler Dump
  Event type: branches:uH
  Sample = { cpu: 0000 addr: 0x0000000000000000 phys_addr: 0x0000000000000000 ip: 0x0000ffffae878754 pid: 4003489 tid: 4003489 period: 1 time: 26765151766034 }
        0000000000028740 <ioctl>: (base address is 0x0000ffffae850000)
           28750: b13ffc1f      cmn     x0, #4095
           28754: 54000042      b.hs    0x2875c <ioctl+0x1c>
            test 4003489/4003489 [0000]     26765.151766034  __GI___ioctl+0x14                        /usr/lib64/libc-2.32.so
  Event type: branches:uH
  Sample = { cpu: 0000 addr: 0x0000ffffa67535ac phys_addr: 0x0000000000000000 ip: 0x0000000000000000 pid: 4003489 tid: 4003489 period: 1 time: 26765151766034 }
Reviewed-by: James Clark <james.clark@arm.com>
Signed-off-by: Ruidong Tian <tianruidong@linux.alibaba.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Al Grant <al.grant@arm.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Leo Yan <leo.yan@linaro.org>
Cc: Mathieu Poirier <mathieu.poirier@linaro.org>
Cc: Mike Leach <mike.leach@linaro.org>
Cc: Suzuki Poulouse <suzuki.poulose@arm.com>
Cc: Tor Jeremiassen <tor@ti.com>
Link: https://lore.kernel.org/r/20231214123304.34087-4-tianruidong@linux.alibaba.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> | ||
|  Ruidong Tian | c344675ad2 | perf scripts python arm-cs-trace-disasm.py: Set start vm addr of exectable file to 0 For exectable ELF file, which e_type is ET_EXEC, dso start address is a absolute address other than offset. Just set vm_start to zero when dso start is 0x400000, which means it is a exectable file. Reviewed-by: James Clark <james.clark@arm.com> Signed-off-by: Ruidong Tian <tianruidong@linux.alibaba.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Al Grant <al.grant@arm.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Leo Yan <leo.yan@linaro.org> Cc: Mathieu Poirier <mathieu.poirier@linaro.org> Cc: Mike Leach <mike.leach@linaro.org> Cc: Suzuki Poulouse <suzuki.poulose@arm.com> Cc: Tor Jeremiassen <tor@ti.com> Link: https://lore.kernel.org/r/20231214123304.34087-3-tianruidong@linux.alibaba.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> | ||
|  Benjamin Gray | 280b4e4a9e | perf tools: Address python 3.6 DeprecationWarning for string scapes Python 3.6 introduced a DeprecationWarning for invalid escape sequences. This is upgraded to a SyntaxWarning in Python 3.12, and will eventually be a syntax error. Fix these now to get ahead of it before it's an error. Signed-off-by: Benjamin Gray <bgray@linux.ibm.com> Acked-by: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Andrii Nakryiko <andrii@kernel.org> Cc: Hartley Sweeten <hsweeten@visionengravers.com> Cc: Ian Abbott <abbotti@mev.co.uk> Cc: Ian Rogers <irogers@google.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jan Kiszka <jan.kiszka@siemens.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Jonathan Corbet <corbet@lwn.net> Cc: Kieran Bingham <kbingham@kernel.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Mykola Lysenko <mykolal@fb.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Nathan Chancellor <nathan@kernel.org> Cc: Nick Desaulniers <ndesaulniers@google.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Shuah Khan <shuah@kernel.org> Cc: Todd E Brandt <todd.e.brandt@linux.intel.com> Cc: Tom Rix <trix@redhat.com> Cc: linux-doc@vger.kernel.org Cc: linux-ia64@vger.kernel.org Cc: linux-kselftest@vger.kernel.org Cc: linux-pm@vger.kernel.org Cc: llvm@lists.linux.dev Link: https://lore.kernel.org/r/20230912060801.95533-6-bgray@linux.ibm.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> | ||
|  Anup Sharma | f208b2c6f9 | perf scripts python gecko: Launch the profiler UI on the default browser with the appropriate URL All required libraries have been imported and make sure that none of them are external dependencies. To achieve this, created a virt env and verified. Modified usage information and added combined command. Modified the main() function to read the --save-only command-line option and set the output_file variable accordingly. Modified the trace_end() function to check for the output_file variable. If it is set, the profiler data is saved to a local file in Gecko Profile format, or the profiler.firefox.com is opened on the default browser. Included trace_begin() to initialize the Firefox Profiler and launch the default browser to display the profiler.firefox.com. Added a new function launchFirefox() to start a local server and launch the profiler UI on the default browser with the appropriate URL. Created the "CORSRequestHandler" class to enable Cross-Origin Resource Sharing. Summary: This integration now includes a exiting feature to conveniently host the Gecko Profile data on a local server and open it directly in the default web browser. This means that users can now effortlessly visualize and analyze the profiler results with just a single click. The addition of the --save-only command-line option allows users to save the profiler output to a local file in Gecko Profile format, but the real highlight lies in the capability to seamlessly launch a local server, making the data accessible to Firefox Profiler via a web browser. In addition, it's important to highlight that all data are hosted locally, eliminating any concerns about data privacy rules and regulations. Signed-off-by: Anup Sharma <anupnewsmail@gmail.com> Tested-by: Ian Rogers <irogers@google.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: https://lore.kernel.org/r/ZNOS0vo58DnVLpD8@yoga Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> | ||
|  Anup Sharma | 43803cb16f | perf scripts python: Add support for input args in gecko script Refines the argument handling mechanism in the "gecko-report" script to enable better compatibility and improved user experience. The script now differentiates between scenarios where arguments are provided for record and report cases where gecko.py arguments are passed. Signed-off-by: Anup Sharma <anupnewsmail@gmail.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Ian Rogers <irogers@google.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: https://lore.kernel.org/r/ZNf7W+EIrrCSHZN0@yoga Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> | ||
|  Wei Li | 41a37430f6 | perf scripts python: Update audit-libs package name for python3 'audit-libs-python' is the package for python2, update it for python3. On Ubuntu and Fedora, the new package is 'python3-audit'. Signed-off-by: Wei Li <liwei391@huawei.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Ian Rogers <irogers@google.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Li Bin <huawei.libin@huawei.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: https://lore.kernel.org/r/20230815131805.1237491-1-liwei391@huawei.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> | ||
|  Wei Li | 708a3e8b80 | perf scripts python: Support syscall name parsing on arm64 In the result of "perf script syscall-counts" on arm64, the syscall events are not resolved currently. Add "aarch64" to audit uname list to support name parsing. * After the patch: [root@localhost ~]# perf script syscall-counts sleep 1 Press control+C to stop and show the summary syscall events: event count ---------------------------------------- ----------- mmap 6 close 5 mprotect 4 brk 3 newfstatat 3 openat 3 getrandom 1 prlimit64 1 munmap 1 clock_nanosleep 1 set_robust_list 1 set_tid_address 1 exit_group 1 read 1 faccessat 1 Signed-off-by: Wei Li <liwei391@huawei.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Ian Rogers <irogers@google.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Li Bin <huawei.libin@huawei.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: https://lore.kernel.org/r/20230815131735.1237221-1-liwei391@huawei.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> | ||
|  Arnaldo Carvalho de Melo | c43888e739 | perf script python: Cope with declarations after statements found in Python.h With -Werror the build was failing on fedora rawhide: [perfbuilder@27cfe44d67ed perf-6.5.0-rc2]$ gcc -v Using built-in specs. COLLECT_GCC=gcc COLLECT_LTO_WRAPPER=/usr/libexec/gcc/x86_64-redhat-linux/13/lto-wrapper OFFLOAD_TARGET_NAMES=nvptx-none OFFLOAD_TARGET_DEFAULT=1 Target: x86_64-redhat-linux Configured with: ../configure --enable-bootstrap --enable-languages=c,c++,fortran,objc,obj-c++,ada,go,d,m2,lto --prefix=/usr --mandir=/usr/share/man --infodir=/usr/share/info --with-bugurl=http://bugzilla.redhat.com/bugzilla --enable-shared --enable-threads=posix --enable-checking=release --enable-multilib --with-system-zlib --enable-__cxa_atexit --disable-libunwind-exceptions --enable-gnu-unique-object --enable-linker-build-id --with-gcc-major-version-only --enable-libstdcxx-backtrace --with-libstdcxx-zoneinfo=/usr/share/zoneinfo --with-linker-hash-style=gnu --enable-plugin --enable-initfini-array --with-isl=/builddir/build/BUILD/gcc-13.2.1-20230728/obj-x86_64-redhat-linux/isl-install --enable-offload-targets=nvptx-none --without-cuda-driver --enable-offload-defaulted --enable-gnu-indirect-function --enable-cet --with-tune=generic --with-arch_32=i686 --build=x86_64-redhat-linux --with-build-config=bootstrap-lto --enable-link-serialization=1 Thread model: posix Supported LTO compression algorithms: zlib zstd gcc version 13.2.1 20230728 (Red Hat 13.2.1-1) (GCC) [perfbuilder@27cfe44d67ed perf-6.5.0-rc2]$ In file included from /usr/include/python3.12/Python.h:44, from scripts/python/Perf-Trace-Util/Context.c:14: /usr/include/python3.12/object.h: In function 'Py_SIZE': /usr/include/python3.12/object.h:217:5: error: ISO C90 forbids mixed declarations and code [-Werror=declaration-after-statement] 217 | PyVarObject *var_ob = _PyVarObject_CAST(ob); | ^~~~~~~~~~~ In file included from /usr/include/python3.12/Python.h:53: /usr/include/python3.12/cpython/longintrepr.h: In function '_PyLong_CompactValue': /usr/include/python3.12/cpython/longintrepr.h:121:5: error: ISO C90 forbids mixed declarations and code [-Werror=declaration-after-statement] 121 | Py_ssize_t sign = 1 - (op->long_value.lv_tag & _PyLong_SIGN_MASK); | ^~~~~~~~~~ <SNIP> In file included from /usr/include/python3.12/Python.h:44, from util/scripting-engines/trace-event-python.c:22: /usr/include/python3.12/object.h: In function 'Py_SIZE': /usr/include/python3.12/object.h:217:5: error: ISO C90 forbids mixed declarations and code [-Werror=declaration-after-statement] 217 | PyVarObject *var_ob = _PyVarObject_CAST(ob); | ^~~~~~~~~~~ CC /tmp/build/perf/util/units.o CC /tmp/build/perf/util/time-utils.o In file included from /usr/include/python3.12/Python.h:53: /usr/include/python3.12/cpython/longintrepr.h: In function '_PyLong_CompactValue': /usr/include/python3.12/cpython/longintrepr.h:121:5: error: ISO C90 forbids mixed declarations and code [-Werror=declaration-after-statement] 121 | Py_ssize_t sign = 1 - (op->long_value.lv_tag & _PyLong_SIGN_MASK); | ^~~~~~~~~~ So add -Wno-declaration-after-statement to the python scripting CFLAGS. Reviewed-by: Ian Rogers <irogers@google.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Link: https://lore.kernel.org/lkml/ZMpdKeO8gU%2FcWDqH@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> | ||
|  Anup Sharma | f9f72b2ab7 | perf scripts python: Add command execution for gecko script This will enable the execution of gecko.py script using record and
report commands in 'perf script'.  And this will be also reflected at
"perf script -l" command.
For Example:
    perf script record gecko
    perf script report gecko
Committer notes:
As discussed on the perf tools office hours, I made -F 99 the default
for the record script and removed the double -- on the report script so
that the existing 'perf script' protocol for the combined operation:
    # perf script gecko
Works, i.e. the record script pipes its stdout into the stdin of the
report script, basically:
  /bin/sh /usr/libexec/perf-core/scripts/python/bin/gecko-record -F 99 -g -a -q -o - | \
  /bin/sh /usr/libexec/perf-core/scripts/python/bin/gecko-report -i -
Testing it:
The resulting JSON file needs to be uploaded to
https://profiler.firefox.com, Anup already has code to start a local
http server on the trace_begin handler of the gecko python script, start
firefox and feed it the JSON.
The example below only collects sample for the specified workload, so
that we don't produce thousands of lines, to collect system wide
samples, use instead:
  # perf script gecko -a sleep 0.5
  # nohup perf script gecko sleep 0.5
  {
    "meta": {
      "interval": 1,
      "processType": 0,
      "product": "x86_64 GNU/Linux",
      "stackwalk": 1,
      "debug": 0,
      "gcpoison": 0,
      "asyncstack": 1,
      "startTime": 274601692.636,
      "shutdownTime": null,
      "version": 24,
      "presymbolicated": true,
      "categories": [
        {
          "name": "User",
          "color": "yellow",
          "subcategories": [
            "Other"
          ]
        },
        {
          "name": "Kernel",
          "color": "orange",
          "subcategories": [
            "Other"
          ]
        }
      ],
      "markerSchema": []
    },
    "libs": [],
    "threads": [
      {
        "tid":  | ||
|  Anup Sharma | 2d889c6af1 | perf scripts python: Implement add sample function and thread processing The stack has been created for storing func and dso from the callchain. The sample has been added to a specific thread. It first checks if the thread exists in the Thread class. Then it call _add_sample function which is responsible for appending a new entry to the samples list. Also callchain parsing and storing part is implemented. Moreover removed the comment from thread. Signed-off-by: Anup Sharma <anupnewsmail@gmail.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Ian Rogers <irogers@google.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: https://lore.kernel.org/r/5a112be85ccdcdcd611e343f6a7a7482d01f6299.1689961706.git.anupnewsmail@gmail.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> | ||
|  Anup Sharma | 258dfd41c1 | perf scripts python: Implement add sample function and thread processing The intern_stack function is responsible for retrieving or creating a stack_id based on the provided frame_id and prefix_id. It first generates a key using the frame_id and prefix_id values. If the stack corresponding to the key is found in the stackMap, it is returned. Otherwise, a new stack is created by appending the prefix_id and frame_id to the stackTable. The key and the index of the newly created stack are added to the stackMap for future reference. The _intern_frame function is responsible for retrieving or creating a frame_id based on the provided frame string. If the frame_id corresponding to the frameString is found in the frameMap, it is returned. Otherwise, a new frame is created by appending relevant information to the frameTable and adding the frameString to the string_id through _intern_string. The _intern_string function will gets a matching string, or saves the new string and returns a String ID. Signed-off-by: Anup Sharma <anupnewsmail@gmail.com> Link: https://lore.kernel.org/r/4442f4b1ab4c7317cf940560a3a285fcdfbeeb08.1689961706.git.anupnewsmail@gmail.com Cc: Mark Rutland <mark.rutland@arm.com> Cc: Ian Rogers <irogers@google.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Arnaldo Carvalho de Melo <acme@kernel.org> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: linux-kernel@vger.kernel.org Cc: linux-perf-users@vger.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> | ||
|  Anup Sharma | 833daec7e6 | perf scripts python: Add trace end processing and PRODUCT and CATEGORIES information The final output will now be presented in JSON format following the Gecko profile structure. Additionally, the inclusion of PRODUCT allows easy retrieval of header information for UI. Furthermore, CATEGORIES have been introduced to enable customization of kernel and user colors using input arguments. To facilitate this functionality, an argparse-based parser has been implemented. Note: The implementation of threads will be addressed in subsequent commits for now I have commented it out. Signed-off-by: Anup Sharma <anupnewsmail@gmail.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Ian Rogers <irogers@google.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: https://lore.kernel.org/r/fa6d027e4134c48e8a2ea45dd8f6b21e6a3418e4.1689961706.git.anupnewsmail@gmail.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> | ||
|  Anup Sharma | 5aacd7f08a | perf scripts python: Add classes and conversion functions This commit introduces new classes and conversion functions to facilitate the representation of Gecko profile information. The new classes Frame, Stack, Sample, and Thread are added to handle specific components of the profile data, also link to the origin docs has been commented out. Additionally, Inside the Thread class _to_json_dict() method has been created that converts the current thread data into the corresponding format expected by the GeckoThread JSON schema, as per the Gecko profile format specification. Signed-off-by: Anup Sharma <anupnewsmail@gmail.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Ian Rogers <irogers@google.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: https://lore.kernel.org/r/ab7b40bd32df7101a6f8b4a3aa41570b63b831ac.1689961706.git.anupnewsmail@gmail.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> | ||
|  Anup Sharma | 0a02e44cc2 | perf scripts python: Extact necessary information from process event The script takes in a sample event dictionary(param_dict) and retrieves relevant data such as time stamp, PID, TID, and comm for each event. Also start time is defined as a global variable as it need to be passed to trace_end for gecko meta information field creation. Signed-off-by: Anup Sharma <anupnewsmail@gmail.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Ian Rogers <irogers@google.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: https://lore.kernel.org/r/19910fefcfe4be03cd5c2aa3fec11d3f86c0381b.1689961706.git.anupnewsmail@gmail.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> | ||
|  Anup Sharma | 1699d3efe1 | perf scripts python: Add initial script file with usage information Added necessary modules, including the Perf-Trace-Util library, and defines the required functions and variables for using perf script python. The perf_trace_context and Core modules for tracing and processing events has been also imported. Added usage information. Signed-off-by: Anup Sharma <anupnewsmail@gmail.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Ian Rogers <irogers@google.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: https://lore.kernel.org/r/f2f1a62f1cc69f44a5414da46a26a4cf124d2744.1689961706.git.anupnewsmail@gmail.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> | ||
|  Sourabh Jain | 75782e8253 | perf python scripting: Get rid of unused import in arm-cs-trace-disasm The arm-cs-trace-disasm.py script doesn't use the sys library, so remove the import. Report by pylint: W0611: Unused import sys (unused-import) Signed-off-by: Sourabh Jain <sourabhjain@linux.ibm.com> Cc: Athira Rajeev <atrajeev@linux.vnet.ibm.com> Cc: Disha Goel <disgoel@linux.vnet.ibm.com> Cc: Ian Rogers <irogers@google.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: John Garry <john.g.garry@oracle.com> Cc: Kajol Jain <kjain@linux.ibm.com> Cc: Madhavan Srinivasan <maddy@linux.ibm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Ravi Bangoria <ravi.bangoria@amd.com> Cc: linuxppc-dev@lists.ozlabs.org Link: https://lore.kernel.org/linux-perf-users/20230613164145.50488-2-atrajeev@linux.vnet.ibm.com Signed-off-by: Athira Rajeev <atrajeev@linux.vnet.ibm.com> Signed-off-by: Kajol Jain <kjain@linux.ibm.com> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> | ||
|  Ian Rogers | ee84a3032b | perf thread: Add accessor functions for thread Using accessors will make it easier to add reference count checking in later patches. Committer notes: thread->nsinfo wasn't wrapped as it is used together with nsinfo__zput(), where does a trick to set the field with a refcount being dropped to NULL, and that doesn't work well with using thread__nsinfo(thread), that loses the &thread->nsinfo pointer. When refcount checking is added to 'struct thread', later in this series, nsinfo__zput(RC_CHK_ACCESS(thread)->nsinfo) will be used to check the thread pointer. Signed-off-by: Ian Rogers <irogers@google.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Ali Saidi <alisaidi@amazon.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Athira Rajeev <atrajeev@linux.vnet.ibm.com> Cc: Brian Robbins <brianrob@linux.microsoft.com> Cc: Changbin Du <changbin.du@huawei.com> Cc: Dmitrii Dolgov <9erthalion6@gmail.com> Cc: Fangrui Song <maskray@google.com> Cc: German Gomez <german.gomez@arm.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Ivan Babrou <ivan@cloudflare.com> Cc: James Clark <james.clark@arm.com> Cc: Jing Zhang <renyu.zj@linux.alibaba.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: John Garry <john.g.garry@oracle.com> Cc: K Prateek Nayak <kprateek.nayak@amd.com> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Leo Yan <leo.yan@linaro.org> Cc: Liam Howlett <liam.howlett@oracle.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Miguel Ojeda <ojeda@kernel.org> Cc: Mike Leach <mike.leach@linaro.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Ravi Bangoria <ravi.bangoria@amd.com> Cc: Sean Christopherson <seanjc@google.com> Cc: Steinar H. Gunderson <sesse@google.com> Cc: Suzuki Poulouse <suzuki.poulose@arm.com> Cc: Wenyu Liu <liuwenyu7@huawei.com> Cc: Will Deacon <will@kernel.org> Cc: Yang Jihong <yangjihong1@huawei.com> Cc: Ye Xingchen <ye.xingchen@zte.com.cn> Cc: Yuan Can <yuancan@huawei.com> Cc: coresight@lists.linaro.org Cc: linux-arm-kernel@lists.infradead.org Link: https://lore.kernel.org/r/20230608232823.4027869-4-irogers@google.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> | ||
|  Sriram Yagnaraman | 69b0e11261 | perf script: Add new parameter in kfree_skb tracepoint to the python scripts using it Include reason parameter that was added in commit  | ||
|  Colin Ian King | de047c1091 | perf script task-analyzer: Fix spelling mistake "miliseconds" -> "milliseconds" There is a spelling mistake in the help for the --ms option. Fix it. Signed-off-by: Colin Ian King <colin.i.king@gmail.com> Acked-by: Hagen Paul Pfeifer <hagen@jauu.net> Acked-by: Ian Rogers <irogers@google.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Petar Gligoric <petar.gligoric@rohde-schwarz.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: kernel-janitors@vger.kernel.org Link: https://lore.kernel.org/r/20230417174826.52963-1-colin.i.king@gmail.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> | ||
|  Alexander Pantyukhin | 984abd349d | perf scripts python intel-pt-events: Delete unused 'event_attr variable The 'event_attr' is never used later, the var is ok be deleted. Additional code simplification is to substitute string slice comparison with "substring" function. This case no need to know the length specific words. Signed-off-by: Alexander Pantyukhin <apantykhin@gmail.com> Acked-by: Adrian Hunter <adrian.hunter@intel.com> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Link: https://lore.kernel.org/r/20230114130533.2877-1-apantykhin@gmail.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> | ||
|  Ian Rogers | 63df0e4bc3 | perf map: Add accessor for dso Later changes will add reference count checking for struct map, with dso being the most frequently accessed variable. Add an accessor so that the reference count check is only necessary in one place. Additional changes: - add a dso variable to avoid repeated map__dso calls. - in builtin-mem.c dump_raw_samples, code only partially tested for dso == NULL. Make the possibility of NULL consistent. - in thread.c thread__memcpy fix use of spaces and use tabs. Committer notes: Did missing conversions on these files: tools/perf/arch/powerpc/util/skip-callchain-idx.c tools/perf/arch/powerpc/util/sym-handling.c tools/perf/ui/browsers/hists.c tools/perf/ui/gtk/annotate.c tools/perf/util/cs-etm.c tools/perf/util/thread.c tools/perf/util/unwind-libunwind-local.c tools/perf/util/unwind-libunwind.c Signed-off-by: Ian Rogers <irogers@google.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Alexey Bayduraev <alexey.v.bayduraev@linux.intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Cc: Darren Hart <dvhart@infradead.org> Cc: Davidlohr Bueso <dave@stgolabs.net> Cc: Dmitriy Vyukov <dvyukov@google.com> Cc: Eric Dumazet <edumazet@google.com> Cc: German Gomez <german.gomez@arm.com> Cc: Hao Luo <haoluo@google.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: James Clark <james.clark@arm.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: John Garry <john.g.garry@oracle.com> Cc: Kajol Jain <kjain@linux.ibm.com> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Leo Yan <leo.yan@linaro.org> Cc: Madhavan Srinivasan <maddy@linux.ibm.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Masami Hiramatsu <mhiramat@kernel.org> Cc: Miaoqian Lin <linmq006@gmail.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Riccardo Mancini <rickyman7@gmail.com> Cc: Shunsuke Nakamura <nakamura.shun@fujitsu.com> Cc: Song Liu <song@kernel.org> Cc: Stephane Eranian <eranian@google.com> Cc: Stephen Brennan <stephen.s.brennan@oracle.com> Cc: Steven Rostedt (VMware) <rostedt@goodmis.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Thomas Richter <tmricht@linux.ibm.com> Cc: Yury Norov <yury.norov@gmail.com> Link: https://lore.kernel.org/r/20230320212248.1175731-2-irogers@google.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> | ||
|  Ian Rogers | 5ab6d715c3 | perf maps: Add functions to access maps Introduce functions to access struct maps. These functions reduce the number of places reference counting is necessary. While tidying APIs do some small const-ification, in particlar to unwind_libunwind_ops. Committer notes: Fixed up tools/perf/util/unwind-libunwind.c: - return ops->get_entries(cb, arg, thread, data, max_stack); + return ops->get_entries(cb, arg, thread, data, max_stack, best_effort); Signed-off-by: Ian Rogers <irogers@google.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Alexey Bayduraev <alexey.v.bayduraev@linux.intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Cc: Darren Hart <dvhart@infradead.org> Cc: Davidlohr Bueso <dave@stgolabs.net> Cc: Dmitriy Vyukov <dvyukov@google.com> Cc: Eric Dumazet <edumazet@google.com> Cc: German Gomez <german.gomez@arm.com> Cc: Hao Luo <haoluo@google.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: James Clark <james.clark@arm.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: John Garry <john.g.garry@oracle.com> Cc: Kajol Jain <kjain@linux.ibm.com> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Leo Yan <leo.yan@linaro.org> Cc: Madhavan Srinivasan <maddy@linux.ibm.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Masami Hiramatsu <mhiramat@kernel.org> Cc: Miaoqian Lin <linmq006@gmail.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Riccardo Mancini <rickyman7@gmail.com> Cc: Shunsuke Nakamura <nakamura.shun@fujitsu.com> Cc: Song Liu <song@kernel.org> Cc: Stephane Eranian <eranian@google.com> Cc: Stephen Brennan <stephen.s.brennan@oracle.com> Cc: Steven Rostedt (VMware) <rostedt@goodmis.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Thomas Richter <tmricht@linux.ibm.com> Cc: Yury Norov <yury.norov@gmail.com> Link: https://lore.kernel.org/r/20230320212248.1175731-2-irogers@google.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> | ||
|  Adrian Hunter | 80c3a7d9f2 | perf script: Fix Python support when no libtraceevent Python scripting can be used without libtraceevent. In particular,
scripting for Intel PT does not use tracepoints, and so does not need
libtraceevent support.
Alter the build and employ conditional compilation to allow Python
scripting without libtraceevent.
Example:
 Before:
    $ ldd `which perf` | grep -i python
    $ ldd `which perf` | grep -i libtraceevent
    $ perf record -e intel_pt//u uname
    Linux
    [ perf record: Woken up 1 times to write data ]
    [ perf record: Captured and wrote 0.031 MB perf.data ]
    $ perf script intel-pt-events.py |& head -3
      Error: Couldn't find script `intel-pt-events.py'
     See perf script -l for available scripts.
 After:
    $ ldd `which perf` | grep -i python
            libpython3.10.so.1.0 => /lib/x86_64-linux-gnu/libpython3.10.so.1.0 (0x00007f4bac400000)
    $ ldd `which perf` | grep -i libtraceevent
    $ perf script intel-pt-events.py | head
    Intel PT Branch Trace, Power Events, Event Trace and PTWRITE
         Switch In    8021/8021  [000]     11234.097713404     0/0
           perf-exec  8021/8021  [000]     11234.098041726       psb                        offset: 0x0                0 [unknown] ([unknown])
           perf-exec  8021/8021  [000]     11234.098041726       cbr                         45  freq: 4505 MHz  (161%)                0 [unknown] ([unknown])
               uname  8021/8021  [000]     11234.098082170  branches:uH  tr strt                              0 [unknown] ([unknown]) => 7f3a8b9422b0 _start+0x0 (/usr/lib/x86_64-linux-gnu/ld-linux-x86-64.so.2)
               uname  8021/8021  [000]     11234.098082379  branches:uH  tr end                    7f3a8b9422b0 _start+0x0 (/usr/lib/x86_64-linux-gnu/ld-linux-x86-64.so.2) => 0 [unknown] ([unknown])
               uname  8021/8021  [000]     11234.098083629  branches:uH  tr strt                              0 [unknown] ([unknown]) => 7f3a8b9422b0 _start+0x0 (/usr/lib/x86_64-linux-gnu/ld-linux-x86-64.so.2)
               uname  8021/8021  [000]     11234.098083629  branches:uH  call                      7f3a8b9422b3 _start+0x3 (/usr/lib/x86_64-linux-gnu/ld-linux-x86-64.so.2) => 7f3a8b943050 _dl_start+0x0 (/usr/lib/x86_64-linux-gnu/ld-linux-x86-64.so.2)
               uname  8021/8021  [000]     11234.098083837  branches:uH  tr end                    7f3a8b943060 _dl_start+0x10 (/usr/lib/x86_64-linux-gnu/ld-linux-x86-64.so.2) => 0 [unknown] ([unknown])  IPC: 0.01 (9/938)
               uname  8021/8021  [000]     11234.098084670  branches:uH  tr strt                              0 [unknown] ([unknown]) => 7f3a8b943060 _dl_start+0x10 (/usr/lib/x86_64-linux-gnu/ld-linux-x86-64.so.2)
Fixes:  | ||
|  Roman Lozko | 1f64cfdebf | perf scripts intel-pt-events.py: Fix IPC output for Python 2 Integers are not converted to floats during division in Python 2 which
results in incorrect IPC values. Fix by switching to new division
behavior.
Fixes:  | ||
|  Ian Rogers | b430d24367 | perf script flamegraph: Avoid d3-flame-graph package dependency Currently flame graph generation requires a d3-flame-graph template to
be installed. Unfortunately this is hard to come by for things like
Debian [1].
If the template isn't installed then ask if it should be downloaded from
jsdelivr CDN. The downloaded HTML file is validated against an md5sum.
If the download fails, generate a minimal flame graph with the
javascript coming from links to jsdelivr CDN.
v3. Adds a warning message and quits before download in live mode.
v2. Change the warning to a prompt about downloading and add the
    --allow-download command line flag. Add an md5sum check for the
    downloaded HTML.
[1] https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=996839
Reviewed-by: Andreas Gerstmayr <agerstmayr@redhat.com>
Signed-off-by: Ian Rogers <irogers@google.com>
Cc: 996839@bugs.debian.org
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Brendan Gregg <brendan@intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Martin Spier <spiermar@gmail.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: https://lore.kernel.org/r/20230118072409.147786-1-irogers@google.com # v3 discussion
Link: https://lore.kernel.org/r/20230112220024.32709-1-irogers@google.com # v2 discussion
Link: https://lore.kernel.org/r/CAP-5=fXi_9zdhTAoYApiFQoLURAvpEatFzU3uL23o3zs=z25ZQ@mail.gmail.com # v1 discussion
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> | ||
|  Petar Gligoric | fdd0f81f05 | perf script: task-analyzer add csv support This patch adds the possibility to write the trace and the summary as csv files
to a user specified file. A format as such simplifies further data processing.
This is achieved by having ";" as separators instead of spaces and solely one
header per file.
Additional parameters are being considered, like in the normal usage of the
script. Colors are turned off in the case of a csv output, thus the highlight
option is also being ignored.
Usage:
Write standard task to csv file:
  $ perf script report tasks-analyzer --csv <file>
write limited output to csv file in nanoseconds:
  $ perf script report tasks-analyzer --csv <file> --ns --limit-to-tasks 1337
Write summary to a csv file:
  $ perf script report tasks-analyzer --csv-summary <file>
Write summary to csv file with additional schedule information:
  $ perf script report tasks-analyzer --csv-summary <file> --summary-extended
Write both summary and standard task to a csv file:
  $ perf script report tasks-analyzer --csv --csv-summary
The following examples illustrate what is possible with the CSV output.  The
first command sequence will record all scheduler switch events for 10 seconds,
the task-analyzer calculates task information like runtimes as CSV.  A small
python snippet using pandas and matplotlib will visualize the most frequent
task (e.g. kworker/1:1) runtimes - each runtime as a bar in a bar chart:
  $ perf record -e sched:sched_switch -a -- sleep 10
  $ perf script report tasks-analyzer --ns --csv tasks.csv
  $ cat << EOF > /tmp/freq-comm-runtimes-bar.py
    import pandas as pd
    import matplotlib.pyplot as plt
    df = pd.read_csv("tasks.csv", sep=';')
    most_freq_comm = df["COMM"].value_counts().idxmax()
    most_freq_runtimes = df[df["COMM"]==most_freq_comm]["Runtime"]
    plt.title(f"Runtimes for Task {most_freq_comm} in Nanoseconds")
    plt.bar(range(len(most_freq_runtimes)), most_freq_runtimes)
    plt.show()
  $ python3 /tmp/freq-comm-runtimes-bar.py
As a seconds example, the subsequent script generates a pie chart of all
accumulated tasks runtimes for 10 seconds of system recordings:
  $ perf record -e sched:sched_switch -a -- sleep 10
  $ perf script report tasks-analyzer --csv-summary task-summary.csv
  $ cat << EOF > /tmp/accumulated-task-pie.py
    import pandas as pd
    from matplotlib.pyplot import pie, axis, show
    df = pd.read_csv("task-summary.csv", sep=';')
    sums = df.groupby(df["Comm"])["Accumulated"].sum()
    axis("equal")
    pie(sums, labels=sums.index);
    show()
  EOF
  $ python3 /tmp/accumulated-task-pie.py
A variety of other visualizations are possible in matplotlib and other
environments. Of course, pandas, numpy and co. also allow easy
statistical analysis of the data!
Signed-off-by: Petar Gligoric <petar.gligoric@rohde-schwarz.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: https://lore.kernel.org/r/20221206154406.41941-3-petar.gligor@gmail.com
Signed-off-by: Hagen Paul Pfeifer <hagen@jauu.net>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> | ||
|  Hagen Paul Pfeifer | e76aff0523 | perf script: Introduce task analyzer python script Introduce a new 'perf script' to analyze task scheduling behavior.
During the task analysis, some data is always needed - which goes beyond
the simple time of switching on and off a task (process/thread). This
concerns for example the runtime of a process or the frequency with
which the process was called. This script serves to simplify this
recurring analyze process. It immediately provides the user with helpful
task characteristic information about the tasks runtimes.
Usage:
Recorded can be in two ways:
  $ perf script record tasks-analyzer -- sleep 10
  $ perf record -e sched:sched_switch -a -- sleep 10
The script can parse all perf.data files, most important: sched:sched_switch
events are mandatory, other events will be ignored.
Most simple report use case is to just call the script without arguments:
  $ perf script report tasks-analyzer
      Switched-In      Switched-Out CPU      PID      TID             Comm    Runtime     Time Out-In
  15576.658891407   15576.659156086   4     2412     2428            gdbus        265            1949
  15576.659111320   15576.659455410   0     2412     2412      gnome-shell        344            2267
  15576.659491326   15576.659506173   2       74       74      kworker/2:1         15           13145
  15576.659506173   15576.659825748   2     2858     2858  gnome-terminal-        320           63263
  15576.659871270   15576.659902872   6    20932    20932    kworker/u16:0         32         2314582
  15576.659909951   15576.659945501   3    27264    27264               sh         36              -1
  15576.659853285   15576.659971052   7    27265    27265             perf        118         5050741
  [...]
What is not shown here are the ASCII color sequences. For example, if
the task consists of only one thread, the TID is grayed out.
Runtime is the time the task was running on the CPU, Time Out-In is the
time between the process being scheduled *out* and scheduled back *in*.
So the last time span between two executions. If -1 is printed, then the
task simply ran the first time in the measurements - a Out-In delta
could not be calculated.
In addition to the chronological representation, there is a summary on
task level. This output can be additionally switched on via the
--summary option and provides information such as max, min & average
runtime per process. The maximum runtime is often important for
debugging. The call looks like this:
  $ perf script report tasks-analyzer --summary
  Summary
       Task Information                       Runtime Information
    PID   TID            Comm Runs Accumulated    Mean  Median  Min   Max          Max At
     14    14     ksoftirqd/0   13         334      26      15    9   127 15571.621211956
     15    15     rcu_preempt  133        1778      13      13    2    33 15572.581176024
     16    16     migration/0    3          49      16      13   12    24 15571.608915425
     20    20     migration/1    3          34      11      13    8    13 15571.639101555
     25    25     migration/2    3          32      11      12    9    12 15575.639239896
  [...]
Besides these two options, there are a number of other options that change the
output and behavior. This can be queried via --help. Options worth mentioning include:
- filter-tasks         - filter out unneeded tasks, --filter-task 1337,/sbin/init
- highlight-tasks      - more pleasant focusing, --highlight-tasks 1:red,mutt:yellow
- extended-times       - show combinations of elapsed times between schedule in/schedule out
- summary-extended     - summary with additional information, like maximum delta time statistics
- rename-comms-by-tids - handy for inexpressive processnames like python, --rename 1337:my-python-app
- ms                   - show timestamps in milliseconds, nanoseconds is also possible (--ns)
- time-limit           - limit the analyzer to a time range, --time-limit 15576.0:15576.1
Script is tested and prime time ready for python2 & python3:
- make PYTHON=python3 prefix=/usr/local install
- make PYTHON=python2 prefix=/usr/local install
Signed-off-by: Hagen Paul Pfeifer <hagen@jauu.net>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: https://lore.kernel.org/r/20221206154406.41941-2-petar.gligor@gmail.com
Signed-off-by: Petar Gligoric <petar.gligoric@rohde-schwarz.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> | ||
|  Ian Rogers | 378ef0f5d9 | perf build: Use libtraceevent from the system Remove the LIBTRACEEVENT_DYNAMIC and LIBTRACEFS_DYNAMIC make command line variables. If libtraceevent isn't installed or NO_LIBTRACEEVENT=1 is passed to the build, don't compile in libtraceevent and libtracefs support. This also disables CONFIG_TRACE that controls "perf trace". CONFIG_LIBTRACEEVENT is used to control enablement in Build/Makefiles, HAVE_LIBTRACEEVENT is used in C code. Without HAVE_LIBTRACEEVENT tracepoints are disabled and as such the commands kmem, kwork, lock, sched and timechart are removed. The majority of commands continue to work including "perf test". Committer notes: Fixed up a tools/perf/util/Build reject and added: #include <traceevent/event-parse.h> to tools/perf/util/scripting-engines/trace-event-perl.c. Committer testing: $ rpm -qi libtraceevent-devel Name : libtraceevent-devel Version : 1.5.3 Release : 2.fc36 Architecture: x86_64 Install Date: Mon 25 Jul 2022 03:20:19 PM -03 Group : Unspecified Size : 27728 License : LGPLv2+ and GPLv2+ Signature : RSA/SHA256, Fri 15 Apr 2022 02:11:58 PM -03, Key ID 999f7cbf38ab71f4 Source RPM : libtraceevent-1.5.3-2.fc36.src.rpm Build Date : Fri 15 Apr 2022 10:57:01 AM -03 Build Host : buildvm-x86-05.iad2.fedoraproject.org Packager : Fedora Project Vendor : Fedora Project URL : https://git.kernel.org/pub/scm/libs/libtrace/libtraceevent.git/ Bug URL : https://bugz.fedoraproject.org/libtraceevent Summary : Development headers of libtraceevent Description : Development headers of libtraceevent-libs $ Default build: $ ldd ~/bin/perf | grep tracee libtraceevent.so.1 => /lib64/libtraceevent.so.1 (0x00007f1dcaf8f000) $ # perf trace -e sched:* --max-events 10 0.000 migration/0/17 sched:sched_migrate_task(comm: "", pid: 1603763 (perf), prio: 120, dest_cpu: 1) 0.005 migration/0/17 sched:sched_wake_idle_without_ipi(cpu: 1) 0.011 migration/0/17 sched:sched_switch(prev_comm: "", prev_pid: 17 (migration/0), prev_state: 1, next_comm: "", next_prio: 120) 1.173 :0/0 sched:sched_wakeup(comm: "", pid: 3138 (gnome-terminal-), prio: 120) 1.180 :0/0 sched:sched_switch(prev_comm: "", prev_prio: 120, next_comm: "", next_pid: 3138 (gnome-terminal-), next_prio: 120) 0.156 migration/1/21 sched:sched_migrate_task(comm: "", pid: 1603763 (perf), prio: 120, orig_cpu: 1, dest_cpu: 2) 0.160 migration/1/21 sched:sched_wake_idle_without_ipi(cpu: 2) 0.166 migration/1/21 sched:sched_switch(prev_comm: "", prev_pid: 21 (migration/1), prev_state: 1, next_comm: "", next_prio: 120) 1.183 :0/0 sched:sched_wakeup(comm: "", pid: 1602985 (kworker/u16:0-f), prio: 120, target_cpu: 1) 1.186 :0/0 sched:sched_switch(prev_comm: "", prev_prio: 120, next_comm: "", next_pid: 1602985 (kworker/u16:0-f), next_prio: 120) # Had to tweak tools/perf/util/setup.py to make sure the python binding shared object links with libtraceevent if -DHAVE_LIBTRACEEVENT is present in CFLAGS. Building with NO_LIBTRACEEVENT=1 uncovered some more build failures: - Make building of data-convert-bt.c to CONFIG_LIBTRACEEVENT=y - perf-$(CONFIG_LIBTRACEEVENT) += scripts/ - bpf_kwork.o needs also to be dependent on CONFIG_LIBTRACEEVENT=y - The python binding needed some fixups and util/trace-event.c can't be built and linked with the python binding shared object, so remove it in tools/perf/util/setup.py and exclude it from the list of dependencies in the python/perf.so Makefile.perf target. Building without libtraceevent-devel installed uncovered more build failures: - The python binding tools/perf/util/python.c was assuming that traceevent/parse-events.h was always available, which was the case when we defaulted to using the in-kernel tools/lib/traceevent/ files, now we need to enclose it under ifdef HAVE_LIBTRACEEVENT, just like the other parts of it that deal with tracepoints. - We have to ifdef the rules in the Build files with CONFIG_LIBTRACEEVENT=y to build builtin-trace.c and tools/perf/trace/beauty/ as we only ifdef setting CONFIG_TRACE=y when setting NO_LIBTRACEEVENT=1 in the make command line, not when we don't detect libtraceevent-devel installed in the system. Simplification here to avoid these two ways of disabling builtin-trace.c and not having CONFIG_TRACE=y when libtraceevent-devel isn't installed is the clean way. From Athira: <quote> tools/perf/arch/powerpc/util/Build -perf-y += kvm-stat.o +perf-$(CONFIG_LIBTRACEEVENT) += kvm-stat.o </quote> Then, ditto for arm64 and s390, detected by container cross build tests. - s/390 uses test__checkevent_tracepoint() that is now only available if HAVE_LIBTRACEEVENT is defined, enclose the callsite with ifder HAVE_LIBTRACEEVENT. Also from Athira: <quote> With this change, I could successfully compile in these environment: - Without libtraceevent-devel installed - With libtraceevent-devel installed - With “make NO_LIBTRACEEVENT=1” </quote> Then, finally rename CONFIG_TRACEEVENT to CONFIG_LIBTRACEEVENT for consistency with other libraries detected in tools/perf/. Signed-off-by: Ian Rogers <irogers@google.com> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Tested-by: Athira Rajeev <atrajeev@linux.vnet.ibm.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Nick Desaulniers <ndesaulniers@google.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Cc: bpf@vger.kernel.org Link: http://lore.kernel.org/lkml/20221205225940.3079667-3-irogers@google.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> | ||
|  Adrian Hunter | ad7ad6b5dd | perf scripts python: intel-pt-events.py: Add ability interleave output Intel PT timestamps are not provided for every branch, let alone every instruction, so there can be many samples with the same timestamp. With per-cpu contexts, decoding is done for each CPU in turn, which can make it difficult to see what is happening on different CPUs at the same time. Currently the interleaving from perf script --itrace=i0ns is quite coarse grained. There are often long stretches executing on one CPU and nothing on another. Some people are interested in seeing what happened on multiple CPUs before a crash to debug races etc. To improve perf script interleaving for parallel execution, the intel-pt-events.py script has been enhanced to enable interleaving the output with the same timestamp from different CPUs. It is understood that interleaving is not perfect or causal. Add parameter --interleave [<n>] to interleave sample output for the same timestamp so that no more than n samples for a CPU are displayed in a row. 'n' defaults to 4. Note this only affects the order of output, and only when the timestamp is the same. Example: $ perf script intel-pt-events.py --insn-trace --interleave 3 ... bash 2267/2267 [004] 9323.692625625 563caa3c86f0 jz 0x563caa3c89c7 run_pending_traps+0x30 (/usr/bin/bash) IPC: 1.52 (38/25) bash 2267/2267 [004] 9323.692625625 563caa3c89c7 movq 0x118(%rsp), %rax run_pending_traps+0x307 (/usr/bin/bash) bash 2267/2267 [004] 9323.692625625 563caa3c89cf subq %fs:0x28, %rax run_pending_traps+0x30f (/usr/bin/bash) bash 2270/2270 [007] 9323.692625625 55dc58cabf02 jz 0x55dc58cabf48 unquoted_glob_pattern_p+0x102 (/usr/bin/bash) IPC: 1.56 (25/16) bash 2270/2270 [007] 9323.692625625 55dc58cabf04 cmp $0x5d, %al unquoted_glob_pattern_p+0x104 (/usr/bin/bash) bash 2270/2270 [007] 9323.692625625 55dc58cabf06 jnz 0x55dc58cabf10 unquoted_glob_pattern_p+0x106 (/usr/bin/bash) bash 2264/2264 [001] 9323.692625625 7fd556a4376c jbe 0x7fd556a43ac8 round_and_return+0x3fc (/usr/lib/x86_64-linux-gnu/libc.so.6) IPC: 4.30 (43/10) bash 2264/2264 [001] 9323.692625625 7fd556a43772 and $0x8, %edx round_and_return+0x402 (/usr/lib/x86_64-linux-gnu/libc.so.6) bash 2264/2264 [001] 9323.692625625 7fd556a43775 jnz 0x7fd556a43ac8 round_and_return+0x405 (/usr/lib/x86_64-linux-gnu/libc.so.6) bash 2267/2267 [004] 9323.692625625 563caa3c89d8 jnz 0x563caa3c8b11 run_pending_traps+0x318 (/usr/bin/bash) bash 2267/2267 [004] 9323.692625625 563caa3c89de add $0x128, %rsp run_pending_traps+0x31e (/usr/bin/bash) bash 2267/2267 [004] 9323.692625625 563caa3c89e5 popq %rbx run_pending_traps+0x325 (/usr/bin/bash) ... Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Ian Rogers <irogers@google.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Link: https://lore.kernel.org/r/20221020152509.5298-1-adrian.hunter@intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> | ||
|  Arnaldo Carvalho de Melo | 18808564aa | Merge remote-tracking branch 'torvalds/master' into perf/core To pick up the fixes that went upstream via acme/perf/urgent and to get to v5.19. Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> | ||
|  Leo Yan | b226521923 | perf scripts python: Let script to be python2 compliant The mainline kernel can be used for relative old distros, e.g. RHEL 7.
The distro doesn't upgrade from python2 to python3, this causes the
building error that the python script is not python2 compliant.
To fix the building failure, this patch changes from the python f-string
format to traditional string format.
Fixes:  | ||
|  Arnaldo Carvalho de Melo | 41d0914d86 | perf python: Ignore unused command line arguments when building with clang Noticed after switching to python3 by default on some older fedora
releases:
  35    38.20 fedora:27                     : FAIL clang version 5.0.2 (tags/RELEASE_502/final)
    clang-5.0: error: argument unused during compilation: '-specs=/usr/lib/rpm/redhat/redhat-hardened-cc1' [-Werror,-Wunused-command-line-argument]
    clang-5.0: error: argument unused during compilation: '-specs=/usr/lib/rpm/redhat/redhat-hardened-cc1' [-Werror,-Wunused-command-line-argument]
    error: command 'clang' failed with exit status 1
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> | ||
|  Arnaldo Carvalho de Melo | 63a4354ae7 | perf scripting perl: Ignore some warnings to keep building with perl headers On gcc 12 we started seeing this:
  In file included from /usr/lib/perl5/5.36.0/x86_64-linux-thread-multi/CORE/perl.h:2999,
                   from util/scripting-engines/trace-event-perl.c:35:
  /usr/lib/perl5/5.36.0/x86_64-linux-thread-multi/CORE/inline.h: In function 'Perl_is_utf8_valid_partial_char_flags':
  /usr/lib/perl5/5.36.0/x86_64-linux-thread-multi/CORE/handy.h:125:23: error: cast from function call of type 'STRLEN' {aka 'long unsigned int'} to non-matching type '_Bool' [-Werror=bad-function-cast]
    125 | #define cBOOL(cbool) ((bool) (cbool))
        |                       ^
  /usr/lib/perl5/5.36.0/x86_64-linux-thread-multi/CORE/inline.h:2363:12: note: in expansion of macro 'cBOOL'
   2363 |     return cBOOL(is_utf8_char_helper_(s0, e, flags));
        |            ^~~~~
  In file included from /usr/lib/perl5/5.36.0/x86_64-linux-thread-multi/CORE/perl.h:7242:
  /usr/lib/perl5/5.36.0/x86_64-linux-thread-multi/CORE/inline.h: In function 'Perl_cop_file_avn':
  /usr/lib/perl5/5.36.0/x86_64-linux-thread-multi/CORE/inline.h:3489:5: error: ISO C90 forbids mixed declarations and code [-Werror=declaration-after-statement]
   3489 |     const char *file = CopFILE(cop);
        |     ^~~~~
  In file included from /usr/lib/perl5/5.36.0/x86_64-linux-thread-multi/CORE/perl.h:7243:
  /usr/lib/perl5/5.36.0/x86_64-linux-thread-multi/CORE/sv_inline.h: In function 'Perl_newSV_type':
  /usr/lib/perl5/5.36.0/x86_64-linux-thread-multi/CORE/sv_inline.h:376:5: error: enumeration value 'SVt_LAST' not handled in switch [-Werror=switch-enum]
    376 |     switch (type) {
        |     ^~~~~~
So disable those warnings to keep building with perl devel headers.
Noticed, among other distros, on opensuse tumbleweed:
gcc version 12.1.1 20220629 [revision 7811663964aa7e31c3939b859bbfa2e16919639f] (SUSE Linux)
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> | ||
|  Adrian Hunter | 13a133b255 | perf script python: intel-pt-events: Add machine_pid and vcpu Add machine_pid and vcpu to the intel-pt-events.py script. Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Acked-by: Ian Rogers <irogers@google.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: kvm@vger.kernel.org Link: https://lore.kernel.org/r/20220711093218.10967-20-adrian.hunter@intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> | ||
|  Leo Yan | 12fdd6c009 | perf scripts python: Support Arm CoreSight trace data disassembly This commit adds python script to parse CoreSight tracing event and
print out source line and disassembly, it generates readable program
execution flow for easier humans inspecting.
The script receives CoreSight tracing packet with below format:
                +------------+------------+------------+
  packet(n):    |    addr    |    ip      |    cpu     |
                +------------+------------+------------+
  packet(n+1):  |    addr    |    ip      |    cpu     |
                +------------+------------+------------+
packet::addr presents the start address of the coming branch sample, and
packet::ip is the last address of the branch smple.  Therefore, a code
section between branches starts from packet(n)::addr and it stops at
packet(n+1)::ip.  As results we combines the two continuous packets to
generate the address range for instructions:
  [ sample(n)::addr .. sample(n+1)::ip ]
The script supports both objdump or llvm-objdump for disassembly with
specifying option '-d'.  If doesn't specify option '-d', the script
simply outputs source lines and symbols.
Below shows usages with llvm-objdump or objdump to output disassembly.
  # perf script -s scripts/python/arm-cs-trace-disasm.py -- -d llvm-objdump-11 -k ./vmlinux
  ARM CoreSight Trace Data Assembler Dump
  	ffff800008eb3198 <etm4_enable_hw>:
  	ffff800008eb3310: c0 38 00 35  	cbnz	w0, 0xffff800008eb3a28 <etm4_enable_hw+0x890>
  	ffff800008eb3314: 9f 3f 03 d5  	dsb	sy
  	ffff800008eb3318: df 3f 03 d5  	isb
  	ffff800008eb331c: f5 5b 42 a9  	ldp	x21, x22, [sp, #32]
  	ffff800008eb3320: fb 73 45 a9  	ldp	x27, x28, [sp, #80]
  	ffff800008eb3324: e0 82 40 39  	ldrb	w0, [x23, #32]
  	ffff800008eb3328: 60 00 00 34  	cbz	w0, 0xffff800008eb3334 <etm4_enable_hw+0x19c>
  	ffff800008eb332c: e0 03 19 aa  	mov	x0, x25
  	ffff800008eb3330: 8c fe ff 97  	bl	0xffff800008eb2d60 <etm4_cs_lock.isra.0.part.0>
              main  6728/6728  [0004]         0.000000000  etm4_enable_hw+0x198                    [kernel.kallsyms]
  	ffff800008eb2d60 <etm4_cs_lock.isra.0.part.0>:
  	ffff800008eb2d60: 1f 20 03 d5  	nop
  	ffff800008eb2d64: 1f 20 03 d5  	nop
  	ffff800008eb2d68: 3f 23 03 d5  	hint	#25
  	ffff800008eb2d6c: 00 00 40 f9  	ldr	x0, [x0]
  	ffff800008eb2d70: 9f 3f 03 d5  	dsb	sy
  	ffff800008eb2d74: 00 c0 3e 91  	add	x0, x0, #4016
  	ffff800008eb2d78: 1f 00 00 b9  	str	wzr, [x0]
  	ffff800008eb2d7c: bf 23 03 d5  	hint	#29
  	ffff800008eb2d80: c0 03 5f d6  	ret
              main  6728/6728  [0004]         0.000000000  etm4_cs_lock.isra.0.part.0+0x20
  # perf script -s scripts/python/arm-cs-trace-disasm.py -- -d objdump -k ./vmlinux
  ARM CoreSight Trace Data Assembler Dump
  	ffff800008eb3310 <etm4_enable_hw+0x178>:
  	ffff800008eb3310:	350038c0 	cbnz	w0, ffff800008eb3a28 <etm4_enable_hw+0x890>
  	ffff800008eb3314:	d5033f9f 	dsb	sy
  	ffff800008eb3318:	d5033fdf 	isb
  	ffff800008eb331c:	a9425bf5 	ldp	x21, x22, [sp, #32]
  	ffff800008eb3320:	a94573fb 	ldp	x27, x28, [sp, #80]
  	ffff800008eb3324:	394082e0 	ldrb	w0, [x23, #32]
  	ffff800008eb3328:	34000060 	cbz	w0, ffff800008eb3334 <etm4_enable_hw+0x19c>
  	ffff800008eb332c:	aa1903e0 	mov	x0, x25
  	ffff800008eb3330:	97fffe8c 	bl	ffff800008eb2d60 <etm4_cs_lock.isra.0.part.0>
              main  6728/6728  [0004]         0.000000000  etm4_enable_hw+0x198                    [kernel.kallsyms]
  	ffff800008eb2d60 <etm4_cs_lock.isra.0.part.0>:
  	ffff800008eb2d60:	d503201f 	nop
  	ffff800008eb2d64:	d503201f 	nop
  	ffff800008eb2d68:	d503233f 	paciasp
  	ffff800008eb2d6c:	f9400000 	ldr	x0, [x0]
  	ffff800008eb2d70:	d5033f9f 	dsb	sy
  	ffff800008eb2d74:	913ec000 	add	x0, x0, #0xfb0
  	ffff800008eb2d78:	b900001f 	str	wzr, [x0]
  	ffff800008eb2d7c:	d50323bf 	autiasp
  	ffff800008eb2d80:	d65f03c0 	ret
              main  6728/6728  [0004]         0.000000000  etm4_cs_lock.isra.0.part.0+0x20
Signed-off-by: Leo Yan <leo.yan@linaro.org>
Co-authored-by: Al Grant <al.grant@arm.com>
Co-authored-by: Mathieu Poirier <mathieu.poirier@linaro.org>
Co-authored-by: Tor Jeremiassen <tor@ti.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Eelco Chaudron <echaudro@redhat.com>
Cc: German Gomez <german.gomez@arm.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@arm.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephen Brennan <stephen.s.brennan@oracle.com>
Cc: Tanmay Jagdale <tanmay@marvell.com>
Cc: coresight@lists.linaro.org
Cc: zengshun . wu <zengshun.wu@outlook.com>
Link: https://lore.kernel.org/r/20220521130446.4163597-3-leo.yan@linaro.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> | ||
|  Adrian Hunter | 75659c6fb5 | perf scripts python: intel-pt-events.py: Print ptwrite value as a string if it is ASCII It can be convenient to put a string value into a ptwrite payload as
a quick and easy way to identify what is being printed.
To make that useful, if the Intel ptwrite payload value contains only
printable ASCII characters padded with NULLs, then print it also as a
string.
Using the example program from the "Emulated PTWRITE" section of
tools/perf/Documentation/perf-intel-pt.txt:
 $ echo -n "Hello" | od -t x8
 0000000 0000006f6c6c6548
 0000005
 $ perf record -e intel_pt//u ./eg_ptw 0x0000006f6c6c6548
 [ perf record: Woken up 1 times to write data ]
 [ perf record: Captured and wrote 0.016 MB perf.data ]
 $ perf script --itrace=ew intel-pt-events.py
 Intel PT Branch Trace, Power Events, Event Trace and PTWRITE
      Switch In   38524/38524 [001]     24166.044995916     0/0
           eg_ptw 38524/38524 [001]     24166.045380004   ptwrite  jmp                   IP: 0 payload: 0x6f6c6c6548 Hello     56532c7ce196 perf_emulate_ptwrite+0x16 (/home/ahunter/git/work/eg_ptw)
 End
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Link: https://lore.kernel.org/r/20220509152400.376613-4-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> | ||
|  Adrian Hunter | 28924a232a | perf scripts python: export-to-postgresql.py: Export all sample flags Add sample flags to the PostgreSQL database definition and export. Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Jiri Olsa <jolsa@redhat.com> Link: https://lore.kernel.org/r/20220124084201.2699795-25-adrian.hunter@intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> | ||
|  Adrian Hunter | 761836cb87 | perf scripts python: export-to-sqlite.py: Export all sample flags Add sample flags to the SQLite database definition and export. Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Jiri Olsa <jolsa@redhat.com> Link: https://lore.kernel.org/r/20220124084201.2699795-24-adrian.hunter@intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> | ||
|  Adrian Hunter | 95f9bfcf84 | perf scripts python: intel-pt-events.py: Add Event Trace Add Event Trace to the intel-pt-events.py script. This shows how to unpack the raw data from the new sample events in a Python script. Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Jiri Olsa <jolsa@redhat.com> Link: https://lore.kernel.org/r/20220124084201.2699795-22-adrian.hunter@intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> | ||
|  Adrian Hunter | 0f80bfbf49 | perf scripts python: intel-pt-events.py: Fix printing of switch events The intel-pt-events.py script displays only the last of consecutive switch
statements but that may not be the last switch event for the CPU. Fix by
keeping a dictionary of last context switch keyed by CPU, and make it
possible to see all switch events by adding option --all-switch-events.
Fixes:  | ||
|  Michael Petlan | 51ae7fa62d | perf scripts python: Fix passing arguments to stackcollapse report The '--' prevented arguments from being passed to the script, such as: $ perf script report stackcollapse -i my_perf.data Signed-off-by: Michael Petlan <mpetlan@redhat.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Paolo Bonzini <pbonzini@redhat.com> LPU-Reference: 20200427142327.21172-1-mpetlan@redhat.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> | ||
|  Andreas Gerstmayr | c611e4f24c | perf flamegraph: flamegraph.py script improvements * display perf.data header * display PIDs of user stacks * added option to change color scheme * default to blue/green color scheme to improve accessibility * correctly identify kernel stacks when kernel-debuginfo is installed Signed-off-by: Andreas Gerstmayr <agerstmayr@redhat.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lore.kernel.org/lkml/20210830164729.116049-1-agerstmayr@redhat.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> | ||
|  Adrian Hunter | a483e64c0b | perf scripting python: intel-pt-events.py: Add --insn-trace and --src-trace Add an instruction trace and a source trace to the intel-pt-events.py script. Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Jiri Olsa <jolsa@redhat.com> Link: https://lore.kernel.org/r/20210530192308.7382-14-adrian.hunter@intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> | ||
|  Adrian Hunter | 2b87386c7a | perf scripting python: exported-sql-viewer.py: Factor out libxed.py Factor out libxed.py so it can be reused. Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Jiri Olsa <jolsa@redhat.com> Link: https://lore.kernel.org/r/20210530192308.7382-13-adrian.hunter@intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> | ||
|  Adrian Hunter | e79457a526 | perf scripting python: Add perf_sample_srcline() and perf_sample_srccode() Add perf_sample_srcline() and perf_sample_srccode() to the perf_trace_context module so that a script can get the srcline or srccode information. Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Jiri Olsa <jolsa@redhat.com> Link: https://lore.kernel.org/r/20210530192308.7382-11-adrian.hunter@intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> |