mirror of
				git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git
				synced 2025-09-04 20:19:47 +08:00 
			
		
		
		
	 99f753f048
			
		
	
	
		99f753f048
		
	
	
	
	
		
			
			Add a ftrace style --graph-function argument to 'perf script' that
allows to print itrace function calls only below a given function. This
makes it easier to find the code of interest in a large trace.
% perf record -e intel_pt//k -a sleep 1
% perf script --graph-function group_sched_in --call-trace
            perf   900 [000] 194167.205652203: ([kernel.kallsyms])          group_sched_in
            perf   900 [000] 194167.205652203: ([kernel.kallsyms])              __x86_indirect_thunk_rax
            perf   900 [000] 194167.205652203: ([kernel.kallsyms])              event_sched_in.isra.107
            perf   900 [000] 194167.205652203: ([kernel.kallsyms])                  perf_event_set_state.part.71
            perf   900 [000] 194167.205652203: ([kernel.kallsyms])                      perf_event_update_time
            perf   900 [000] 194167.205652203: ([kernel.kallsyms])                  perf_pmu_disable
            perf   900 [000] 194167.205652203: ([kernel.kallsyms])                  perf_log_itrace_start
            perf   900 [000] 194167.205652203: ([kernel.kallsyms])                  __x86_indirect_thunk_rax
            perf   900 [000] 194167.205652203: ([kernel.kallsyms])                      perf_event_update_userpage
            perf   900 [000] 194167.205652203: ([kernel.kallsyms])                          calc_timer_values
            perf   900 [000] 194167.205652203: ([kernel.kallsyms])                              sched_clock_cpu
            perf   900 [000] 194167.205652203: ([kernel.kallsyms])                          __x86_indirect_thunk_rax
            perf   900 [000] 194167.205652203: ([kernel.kallsyms])                          arch_perf_update_userpage
            perf   900 [000] 194167.205652203: ([kernel.kallsyms])                              __fentry__
            perf   900 [000] 194167.205652203: ([kernel.kallsyms])                              using_native_sched_clock
            perf   900 [000] 194167.205652203: ([kernel.kallsyms])                              sched_clock_stable
            perf   900 [000] 194167.205652203: ([kernel.kallsyms])                  perf_pmu_enable
            perf   900 [000] 194167.205652203: ([kernel.kallsyms])              __x86_indirect_thunk_rax
         swapper     0 [001] 194167.205660693: ([kernel.kallsyms])          group_sched_in
         swapper     0 [001] 194167.205660693: ([kernel.kallsyms])              __x86_indirect_thunk_rax
         swapper     0 [001] 194167.205660693: ([kernel.kallsyms])              event_sched_in.isra.107
         swapper     0 [001] 194167.205660693: ([kernel.kallsyms])                  perf_event_set_state.part.71
         swapper     0 [001] 194167.205660693: ([kernel.kallsyms])                      perf_event_update_time
         swapper     0 [001] 194167.205660693: ([kernel.kallsyms])                  perf_pmu_disable
         swapper     0 [001] 194167.205660693: ([kernel.kallsyms])                  perf_log_itrace_start
         swapper     0 [001] 194167.205660693: ([kernel.kallsyms])                  __x86_indirect_thunk_rax
         swapper     0 [001] 194167.205660693: ([kernel.kallsyms])                      perf_event_update_userpage
         swapper     0 [001] 194167.205660693: ([kernel.kallsyms])                          calc_timer_values
         swapper     0 [001] 194167.205660693: ([kernel.kallsyms])                              sched_clock_cpu
         swapper     0 [001] 194167.205660693: ([kernel.kallsyms])                          __x86_indirect_thunk_rax
         swapper     0 [001] 194167.205660693: ([kernel.kallsyms])                          arch_perf_update_userpage
         swapper     0 [001] 194167.205660693: ([kernel.kallsyms])                              __fentry__
         swapper     0 [001] 194167.205660693: ([kernel.kallsyms])                              using_native_sched_clock
         swapper     0 [001] 194167.205660693: ([kernel.kallsyms])                              sched_clock_stable
Signed-off-by: Andi Kleen <ak@linux.intel.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Tested-by: Leo Yan <leo.yan@linaro.org>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Cc: Kim Phillips <kim.phillips@arm.com>
Link: http://lkml.kernel.org/r/20180920180540.14039-5-andi@firstfloor.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
		
	
			
		
			
				
	
	
		
			408 lines
		
	
	
		
			14 KiB
		
	
	
	
		
			Plaintext
		
	
	
	
	
	
			
		
		
	
	
			408 lines
		
	
	
		
			14 KiB
		
	
	
	
		
			Plaintext
		
	
	
	
	
	
| perf-script(1)
 | |
| =============
 | |
| 
 | |
| NAME
 | |
| ----
 | |
| perf-script - Read perf.data (created by perf record) and display trace output
 | |
| 
 | |
| SYNOPSIS
 | |
| --------
 | |
| [verse]
 | |
| 'perf script' [<options>]
 | |
| 'perf script' [<options>] record <script> [<record-options>] <command>
 | |
| 'perf script' [<options>] report <script> [script-args]
 | |
| 'perf script' [<options>] <script> <required-script-args> [<record-options>] <command>
 | |
| 'perf script' [<options>] <top-script> [script-args]
 | |
| 
 | |
| DESCRIPTION
 | |
| -----------
 | |
| This command reads the input file and displays the trace recorded.
 | |
| 
 | |
| There are several variants of perf script:
 | |
| 
 | |
|   'perf script' to see a detailed trace of the workload that was
 | |
|   recorded.
 | |
| 
 | |
|   You can also run a set of pre-canned scripts that aggregate and
 | |
|   summarize the raw trace data in various ways (the list of scripts is
 | |
|   available via 'perf script -l').  The following variants allow you to
 | |
|   record and run those scripts:
 | |
| 
 | |
|   'perf script record <script> <command>' to record the events required
 | |
|   for 'perf script report'.  <script> is the name displayed in the
 | |
|   output of 'perf script --list' i.e. the actual script name minus any
 | |
|   language extension.  If <command> is not specified, the events are
 | |
|   recorded using the -a (system-wide) 'perf record' option.
 | |
| 
 | |
|   'perf script report <script> [args]' to run and display the results
 | |
|   of <script>.  <script> is the name displayed in the output of 'perf
 | |
|   script --list' i.e. the actual script name minus any language
 | |
|   extension.  The perf.data output from a previous run of 'perf script
 | |
|   record <script>' is used and should be present for this command to
 | |
|   succeed.  [args] refers to the (mainly optional) args expected by
 | |
|   the script.
 | |
| 
 | |
|   'perf script <script> <required-script-args> <command>' to both
 | |
|   record the events required for <script> and to run the <script>
 | |
|   using 'live-mode' i.e. without writing anything to disk.  <script>
 | |
|   is the name displayed in the output of 'perf script --list' i.e. the
 | |
|   actual script name minus any language extension.  If <command> is
 | |
|   not specified, the events are recorded using the -a (system-wide)
 | |
|   'perf record' option.  If <script> has any required args, they
 | |
|   should be specified before <command>.  This mode doesn't allow for
 | |
|   optional script args to be specified; if optional script args are
 | |
|   desired, they can be specified using separate 'perf script record'
 | |
|   and 'perf script report' commands, with the stdout of the record step
 | |
|   piped to the stdin of the report script, using the '-o -' and '-i -'
 | |
|   options of the corresponding commands.
 | |
| 
 | |
|   'perf script <top-script>' to both record the events required for
 | |
|   <top-script> and to run the <top-script> using 'live-mode'
 | |
|   i.e. without writing anything to disk.  <top-script> is the name
 | |
|   displayed in the output of 'perf script --list' i.e. the actual
 | |
|   script name minus any language extension; a <top-script> is defined
 | |
|   as any script name ending with the string 'top'.
 | |
| 
 | |
|   [<record-options>] can be passed to the record steps of 'perf script
 | |
|   record' and 'live-mode' variants; this isn't possible however for
 | |
|   <top-script> 'live-mode' or 'perf script report' variants.
 | |
| 
 | |
|   See the 'SEE ALSO' section for links to language-specific
 | |
|   information on how to write and run your own trace scripts.
 | |
| 
 | |
| OPTIONS
 | |
| -------
 | |
| <command>...::
 | |
| 	Any command you can specify in a shell.
 | |
| 
 | |
| -D::
 | |
| --dump-raw-trace=::
 | |
|         Display verbose dump of the trace data.
 | |
| 
 | |
| -L::
 | |
| --Latency=::
 | |
|         Show latency attributes (irqs/preemption disabled, etc).
 | |
| 
 | |
| -l::
 | |
| --list=::
 | |
|         Display a list of available trace scripts.
 | |
| 
 | |
| -s ['lang']::
 | |
| --script=::
 | |
|         Process trace data with the given script ([lang]:script[.ext]).
 | |
| 	If the string 'lang' is specified in place of a script name, a
 | |
|         list of supported languages will be displayed instead.
 | |
| 
 | |
| -g::
 | |
| --gen-script=::
 | |
|         Generate perf-script.[ext] starter script for given language,
 | |
|         using current perf.data.
 | |
| 
 | |
| -a::
 | |
|         Force system-wide collection.  Scripts run without a <command>
 | |
|         normally use -a by default, while scripts run with a <command>
 | |
|         normally don't - this option allows the latter to be run in
 | |
|         system-wide mode.
 | |
| 
 | |
| -i::
 | |
| --input=::
 | |
|         Input file name. (default: perf.data unless stdin is a fifo)
 | |
| 
 | |
| -d::
 | |
| --debug-mode::
 | |
|         Do various checks like samples ordering and lost events.
 | |
| 
 | |
| -F::
 | |
| --fields::
 | |
|         Comma separated list of fields to print. Options are:
 | |
|         comm, tid, pid, time, cpu, event, trace, ip, sym, dso, addr, symoff,
 | |
|         srcline, period, iregs, uregs, brstack, brstacksym, flags, bpf-output, brstackinsn,
 | |
|         brstackoff, callindent, insn, insnlen, synth, phys_addr, metric, misc.
 | |
|         Field list can be prepended with the type, trace, sw or hw,
 | |
|         to indicate to which event type the field list applies.
 | |
|         e.g., -F sw:comm,tid,time,ip,sym  and -F trace:time,cpu,trace
 | |
| 
 | |
| 		perf script -F <fields>
 | |
| 
 | |
| 	is equivalent to:
 | |
| 
 | |
| 		perf script -F trace:<fields> -F sw:<fields> -F hw:<fields>
 | |
| 
 | |
| 	i.e., the specified fields apply to all event types if the type string
 | |
| 	is not given.
 | |
| 
 | |
| 	In addition to overriding fields, it is also possible to add or remove
 | |
| 	fields from the defaults. For example
 | |
| 
 | |
| 		-F -cpu,+insn
 | |
| 
 | |
| 	removes the cpu field and adds the insn field. Adding/removing fields
 | |
| 	cannot be mixed with normal overriding.
 | |
| 
 | |
| 	The arguments are processed in the order received. A later usage can
 | |
| 	reset a prior request. e.g.:
 | |
| 
 | |
| 		-F trace: -F comm,tid,time,ip,sym
 | |
| 
 | |
| 	The first -F suppresses trace events (field list is ""), but then the
 | |
| 	second invocation sets the fields to comm,tid,time,ip,sym. In this case a
 | |
| 	warning is given to the user:
 | |
| 
 | |
| 		"Overriding previous field request for all events."
 | |
| 
 | |
| 	Alternatively, consider the order:
 | |
| 
 | |
| 		-F comm,tid,time,ip,sym -F trace:
 | |
| 
 | |
| 	The first -F sets the fields for all events and the second -F
 | |
| 	suppresses trace events. The user is given a warning message about
 | |
| 	the override, and the result of the above is that only S/W and H/W
 | |
| 	events are displayed with the given fields.
 | |
| 
 | |
| 	For the 'wildcard' option if a user selected field is invalid for an
 | |
| 	event type, a message is displayed to the user that the option is
 | |
| 	ignored for that type. For example:
 | |
| 
 | |
| 		$ perf script -F comm,tid,trace
 | |
| 		'trace' not valid for hardware events. Ignoring.
 | |
| 		'trace' not valid for software events. Ignoring.
 | |
| 
 | |
| 	Alternatively, if the type is given an invalid field is specified it
 | |
| 	is an error. For example:
 | |
| 
 | |
|         perf script -v -F sw:comm,tid,trace
 | |
|         'trace' not valid for software events.
 | |
| 
 | |
| 	At this point usage is displayed, and perf-script exits.
 | |
| 
 | |
| 	The flags field is synthesized and may have a value when Instruction
 | |
| 	Trace decoding. The flags are "bcrosyiABEx" which stand for branch,
 | |
| 	call, return, conditional, system, asynchronous, interrupt,
 | |
| 	transaction abort, trace begin, trace end, and in transaction,
 | |
| 	respectively. Known combinations of flags are printed more nicely e.g.
 | |
| 	"call" for "bc", "return" for "br", "jcc" for "bo", "jmp" for "b",
 | |
| 	"int" for "bci", "iret" for "bri", "syscall" for "bcs", "sysret" for "brs",
 | |
| 	"async" for "by", "hw int" for "bcyi", "tx abrt" for "bA", "tr strt" for "bB",
 | |
| 	"tr end" for "bE". However the "x" flag will be display separately in those
 | |
| 	cases e.g. "jcc     (x)" for a condition branch within a transaction.
 | |
| 
 | |
| 	The callindent field is synthesized and may have a value when
 | |
| 	Instruction Trace decoding. For calls and returns, it will display the
 | |
| 	name of the symbol indented with spaces to reflect the stack depth.
 | |
| 
 | |
| 	When doing instruction trace decoding insn and insnlen give the
 | |
| 	instruction bytes and the instruction length of the current
 | |
| 	instruction.
 | |
| 
 | |
| 	The synth field is used by synthesized events which may be created when
 | |
| 	Instruction Trace decoding.
 | |
| 
 | |
| 	Finally, a user may not set fields to none for all event types.
 | |
| 	i.e., -F "" is not allowed.
 | |
| 
 | |
| 	The brstack output includes branch related information with raw addresses using the
 | |
| 	/v/v/v/v/cycles syntax in the following order:
 | |
| 	FROM: branch source instruction
 | |
| 	TO  : branch target instruction
 | |
|         M/P/-: M=branch target mispredicted or branch direction was mispredicted, P=target predicted or direction predicted, -=not supported
 | |
| 	X/- : X=branch inside a transactional region, -=not in transaction region or not supported
 | |
| 	A/- : A=TSX abort entry, -=not aborted region or not supported
 | |
| 	cycles
 | |
| 
 | |
| 	The brstacksym is identical to brstack, except that the FROM and TO addresses are printed in a symbolic form if possible.
 | |
| 
 | |
| 	When brstackinsn is specified the full assembler sequences of branch sequences for each sample
 | |
| 	is printed. This is the full execution path leading to the sample. This is only supported when the
 | |
| 	sample was recorded with perf record -b or -j any.
 | |
| 
 | |
| 	The brstackoff field will print an offset into a specific dso/binary.
 | |
| 
 | |
| 	With the metric option perf script can compute metrics for
 | |
| 	sampling periods, similar to perf stat. This requires
 | |
| 	specifying a group with multiple metrics with the :S option
 | |
| 	for perf record. perf will sample on the first event, and
 | |
| 	compute metrics for all the events in the group. Please note
 | |
| 	that the metric computed is averaged over the whole sampling
 | |
| 	period, not just for the sample point.
 | |
| 
 | |
| 	For sample events it's possible to display misc field with -F +misc option,
 | |
| 	following letters are displayed for each bit:
 | |
| 
 | |
| 	  PERF_RECORD_MISC_KERNEL               K
 | |
| 	  PERF_RECORD_MISC_USER                 U
 | |
| 	  PERF_RECORD_MISC_HYPERVISOR           H
 | |
| 	  PERF_RECORD_MISC_GUEST_KERNEL         G
 | |
| 	  PERF_RECORD_MISC_GUEST_USER           g
 | |
| 	  PERF_RECORD_MISC_MMAP_DATA*           M
 | |
| 	  PERF_RECORD_MISC_COMM_EXEC            E
 | |
| 	  PERF_RECORD_MISC_SWITCH_OUT           S
 | |
| 	  PERF_RECORD_MISC_SWITCH_OUT_PREEMPT   Sp
 | |
| 
 | |
| 	  $ perf script -F +misc ...
 | |
| 	   sched-messaging  1414 K     28690.636582:       4590 cycles ...
 | |
| 	   sched-messaging  1407 U     28690.636600:     325620 cycles ...
 | |
| 	   sched-messaging  1414 K     28690.636608:      19473 cycles ...
 | |
| 	  misc field ___________/
 | |
| 
 | |
| -k::
 | |
| --vmlinux=<file>::
 | |
|         vmlinux pathname
 | |
| 
 | |
| --kallsyms=<file>::
 | |
|         kallsyms pathname
 | |
| 
 | |
| --symfs=<directory>::
 | |
|         Look for files with symbols relative to this directory.
 | |
| 
 | |
| -G::
 | |
| --hide-call-graph::
 | |
|         When printing symbols do not display call chain.
 | |
| 
 | |
| --stop-bt::
 | |
|         Stop display of callgraph at these symbols
 | |
| 
 | |
| -C::
 | |
| --cpu:: Only report samples for the list of CPUs provided. Multiple CPUs can
 | |
| 	be provided as a comma-separated list with no space: 0,1. Ranges of
 | |
| 	CPUs are specified with -: 0-2. Default is to report samples on all
 | |
| 	CPUs.
 | |
| 
 | |
| -c::
 | |
| --comms=::
 | |
| 	Only display events for these comms. CSV that understands
 | |
| 	file://filename entries.
 | |
| 
 | |
| --pid=::
 | |
| 	Only show events for given process ID (comma separated list).
 | |
| 
 | |
| --tid=::
 | |
| 	Only show events for given thread ID (comma separated list).
 | |
| 
 | |
| -I::
 | |
| --show-info::
 | |
| 	Display extended information about the perf.data file. This adds
 | |
| 	information which may be very large and thus may clutter the display.
 | |
| 	It currently includes: cpu and numa topology of the host system.
 | |
| 	It can only be used with the perf script report mode.
 | |
| 
 | |
| --show-kernel-path::
 | |
| 	Try to resolve the path of [kernel.kallsyms]
 | |
| 
 | |
| --show-task-events
 | |
| 	Display task related events (e.g. FORK, COMM, EXIT).
 | |
| 
 | |
| --show-mmap-events
 | |
| 	Display mmap related events (e.g. MMAP, MMAP2).
 | |
| 
 | |
| --show-namespace-events
 | |
| 	Display namespace events i.e. events of type PERF_RECORD_NAMESPACES.
 | |
| 
 | |
| --show-switch-events
 | |
| 	Display context switch events i.e. events of type PERF_RECORD_SWITCH or
 | |
| 	PERF_RECORD_SWITCH_CPU_WIDE.
 | |
| 
 | |
| --show-lost-events
 | |
| 	Display lost events i.e. events of type PERF_RECORD_LOST.
 | |
| 
 | |
| --show-round-events
 | |
| 	Display finished round events i.e. events of type PERF_RECORD_FINISHED_ROUND.
 | |
| 
 | |
| --demangle::
 | |
| 	Demangle symbol names to human readable form. It's enabled by default,
 | |
| 	disable with --no-demangle.
 | |
| 
 | |
| --demangle-kernel::
 | |
| 	Demangle kernel symbol names to human readable form (for C++ kernels).
 | |
| 
 | |
| --header
 | |
| 	Show perf.data header.
 | |
| 
 | |
| --header-only
 | |
| 	Show only perf.data header.
 | |
| 
 | |
| --itrace::
 | |
| 	Options for decoding instruction tracing data. The options are:
 | |
| 
 | |
| include::itrace.txt[]
 | |
| 
 | |
| 	To disable decoding entirely, use --no-itrace.
 | |
| 
 | |
| --full-source-path::
 | |
| 	Show the full path for source files for srcline output.
 | |
| 
 | |
| --max-stack::
 | |
|         Set the stack depth limit when parsing the callchain, anything
 | |
|         beyond the specified depth will be ignored. This is a trade-off
 | |
|         between information loss and faster processing especially for
 | |
|         workloads that can have a very long callchain stack.
 | |
|         Note that when using the --itrace option the synthesized callchain size
 | |
|         will override this value if the synthesized callchain size is bigger.
 | |
| 
 | |
|         Default: 127
 | |
| 
 | |
| --ns::
 | |
| 	Use 9 decimal places when displaying time (i.e. show the nanoseconds)
 | |
| 
 | |
| -f::
 | |
| --force::
 | |
| 	Don't do ownership validation.
 | |
| 
 | |
| --time::
 | |
| 	Only analyze samples within given time window: <start>,<stop>. Times
 | |
| 	have the format seconds.microseconds. If start is not given (i.e., time
 | |
| 	string is ',x.y') then analysis starts at the beginning of the file. If
 | |
| 	stop time is not given (i.e, time string is 'x.y,') then analysis goes
 | |
| 	to end of file.
 | |
| 
 | |
| 	Also support time percent with multipe time range. Time string is
 | |
| 	'a%/n,b%/m,...' or 'a%-b%,c%-%d,...'.
 | |
| 
 | |
| 	For example:
 | |
| 	Select the second 10% time slice:
 | |
| 	perf script --time 10%/2
 | |
| 
 | |
| 	Select from 0% to 10% time slice:
 | |
| 	perf script --time 0%-10%
 | |
| 
 | |
| 	Select the first and second 10% time slices:
 | |
| 	perf script --time 10%/1,10%/2
 | |
| 
 | |
| 	Select from 0% to 10% and 30% to 40% slices:
 | |
| 	perf script --time 0%-10%,30%-40%
 | |
| 
 | |
| --max-blocks::
 | |
| 	Set the maximum number of program blocks to print with brstackasm for
 | |
| 	each sample.
 | |
| 
 | |
| --per-event-dump::
 | |
| 	Create per event files with a "perf.data.EVENT.dump" name instead of
 | |
|         printing to stdout, useful, for instance, for generating flamegraphs.
 | |
| 
 | |
| --inline::
 | |
| 	If a callgraph address belongs to an inlined function, the inline stack
 | |
| 	will be printed. Each entry has function name and file/line. Enabled by
 | |
| 	default, disable with --no-inline.
 | |
| 
 | |
| --insn-trace::
 | |
| 	Show instruction stream for intel_pt traces. Combine with --xed to
 | |
| 	show disassembly.
 | |
| 
 | |
| --xed::
 | |
| 	Run xed disassembler on output. Requires installing the xed disassembler.
 | |
| 
 | |
| --call-trace::
 | |
| 	Show call stream for intel_pt traces. The CPUs are interleaved, but
 | |
| 	can be filtered with -C.
 | |
| 
 | |
| --call-ret-trace::
 | |
| 	Show call and return stream for intel_pt traces.
 | |
| 
 | |
| --graph-function::
 | |
| 	For itrace only show specified functions and their callees for
 | |
| 	itrace. Multiple functions can be separated by comma.
 | |
| 
 | |
| SEE ALSO
 | |
| --------
 | |
| linkperf:perf-record[1], linkperf:perf-script-perl[1],
 | |
| linkperf:perf-script-python[1]
 |