It's not displayed in TUI now, putting it into generic part.
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/n/tip-5fk88kejqgi50ye7xdkhiloz@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
If write_backward attribute is set, records are written into kernel
ring buffer from end to beginning, but read from beginning to end.
To avoid 'XX out of order events recorded' warning message (timestamps
of records is in reverse order when using write_backward), suppress the
warning message if write_backward is selected by at lease one event.
Result:
Before this patch:
# perf record -m 1 -e raw_syscalls:sys_exit/overwrite/ \
-e raw_syscalls:sys_enter \
dd if=/dev/zero of=/dev/null count=300
300+0 records in
300+0 records out
153600 bytes (154 kB) copied, 0.000601617 s, 255 MB/s
[ perf record: Woken up 5 times to write data ]
Warning:
40 out of order events recorded.
[ perf record: Captured and wrote 0.096 MB perf.data (696 samples) ]
After this patch:
# perf record -m 1 -e raw_syscalls:sys_exit/overwrite/ \
-e raw_syscalls:sys_enter \
dd if=/dev/zero of=/dev/null count=300
300+0 records in
300+0 records out
153600 bytes (154 kB) copied, 0.000644873 s, 238 MB/s
[ perf record: Woken up 5 times to write data ]
[ perf record: Captured and wrote 0.096 MB perf.data (696 samples) ]
Signed-off-by: Wang Nan <wangnan0@huawei.com>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Nilay Vaish <nilayvaish@gmail.com>
Cc: Zefan Li <lizefan@huawei.com>
Cc: pi3orama@163.com
Link: http://lkml.kernel.org/r/1468485287-33422-15-git-send-email-wangnan0@huawei.com
Signed-off-by: He Kuang <hekuang@huawei.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
To match the semantics for list.h in the kernel, that are used to
implement those macros.
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Milian Wolff <milian.wolff@kdab.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Taeung Song <treeze.taeung@gmail.com>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/n/tip-qbcjlgj0ffxquxscahbpddi3@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
And do nothing, just like free(), to avoid having to test it in callers,
usually in error paths.
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/n/tip-dyuupcj0hnoyt96vma8b3anv@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Milian Wolff <milian.wolff@kdab.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/n/tip-5i07ivw1yjsweb7gztr255jd@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
ordered_events__free() leaves linked lists and timestamps not cleared,
so unable to be reused after ordered_events__free(). Which is inconvenient
after 'perf record' supports generating multiple perf.data output and
process build-ids for each of them.
Use ordered_events__reinit() for this.
Signed-off-by: Wang Nan <wangnan0@huawei.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Zefan Li <lizefan@huawei.com>
Cc: pi3orama@163.com
Link: http://lkml.kernel.org/r/1460535673-159866-2-git-send-email-wangnan0@huawei.com
Signed-off-by: He Kuang <hekuang@huawei.com>
[ Split from larger patch ]
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Those were converted to be evsel methods long ago, move the
source to where it belongs.
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Milian Wolff <milian.wolff@kdab.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/n/tip-vja8rjmkw3gd5ungaeyb5s2j@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
The fprintf_sym() and fprintf_callchain() methods now allow users to
change the existing behaviour of showing "[unknown]" as the name of
unresolved symbols to instead show "[0x123456]", i.e. its address.
The current patch doesn't change tools to use this facility, the results
from 'perf trace' and 'perf script' cotinue like:
70.109 ( 0.001 ms): qemu-system-x8/10153 poll(ufds: 0x7f2d93ffe870, nfds: 1) = 0 Timeout
[unknown] (/usr/lib64/libc-2.22.so)
[unknown] (/usr/lib64/libspice-server.so.1.10.0)
[unknown] (/usr/lib64/libspice-server.so.1.10.0)
[unknown] (/usr/lib64/libspice-server.so.1.10.0)
start_thread+0xca (/usr/lib64/libpthread-2.22.so)
__clone+0x6d (/usr/lib64/libc-2.22.so)
The next patch will make 'perf trace' use the new formatting.
Suggested-by: Milian Wolff <milian.wolff@kdab.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/n/tip-fja1ods5vqpg42mdz09xcz3r@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
In 'perf trace' we're just interested in printing callchains, and we
don't want to use the symbol_conf.use_callchain, so move the callchain
part to a new method.
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Milian Wolff <milian.wolff@kdab.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/n/tip-kcn3romzivcpxb3u75s9nz33@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
As it receives a FILE, and its more than just the IP, which can even be
requested not to be printed.
For consistency with other similar methods in tools/perf/, name it as
perf_evsel__fprintf_sym() and make it return the number of bytes
printed, just like 'fprintf(3)'
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Milian Wolff <milian.wolff@kdab.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/n/tip-84gawlqa3lhk63nf0t9vnqnn@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
For callchains, etc where we want it to align just below the syscall
name, for instance, in 'perf trace'
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/n/tip-uk9ekchd67651c625ltaur5y@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Intel PT uses the time members from the perf_event_mmap_page to convert
between TSC and perf time.
Due to a lack of foresight when Intel PT was implemented, those time
members were recorded in the (implementation dependent) AUXTRACE_INFO
event, the structure of which is generally inaccessible outside of the
Intel PT decoder. However now the conversion between TSC and perf time
is needed when processing a jitdump file when Intel PT has been used for
tracing.
So add a user event to record the time members. 'perf record' will
synthesize the event if the information is available. And session
processing will put a copy of the event on the session so that tools
like 'perf inject' can easily access it.
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lkml.kernel.org/r/1457426324-30158-1-git-send-email-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
To avoid parsing event->header.misc in many locations.
This will also allow setting perf.sample.{ip,cpumode} in a single place,
from tracepoint fields, as needed by 'perf kvm' with PPC guests, where
the guest hardware counters is not available at the host.
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Hemant Kumar <hemant@linux.vnet.ibm.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com>
Cc: Ravi Bangoria <ravi.bangoria@linux.vnet.ibm.com>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/n/tip-qp3yradhyt6q3wl895b1aat0@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Some of the stubs are identical so just have one function for them.
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/1457005856-6143-3-git-send-email-adrian.hunter@intel.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
While recording guest samples in host using perf kvm record, it will
populate unprocessable sample error, though samples will be recorded
properly. While generating report using perf kvm report, no samples will
be processed and same error will populate. We have seen this behaviour
with upstream perf(4.4-rc3) on x86 and ppc64 hardware.
Reason behind this failure is, when it tries to fetch machine from
rb_tree of machines, it fails. As a part of tracing a bug, we figured
out that this code was incorrectly refactored in commit 54245fdc35
("perf session: Remove wrappers to machines__find").
This patch will change the functionality such that if it can't fetch
machine in first trial, it will create one node of machine and add that to
rb_tree. So next time when it tries to fetch same machine from rb_tree,
it won't fail. Actually it was the case before refactoring of code in
aforementioned commit.
This patch is generated from acme perf/core branch.
Below I've mention an example that demonstrate the behaviour before and
after applying patch.
Before applying patch:
[Note: One needs to run guest before recording data in host]
ravi@ravi-bangoria:~$ ./perf kvm record -a
Warning:
5903 unprocessable samples recorded.
Do you have a KVM guest running and not using 'perf kvm'?
[ perf record: Captured and wrote 1.409 MB perf.data.guest (285 samples) ]
ravi@ravi-bangoria:~$ ./perf kvm report --stdio
Warning:
5903 unprocessable samples recorded.
Do you have a KVM guest running and not using 'perf kvm'?
# To display the perf.data header info, please use --header/--header-only options.
#
# Total Lost Samples: 0
#
# Samples: 285 of event 'cycles'
# Event count (approx.): 88715406
#
# Overhead Command Shared Object Symbol
# ........ ....... ............. ......
#
# (For a higher level overview, try: perf report --sort comm,dso)
#
After applying patch:
ravi@ravi-bangoria:~$ ./perf kvm record -a
[ perf record: Captured and wrote 1.188 MB perf.data.guest (17 samples) ]
ravi@ravi-bangoria:~$ ./perf kvm report --stdio
# To display the perf.data header info, please use --header/--header-only options.
#
# Total Lost Samples: 0
#
# Samples: 17 of event 'cycles'
# Event count (approx.): 700746
#
# Overhead Command Shared Object Symbol
# ........ ....... ................ ......................
#
34.19% :5758 [unknown] [g] 0xffffffff818682ab
22.79% :5758 [unknown] [g] 0xffffffff812dc7f8
22.79% :5758 [unknown] [g] 0xffffffff818650d0
14.83% :5758 [unknown] [g] 0xffffffff8161a1b6
2.49% :5758 [unknown] [g] 0xffffffff818692bf
0.48% :5758 [unknown] [g] 0xffffffff81869253
0.05% :5758 [unknown] [g] 0xffffffff81869250
Signed-off-by: Ravi Bangoria <ravi.bangoria@linux.vnet.ibm.com>
Cc: Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com>
Cc: stable@vger.kernel.org # v3.19+
Fixes: 54245fdc35 ("perf session: Remove wrappers to machines__find")
Link: http://lkml.kernel.org/r/1449471302-11283-1-git-send-email-ravi.bangoria@linux.vnet.ibm.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Disabling all non stat related features.
Also as we now enable STAT feature in the data file, adding code to
instruct session open to skip sample type checking for stat data files.
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Tested-by: Kan Liang <kan.liang@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/1446734469-11352-4-git-send-email-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
The 'perf report -D' command will now display detailed output for these
newly added events:
event_update
thread_map
cpu_map
stat
stat_config
stat_round
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Tested-by: Kan Liang <kan.liang@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/1445784728-21732-27-git-send-email-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
It'll serve as a base event for additional event attributes details,
that are not part of the attr event.
At the moment this event is just a dummy one without any specific
functionality. The type value will distinguish the update event details.
It'll come in the following patches.
The idea for this event is to be extensible for any update that the
event might need in the future.
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Tested-by: Kan Liang <kan.liang@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/1445784728-21732-21-git-send-email-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Adding the stat round event to be stored after each stat interval round,
so that report tools (report/script) gets notified and process interval
data.
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Tested-by: Kan Liang <kan.liang@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/1445784728-21732-18-git-send-email-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Adding a stat event to store a 'struct perf_counter_values' for a given
event/cpu/thread.
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Tested-by: Kan Liang <kan.liang@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/1445784728-21732-15-git-send-email-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Adding the stat config event to pass/store stat config data, so report
tools (report/script) know how to interpret stat data.
The config data is stored in a 'tag|value' way to allow for easy
extension and backwards compatibility.
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Tested-by: Kan Liang <kan.liang@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/1445784728-21732-12-git-send-email-jolsa@kernel.org
[ stat_config_term_event -> stat_config_event_entry ]
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Adding the cpu_map event to pass/store cpu maps as data in
a pipe/perf.data.
We store maps in 2 formats:
- list of cpus
- mask of cpus
The format that takes less space is selected transparently in the
following patch.
The interface is made generic, so we could add the cpumap event data
into another event in the following patches.
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Tested-by: Kan Liang <kan.liang@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/1445784728-21732-8-git-send-email-jolsa@kernel.org
[ cpu_map_data_cpus -> cpu_map_entries, cpu_map_data_mask -> cpu_map_mask ]
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Adding the thread_map event to pass/store thread maps as data in
the pipe/perf.data.
Storing the thread ID along with the standard comm[16] thread name string.
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Tested-by: Kan Liang <kan.liang@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/1445784728-21732-4-git-send-email-jolsa@kernel.org
[ Renamed thread_map_data_event to thread_map_event_entry ]
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Note that since the thread was already inserted to the session
list, it will be released when the session is released.
Also, in perf_session__register_idle_thread() failure path,
the thread should be put before returning.
Refcnt debugger shows that the perf_session__register_idle_thread
gets the returned thread, but the caller (__cmd_top) does not
put the returned idle thread.
----
==== [0] ====
Unreclaimed thread@0x24e6240
Refcount +1 => 0 at
./perf(thread__new+0xe5) [0x4c8a75]
./perf(machine__findnew_thread+0x9a) [0x4bbdba]
./perf(perf_session__register_idle_thread+0x28) [0x4c63c8]
./perf(cmd_top+0xd7d) [0x43cf6d]
./perf() [0x47ba35]
./perf(main+0x617) [0x4225b7]
/lib64/libc.so.6(__libc_start_main+0xf5) [0x7f06027c5af5]
./perf() [0x42272d]
Refcount +1 => 1 at
./perf(thread__get+0x2c) [0x4c8bcc]
./perf(machine__findnew_thread+0xee) [0x4bbe0e]
./perf(perf_session__register_idle_thread+0x28) [0x4c63c8]
./perf(cmd_top+0xd7d) [0x43cf6d]
./perf() [0x47ba35]
./perf(main+0x617) [0x4225b7]
/lib64/libc.so.6(__libc_start_main+0xf5) [0x7f06027c5af5]
./perf() [0x42272d]
Refcount +1 => 2 at
./perf(thread__get+0x2c) [0x4c8bcc]
./perf(machine__findnew_thread+0x112) [0x4bbe32]
./perf(perf_session__register_idle_thread+0x28) [0x4c63c8]
./perf(cmd_top+0xd7d) [0x43cf6d]
./perf() [0x47ba35]
./perf(main+0x617) [0x4225b7]
/lib64/libc.so.6(__libc_start_main+0xf5) [0x7f06027c5af5]
./perf() [0x42272d]
----
Signed-off-by: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/20151209021122.10245.69707.stgit@localhost.localdomain
[ Drop the refcount in perf_session__register_idle_thread() ]
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Before:
[acme@zoo linux]$ perf evlist
WARNING: The perf.data file's data size field is 0 which is unexpected.
Was the 'perf record' command properly terminated?
non matching sample_type[acme@zoo linux]$
After:
[acme@zoo linux]$ perf evlist
WARNING: The perf.data file's data size field is 0 which is unexpected.
Was the 'perf record' command properly terminated?
non matching sample_type
[acme@zoo linux]$
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/n/tip-wscok3a2s7yrj8156oc2r6qe@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
The perf top didn't add the idle/swapper thread to the machine's thread
list and its comm was displayed as ':0'. Fix it.
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/1443577526-3240-3-git-send-email-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
By default 'perf record' will postprocess the perf.data file to
determine build-ids. When that happens, the number of lost perf events
is displayed.
Make that also happen for AUX events.
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Link: http://lkml.kernel.org/r/1443186956-18718-7-git-send-email-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
If a session contains no events, we can get stuck in an infinite loop in
__perf_session__process_events, with a non-zero file_size and data_offset, but
a zero data_size.
In this case, we can mmap the entirety of the file (consisting of the file and
attribute headers), and fetch_mmaped_event will correctly refuse to read any
(unmapped and non-existent) event headers. This causes
__perf_session__process_events to unmap the file and retry with the exact same
parameters, getting stuck in an infinite loop.
This has been observed to result in an exit-time hang when counting
rare/unschedulable events with perf record, and can be triggered artificially
with the script below:
----
#!/bin/sh
printf "REPRO: launching perf\n";
./perf record -e software/config=9/ sleep 1 &
PERF_PID=$!;
sleep 0.002;
kill -2 $PERF_PID;
printf "REPRO: waiting for perf (%d) to exit...\n" "$PERF_PID";
wait $PERF_PID;
printf "REPRO: perf exited\n";
----
To avoid this, have __perf_session__process_events bail out early when
the file has no data (i.e. it has no events).
Commiter note:
I only managed to reproduce this when setting
/proc/sys/kernel/kptr_restrict to '1' and changing the code to
purposefully not process any samples and no synthesized samples, i.e.
kptr_restrict prevents 'record' from synthesizing the kernel mmaps for
vmlinux + modules and since it is a workload started from perf, we don't
synthesize mmap/comm records for existing threads.
Adrian Hunter managed to reproduce it in his environment tho.
Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Tested-by: Arnaldo Carvalho de Melo <acme@kernel.org>
Tested-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/1442423929-12253-1-git-send-email-mark.rutland@arm.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
The 'struct machine' represents the machine where the samples were/are
being collected, and we also have a 'struct perf_env' with extra details
about such machine, that we were collecting at 'perf.data' creation time
but we also needed when no perf.data file is being used, such as in
'perf top'.
So, get those structs closer together, as they provide a bigger picture
of the sample's environment.
In 'perf session', when the file argument is NULL, we can assume that
the tool is sampling the running machine, so point machine->env to
the global put in place in previous patches, while set it to the
perf_header.env one when reading from a file.
This paves the way for machine->env to be used in
perf_event__preprocess_sample to populate addr_location.socket.
Tested-by: Wang Nan <wangnan0@huawei.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Borislav Petkov <bp@suse.de>
Cc: David Ahern <dsahern@gmail.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Kan Liang <kan.liang@intel.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lkml.kernel.org/n/tip-2ajotl0khscutm68exictoy9@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Since it can be used separately from 'perf_session' and 'perf_header',
move it to separate include file and object, next csets will try to move
a perf_env__init() routine.
Tested-by: Wang Nan <wangnan0@huawei.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Borislav Petkov <bp@suse.de>
Cc: David Ahern <dsahern@gmail.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Kan Liang <kan.liang@intel.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lkml.kernel.org/n/tip-ff2rw99tsn670y1b6gxbwdsi@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Need to check evsel before passing it to dump_sample().
Signed-off-by: Kan Liang <kan.liang@intel.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lkml.kernel.org/r/1441283463-51050-5-git-send-email-kan.liang@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
This patch stores the cpu socket_id and core_id in a perf.data header,
and reads them into the perf_env struct when processing perf.data files.
The changes modifies the CPU_TOPOLOGY section, making sure it is
backward/forward compatible.
The patch checks the section size before reading the core and socket ids.
It never reads data crossing the section boundary. An old perf binary
without this patch can also correctly read the perf.data from a new perf
with this patch.
Because the new info is added at the end of the cpu_topology section, an
old perf tool ignores the extra data.
Examples:
1. New perf with this patch read perf.data from an old perf without the
patch:
$ perf_new report -i perf_old.data --header-only -I
......
# sibling threads : 33
# sibling threads : 34
# sibling threads : 35
# Core ID and Socket ID information is not available
# node0 meminfo : total = 32823872 kB, free = 29315548 kB
# node0 cpu list : 0-17,36-53
......
2. Old perf without the patch reads perf.data from a new perf with the
patch:
$ perf_old report -i perf_new.data --header-only -I
......
# sibling threads : 33
# sibling threads : 34
# sibling threads : 35
# node0 meminfo : total = 32823872 kB, free = 29190932 kB
# node0 cpu list : 0-17,36-53
......
3. New perf read new perf.data:
$ perf_new report -i perf_new.data --header-only -I
......
# sibling threads : 33
# sibling threads : 34
# sibling threads : 35
# CPU 0: Core ID 0, Socket ID 0
# CPU 1: Core ID 1, Socket ID 0
......
# CPU 61: Core ID 10, Socket ID 1
# CPU 62: Core ID 11, Socket ID 1
# CPU 63: Core ID 16, Socket ID 1
# node0 meminfo : total = 32823872 kB, free = 29190932 kB
# node0 cpu list : 0-17,36-53
Signed-off-by: Kan Liang <kan.liang@intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Link: http://lkml.kernel.org/r/1441115893-22006-2-git-send-email-kan.liang@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
As it is not necessarily tied to a perf.data file and needs using in
places where a perf_session is not required.
Suggested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: Kan Liang <kan.liang@intel.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lkml.kernel.org/r/1440755289-30939-4-git-send-email-kan.liang@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
cycles is a new branch_info field available on some CPUs that indicates
the time deltas between branches in the LBR.
Add a sort key and output code for the cycles to allow to display the
basic block cycles individually in perf report.
We also pass in the cycles for weight when LBRs are processed, which
allows to get global and local weight, to get an estimate of the total
cost.
And also print the cycles information for perf report -D. I also added
printing for the previously missing LBR flags (mispredict etc.)
Signed-off-by: Andi Kleen <ak@linux.intel.com>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: http://lkml.kernel.org/r/1437233094-12844-2-git-send-email-andi@firstfloor.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
The semantic associated in tools/perf/ with foo__delete(instance) is to
release all resources referenced by 'instance' members and then release
the memory for 'instance' itself.
The perf_session_env__delete() function isn't doing this, it just does
the first part, but the space used by 'instance' itself isn't freed, as
it is embedded in a larger structure, that will be freed at other stage.
For these cases we se foo__exit(), i.e. the usage is:
void foo__delete(foo)
{
if (foo) {
foo__exit(foo);
free(foo);
}
}
But when we have something like:
struct bar {
struct foo foo;
. . .
}
Then we can't really call foo__delete(&bar.foo), we must have this
instead:
void bar__exit(bar)
{
foo__exit(&bar.foo);
/* free other bar-> resources */
}
void bar__delete(bar)
{
if (bar) {
bar__exit(bar);
free(bar);
}
}
So just rename perf_session_env__delete() to perf_session_env__exit().
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Borislav Petkov <bp@suse.de>
Cc: David Ahern <dsahern@gmail.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Kan Liang <kan.liang@intel.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lkml.kernel.org/n/tip-djbgpcfo5udqptx3q0flwtmk@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Support processing of PERF_RECORD_SWITCH events and
PERF_RECORD_SWITCH_CPU_WIDE events. There is a single
tools callback for them both so that the tool must
check the event type before using the extra members
in PERF_RECORD_SWITCH_CPU_WIDE.
There is still no way to select the events, though.
That is added in a subsequest patch.
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Tested-by: Jiri Olsa <jolsa@redhat.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Mathieu Poirier <mathieu.poirier@linaro.org>
Cc: Pawel Moll <pawel.moll@arm.com>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lkml.kernel.org/r/1437471846-26995-3-git-send-email-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
We will reuse argv style data in following change to display counters
header showing monitored command line.
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Cc: David Ahern <dsahern@gmail.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/1437481927-29538-12-git-send-email-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Adding refference counting for cpu_map object, so it could be easily
shared among other objects.
Using cpu_map__put instead cpu_map__delete and making cpu_map__delete
static.
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lkml.kernel.org/r/1435012588-9007-4-git-send-email-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
When dumping events with 'perf report -D' the event print always starts
with a newline (see dump_event()).
Do the same with the "Aggregated stats" print so that it is not jammed
up against the last event print.
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Link: http://lkml.kernel.org/r/1435045969-15999-2-git-send-email-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
With 'perf report -D' the PERF_RECORD_FINISHED_ROUND event was printed
without a newline, resulting in:
0x91a18 [0x8]: PERF_RECORD_FINISHED_ROUNDAggregated stats
Other events print their details, but PERF_RECORD_FINISHED_ROUND doesn't
have any so just add a print for a newline.
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Link: http://lkml.kernel.org/r/1435045969-15999-1-git-send-email-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
The time out to limit the individual proc map processing was hard code
to 500ms. This patch introduce a new option --proc-map-timeout to make
the time limit configurable.
Signed-off-by: Kan Liang <kan.liang@intel.com>
Cc: Andi Kleen <andi@firstfloor.org>
Cc: David Ahern <dsahern@gmail.com>
Cc: Ying Huang <ying.huang@intel.com>
Link: http://lkml.kernel.org/r/1434549071-25611-2-git-send-email-kan.liang@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
System wide sampling like 'perf top' or 'perf record -a' read all
threads /proc/xxx/maps before sampling. If there are any threads which
generating a keeping growing huge maps, perf will do infinite loop
during synthesizing. Nothing will be sampled.
This patch fixes this issue by adding per-thread timeout to force stop
this kind of endless proc map processing.
PERF_RECORD_MISC_PROC_MAP_PARSE_TIME_OUT is introduced to indicate that
the mmap record are truncated by time out. User will get warning
notification when truncated mmap records are detected.
Reported-by: Ying Huang <ying.huang@intel.com>
Signed-off-by: Kan Liang <kan.liang@intel.com>
Cc: Andi Kleen <andi@firstfloor.org>
Cc: David Ahern <dsahern@gmail.com>
Cc: Ying Huang <ying.huang@intel.com>
Link: http://lkml.kernel.org/r/1434549071-25611-1-git-send-email-kan.liang@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
The thread-stack represents a thread's current stack. When a thread
exits there can still be many functions on the stack e.g. exit() can be
called many levels deep, so all the callers will never return. To get
that information output, the thread-stack must be flushed.
Previously it was assumed the thread-stack would be flushed when the
struct thread was deleted. With thread ref-counting it is no longer
clear when that will be, if ever. So instead explicitly flush all the
thread-stacks at the end of a session.
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Link: http://lkml.kernel.org/r/1432906425-9911-3-git-send-email-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Following error occurs when trying to use 'perf report' on x86_64 to
cross analysis a perf.data generated by an old perf on a big-endian
machine:
# perf report
*** Error in `/home/w00229757/perf': free(): invalid next size (fast): 0x00000000032c99f0 ***
======= Backtrace: =========
/lib64/libc.so.6(+0x6eeef)[0x7ff6ff7e2eef]
/lib64/libc.so.6(+0x78cae)[0x7ff6ff7eccae]
/lib64/libc.so.6(+0x79987)[0x7ff6ff7ed987]
/path/to/perf[0x4ac734]
/path/to/perf[0x4ac829]
/path/to/perf(perf_header__process_sections+0x129)[0x4ad2c9]
/path/to/perf(perf_session__read_header+0x2e1)[0x4ad9e1]
/path/to/perf(perf_session__new+0x168)[0x4bd458]
/path/to/perf(cmd_report+0xfa0)[0x43eb70]
/path/to/perf[0x47adc3]
/path/to/perf(main+0x5f6)[0x42fd06]
/lib64/libc.so.6(__libc_start_main+0xf5)[0x7ff6ff795bd5]
/path/to/perf[0x42fe35]
======= Memory map: ========
[SNIP]
The bug is in perf_event__attr_swap(). It swaps all fields in 'struct
perf_event_attr' without checking whether the swapped field exist or
not. In addition, in read_event_desc() allocs memory for attr according
to size read from perf.data.
Therefore, if the perf.data is collected by an old perf (without
aux_watermark, for example), when perf_event__attr_swap() swaping
attr->aux_watermark it destroy malloc's metadata.
This patch introduces boundary checking in perf_event__attr_swap(). It
adds macros bswap_field_64 and bswap_field_32 into
perf_event__attr_swap() to make it only swap exist fields.
Signed-off-by: Wang Nan <wangnan0@huawei.com>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Zefan Li <lizefan@huawei.com>
Cc: pi3orama@163.com
Link: http://lkml.kernel.org/r/1434534999-85347-1-git-send-email-wangnan0@huawei.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Failed in 32bit arch build like this:
CC /opt/h00206996/output/perf/arm32/builtin-record.o
util/session.c: In function ‘perf_session__warn_about_errors’:
util/session.c:1304:9: error: format ‘%lu’ expects argument of type ‘long unsigned int’,
but argument 2 has type ‘long long unsigned int’ [-Werror=format=]
builtin-report.c: In function ‘perf_evlist__tty_browse_hists’:
builtin-report.c:323:2: error: format ‘%lu’ expects argument of type ‘long unsigned int’,
but argument 3 has type ‘u64’ [-Werror=format=]
Replace %lu format strings in warning message with PRIu64 for u64
'total_lost_samples' to fix this problem.
Signed-off-by: He Kuang <hekuang@huawei.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kan Liang <kan.liang@intel.com>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/r/1434026664-71642-1-git-send-email-hekuang@huawei.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
This patch modifies the perf tool to handle the new RECORD type,
PERF_RECORD_LOST_SAMPLES.
The number of lost-sample events is stored in
.nr_events[PERF_RECORD_LOST_SAMPLES]. The exact number of samples
which the kernel dropped is stored in total_lost_samples.
When the percentage of dropped samples is greater than 5%, a warning
is printed.
Here are some examples:
Eg 1, Recording different frequently-occurring events is safe with the
patch. Only a very low drop rate is associated with such actions.
$ perf record -e '{cycles:p,instructions:p}' -c 20003 --no-time ~/tchain ~/tchain
$ perf report -D | tail
SAMPLE events: 120243
MMAP2 events: 5
LOST_SAMPLES events: 24
FINISHED_ROUND events: 15
cycles:p stats:
TOTAL events: 59348
SAMPLE events: 59348
instructions:p stats:
TOTAL events: 60895
SAMPLE events: 60895
$ perf report --stdio --group
# To display the perf.data header info, please use --header/--header-only options.
#
#
# Total Lost Samples: 24
#
# Samples: 120K of event 'anon group { cycles:p, instructions:p }'
# Event count (approx.): 24048600000
#
# Overhead Command Shared Object Symbol
# ................ ........... ................
..................................
#
99.74% 99.86% tchain_edit tchain_edit [.] f3
0.09% 0.02% tchain_edit tchain_edit [.] f2
0.04% 0.00% tchain_edit [kernel.vmlinux] [k] ixgbe_read_reg
Eg 2, Recording the same thing multiple times can lead to high drop
rate, but it is not a useful configuration.
$ perf record -e '{cycles:p,cycles:p}' -c 20003 --no-time ~/tchain
Warning: Processed 600592 samples and lost 99.73% samples!
[perf record: Woken up 148 times to write data]
[perf record: Captured and wrote 36.922 MB perf.data (1206322 samples)]
[perf record: Woken up 1 times to write data]
[perf record: Captured and wrote 0.121 MB perf.data (1629 samples)]
Signed-off-by: Kan Liang <kan.liang@intel.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: acme@infradead.org
Cc: eranian@google.com
Link: http://lkml.kernel.org/r/1431285195-14269-9-git-send-email-kan.liang@intel.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
perf_session__peek_event() generally leverages there being a single mmap
of the perf.data file, however on 32-bit platforms when there is more
that 32MiB of data, then there are multiple mmaps, so
perf_session__peek_event() reads from the file.
In that case a couple of bugs were exposed (note how the seg. fault
appears with >32M of data):
$ perf record --per-thread -e intel_bts// ../rtit-tests/loopy 1000000
[ perf record: Woken up 13 times to write data ]
[ perf record: Captured and wrote 24.568 MB perf.data ]
$ perf script > /dev/null
$ perf record --per-thread -e intel_bts// ../rtit-tests/loopy 10000000
[ perf record: Woken up 136 times to write data ]
[ perf record: Captured and wrote 270.794 MB perf.data ]
$ perf script > /dev/null
Segmentation fault (core dumped)
The wrong address was being passed to the readn() function and the
buffer size was not being checked.
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Namhyung Kim <namhyung@gmail.com>
Link: http://lkml.kernel.org/r/1432040746-1755-5-git-send-email-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Add support for the PERF_RECORD_ITRACE_START event type. This event can
be used to determine the pid and tid that are running when Instruction
Tracing starts. Generally that information would come from a
sched_switch event but, at the start, no sched_switch events may yet
have been recorded.
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Cc: David Ahern <dsahern@gmail.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Namhyung Kim <namhyung@gmail.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lkml.kernel.org/r/1430404667-10593-8-git-send-email-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Add support for the PERF_RECORD_AUX event type.
PERF_RECORD_AUX is a new kernel event that records when new data lands
in the AUX buffer. Currently it is assumed that AUX data follows the
same ring buffer conventions used by the perf events buffer, and
consequently the AUX event is not processed during recording.
It is processed during session processing so that the information in the
'flags' member is made available.
The format of PERF_RECORD_AUX is outlined in the linux/perf_events.h
header file. The 'flags' are also enumerated.
Intel PT and Intel BTS use the flag named PERF_AUX_FLAG_TRUNCATED to
determine if data has been lost because the buffer became full as perf
was not able to empty it fast enough.
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Cc: David Ahern <dsahern@gmail.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Namhyung Kim <namhyung@gmail.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lkml.kernel.org/r/1430404667-10593-7-git-send-email-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Add an index of AUX area tracing events within a perf.data file.
perf record uses a special user event PERF_RECORD_FINISHED_ROUND to
enable sorting of events in chunks instead of having to sort all events
altogether.
AUX area tracing events contain data that can span back to the very
beginning of the recording period. i.e. they do not obey the rules of
PERF_RECORD_FINISHED_ROUND.
By adding an index, AUX area tracing events can be found in advance and
the PERF_RECORD_FINISHED_ROUND approach works as usual.
The index is recorded with the auxtrace feature in the perf.data file.
A session reads the index but does not process it. An AUX area decoder
can queue all the AUX area data in advance using
auxtrace_queues__process_index() or otherwise process the index in some
custom manner.
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Cc: David Ahern <dsahern@gmail.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Namhyung Kim <namhyung@gmail.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lkml.kernel.org/r/1430404667-10593-3-git-send-email-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Add new AUX area member (aux_watermark) of struct perf_event_attr to
debug prints and byte swapping.
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Namhyung Kim <namhyung@gmail.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lkml.kernel.org/r/1428594864-29309-27-git-send-email-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Add functions to synthesize, count and print AUX area tracing error
events.
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Namhyung Kim <namhyung@gmail.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lkml.kernel.org/r/1428594864-29309-11-git-send-email-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Hook into session processing so that AUX area decoding can synthesize
events transparently to the tools.
The advantages of transparent decoding are that tools can be used
directly with perf.data files containing AUX area tracing data, which is
easier for the user and more efficient than having a separate decoding
tool.
This will work as follows:
1. Tools will feed auxtrace events to the decoder using
perf_tool->auxtrace() (support for that still to come).
2. The decoder can process side-band events as needed due
to the auxtrace->process_event() hook.
3. The decoder can deliver synthesized events into the
event stream using perf_session__deliver_synth_event().
Note the expectation is that decoding will work on data that is
time-ordered with respect to the per-cpu or per-thread contexts that
were recorded.
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Namhyung Kim <namhyung@gmail.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lkml.kernel.org/r/1428594864-29309-9-git-send-email-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Errors encountered when decoding an AUX area trace need to be reported
to the user. However the "user" might be a script or another tool, so
provide a new user event to capture those errors.
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Cc: David Ahern <dsahern@gmail.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Namhyung Kim <namhyung@gmail.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lkml.kernel.org/r/1428594864-29309-8-git-send-email-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Add two user events for AUX area tracing.
PERF_RECORD_AUXTRACE_INFO contains metadata, consisting primarily the
type of the AUX area tracing data plus some amount of
architecture-specific information. There should be only one
PERF_RECORD_AUXTRACE_INFO event.
PERF_RECORD_AUXTRACE identifies AUX area tracing data copied from the
mmapped AUX area tracing region. The actual data is not part of the
event but immediately follows it.
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Cc: David Ahern <dsahern@gmail.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Namhyung Kim <namhyung@gmail.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lkml.kernel.org/r/1428594864-29309-4-git-send-email-adrian.hunter@intel.com
[ s/MIN/min/g and use cast to fix up wrt -Werror=sign-compare till we adopt min_t() ]
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
This patch add checks in places where map__kmap is used to get kmaps
from struct kmap.
Error messages are added at map__kmap to warn invalid accessing of kmap
(for the case of !map->dso->kernel, kmap(map) does not exists at all).
Also, introduces map__kmaps() to warn uninitialized kmaps.
Reviewed-by: Ingo Molnar <mingo@kernel.org>
Signed-off-by: Wang Nan <wangnan0@huawei.com>
Cc: pi3orama@163.com
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Zefan Li <lizefan@huawei.com>
Link: http://lkml.kernel.org/r/1428394966-131044-2-git-send-email-wangnan0@huawei.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
As these can be obtained from the ordered_events pointer, via
container_of, reducing the cross section of ordered_samples.
These were added to ordered_samples in:
commit b7b61cbebd
Author: Arnaldo Carvalho de Melo <acme@redhat.com>
Date: Tue Mar 3 11:58:45 2015 -0300
perf ordered_events: Shorten function signatures
By keeping pointers to machines, evlist and tool in ordered_events.
But that was more a transitional patch while moving stuff out from
perf_session.c to ordered_events.c and possibly not even needed by then,
as we could use the container_of() method and instead of having the
nr_unordered_samples stats in events_stats, we can have it in
ordered_samples.
Based-on-a-patch-by: Jiri Olsa <jolsa@kernel.org>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Borislav Petkov <bp@suse.de>
Cc: David Ahern <dsahern@gmail.com>
Cc: Don Zickus <dzickus@redhat.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lkml.kernel.org/n/tip-4lk0t9js82g0tfc0x1onpkjt@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Even when it is not used to actually reorder events, some of its fields
are used, like session->ordered_events->tool, to shorten function
signatures where tool, for instance, was being passed, as the tool is
needed for the ordered_events code, we need it there and might as well
use it for other perf_session needs.
This fixes a problem where 'perf script' had some condition that made
session->ordered_events not to be initialized even with its
script->tool ordered_events related flags asking for it to be, which
looks like another bug and needs to be investigated further.
Always initializing session->ordered_events at least leaves the current
assumptions in place, so do it now.
Reported-by: David Ahern <dsahern@gmail.com>
Reviewed-by: David Ahern <dsahern@gmail.com>
Tested-by: David Ahern <dsahern@gmail.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Borislav Petkov <bp@suse.de>
Cc: Don Zickus <dzickus@redhat.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lkml.kernel.org/n/tip-b1xxk0rwkz2a0gip1uufmjqg@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
From perf_session, will be used in 'trace'.
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Borislav Petkov <bp@suse.de>
Cc: David Ahern <dsahern@gmail.com>
Cc: Don Zickus <dzickus@redhat.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lkml.kernel.org/n/tip-mfihndzaumx44h6y37ng2irb@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
It is all about flushing the ordered queue or piping it thru, no need
for a perf_session pointer.
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Borislav Petkov <bp@suse.de>
Cc: David Ahern <dsahern@gmail.com>
Cc: Don Zickus <dzickus@redhat.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lkml.kernel.org/n/tip-g47fx3ys0t9271cp0dcabjc7@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
So that we can simplify the deliver method to pass just:
(ordered_events, ordered_event, sample);
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Borislav Petkov <bp@suse.de>
Cc: David Ahern <dsahern@gmail.com>
Cc: Don Zickus <dzickus@redhat.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lkml.kernel.org/n/tip-j0s4bpxs5qza5tnkvjwom9rw@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
By keeping pointers to machines, evlist and tool in ordered_events.
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Borislav Petkov <bp@suse.de>
Cc: David Ahern <dsahern@gmail.com>
Cc: Don Zickus <dzickus@redhat.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lkml.kernel.org/n/tip-0c6huyaf59mqtm2ek9pmposl@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
For use by tools that are not perf.data based, as maybe 'perf trace' in
live mode.
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Borislav Petkov <bp@suse.de>
Cc: David Ahern <dsahern@gmail.com>
Cc: Don Zickus <dzickus@redhat.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lkml.kernel.org/n/tip-nedqe7cmii5w82etfi36urfz@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
We need to do that to stop accumulating entries in the dead_threads
linked list, i.e. we were keeping references to threads in struct hists
that continue to exist even after a thread exited and was removed from
the machine threads rbtree.
We still keep the dead_threads list, but just for debugging, allowing us
to iterate at any given point over the threads that still are referenced
by things like struct hist_entry.
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Borislav Petkov <bp@suse.de>
Cc: David Ahern <dsahern@gmail.com>
Cc: Don Zickus <dzickus@redhat.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lkml.kernel.org/n/tip-3ejvfyed0r7ue61dkurzjux4@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
All it wants is session->evlist.
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Borislav Petkov <bp@suse.de>
Cc: David Ahern <dsahern@gmail.com>
Cc: Don Zickus <dzickus@redhat.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lkml.kernel.org/n/tip-6w9663gka3jb1j1rfxxd5jcq@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
For tools that don't deal with perf.data files, thus do not need to
use perf_session.
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Borislav Petkov <bp@suse.de>
Cc: David Ahern <dsahern@gmail.com>
Cc: Don Zickus <dzickus@redhat.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lkml.kernel.org/n/tip-kglq67gvauq9tak02a4se00r@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Start to untangle session from delivering samples, as there are
tools that want to use ordered_events and don't use perf_session at all.
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Borislav Petkov <bp@suse.de>
Cc: David Ahern <dsahern@gmail.com>
Cc: Don Zickus <dzickus@redhat.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lkml.kernel.org/n/tip-rn4pk3pjxd78sgzrkn19tktp@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
LBR call stack only has user-space callchains. It is output in the
PERF_SAMPLE_BRANCH_STACK data format. For kernel callchains, it's
still in the form of PERF_SAMPLE_CALLCHAIN.
The perf tool has to handle both data sources to construct a
complete callstack.
For the "perf report -D" option, both lbr and fp information will be
displayed.
A new call chain recording option "lbr" is introduced into the perf
tool for LBR call stack. The user can use --call-graph lbr to get
the call stack information from hardware.
Here are some examples.
When profiling bc(1) on Fedora 19:
echo 'scale=2000; 4*a(1)' > cmd; perf record --call-graph lbr bc -l < cmd
If enabling LBR, perf report output looks like:
50.36% bc bc [.] bc_divide
|
--- bc_divide
execute
run_code
yyparse
main
__libc_start_main
_start
33.66% bc bc [.] _one_mult
|
--- _one_mult
bc_divide
execute
run_code
yyparse
main
__libc_start_main
_start
7.62% bc bc [.] _bc_do_add
|
--- _bc_do_add
|
|--99.89%-- 0x2000186a8
--0.11%-- [...]
6.83% bc bc [.] _bc_do_sub
|
--- _bc_do_sub
|
|--99.94%-- bc_add
| execute
| run_code
| yyparse
| main
| __libc_start_main
| _start
--0.06%-- [...]
0.46% bc libc-2.17.so [.] __memset_sse2
|
--- __memset_sse2
|
|--54.13%-- bc_new_num
| |
| |--51.00%-- bc_divide
| | execute
| | run_code
| | yyparse
| | main
| | __libc_start_main
| | _start
| |
| |--30.46%-- _bc_do_sub
| | bc_add
| | execute
| | run_code
| | yyparse
| | main
| | __libc_start_main
| | _start
| |
| --18.55%-- _bc_do_add
| bc_add
| execute
| run_code
| yyparse
| main
| __libc_start_main
| _start
|
--45.87%-- bc_divide
execute
run_code
yyparse
main
__libc_start_main
_start
If using FP, perf report output looks like:
echo 'scale=2000; 4*a(1)' > cmd; perf record --call-graph fp bc -l < cmd
50.49% bc bc [.] bc_divide
|
--- bc_divide
33.57% bc bc [.] _one_mult
|
--- _one_mult
7.61% bc bc [.] _bc_do_add
|
--- _bc_do_add
0x2000186a8
6.88% bc bc [.] _bc_do_sub
|
--- _bc_do_sub
0.42% bc libc-2.17.so [.] __memcpy_ssse3_back
|
--- __memcpy_ssse3_back
If using LBR, perf report -D output looks like:
3458145275743 0x2fd750 [0xd8]: PERF_RECORD_SAMPLE(IP, 0x2): 9748/9748: 0x408ea8 period: 609644 addr: 0
... LBR call chain: nr:8
..... 0: fffffffffffffe00
..... 1: 0000000000408e50
..... 2: 000000000040a458
..... 3: 000000000040562e
..... 4: 0000000000408590
..... 5: 00000000004022c0
..... 6: 00000000004015dd
..... 7: 0000003d1cc21b43
... FP chain: nr:2
..... 0: fffffffffffffe00
..... 1: 0000000000408ea8
... thread: bc:9748
...... dso: /usr/bin/bc
The LBR call stack has the following known limitations:
- Zero length calls are not filtered out by the hardware
- Exception handing such as setjmp/longjmp will have calls/returns not
match
- Pushing different return address onto the stack will have
calls/returns not match
- If callstack is deeper than the LBR, only the last entries are
captured
Tested-by: Jiri Olsa <jolsa@kernel.org>
Signed-off-by: Kan Liang <kan.liang@intel.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Arnaldo Carvalho de Melo <acme@kernel.org>
Cc: Borislav Petkov <bp@suse.de>
Cc: David Ahern <dsahern@gmail.com>
Cc: Don Zickus <dzickus@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Simon Que <sque@chromium.org>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lkml.kernel.org/r/1420482185-29830-3-git-send-email-kan.liang@intel.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
It's only used for perf record to process build-id because its file size
it's not fixed at this time due to remaining header features.
However data offset and size is available so that we can use the
perf_session__process_events() once we set the file size as the current
offset like for now.
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Andi Kleen <andi@firstfloor.org>
Cc: David Ahern <dsahern@gmail.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lkml.kernel.org/r/1422518843-25818-3-git-send-email-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Pull leftover perf fixes from Ingo Molnar:
"Two perf fixes left over from the previous cycle"
* 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
perf session: Do not fail on processing out of order event
x86/asm/traps: Disable tracing and kprobes in fixup_bad_iret and sync_regs
Linus reported perf report command being interrupted due to processing
of 'out of order' event, with following error:
Timestamp below last timeslice flush
0x5733a8 [0x28]: failed to process type: 3
I could reproduce the issue and in my case it was caused by one CPU
(mmap) being behind during record and userspace mmap reader seeing the
data after other CPUs data were already stored.
This is expected under some circumstances because we need to limit the
number of events that we queue for reordering when we receive a
PERF_RECORD_FINISHED_ROUND or when we force flush due to memory
pressure.
Reported-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Acked-by: Ingo Molnar <mingo@kernel.org>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Corey Ashford <cjashfor@linux.vnet.ibm.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Matt Fleming <matt.fleming@intel.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lkml.kernel.org/r/1417016371-30249-1-git-send-email-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Add the infrastructure to setup, collect and report the interrupt
machine state regs which can be captured by the kernel.
Signed-off-by: Stephane Eranian <eranian@google.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: cebbert.lkml@gmail.com
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Arnaldo Carvalho de Melo <acme@kernel.org>
Cc: David Ahern <dsahern@gmail.com>
Cc: Don Zickus <dzickus@redhat.com>
Cc: Jean Pihet <jean.pihet@linaro.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Waiman Long <Waiman.Long@hp.com>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/r/1411559322-16548-4-git-send-email-eranian@google.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Add a function to deliver synthesized events from within a session.
Intel PT decoding works by synthesizing events (primarily branch events)
that can then be consumed by existing tools. This function will be used
to deliver those events.
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Namhyung Kim <namhyung@gmail.com>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lkml.kernel.org/r/1414417770-18602-3-git-send-email-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Where direct use of the longer form using list_for_entry() was being
used.
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Borislav Petkov <bp@suse.de>
Cc: David Ahern <dsahern@gmail.com>
Cc: Don Zickus <dzickus@redhat.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lkml.kernel.org/n/tip-v4fw80flg25nkl8jgeod3ot9@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Add an index of the event identifiers, in preparation for Intel PT.
The event id (also called the sample id) is a unique number
allocated by the kernel to the event created by perf_event_open(). Events
can include the event id by having a sample type including PERF_SAMPLE_ID or
PERF_SAMPLE_IDENTIFIER.
Currently the main use of the event id is to match an event back to the
evsel to which it belongs i.e. perf_evlist__id2evsel()
The purpose of this patch is to make it possible to match an event back to
the mmap from which it was read. The reason that is useful is because the
mmap represents a time-ordered context (either for a cpu or for a thread).
Intel PT decodes trace information on that basis. In full-trace mode, that
information can be recorded when the Intel PT trace is read, but in
sample-mode the Intel PT trace data is embedded in a sample and it is in
that case that the "id index" is needed.
So the mmaps are numbered (idx) and the cpu and tid recorded against the id
by perf_evlist__set_sid_idx() which is called by perf_evlist__mmap_per_evsel().
That information is recorded on the perf.data file in the new "id index".
idx, cpu and tid are added to struct perf_sample_id (which is the node of
evlist's hash table to match ids to evsels). The information can be
retrieved using perf_evlist__id2sid(). Note however this all depends on
having a sample type including PERF_SAMPLE_ID or PERF_SAMPLE_IDENTIFIER,
otherwise ids are not recorded.
The "id index" is a synthesized event record which will be created when
Intel PT sampling is used by calling perf_event__synthesize_id_index().
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Cc: David Ahern <dsahern@gmail.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Namhyung Kim <namhyung@gmail.com>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lkml.kernel.org/r/1414417770-18602-2-git-send-email-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Shortening function signature lenght too, since a thread's machine can be
obtained from thread->mg->machine, no need to pass thread, machine.
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Borislav Petkov <bp@suse.de>
Cc: David Ahern <dsahern@gmail.com>
Cc: Don Zickus <dzickus@redhat.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Jean Pihet <jean.pihet@linaro.org>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lkml.kernel.org/n/tip-5wb6css280ty0cel5p0zo2b1@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
When processing events the session code has an ordered samples queue
which is used to time-sort events coming in across multiple mmaps. At a
later point in time samples on the queue are flushed up to some
timestamp at which point the event is actually processed.
When analyzing events live (ie., record/analysis path in the same
command) there is a race that leads to corrupted events and parse errors
which cause perf to terminate. The problem is that when the event is
placed in the ordered samples queue it is only a reference to the event
which is really sitting in the mmap buffer. Even though the event is
queued for later processing the mmap tail pointer is updated which
indicates to the kernel that the event has been processed. The race is
flushing the event from the queue before it gets overwritten by some
other event. For commands trying to process events live (versus just
writing to a file) and processing a high rate of events this leads to
parse failures and perf terminates.
Examples hitting this problem are 'perf kvm stat live', especially with
nested VMs which generate 100,000+ traces per second, and a command
processing scheduling events with a high rate of context switching --
e.g., running 'perf bench sched pipe'.
This patch offers live commands an option to copy the event when it is
placed in the ordered samples queue.
Based on a patch from David Ahern <dsahern@gmail.com>
Signed-off-by: Alexander Yarygin <yarygin@linux.vnet.ibm.com>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Cc: Christian Borntraeger <borntraeger@de.ibm.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Namhyung Kim <namhyung.kim@lge.com>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lkml.kernel.org/r/1412347212-28237-2-git-send-email-yarygin@linux.vnet.ibm.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Now perf_session doesn't require that the evsels in its evlist are hists
containing ones.
Tools that are hists based and want to do per evsel events_stats
updates, if at some point this turns into a necessity, should do it in
the tool specific code, keeping the session class hists agnostic.
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Borislav Petkov <bp@suse.de>
Cc: David Ahern <dsahern@gmail.com>
Cc: Don Zickus <dzickus@redhat.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Jean Pihet <jean.pihet@linaro.org>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lkml.kernel.org/n/tip-cli1bgwpo82mdikuhy3djsuy@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
PERF_RECORD_SAMPLE was not being counted here and is the only per-evsel
thing anyway, the other events were not mapping to a evsel.
With this we don't require that evsels used with a perf_session need to
have space for hists, like the ones in annotate, report, top.
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Borislav Petkov <bp@suse.de>
Cc: David Ahern <dsahern@gmail.com>
Cc: Don Zickus <dzickus@redhat.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Jean Pihet <jean.pihet@linaro.org>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lkml.kernel.org/n/tip-kzchpz0l1mhrsfpkirz086m2@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Not all tools need a hists instance per perf_evsel, so lets pave the way
to remove evsel->hists while leaving a way to access the hists from a
specially allocated evsel, one that comes with space at the end where
lives the evsel.
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Borislav Petkov <bp@suse.de>
Cc: David Ahern <dsahern@gmail.com>
Cc: Don Zickus <dzickus@redhat.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Jean Pihet <jean.pihet@linaro.org>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lkml.kernel.org/n/tip-qlktkhe31w4mgtbd84035sr2@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Currently perf record always errors out when you run it as non-root with
kptr_restrict == 1, which is often the default.
Make it only warn instead and fix the kernel resolve code to not
segfault later. Profiling works still fine, except kernel symbols are
not resolved.
Signed-off-by: Andi Kleen <ak@linux.intel.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/1411594794-7229-1-git-send-email-andi@firstfloor.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Add machine__thread_exec_comm() to return the comm that matches the last
exec, if the comm_exec flag is present, or the last comm otherwise.
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Namhyung Kim <namhyung@gmail.com>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lkml.kernel.org/r/1406786474-9306-3-git-send-email-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Add a function to peek at other events in the event stream.
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Namhyung Kim <namhyung@gmail.com>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lkml.kernel.org/r/1406786474-9306-15-git-send-email-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
In forced flush (OE_FLUSH__HALF) we break the rules of the flush
timestamp via PERF_RECORD_FINISHED_ROUND event, so we could get out of
order event.
Do not force error in this case plus changing the output warning to use
WARN_ONCE.
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Acked-by: David Ahern <dsahern@gmail.com>
Cc: Corey Ashford <cjashfor@linux.vnet.ibm.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Jean Pihet <jean.pihet@linaro.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/n/tip-8q8794a8nlmpd1u8xrqmcyd2@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Adding some prints for ordered events queue, to help debug issues.
Adding debug_ordered_events debug variable to be able to enable ordered
events debug messages using:
$ perf --debug ordered-events=2 report ...
Also using oe pointer in perf_session__queue_event instead of chained
session variable dereferencing.
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Acked-by: David Ahern <dsahern@gmail.com>
Cc: Corey Ashford <cjashfor@linux.vnet.ibm.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Jean Pihet <jean.pihet@linaro.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/n/tip-7p3mnnopjvsp9nmk9msqcfkm@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Adding ordered_events__free function to release all the struct
ordered_events data. It's replacement for former
perf_session_free_sample_buffers function.
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Acked-by: David Ahern <dsahern@gmail.com>
Cc: Corey Ashford <cjashfor@linux.vnet.ibm.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Jean Pihet <jean.pihet@linaro.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/n/tip-urraa8ccay4o14wambjraws7@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Making perf_session__deliver_event global function, as it will be called
from another object in following patch.
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Acked-by: David Ahern <dsahern@gmail.com>
Cc: Corey Ashford <cjashfor@linux.vnet.ibm.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Jean Pihet <jean.pihet@linaro.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/n/tip-rz7s2b8uwv567bigckh75gvk@git.kernel.org
[ Fixup naming to match class__method schema, as now is more widely exposed ]
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
In previous patches we added a limit for ordered events queue allocation
size. If we reach this size we need to flush (part of) the queue to get
some free buffers.
The current functionality is not affected, because the limit is hard
coded to (u64) -1. The configuration code for size will come in
following patches.
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Acked-by: David Ahern <dsahern@gmail.com>
Cc: Corey Ashford <cjashfor@linux.vnet.ibm.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Jean Pihet <jean.pihet@linaro.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/n/tip-ggcas0xdq847fi85bz73do2e@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Add limit to the ordered events queue allocation. This way we will be
able to control the size of the queue buffers.
There's no limit at the moment (it's set to (u64) -1). The config code
will come in following patches.
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Acked-by: David Ahern <dsahern@gmail.com>
Cc: Corey Ashford <cjashfor@linux.vnet.ibm.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Jean Pihet <jean.pihet@linaro.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/n/tip-lw1ny3mk4ctb6su5ght5rsng@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Rename 'struct ordered_events' members to fit better the ordered events
style.
No functional change was intended.
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Acked-by: David Ahern <dsahern@gmail.com>
Cc: Corey Ashford <cjashfor@linux.vnet.ibm.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Jean Pihet <jean.pihet@linaro.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/n/tip-v0eb2hsmrxbolnoawu5fn92z@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Following up with ordered_samples rename for ordered_samples and
sample_queue structs to ordered_events and ordered_event structs
respectively.
Also changing flush_sample_queue function name to ordered_events_flush.
No functional change was intended.
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Acked-by: David Ahern <dsahern@gmail.com>
Cc: Corey Ashford <cjashfor@linux.vnet.ibm.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Jean Pihet <jean.pihet@linaro.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/n/tip-2dkrdvh0bbmzxdse437fcgls@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
The time ordering is generic for all kinds of events, so using generic
name 'ordered_events' for ordered_samples bool in perf_tool struct.
No functional change was intended.
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Acked-by: David Ahern <dsahern@gmail.com>
Cc: Corey Ashford <cjashfor@linux.vnet.ibm.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Jean Pihet <jean.pihet@linaro.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/n/tip-07mrqzcuhsks9wfmxrzsvemz@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
The VDSO temporary file is unlinked when a session is deleted. That
precludes the possibilities that there is no session or there is more
than one session.
Correctly the vdso belongs to the machine so put the information on
'struct machine' and get rid of the global variables.
Reviewed-by: Jiri Olsa <jolsa@redhat.com>
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Namhyung Kim <namhyung@gmail.com>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lkml.kernel.org/r/53CF9B14.7040408@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
A session can be made to skip portions of the input file. Do not limit
that size to 32-bits.
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Namhyung Kim <namhyung@gmail.com>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lkml.kernel.org/r/1406143198-20732-2-git-send-email-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
A piped event stream may contain arbitary sized tracepoint information
following a PERF_RECORD_HEADER_TRACING_DATA event. The position in the
stream has to be 'skipped' to match the start of the next event.
Provide the same ability to a non-piped event stream to allow for
Instruction Trace data that may also be in a non-piped event stream.
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Namhyung Kim <namhyung@gmail.com>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lkml.kernel.org/r/1406143198-20732-1-git-send-email-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Flag if the event stream is a file that has been mmapped in one go.
This is useful, for example, if a tool needs to keep an event for later
reference. If the new flag is set, a pointer to the event can be
retained, otherwise the event must be copied.
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Namhyung Kim <namhyung@gmail.com>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lkml.kernel.org/r/1405332185-4050-28-git-send-email-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
The value used for unknown pids cannot be zero because that is used by
the "idle" task.
Use -1 instead. Also handle the unknown pid case when creating map
groups.
Note that, threads with an unknown pid should not occur because fork (or
synthesized) events precede the thread's existence.
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Namhyung Kim <namhyung@gmail.com>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lkml.kernel.org/r/1405332185-4050-2-git-send-email-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
cppcheck detected following warning:
[tools/perf/util/session.c:1628] -> [tools/perf/util/session.c:1632]:
(warning) Possible null pointer dereference: session - otherwise it
is redundant to check it against null.
In order to avoide null pointer, check the pointer before use.
Signed-off-by: Masanari Iida <standby24x7@gmail.com>
Link: http://lkml.kernel.org/r/1400087618-13628-1-git-send-email-standby24x7@gmail.com
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
When printing the raw dump of a data file, the header.misc is
printed as a decimal. Unfortunately, that field is a bit mask, so
it is hard to interpret as a decimal.
Print in hex, so the user can easily see what bits are set and more
importantly what type of info it is conveying.
V2: add 0x in front per Jiri Olsa
Signed-off-by: Don Zickus <dzickus@redhat.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lkml.kernel.org/r/1393386227-149412-3-git-send-email-dzickus@redhat.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Adding mask info into struct regs_dump to make the registers information
compact.
The mask was always passed along, so logically the mask info fits more
into the struct regs_dump.
Signed-off-by: Jiri Olsa <jolsa@redhat.com>
Acked-by: Jean Pihet <jean.pihet@linaro.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Corey Ashford <cjashfor@linux.vnet.ibm.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Jean Pihet <jean.pihet@linaro.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/1389098853-14466-9-git-send-email-jolsa@redhat.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
We removed event types from data file in following commits:
6065210 perf tools: Remove event types framework completely
44b3c57 perf tools: Remove event types from perf data file
We no longer need this information, because we can get it directly from
tracepoints.
But we still need to handle PERF_RECORD_HEADER_EVENT_TYPE event for the
sake of old perf data files created in pipe mode like:
$ perf.3.4 record -o - foo >perf.data
$ perf.312 report -i - < perf.data
Reported-by: Stephane Eranian <eranian@google.com>
Signed-off-by: Jiri Olsa <jolsa@redhat.com>
Cc: Corey Ashford <cjashfor@linux.vnet.ibm.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/1391524668-12546-1-git-send-email-jolsa@redhat.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
This method uses a temporary struct cpu_map to figure out the cpus
present in the received cpu list in string form, but it failed to free
it after returning. Fix it.
Signed-off-by: Stanislav Fomichev <stfomichev@yandex-team.ru>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/1390217980-22424-3-git-send-email-stfomichev@yandex-team.ru
[ Use goto + err = -1 to do the delete just once, in the normal exit path ]
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
For the common evsel list traversal, so that it becomes more compact.
Use the opportunity to start ditching the 'perf_' from 'perf_evlist__',
as discussed, as the whole conversion touches a lot of places, lets do
it piecemeal when we have the chance due to other work, like in this
case.
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lkml.kernel.org/n/tip-qnkx7dzm2h6m6uptkfk03ni6@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Several areas already used this technique, so do some audit to
consistently use it elsewhere.
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lkml.kernel.org/n/tip-9sbere0kkplwe45ak6rk4a1f@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Currently, if we use perf kvm --guestkallsyms --guestmodules report, we
can not get the perf information from perf data file. All sample are
shown as unknown.
Reproducing steps:
# perf kvm --guestkallsyms /tmp/kallsyms --guestmodules /tmp/modules record -a sleep 1
[ perf record: Woken up 1 times to write data ]
[ perf record: Captured and wrote 0.624 MB perf.data.guest (~27260 samples) ]
# perf kvm --guestkallsyms /tmp/kallsyms --guestmodules /tmp/modules report |grep %
100.00% [guest/6471] [unknown] [g] 0xffffffff8164f330
This bug was introduced by 207b57926 (perf kvm: Fix regression with guest machine creation).
In original code, it uses perf_session__find_machine(), it means we deliver symbol to machine
which has the same pid, if no machine found, deliver it to *default* guest. But if we use
perf_session__findnew_machine() here, if no machine was found, new machine with pid will be built
and added. Then the default guest which with pid == 0 will never get a symbol.
And because the new machine initialized here has no kernel map created, the symbol delivered to
it will be marked as "unknown".
This patch here is to revert commit 207b57926 and fix the SEGFAULT bug in another way.
Verification steps:
# ./perf kvm --guestkallsyms /home/kallsyms --guestmodules /home/modules record -a sleep 1
[ perf record: Woken up 1 times to write data ]
[ perf record: Captured and wrote 0.651 MB perf.data.guest (~28437 samples) ]
# ./perf kvm --guestkallsyms /home/kallsyms --guestmodules /home/modules report |grep %
22.64% :6471 [guest.kernel.kallsyms] [g] update_rq_clock.part.70
19.99% :6471 [guest.kernel.kallsyms] [g] d_free
18.46% :6471 [guest.kernel.kallsyms] [g] bio_phys_segments
16.25% :6471 [guest.kernel.kallsyms] [g] dequeue_task
12.78% :6471 [guest.kernel.kallsyms] [g] __switch_to
7.91% :6471 [guest.kernel.kallsyms] [g] scheduler_tick
1.75% :6471 [guest.kernel.kallsyms] [g] native_apic_mem_write
0.21% :6471 [guest.kernel.kallsyms] [g] apic_timer_interrupt
Signed-off-by: Dongsheng Yang <yangds.fnst@cn.fujitsu.com>
Acked-by: David Ahern <dsahern@gmail.com>
Cc: stable@vger.kernel.org # 3.3+
Cc: David Ahern <dsahern@gmail.com>
Link: http://lkml.kernel.org/r/1387564907-3045-1-git-send-email-yangds.fnst@cn.fujitsu.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
The addr_location struct should fully qualify an address, and to do that
it should have in it the machine where the thread was found.
Thus all functions that receive an addr_location now don't need to also
receive a 'machine', those functions just need to access al->machine
instead, just like it does with the other parts of an address location:
al->thread, al->map, etc.
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lkml.kernel.org/n/tip-o51iiee7vyq4r3k362uvuylg@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Move functions mem_bswap_32() and mem_bswap_64() so they can be reused.
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Namhyung Kim <namhyung@gmail.com>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lkml.kernel.org/r/1386765443-26966-21-git-send-email-alexander.shishkin@linux.intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Add field 'srcline' that displays the source file name and line number
associated with the sample ip. The information displayed is the same as
from addr2line.
$ perf script -f comm,tid,pid,time,ip,sym,dso,symoff,srcline
grep 10701/10701 2497321.421013: ffffffff81043ffa native_write_msr_safe+0xa ([kernel.kallsyms])
/usr/src/debug/kernel-3.9.fc17/linux-3.9.10-100.fc17.x86_64/arch/x86/include/asm/msr.h:95
grep 10701/10701 2497321.421984: ffffffff8165b6b3 _raw_spin_lock+0x13 ([kernel.kallsyms])
/usr/src/debug/kernel-3.9.fc17/linux-3.9.10-100.fc17.x86_64/arch/x86/include/asm/spinlock.h:54
grep 10701/10701 2497321.421990: ffffffff810b64b3 tick_sched_timer+0x53 ([kernel.kallsyms])
/usr/src/debug/kernel-3.9.fc17/linux-3.9.10-100.fc17.x86_64/kernel/time/tick-sched.c:840
grep 10701/10701 2497321.421992: ffffffff8106f63f run_timer_softirq+0x2f ([kernel.kallsyms])
/usr/src/debug/kernel-3.9.fc17/linux-3.9.10-100.fc17.x86_64/kernel/timer.c:1372
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Namhyung Kim <namhyung@gmail.com>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lkml.kernel.org/r/1386315778-11633-3-git-send-email-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
The address being used to calculate the offset was the memory address
but the address needed is the address mapped to the dso. i.e. the 'addr'
member of 'struct addr_location'
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Acked-by: David Ahern <dsahern@gmail.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Namhyung Kim <namhyung@gmail.com>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lkml.kernel.org/r/1386315778-11633-2-git-send-email-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
The perf_event__preprocess_sample() function is called in
process_sample_event(). Instead of calling it again in
perf_evsel__print_ip(), pass through the resultant addr_location.
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Namhyung Kim <namhyung@gmail.com>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lkml.kernel.org/r/529F3944.9050007@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Changing readn function return type to ssize_t because read returns
ssize_t not int.
Changing callers holding variable types as well.
Signed-off-by: Jiri Olsa <jolsa@redhat.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/1385634619-8129-3-git-send-email-jolsa@redhat.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Allows a command to have a symbol_filter controlled by the user to skip
certain functions in a backtrace. One example is to allow the user to
reduce repeating patterns like:
do_select core_sys_select sys_select
to just sys_select when dumping callchains, consuming less real estate
on the screen while still conveying the essential message - the process
is in a select call.
This option is leveraged by the upcoming timehist command.
Signed-off-by: David Ahern <dsahern@gmail.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lkml.kernel.org/r/1384806771-2945-2-git-send-email-dsahern@gmail.com
[ Checked if al.sym is NULL before touching al.sym->ignored, as noted by Adrian Hunter ]
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Not needed since this cset:
fcf65bf149: perf evsel: Cache associated event_format
So lets trim this struct a bit.
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lkml.kernel.org/n/tip-j8setslokt0goiwxq9dogzqm@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
They convey no information, perhaps I was bitten by some snake at some
point, complete the detox by naming the last of those arguments more
sensibly.
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lkml.kernel.org/n/tip-u1r0dnjoro08dgztiy2g3t2q@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
This way we can later delimit a lifecycle for the COMM and map a hist to
a precise COMM:timeslice couple.
PERF_RECORD_COMM and PERF_RECORD_FORK events that don't have
PERF_SAMPLE_TIME samples can only send 0 value as a timestamp and thus
should overwrite any previous COMM on a given thread because there is no
sensible way to keep track of all the comms lifecycles in a thread
without time informations.
Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Tested-by: Jiri Olsa <jolsa@redhat.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lkml.kernel.org/n/tip-6tyow99vgmmtt9qwr2u2lqd7@git.kernel.org
[ Made it cope with PERF_RECORD_MMAP2 ]
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
That will ease using a progress bar across multiple functions, like in
the upcoming patches that will present a progress bar when collapsing
histograms.
Based on a previous patch by Namhyung Kim.
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lkml.kernel.org/n/tip-cr7lq7ud9fj21bg7wvq27w1u@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
When callgraph data was included in the perf data file, it may take a
long time to scan all those data and merge them together especially if
the stored callchains are long and the perf data file itself is large,
like a Gbyte or so.
The callchain stack is currently limited to PERF_MAX_STACK_DEPTH (127).
This is a large value. Usually the callgraph data that developers are
most interested in are the first few levels, the rests are usually not
looked at.
This patch adds a new --max-stack option to perf-report to limit the
depth of callchain stack data to look at to reduce the time it takes for
perf-report to finish its processing. It trades the presence of trailing
stack information with faster speed.
The following table shows the elapsed time of doing perf-report on a
perf.data file of size 985,531,828 bytes.
--max_stack Elapsed Time Output data size
----------- ------------ ----------------
not set 88.0s 124,422,651
64 87.5s 116,303,213
32 87.2s 112,023,804
16 86.6s 94,326,380
8 59.9s 33,697,248
4 40.7s 10,116,637
-g none 27.1s 2,555,810
Signed-off-by: Waiman Long <Waiman.Long@hp.com>
Acked-by: David Ahern <dsahern@gmail.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Aswin Chandramouleeswaran <aswin@hp.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Scott J Norton <scott.norton@hp.com>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lkml.kernel.org/r/1382107129-2010-4-git-send-email-Waiman.Long@hp.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Adding perf_data_file__open interface to data object to open the
perf.data file for both read and write.
Signed-off-by: Jiri Olsa <jolsa@redhat.com>
Acked-by: Namhyung Kim <namhyung@kernel.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Andi Kleen <andi@firstfloor.org>
Cc: Corey Ashford <cjashfor@linux.vnet.ibm.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/1381847254-28809-3-git-send-email-jolsa@redhat.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
This patch is adding 'struct perf_data_file' object as a placeholder for
all attributes regarding perf.data file handling. Changing
perf_session__new to take it as an argument.
The rest of the functionality will be added later to keep this change
simple enough, because all the places using perf_session are changed
now.
Signed-off-by: Jiri Olsa <jolsa@redhat.com>
Acked-by: Namhyung Kim <namhyung@kernel.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Andi Kleen <andi@firstfloor.org>
Cc: Corey Ashford <cjashfor@linux.vnet.ibm.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/1381847254-28809-2-git-send-email-jolsa@redhat.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
The perf_event__attr_swap() method needs to swap all members of struct
perf_event_attr. Add missing ones.
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Acked-by: Jiri Olsa <jolsa@redhat.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Namhyung Kim <namhyung@gmail.com>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lkml.kernel.org/r/1382099356-4918-7-git-send-email-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Piped events can be sorted so a final flush is needed.
Add that and remove a redundant 'err = 0'.
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Reviewed-by: Jiri Olsa <jolsa@redhat.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Namhyung Kim <namhyung@gmail.com>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lkml.kernel.org/r/1382099356-4918-6-git-send-email-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Moving 'struct throttle_event' out of python code and making it global
as any other event.
There's no usage of throttling events in any perf commands so far
(besides python support), but we'll need this event data backup for
upcoming test.
Signed-off-by: Jiri Olsa <jolsa@redhat.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Corey Ashford <cjashfor@linux.vnet.ibm.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/1378031796-17892-5-git-send-email-jolsa@redhat.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
perf-record updates the header in the perf.data file at termination.
Without this update perf-report (and other processing built-ins) it
caused an infinite loop when perf report (or something like) called.
This is because the algorithm in __perf_session__process_events()
depends on the data_size which is read from file header. Use file size
directly instead in this case to do the best-effort processing.
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Tested-by: David Ahern <dsahern@gmail.com>
Tested-by: Sonny Rao <sonnyrao@chromium.org>
Acked-by: Ingo Molnar <mingo@kernel.org>
Cc: David Ahern <dsahern@gmail.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Sonny Rao <sonnyrao@chromium.org>
Link: http://lkml.kernel.org/r/1380529188-27193-1-git-send-email-namhyung@kernel.org
Signed-off-by: David Ahern <dsahern@gmail.com>
[ Reworded warning as per Ingo Molnar suggestion, replaces 'perf.data'
with session->filename, to precisely identify the data file involved ]
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Commands that do not implement an mmap2 handler should at least not die
with a segfault when processing files with MMAP2 events.
Signed-off-by: David Ahern <dsahern@gmail.com>
Link: http://lkml.kernel.org/r/1379900700-5186-5-git-send-email-dsahern@gmail.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Add support for recording and displaying the transaction flags.
They are essentially a new sort key. Also display them
in a nice way to the user.
Signed-off-by: Andi Kleen <ak@linux.intel.com>
Acked-by: Jiri Olsa <jolsa@redhat.com>
Signed-off-by: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/1379688044-14173-6-git-send-email-andi@firstfloor.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
When processing big files we were not checking if session_done was set
by the SIGINT signal handler, for instance in 'perf report'. Fix it.
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lkml.kernel.org/n/tip-pyad42lgrtq7xhg2dpsoauq7@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
This patch adds support for the new PERF_RECORD_MMAP2 record type
exposed by the kernel. This is an extended PERF_RECORD_MMAP record.
It adds for each file-backed mapping the device major, minor number and
the inode number and generation.
This triplet uniquely identifies the source of a file-backed mapping. It
can be used to detect identical virtual mappings between processes, for
instance.
The patch will prefer MMAP2 over MMAP.
Signed-off-by: Stephane Eranian <eranian@google.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Namhyung Kim <namhyung.kim@lge.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/1377079825-19057-3-git-send-email-eranian@google.com
[ Cope with 314add6 "Change machine__findnew_thread() to set thread pid",
fix 'perf test' regression test entry affected,
use perf_missing_features.mmap2 to fallback to not using .mmap2 in older kernels,
so that new tools can work with kernels where this feature is not present ]
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Before:
perf trace -i perf.data
Segmentation fault (core dumped)
#
After:
# perf trace -i perf.data
Data file does not have raw_syscalls:sys_enter events
#
When there are no tracepoints in a perf.data file the struct pevent
that contains the list of tracepoints that will be used to lookup the
tracepoint id by name will not be populated, causing a NULL deref.
And we don't need to do all that dance to look at pevents for an entry
with a slighly different name to then lookup the tracepoint by its id on
the evlist, just use the perf_evlist__find_tracepoint_by_name() routine,
that will find the tracepoint, if present.
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lkml.kernel.org/n/tip-egcm21k1e6gcyxpcgjxtmsq3@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Currently when processing events in the __perf_session__process_events
function we update a progress bar based on the file_size. During the
same processing we update the progress bar from within
flush_sample_queue which is based on number of samples count.
Having 2 different based updates is causing the progress bar to jump
heavily back and forth giving not much usefull info.
Fixing this by keeping only __perf_session__process_events based
progress bar update. And turning on flush_sample_queue progress bar
update only for final flushing.
This reduces the number of time the progress bar update function is
called and it significantly reduces the loading time for TUI, where the
progress bar update takes quite a lot of time.
Signed-off-by: Jiri Olsa <jolsa@redhat.com>
Cc: Corey Ashford <cjashfor@linux.vnet.ibm.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/20130905091449.GC1100@krava.brq.redhat.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Enable parsing of samples with sample format bit PERF_SAMPLE_IDENTIFIER.
In addition, if the kernel supports it, prefer it to selecting
PERF_SAMPLE_ID thereby allowing non-matching sample types.
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Namhyung Kim <namhyung@gmail.com>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lkml.kernel.org/r/1377591794-30553-8-git-send-email-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
The ip_event struct assumes fixed positions for ip, pid and tid. That
is no longer true with the addition of PERF_SAMPLE_IDENTIFIER. The
information is anyway in struct sample, so use that instead.
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Acked-by: Namhyung Kim <namhyung@kernel.org>
Cc: David Ahern <dsahern@gmail.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Namhyung Kim <namhyung@gmail.com>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lkml.kernel.org/r/1377591794-30553-5-git-send-email-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Now that the sample parsing correctly checks data sizes there is no
reason for it to be done again for callchains.
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Acked-by: Namhyung Kim <namhyung@kernel.org>
Cc: David Ahern <dsahern@gmail.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Namhyung Kim <namhyung@gmail.com>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lkml.kernel.org/r/1377591794-30553-4-git-send-email-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Add a new parameter for 'pid' to machine__findnew_thread().
Change callers to pass 'pid' when it is known.
Note that callers sometimes want to find the main thread
which has the memory maps. The main thread has tid == pid
so the usage in that case is:
machine__findnew_thread(machine, pid, pid)
whereas the usage to find the specific thread is:
machine__findnew_thread(machine, pid, tid)
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Acked-by: David Ahern <dsahern@gmail.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Namhyung Kim <namhyung@gmail.com>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lkml.kernel.org/r/1377591794-30553-2-git-send-email-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Now that the symbol filter is recorded on the machine there is no need
to pass it to perf_event__preprocess_sample(). So remove it.
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Namhyung Kim <namhyung@gmail.com>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lkml.kernel.org/r/1375961547-30267-7-git-send-email-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Any event can have RAW data attribute set. The intent of the function is
to determine if the session has tracepoints, so check for the type of
each event explicitly.
Signed-off-by: David Ahern <dsahern@gmail.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lkml.kernel.org/r/1375930261-77273-17-git-send-email-dsahern@gmail.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Make print options based on flags. Simplifies addition of more print
options which is the subject of upcoming patches.
Signed-off-by: David Ahern <dsahern@gmail.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lkml.kernel.org/r/1375930261-77273-10-git-send-email-dsahern@gmail.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Taking a lesson from perf-trace and bringing in control of event
processing to perf-kvm-stat-live: parse the sample to get access the
time leaving just the need to queue it to the ordered samples list. For
that the queue_event function needs to be exported.
Unexport perf_session__process_event.
Signed-off-by: David Ahern <dsahern@gmail.com>
Reviewed-by: Xiao Guangrong <xiaoguangrong@linux.vnet.ibm.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Runzhen Wang <runzhen@linux.vnet.ibm.com>
Cc: Xiao Guangrong <xiaoguangrong@linux.vnet.ibm.com>
Link: http://lkml.kernel.org/r/1375753297-69645-2-git-send-email-dsahern@gmail.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Allows kvm live mode to reuse the event processing and ordered samples
processing used by the perf-report path.
v2: removed flush_sample_queue as noticed by Jiri
Signed-off-by: David Ahern <dsahern@gmail.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Runzhen Wang <runzhen@linux.vnet.ibm.com>
Cc: Xiao Guangrong <xiaoguangrong@linux.vnet.ibm.com>
Link: http://lkml.kernel.org/r/1375473947-64285-4-git-send-email-dsahern@gmail.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Symbol offset is one of the fields that can be requested in perf-script.
Currently you do not get that data when requested. e.g.,
perf script -f comm,tid,pid,time,cpu,sym,symoff,ip
...
gcc 6201/6201 [006] 762250.617897:
ffffffff81090d95 update_curr
ffffffff810911b8 dequeue_entity
ffffffff81091825 dequeue_task_fair
ffffffff81087163 dequeue_task
ffffffff81087c03 deactivate_task
...
With this patch you get the offset:
...
gcc 6201/6201 [006] 762250.617897:
ffffffff81090d95 update_curr+0x1c5
ffffffff810911b8 dequeue_entity+0x28
ffffffff81091825 dequeue_task_fair+0x45
ffffffff81087163 dequeue_task+0x93
ffffffff81087c03 deactivate_task+0x23
...
Signed-off-by: David Ahern <dsahern@gmail.com>
Link: http://lkml.kernel.org/r/1375024474-45726-1-git-send-email-dsahern@gmail.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
For sample with sample type PERF_SAMPLE_READ the period value is stored
in the 'struct sample_read'.
Moreover if the read format has PERF_FORMAT_GROUP, the 'struct
sample_read' contains period values for all events in the group (for
which the sample's event is a leader).
We deliver separated samples for all the values contained within the
'struct sample_read'.
Signed-off-by: Jiri Olsa <jolsa@redhat.com>
Acked-by: Namhyung Kim <namhyung@kernel.org>
Cc: Corey Ashford <cjashfor@linux.vnet.ibm.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/n/tip-6mdm5xkrm6kypouh1c33cyys@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Adding support to parse out the PERF_SAMPLE_READ sample bits. The code
contains both single and group format specification.
This code parse out and prepare PERF_SAMPLE_READ data into the
perf_sample struct. It will be used for group leader sampling feature
comming in shortly.
Signed-off-by: Jiri Olsa <jolsa@redhat.com>
Acked-by: Namhyung Kim <namhyung@kernel.org>
Cc: Corey Ashford <cjashfor@linux.vnet.ibm.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/n/tip-0tgdoln5rwk3wocshb442cl3@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Removing event types framework completely. The only remainder (apart
from few comments) is following enum:
enum perf_user_event_type {
...
PERF_RECORD_HEADER_EVENT_TYPE = 65, /* deprecated */
...
}
It's kept as deprecated, resulting in error when processed in
perf_session__process_user_event function.
Signed-off-by: Jiri Olsa <jolsa@redhat.com>
Acked-by: Namhyung Kim <namhyung@kernel.org>
Cc: Corey Ashford <cjashfor@linux.vnet.ibm.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/1373556513-3000-6-git-send-email-jolsa@redhat.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
For example, in an application with an expensive function implemented
with deeply nested recursive calls, the default call-graph presentation
is dominated by the different callchains within that function. By
ignoring these callees, we can collect the callchains leading into the
function and compactly identify what to blame for expensive calls.
For example, in this report the callers of garbage_collect() are
scattered across the tree:
$ perf report -d ruby 2>- | grep -m10 ^[^#]*[a-z]
22.03% ruby [.] gc_mark
--- gc_mark
|--59.40%-- mark_keyvalue
| st_foreach
| gc_mark_children
| |--99.75%-- rb_gc_mark
| | rb_vm_mark
| | gc_mark_children
| | gc_marks
| | |--99.00%-- garbage_collect
If we ignore the callees of garbage_collect(), its callers are coalesced:
$ perf report --ignore-callees garbage_collect -d ruby 2>- | grep -m10 ^[^#]*[a-z]
72.92% ruby [.] garbage_collect
--- garbage_collect
vm_xmalloc
|--47.08%-- ruby_xmalloc
| st_insert2
| rb_hash_aset
| |--98.45%-- features_index_add
| | rb_provide_feature
| | rb_require_safe
| | vm_call_method
Signed-off-by: Greg Price <price@mit.edu>
Tested-by: Jiri Olsa <jolsa@redhat.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/20130623031720.GW22203@biohazard-cafe.mit.edu
Link: http://lkml.kernel.org/r/20130708115746.GO22203@biohazard-cafe.mit.edu
Cc: Fengguang Wu <fengguang.wu@intel.com>
[ remove spaces at beginning of line, reported by Fengguang Wu ]
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
The 'size' variable includes the header so must be at least
'sizeof(struct perf_event_header)'. Error out immediately if that is
not the case. Also don't byte-swap the header until it is actually
"fetched" from the mmap region.
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Namhyung Kim <namhyung@gmail.com>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lkml.kernel.org/r/1372944040-32690-9-git-send-email-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
The 'inject' command expects to get a reference to 'struct perf_inject'
from its 'tool' member. For that to work, 'tool' needs to be a
parameter of all tool callbacks. Make it so.
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Namhyung Kim <namhyung@gmail.com>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lkml.kernel.org/r/1372944040-32690-3-git-send-email-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Header files of libtraceevent or no longer local headers. Thus, use
default path notation for them. Also removing extra traceevent include
path and instead handle this similar to liblk.
Signed-off-by: Robert Richter <robert.richter@linaro.org>
Signed-off-by: Robert Richter <rric@kernel.org>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Robert Richter <rric@kernel.org>
Link: http://lkml.kernel.org/r/1370964558-8599-1-git-send-email-rric@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
This patch adds the sorting and histogram support
functions to enable profiling of memory accesses.
The following sorting orders are added:
- symbol_daddr: data address symbol (or raw address)
- dso_daddr: data address shared object
- locked: access uses locked transaction
- tlb : TLB access
- mem : memory level of the access (L1, L2, L3, RAM, ...)
- snoop: access snoop mode
Signed-off-by: Stephane Eranian <eranian@google.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Namhyung Kim <namhyung.kim@lge.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/1359040242-8269-12-git-send-email-eranian@google.com
[ committer note: changed to cope with fc5871ed, the move of methods to
machine.[ch], and the rename of dsrc to data_src, to match the change
made in the PERF_SAMPLE_DSRC in a previous patch. ]
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
perf record has a new option -W that enables weightened sampling.
Add sorting support in top/report for the average weight per sample and the
total weight sum. This allows to both compare relative cost per event
and the total cost over the measurement period.
Add the necessary glue to perf report, record and the library.
v2: Merge with new hist refactoring.
v3: Fix manpage. Remove value check.
Rename global_weight to weight and weight to local_weight.
v4: Readd sort keys to manpage
v5: Move weight to end
v6: Move weight to template
v7: Rename weight key.
Original patch from Andi modified by Stephane Eranian <eranian@google.com>
to include ONLY the weight supporting code and apply to pristine 3.8.0-rc4.
Signed-off-by: Andi Kleen <ak@linux.intel.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Namhyung Kim <namhyung.kim@lge.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/1359040242-8269-6-git-send-email-eranian@google.com
[ committer note: changed to cope with fc5871ed and the hists_link perf test entry ]
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
We define it in the Makefile so no need to duplicate it.
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/1363686376-29525-1-git-send-email-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
It is always there, no sense in calling a function named
"perf_session__find_host_machine".
Also no sense in checking if that function return is NULL, so ditch
needless error handling.
Cc: David Ahern <dsahern@gmail.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Namhyung Kim <namhyung@gmail.com>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lkml.kernel.org/n/tip-a6a3zx3afbrxo8p2zqm5mxo8@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
That consolidates the grouping of host + guests, isolating a bit more of
functionality now centered on 'perf_session' that can be used
independently in tools that don't need a 'perf_session' instance, but
needs to have all the thread/map/symbol machinery.
Cc: David Ahern <dsahern@gmail.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Namhyung Kim <namhyung@gmail.com>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lkml.kernel.org/n/tip-c700rsiphpmzv8klogojpfut@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
It was being used just for its stats member, so ditch session->hists and
use just what is needed, session->stats.
This completes the move support multiple events in the hists layer, the
last user of session->hists was 'perf diff' but Jiri Olsa has fixed that
some time ago.
Cc: David Ahern <dsahern@gmail.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Namhyung Kim <namhyung@gmail.com>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lkml.kernel.org/n/tip-pimk92kek8kcp4dmb1jakoro@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
As this function deals exclusively with hists->stats.
Preparatory patch for removing the by now needless session->hists, that
should be just session->stats.
Cc: David Ahern <dsahern@gmail.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Namhyung Kim <namhyung@gmail.com>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lkml.kernel.org/n/tip-be0o8si9f1z40cwoa534f7me@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
We had that 'with_hits' filter to show just the build ids for DSOs that
had samples, make that generic so that we can use it in the upcoming
buildid-cache --missing feature, to show just the build ids that are not
in the cache.
Cc: David Ahern <dsahern@gmail.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Namhyung Kim <namhyung@gmail.com>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lkml.kernel.org/n/tip-9nfesdfpnx7zp96yn3tmfbx0@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
This is a suggested patch to fix the bug I reported at:
http://marc.info/?l=linux-kernel&m=135033028924652&w=2
Essentially, there is a hard requirement that when perf analyzes a
trace, it must have the entire thing mmap()'d.
Therefore the scheme used on 32-bit where we have a fixed (8) number of
32MB mmaps, and cycle through them, simply does not work.
One of the reasons this requirement exists is because the iterators
maintain references to perf entry objects and those references don't
just simply go away when this mmap code decides to cycle an old mmap
area out and reuse it. At this point, those entry pointers now point to
garbage resulting in unpredictable behavior and crashes.
It is better to try to mmap() as much as we can and if we do actually
run into address space limitations, the failure of the mmap() call will
indicate that and stop processing.
I noticed that perf_session->mmap_window is set to a constant in one
location, and only used in one other location. So I got rid of it
altogether.
So we adjust the size of the mmaps[] array to the maximum we could need.
On 64-bit we only need one slot. On 32-bit we could need up to 128 (128
* 32MB == 4GB).
I've verified that this allows a large (~600MB) perf.data file to be
analyzed properly with a 32-bit perf binary, which previously was not
possible.
Signed-off-by: David S. Miller <davem@davemloft.net>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/20121110.141219.582924082787523608.davem@davemloft.net
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
The perf session environment information was saved (so allocated) during
perf_session__open, but was not freed. As free(3) handles NULL pointer
input properly it won't cause a issue for writing modes - e.g. perf
record
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Cc: Andi Kleen <andi@firstfloor.org>
Cc: Feng Tang <feng.tang@intel.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/1353472999-23042-1-git-send-email-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Sometimes we need to know when the progress bar should disappear.
Checking curr >= total wasn't enough since there're cases not met that
condition for the last call.
So add a new ->finish callback to identify this explicitly. Currently
only GTK frontend needs it.
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Pekka Enberg <penberg@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/1352813436-14173-4-git-send-email-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Its such a common need that we might as well have a global with that
value.
Cc: David Ahern <dsahern@gmail.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Namhyung Kim <namhyung@gmail.com>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lkml.kernel.org/n/tip-mwfqji9f17k5j81l1404dk3q@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Instead of passing it around for parsing as an explicit parameter, will
help with reading tracepoint fields when not using a perf session or
pevent structure, i.e. for non perf.data centered workflows.
Cc: David Ahern <dsahern@gmail.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Namhyung Kim <namhyung@gmail.com>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lkml.kernel.org/n/tip-qa67ikv2sm49cwa7dyjhhp6g@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
perf defines both __used and __unused variables to use for marking
unused variables. The variable __used is defined to
__attribute__((__unused__)), which contradicts the kernel definition to
__attribute__((__used__)) for new gcc versions. On Android, __used is
also defined in system headers and this leads to warnings like: warning:
'__used__' attribute ignored
__unused is not defined in the kernel and is not a standard definition.
If __unused is included everywhere instead of __used, this leads to
conflicts with glibc headers, since glibc has a variables with this name
in its headers.
The best approach is to use __maybe_unused, the definition used in the
kernel for __attribute__((unused)). In this way there is only one
definition in perf sources (instead of 2 definitions that point to the
same thing: __used and __unused) and it works on both Linux and Android.
This patch simply replaces all instances of __used and __unused with
__maybe_unused.
Signed-off-by: Irina Tirdea <irina.tirdea@intel.com>
Acked-by: Pekka Enberg <penberg@kernel.org>
Cc: David Ahern <dsahern@gmail.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Namhyung Kim <namhyung.kim@lge.com>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Steven Rostedt <rostedt@goodmis.org>
Link: http://lkml.kernel.org/r/1347315303-29906-7-git-send-email-irina.tirdea@intel.com
[ committer note: fixed up conflict with a116e05 in builtin-sched.c ]
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Storing data for VDSO shared object, because we need it for the post
unwind processing.
The VDSO shared object is same for all process on a running system, so
it makes no difference when we store it inside the tracer - perf.
When [vdso] map memory is hit, we retrieve [vdso] DSO image and store it
into temporary file.
During the build-id processing phase, the [vdso] DSO image is stored in
build-id db, and build-id reference is made inside perf.data. The
build-id vdso file object is called '[vdso]'. We don't use temporary
file name which gets removed when record is finished.
During report phase the vdso build-id object is treated as any other
build-id DSO object.
Adding following API for vdso object:
bool is_vdso_map(const char *filename)
- returns true if the filename matches vdso map name
struct dso *vdso__dso_findnew(struct list_head *head)
- find/create proper vdso DSO object
vdso__exit(void)
- removes temporary VDSO image if there's any
This change makes backtrace dwarf post unwind possible from [vdso] maps.
Following output is current report of [vdso] sample dwarf backtrace:
# Overhead Command Shared Object Symbol
# ........ ....... ................. .............................
#
99.52% ex [vdso] [.] 0x00007fff3ace89af
|
--- 0x7fff3ace89af
Following output is new report of [vdso] sample dwarf backtrace:
# Overhead Command Shared Object Symbol
# ........ ....... ................. .............................
#
99.52% ex [vdso] [.] 0x00000000000009af
|
--- 0x7fff3ace89af
main
__libc_start_main
_start
Signed-off-by: Jiri Olsa <jolsa@redhat.com>
Acked-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/1347295819-23177-5-git-send-email-jolsa@redhat.com
[ committer note: s/ALIGN/PERF_ALIGN/g to cope with the android build changes ]
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Bail out without error if we want to do backtrace post unwind, but were
not able to capture user registers or user stack during the record
phase, which is possible and valid case.
Signed-off-by: Jiri Olsa <jolsa@redhat.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/1347295819-23177-2-git-send-email-jolsa@redhat.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
On some systems (e.g. Android), ALIGN is defined in system headers as
ALIGN(p). The definition of ALIGN used in perf takes 2 parameters:
ALIGN(x,a). This leads to redefinition conflicts.
Redefinition error on Android:
In file included from util/include/linux/list.h:1:0,
from util/callchain.h:5,
from util/hist.h:6,
from util/session.h:4,
from util/build-id.h:4,
from util/annotate.c:11:
util/include/linux/kernel.h:11:0: error: "ALIGN" redefined [-Werror]
bionic/libc/include/sys/param.h:38:0: note: this is the location of
the previous definition
Conflics with system defined ALIGN in Android:
util/event.c: In function 'perf_event__synthesize_comm':
util/event.c:115:32: error: macro "ALIGN" passed 2 arguments, but takes just 1
util/event.c:115:9: error: 'ALIGN' undeclared (first use in this function)
util/event.c:115:9: note: each undeclared identifier is reported only once for
each function it appears in
In order to avoid this redefinition, ALIGN is renamed to PERF_ALIGN.
Signed-off-by: Irina Tirdea <irina.tirdea@intel.com>
Acked-by: Pekka Enberg <penberg@kernel.org>
Cc: David Ahern <dsahern@gmail.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Irina Tirdea <irina.tirdea@intel.com>
Cc: Namhyung Kim <namhyung.kim@lge.com>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Steven Rostedt <rostedt@goodmis.org>
Link: http://lkml.kernel.org/r/1347315303-29906-5-git-send-email-irina.tirdea@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Allows errors to propogate through event processing code and back to
commands.
Signed-off-by: David Ahern <dsahern@gmail.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/1346005487-62961-2-git-send-email-dsahern@gmail.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
This brings the support for DWARF cfi unwinding on perf post
processing. Call frame informations are retrieved and then passed
to libunwind that requests memory and register content from the
applications.
Adding unwind object to handle the user stack backtrace based
on the user register values and user stack dump.
The unwind object access the libunwind via remote interface
and provides to it all the necessary data to unwind the stack.
The unwind interface provides following function:
unwind__get_entries
And callback (specified in above function) to retrieve
the backtrace entries:
typedef int (*unwind_entry_cb_t)(struct unwind_entry *entry,
void *arg);
Signed-off-by: Jiri Olsa <jolsa@redhat.com>
Original-patch-by: Frederic Weisbecker <fweisbec@gmail.com>
Cc: "Frank Ch. Eigler" <fche@redhat.com>
Cc: Arun Sharma <asharma@fb.com>
Cc: Benjamin Redelings <benjamin.redelings@nescent.org>
Cc: Corey Ashford <cjashfor@linux.vnet.ibm.com>
Cc: Cyrill Gorcunov <gorcunov@openvz.org>
Cc: Frank Ch. Eigler <fche@redhat.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Robert Richter <robert.richter@amd.com>
Cc: Stephane Eranian <eranian@google.com>
Cc: Tom Zanussi <tzanussi@gmail.com>
Cc: Ulrich Drepper <drepper@gmail.com>
Link: http://lkml.kernel.org/r/1344345647-11536-12-git-send-email-jolsa@redhat.com
[ Replaced use of perf_session by usage of perf_evsel ]
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Adding following info to be parsed out of the event sample:
- user register set
- user stack dump
Both are global and specific to all events within the session.
This info will be used in the unwind patches coming in shortly.
Adding simple output printout (report -D) for both register and
stack dumps.
Signed-off-by: Jiri Olsa <jolsa@redhat.com>
Original-patch-by: Frederic Weisbecker <fweisbec@gmail.com>
Cc: "Frank Ch. Eigler" <fche@redhat.com>
Cc: Arun Sharma <asharma@fb.com>
Cc: Benjamin Redelings <benjamin.redelings@nescent.org>
Cc: Corey Ashford <cjashfor@linux.vnet.ibm.com>
Cc: Cyrill Gorcunov <gorcunov@openvz.org>
Cc: Frank Ch. Eigler <fche@redhat.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Robert Richter <robert.richter@amd.com>
Cc: Stephane Eranian <eranian@google.com>
Cc: Tom Zanussi <tzanussi@gmail.com>
Cc: Ulrich Drepper <drepper@gmail.com>
Link: http://lkml.kernel.org/r/1344345647-11536-11-git-send-email-jolsa@redhat.com
[ Use evsel->attr.sample_regs_user ]
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
That is a more compact form of perf_session__parse_sample and to support
multiple evlists per perf_session is the way to go anyway.
Cc: David Ahern <dsahern@gmail.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Namhyung Kim <namhyung@gmail.com>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lkml.kernel.org/n/tip-vkxx3j5qktoj11bvcwmfjj13@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Removing perf_session->id_hdr_size, as it can be obtained from the
evsel/evlist.
Cc: David Ahern <dsahern@gmail.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Namhyung Kim <namhyung@gmail.com>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lkml.kernel.org/n/tip-1nwc2kslu7gsfblu98xbqbll@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Removing perf_session->sample_id_all, as it can be obtained from the
evsel/evlist.
Cc: David Ahern <dsahern@gmail.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Namhyung Kim <namhyung@gmail.com>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lkml.kernel.org/n/tip-ok58u1mlc5ci9b6p36r52uh1@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Removing perf_session->sample_type, as it can be obtained from the
evsel/evlist.
Cc: David Ahern <dsahern@gmail.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Namhyung Kim <namhyung@gmail.com>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lkml.kernel.org/n/tip-mnt1zwlik7sp7z6ljc9kyefg@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
So that we don't have to store it in the perf_session instance, because
in the future perf_session instances may have multiple evlists, each
with different sample_type/sizes.
Cc: David Ahern <dsahern@gmail.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Namhyung Kim <namhyung@gmail.com>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lkml.kernel.org/n/tip-ptod86fxkpgq3h62m9refkv4@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Guest kernel symbols are not resolved despite passing the information
needed to resolve them. e.g.,
perf kvm --guest --guestmount=/tmp/guest-mount record -a -- sleep 1
perf kvm --guest --guestmount=/tmp/guest-mount report --stdio
36.55% [guest/11399] [unknown] [g] 0xffffffff81600bc8
33.19% [guest/10474] [unknown] [g] 0x00000000c0116e00
30.26% [guest/11094] [unknown] [g] 0xffffffff8100a288
43.69% [guest/10474] [unknown] [g] 0x00000000c0103d90
37.38% [guest/11399] [unknown] [g] 0xffffffff81600bc8
12.24% [guest/11094] [unknown] [g] 0xffffffff810aa91d
6.69% [guest/11094] [unknown] [u] 0x00007fa784d721c3
which is just pathetic.
After a maddening 2 days sifting through perf minutia I found it --
id_hdr_size is not initialized for guest machines. This shows up on the
report side as random garbage for the cpu and timestamp, e.g.,
29816 7310572949125804849 0x1ac0 [0x50]: PERF_RECORD_MMAP ...
That messes up the sample sorting such that synthesized guest maps are
processed last.
With this patch you get a much more helpful report:
12.11% [guest/11399] [guest.kernel.kallsyms.11399] [g] irqtime_account_process_tick
10.58% [guest/11399] [guest.kernel.kallsyms.11399] [g] run_timer_softirq
6.95% [guest/11094] [guest.kernel.kallsyms.11094] [g] printk_needs_cpu
6.50% [guest/11094] [guest.kernel.kallsyms.11094] [g] do_timer
6.45% [guest/11399] [guest.kernel.kallsyms.11399] [g] idle_balance
4.90% [guest/11094] [guest.kernel.kallsyms.11094] [g] native_read_tsc
...
v2:
- changed rbtree walk to use rb_first per Namhyung's suggestion
Tested-by: Jiri Olsa <jolsa@redhat.com>
Signed-off-by: David Ahern <dsahern@gmail.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/1342826756-64663-5-git-send-email-dsahern@gmail.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Commit 743eb86865 reworked when the
machines were created. Prior to this commit guest machines could be
created in perf_event__process_kernel_mmap() while processing kernel
MMAP events. This commit assumes that the machines exist by the time
perf_session_deliver_event is called (e.g., during processing of build
id events) - which is not always correct.
One example is the use of default guest args (--guestkallsyms and
--guestmodules) for short times where no samples hit within a guest
module. For this case no build id is added to the file header. No build
id == no machine created. That leads to the next example -- the use of
no-buildid (-B) on the record for all perf-kvm invocations. In both
cases perf report dies with a SEGFAULT of the form:
(gdb) bt
0 0x000000000046dd7b in machine__mmap_name (self=0x0, bf=0x7fffffffbd20 "q\021", size=4096) at util/map.c:715
1 0x0000000000444161 in perf_event__process_kernel_mmap (tool=0x7fffffffdd80, event=0x7ffff7fb4120, machine=0x0) at util/event.c:562
2 0x0000000000444642 in perf_event__process_mmap (tool=0x7fffffffdd80, event=0x7ffff7fb4120, sample=0x7fffffffd210, machine=0x0)
at util/event.c:668
3 0x0000000000470e0b in perf_session_deliver_event (session=0x915ca0, event=0x7ffff7fb4120, sample=0x7fffffffd210, tool=0x7fffffffdd80,
file_offset=8480) at util/session.c:979
4 0x000000000047032e in flush_sample_queue (s=0x915ca0, tool=0x7fffffffdd80) at util/session.c:679
5 0x0000000000471c8d in __perf_session__process_events (session=0x915ca0, data_offset=400, data_size=150448, file_size=150848, tool=
0x7fffffffdd80) at util/session.c:1363
6 0x0000000000471d42 in perf_session__process_events (self=0x915ca0, tool=0x7fffffffdd80) at util/session.c:1379
7 0x000000000042484a in __cmd_report (rep=0x7fffffffdd80) at builtin-report.c:368
8 0x0000000000425bf1 in cmd_report (argc=0, argv=0x915b00, prefix=0x0) at builtin-report.c:756
9 0x0000000000438505 in __cmd_report (argc=4, argv=0x7fffffffe260) at builtin-kvm.c:84
10 0x000000000043882a in cmd_kvm (argc=4, argv=0x7fffffffe260, prefix=0x0) at builtin-kvm.c:131
11 0x00000000004152cd in run_builtin (p=0x7a54e8, argc=9, argv=0x7fffffffe260) at perf.c:273
12 0x00000000004154c7 in handle_internal_command (argc=9, argv=0x7fffffffe260) at perf.c:345
13 0x0000000000415613 in run_argv (argcp=0x7fffffffe14c, argv=0x7fffffffe140) at perf.c:389
14 0x0000000000415899 in main (argc=9, argv=0x7fffffffe260) at perf.c:487
Fix by allowing the machine to be created in perf_session_deliver_event.
Tested with --guestmount option and default guest args, with and without
-B arg on record for both and for short (10 seconds) and long (10
minutes) windows.
Reported-by: Pradeep Kumar Surisetty <psuriset@linux.vnet.ibm.com>
Signed-off-by: David Ahern <dsahern@gmail.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Namhyung Kim <namhyung@gmail.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Pradeep Kumar Surisetty <psuriset@linux.vnet.ibm.com>
Link: http://lkml.kernel.org/r/1341180697-64515-1-git-send-email-dsahern@gmail.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
The pevent thing is per perf.data file, so I made it stop being static
and become a perf_session member, so tools processing perf.data files
use perf_session and _there_ we read the trace events description into
session->pevent and then change everywhere to stop using that single
global pevent variable and use the per session one.
Note that it _doesn't_ fall backs to trace__event_id, as we're not
interested at all in what is present in the
/sys/kernel/debug/tracing/events in the workstation doing the analysis,
just in what is in the perf.data file.
This patch also introduces perf_session__set_tracepoints_handlers that
is the perf perf.data/session way to associate handlers to tracepoint
events by resolving their IDs using the events descriptions stored in a
perf.data file. Make 'perf sched' use it.
Reported-by: Dmitry Antipov <dmitry.antipov@linaro.org>
Tested-by: Dmitry Antipov <dmitry.antipov@linaro.org>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Namhyung Kim <namhyung@gmail.com>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: linaro-dev@lists.linaro.org
Cc: patches@linaro.org
Link: http://lkml.kernel.org/r/20120625232016.GA28525@infradead.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Pull perf improvements from Arnaldo Carvalho de Melo:
* Replace event_name with perf_evsel__name, that handles the event
modifiers and doesn't use static variables.
* GTK browser improvements, from Namhyung Kim
* Fix possible NULL pointer deref in the TUI annotate browser, from
Samuel Liao
* Add sort by source file:line number, using addr2line.
* Allow printing histogram text snapshots at any point in top/report.
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
So that we don't use global variables that could make us misreport event
names when having a multi window top, for instance.
Cc: David Ahern <dsahern@gmail.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Namhyung Kim <namhyung@gmail.com>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lkml.kernel.org/n/tip-mccancovi1u0wdkg8ncth509@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Based on Jiri's latest attempt:
https://lkml.org/lkml/2012/5/16/61
Basically, adds_features should be byte swapped assuming unsigned
longs are either 8-bytes (u64) or 4-bytes (u32).
Fixes 32-bit ppc dumping 64-bit x86 feature data:
========
captured on: Sun May 20 19:23:23 2012
hostname : nxos-vdc-dev3
os release : 3.4.0-rc7+
perf version : 3.4.rc4.137.g978da3
arch : x86_64
nrcpus online : 16
nrcpus avail : 16
cpudesc : Intel(R) Xeon(R) CPU E5540 @ 2.53GHz
cpuid : GenuineIntel,6,26,5
total memory : 24680324 kB
...
Verified 64-bit x86 can still dump feature data for 32-bit ppc.
Signed-off-by: David Ahern <dsahern@gmail.com>
Reviewed-by: Jiri Olsa <jolsa@redhat.com>
Cc: Corey Ashford <cjashfor@linux.vnet.ibm.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/4FBBB539.5010805@gmail.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Adding endianity swapping for event header attached via sample_id_all.
Currently we dont do that and it's causing wrong data to be read when
running report on architecture with different endianity than the record.
The perf is currently able to process 32-bit PPC samples on 32-bit
and 64-bit x86.
Together with other endianity patches, this change fixies perf report
discrepancies on origin and target systems as described in test 1
below, e.g. following perf report diff:
...
0.12% ps [kernel.kallsyms] [k] clear_page
- 0.12% awk bash [.] alloc_word_desc
+ 0.12% awk bash [.] yyparse
0.11% beah-rhts-task libpython2.6.so.1.0 [.] 0x5560e
0.10% perf libc-2.12.so [.] __ctype_toupper_loc
- 0.09% rhts-test-runne bash [.] maybe_make_export_env
+ 0.09% rhts-test-runne bash [.] 0x385a0
0.09% ps [kernel.kallsyms] [k] page_fault
...
Note, running following to test perf endianity handling:
test 1)
- origin system:
# perf record -a -- sleep 10 (any perf record will do)
# perf report > report.origin
# perf archive perf.data
- copy the perf.data, report.origin and perf.data.tar.bz2
to a target system and run:
# tar xjvf perf.data.tar.bz2 -C ~/.debug
# perf report > report.target
# diff -u report.origin report.target
- the diff should produce no output
(besides some white space stuff and possibly different
date/TZ output)
test 2)
- origin system:
# perf record -ag -fo /tmp/perf.data -- sleep 1
- mount origin system root to the target system on /mnt/origin
- target system:
# perf script --symfs /mnt/origin -I -i /mnt/origin/tmp/perf.data \
--kallsyms /mnt/origin/proc/kallsyms
- complete perf.data header is displayed
Signed-off-by: Jiri Olsa <jolsa@redhat.com>
Reviewed-by: David Ahern <dsahern@gmail.com>
Tested-by: David Ahern <dsahern@gmail.com>
Cc: Corey Ashford <cjashfor@linux.vnet.ibm.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/1338380624-7443-3-git-send-email-jolsa@redhat.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
We faced segmentation fault on perf top -G at very high sampling rate
due to a corrupted callchain. While the root cause was not revealed (I
failed to figure it out), this patch tries to protect us from the
segfault on such cases.
Reported-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: Namhyung Kim <namhyung.kim@lge.com>
Cc: Namhyung Kim <namhyung@gmail.com>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Sunjin Yang <fan4326@gmail.com>
Link: http://lkml.kernel.org/r/1338443007-24857-2-git-send-email-namhyung.kim@lge.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
perf top -G has a race on callchain cursor between main thread and
display thread. Since the callchain cursors are used locally make them
thread-local data would solve the problem.
Signed-off-by: Namhyung Kim <namhyung.kim@lge.com>
Reported-by: Sunjin Yang <fan4326@gmail.com>
Suggested-by: Arnaldo Carvalho de Melo <acme@ghostprotocols.net>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Namhyung Kim <namhyung@gmail.com>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Sunjin Yang <fan4326@gmail.com>
Link: http://lkml.kernel.org/r/1338443007-24857-1-git-send-email-namhyung.kim@lge.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
$ perf script -i /tmp/perf.data
...
gcc 13623 544315.062858: context-switches:
ffffffff815f65c9 __schedule ([kernel.kallsyms])
ffffffff81087cea __cond_resched ([kernel.kallsyms])
ffffffff815f6b92 _cond_resched ([kernel.kallsyms])
ffffffff815fb87a do_page_fault ([kernel.kallsyms])
ffffffff815f8465 page_fault ([kernel.kallsyms])
2b7a71ea0303 _dl_lookup_symbol_x ([kernel.kallsyms])
2b7a71ea1eb5 _dl_relocate_object ([kernel.kallsyms])
2b7a71e99b2e dl_main ([kernel.kallsyms])
2b7a71eab7f4 _dl_sysdep_start ([kernel.kallsyms])
All DSO's in a callchain are printed as [kernel.kallsyms].
git bisect chased it to:
547a92e0ae is the first bad commit
commit 547a92e0ae
Author: Akihiro Nagai <akihiro.nagai.hw@hitachi.com>
Date: Mon Jan 30 13:42:57 2012 +0900
perf script: Unify the expressions indicating "unknown"
The perf script command uses various expressions to indicate "unknown".
It is unfriendly for user scripts to parse it. So, this patch unifies
the expressions to "[unknown]".
Looks like a copy-paste in that the other references use al.map but this one
should be node->map.
With this patch you get:
$ perf script -i /tmp/perf.data
...
gcc 13623 544315.062858: context-switches:
ffffffff815f65c9 __schedule ([kernel.kallsyms])
ffffffff81087cea __cond_resched ([kernel.kallsyms])
ffffffff815f6b92 _cond_resched ([kernel.kallsyms])
ffffffff815fb87a do_page_fault ([kernel.kallsyms])
ffffffff815f8465 page_fault ([kernel.kallsyms])
2b7a71ea0303 _dl_lookup_symbol_x (/lib64/ld-2.14.90.so)
2b7a71ea1eb5 _dl_relocate_object (/lib64/ld-2.14.90.so)
2b7a71e99b2e dl_main (/lib64/ld-2.14.90.so)
2b7a71eab7f4 _dl_sysdep_start (/lib64/ld-2.14.90.so)
Signed-off-by: David Ahern <dsahern@gmail.com>
Cc: Akihiro Nagai <akihiro.nagai.hw@hitachi.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/1338353906-60706-1-git-send-email-dsahern@gmail.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
In __perf_session__process_pipe_events(), there was a risk we would read
more than what a union perf_event struct can hold. this could happen in
case, perf is reading a file which contains new record types it does not
know about and which are larger than anything it knows about.
In general, perf is supposed to skip records it does not understand, but
in pipe mode, those have to be read and ignored. The fixed size header
contains the size of the record, but that size may be larger than union
perf_event, yet it was used as the backing to the read in:
union perf_event event;
void *p;
size = event->header.size;
p = &event;
p += sizeof(struct perf_event_header);
if (size - sizeof(struct perf_event_header)) {
err = readn(self->fd, p, size - sizeof(struct perf_event_header));
We fix this by allocating a buffer based on the size reported in the
header. We reuse the buffer as much as we can. We realloc in case it
becomes too small. In the common case, the performance impact is
negligible.
Signed-off-by: Stephane Eranian <eranian@google.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/1337081295-10303-3-git-send-email-eranian@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
When the perf data file is read cross architectures, the
perf_event__attr_swap function takes care about endianness of all the
struct fields except the bitfield flags.
The bitfield flags need to be transformed as well, since the bitfield
binary storage differs for both endians.
ABI says:
Bit-fields are allocated from right to left (least to most significant)
on little-endian implementations and from left to right (most to least
significant) on big-endian implementations.
The above seems to be byte specific, so we need to reverse each byte of
the bitfield. 'Internet' also says this might be implementation specific
and we probably need proper fix and carry perf_event_attr bitfield flags
in separate data file FEAT_ section. Thought this seems to work for now.
Note, running following to test perf endianity handling:
test 1)
- origin system:
# perf record -a -- sleep 10 (any perf record will do)
# perf report > report.origin
# perf archive perf.data
- copy the perf.data, report.origin and perf.data.tar.bz2
to a target system and run:
# tar xjvf perf.data.tar.bz2 -C ~/.debug
# perf report > report.target
# diff -u report.origin report.target
- the diff should produce no output
(besides some white space stuff and possibly different
date/TZ output)
test 2)
- origin system:
# perf record -ag -fo /tmp/perf.data -- sleep 1
- mount origin system root to the target system on /mnt/origin
- target system:
# perf script --symfs /mnt/origin -I -i /mnt/origin/tmp/perf.data \
--kallsyms /mnt/origin/proc/kallsyms
- complete perf.data header is displayed
Signed-off-by: Jiri Olsa <jolsa@redhat.com>
Reviewed-by: David Ahern <dsahern@gmail.com>
Tested-by: David Ahern <dsahern@gmail.com>
Cc: Corey Ashford <cjashfor@linux.vnet.ibm.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/1337151548-2396-3-git-send-email-jolsa@redhat.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
. perf_target: abstraction for --uid, --pid, --tid, --cpu, --all-cpus handling,
eliminating code duplicated in the tools, having constraints that apply to
all of them, from Namhyung Kim
. Fixes for handling fallback to cpu-clock on PPC, from David Ahern
. Fix for processing events with unknown size, from Jiri Olsa
. Compilation fix on 32-bit, from Jiri Olsa
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2.0.14 (GNU/Linux)
iQIcBAABAgAGBQJPq+zRAAoJENZQFvNTUqpAe9YP/2VuVQ8OcIS6h9NjdfdAc3y/
RX2KDvnCaqa8YFIAl9OOWwy6G35mu8NEr3Lpzw2kkKvdUE0HYFUa91kF4DDLvG/w
6gpWUtH+9jtnydkX4z11qi1a+Pfq0irc83oaDgfFNArZKdf080OxoCfzaxOjZ3fH
zs4JR2L8TYYxIiWWWl0vMFic9+hTIM3dyFYJswHGy/ppqDX6DWfq4XndTi1gjo99
dxT+hKUvIMDI6Ztdyafkvfvge47ooI3OCuPlvdQzPQlh5ZY/BSZMDhUVHPaBBAV3
qQY6tCGPgk6sEIoNGf3juRSQyaGoo5V5zKgNFsAG5rsZqIpmlErJTeGMSGeOW3ww
TIAOXBVij98c8iwuGPAKR3ZUuNxXHBTAo4Sq/DjRGJ8RKILVaYe6Sp5DOjaFA+3U
374XGvEIqqYViNZm3gq1JfVAjJ4179hk7K+zUK55IeSgsEnwQ9s379dIpwaxCtL4
4Sb0FqyVbx7joo8NfLokXqblcBa/MRECrtVbHHEWuygzuV2Au437mOLOiTSGyyZr
N6FsDNFUcx044QZAYsAcbAXIxE+DssOcEERiFD1gG8MoUohW/DnpbMMaQi9bfUSi
b8SgFIpy9q9LlADRFYUZ+rVs4B9RCJNMtaa2UpcaRuUcXHrqZEWzxbPbSDjCbl4G
/yoSHxH5QJu6bkYz5deW
=ZZUs
-----END PGP SIGNATURE-----
Merge tag 'perf-core-for-mingo' of git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux into perf/core
Fixes and improvements for perf/core:
- perf_target: abstraction for --uid, --pid, --tid, --cpu, --all-cpus handling,
eliminating code duplicated in the tools, having constraints that apply to
all of them, from Namhyung Kim
- Fixes for handling fallback to cpu-clock on PPC, from David Ahern
- Fix for processing events with unknown size, from Jiri Olsa
- Compilation fix on 32-bit, from Jiri Olsa
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Currently if we cannot decide the size of the event, we guess next
event possition by:
"... check alignment, and increment a single u64 in the hope
to catch on again 'soon'"
This usually ends up with segfault or endless loop. It's better
to admit the failure right away, then pretend nothing happened.
It makes the life easier ;)
Signed-off-by: Jiri Olsa <jolsa@redhat.com>
Acked-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Corey Ashford <cjashfor@linux.vnet.ibm.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/20120416184251.GA11503@m.brq.redhat.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
In case the perf_session__process_event function fails, we estimate the
next event offset.
This is not necessary for sample event failing on unknown ID or machine.
In such case we know proper size of the event, so we dont need to guess.
Also failure statistics are updated correctly so we don't miss any
information.
Forcing perf_session__process_event to return 0 in case of unknown ID or
machine.
Signed-off-by: Jiri Olsa <jolsa@redhat.com>
Cc: Corey Ashford <cjashfor@linux.vnet.ibm.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/1334233262-5679-3-git-send-email-jolsa@redhat.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Running 'perf kvm --host --guest --guestmount /tmp/guestmount record -a -g -- sleep 2'
Was resulting in a segfault. For event type PERF_RECORD_MMAP,
event->ip.pid is being used in perf_session__find_machine_for_cpumode,
which is not correct.
The event->ip.pid field happens to be 0 in this case and results in
returning a NULL machine object. Finally, access to self->pid in
machine__mmap_name, results in a segfault later.
For PERF_RECORD_MMAP type, pass event->mmap.pid.
Signed-off-by: Nikunj A. Dadhania <nikunj@linux.vnet.ibm.com>
Reviewed-by: David Ahern <dsahern@gmail.com>
Tested-by: David Ahern <dsahern@gmail.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Stephane Eranian <eranian@google.com>
Cc: Nikunj A. Dadhania <nikunj@linux.vnet.ibm.com>
Link: http://lkml.kernel.org/r/20120409081835.10576.22018.stgit@abhimanyu.in.ibm.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
The perf diff command is broken since:
perf hists: Threaded addition and sorting of entries
commit 1980c2ebd7
Several places were broken:
- hists data need to be collected into opened sessions instead
of into events
- session's hists data need to be initialized properly when the
session is created
- hist_entry__pcnt_snprintf: the percentage and displacement
buffer preparation must not use 'ret' because it's used
as a pointer to the final buffer
Signed-off-by: Jiri Olsa <jolsa@redhat.com>
Cc: Corey Ashford <cjashfor@linux.vnet.ibm.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/20120322133726.GB1601@m.brq.redhat.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
This patch allows perf to process perf.data files generated
using an ABI that has a different perf_event_attr struct size,
i.e., a different ABI version.
The perf_event_attr can be extended, yet perf needs to cope with
older perf.data files. Similarly, perf must be able to cope with
a perf.data file which is using a newer version of the ABI than
what it knows about.
This patch adds read_attr(), a routine that reads a
perf_event_attr struct from a file incrementally based on its
advertised size. If the on-file struct is smaller than what perf
knows, then the extra fields are zeroed. If the on-file struct
is bigger, then perf only uses what it knows about, the rest is
skipped.
Signed-off-by: Stephane Eranian <eranian@google.com>
Cc: peterz@infradead.org
Cc: acme@redhat.com
Cc: robert.richter@amd.com
Cc: ming.m.lin@intel.com
Cc: andi@firstfloor.org
Cc: asharma@fb.com
Cc: ravitillo@lbl.gov
Cc: vweaver1@eecs.utk.edu
Cc: khandual@linux.vnet.ibm.com
Cc: dsahern@gmail.com
Link: http://lkml.kernel.org/r/1328826068-11713-17-git-send-email-eranian@google.com
Signed-off-by: Ingo Molnar <mingo@elte.hu>
The perf sample processing code relies on a valid machine object. Make
sure that this path is only entered when such a object exists.
A counter for samples where no machine object exits is also introduced
to give the user a message about these samples.
Reported-by: David Ahern <dsahern@gmail.com>
Reported-by: Jason Wang <jasowang@redhat.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Jason Wang <jasowang@redhat.com>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/1328893505-4115-2-git-send-email-joerg.roedel@amd.com
Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
The perf script command uses various expressions to indicate "unknown".
It is unfriendly for user scripts to parse it. So, this patch unifies
the expressions to "[unknown]".
Cc: David Ahern <dsahern@gmail.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: yrl.pp-manager.tt@hitachi.com
Link: http://lkml.kernel.org/r/20120130044257.2384.62905.stgit@linux3
Signed-off-by: Akihiro Nagai <akihiro.nagai.hw@hitachi.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
The 'size' cannot be 0 because it was set to 8 on the above line in case
it was 0 and never changed.
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/1325000151-4463-1-git-send-email-namhyung@gmail.com
Signed-off-by: Namhyung Kim <namhyung@gmail.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
The default input file for perf report is not handled the same way as
perf record does it for its output file. This leads to unexpected
behavior of perf report, etc. E.g.:
# perf record -a -e cpu-cycles sleep 2 | perf report | cat
failed to open perf.data: No such file or directory (try 'perf record' first)
While perf record writes to a fifo, perf report expects perf.data to be
read. This patch changes this to accept fifos as input file.
Applies to the following commands:
perf annotate
perf buildid-list
perf evlist
perf kmem
perf lock
perf report
perf sched
perf script
perf timechart
Also fixes char const* -> const char* type declaration for filename
strings.
v2:
* Prevent potential null pointer access to input_name in
builtin-report.c. Needed due to removal of patch "perf report: Setup
browser if stdout is a pipe"
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lkml.kernel.org/r/1323248577-11268-5-git-send-email-robert.richter@amd.com
Signed-off-by: Robert Richter <robert.richter@amd.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
If filename is NULL there is an out-of-bound access to struct
perf_session if it would be used with perf_session__open(). Shouldn't
actually happen in current implementation as filename is always !NULL.
Fixing this by always null-terminating filename.
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lkml.kernel.org/r/1323248577-11268-3-git-send-email-robert.richter@amd.com
Signed-off-by: Robert Richter <robert.richter@amd.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
To better reflect that it became the base class for all tools, that must
be in each tool struct and where common stuff will be put.
Cc: David Ahern <dsahern@gmail.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lkml.kernel.org/n/tip-qgpc4msetqlwr8y2k7537cxe@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Reducing the exposure of perf_session further, so that we can use the
classes in cases where no perf.data file is created.
Cc: David Ahern <dsahern@gmail.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lkml.kernel.org/n/tip-stua66dcscsezzrcdugvbmvd@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
So that we don't need to have that many globals.
Next steps will remove the 'session' pointer, that in most cases is
not needed.
Then we can rename perf_event_ops to 'perf_tool' that better describes
this class hierarchy.
Cc: David Ahern <dsahern@gmail.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lkml.kernel.org/n/tip-wp4djox7x6w1i2bab1pt4xxp@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Since we have it in evsel->hists.callchain_cursor, remove it from
perf_session.
One more step in disentangling several places from requiring a
perf_session pointer.
Cc: David Ahern <dsahern@gmail.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lkml.kernel.org/n/tip-rxr5dj3di7ckyfmnz0naku1z@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Removing another case where a perf_session is required when processing
events.
Cc: David Ahern <dsahern@gmail.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lkml.kernel.org/n/tip-ug1wtjbnva4bxwknflkkrlrh@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
We will need this when not using perf_session in cases like 'perf top'
and strace where no perf.data file is created nor consumed.
Cc: David Ahern <dsahern@gmail.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lkml.kernel.org/n/tip-za923wjc41q5xot5vrhuhj3j@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
The 'machine' abstraction was introduced with 'perf kvm' where we could
have samples for the host and multiple guests, but at the time we ended
up keeping the list of all machines threads all in
session->host_machine.
Move the threads rb_tree to struct machine to separate the namespaces.
Cc: David Ahern <dsahern@gmail.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lkml.kernel.org/n/tip-mdg7sm6j3va09vtgj49gbsrp@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
commit 5d67be9 added the option to specify a range of CPUs of interest,
but does not catch an invalid CPU list:
$ perf script -c foo
Segmentation fault (core dumped)
Cc: Anton Blanchard <anton@samba.org>
Link: http://lkml.kernel.org/r/1321206327-5881-1-git-send-email-dsahern@gmail.com
Signed-off-by: David Ahern <dsahern@gmail.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
So that for large perf.data files the user can have visual feedback that
activity is being performed.
Cc: David Ahern <dsahern@gmail.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lkml.kernel.org/n/tip-3ysn01mpspfrbsy56gznzqqz@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Just like the old perf top --tui and the --stdio version.
But because we have the initial menu to choose which event to show in a
session with multiple events we can see how many chunks were lost in
each of the event types, clarifying which events are being affected the
most.
Cc: David Ahern <dsahern@gmail.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lkml.kernel.org/n/tip-47yyqbubmjzch2chezmb21m6@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Just another step in stopping the use of libnewt in perf.
Cc: David Ahern <dsahern@gmail.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lkml.kernel.org/n/tip-vkb9jh5kkzl5ep3puoatd6an@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
The goal of this patch is to include more information about the host
environment into the perf.data so it is more self-descriptive. Overtime,
profiles are captured on various machines and it becomes hard to track
what was recorded, on what machine and when.
This patch provides a way to solve this by extending the perf.data file
with basic information about the host machine. To add those extensions,
we leverage the feature bits capabilities of the perf.data format. The
change is backward compatible with existing perf.data files.
We define the following useful new extensions:
- HEADER_HOSTNAME: the hostname
- HEADER_OSRELEASE: the kernel release number
- HEADER_ARCH: the hw architecture
- HEADER_CPUDESC: generic CPU description
- HEADER_NRCPUS: number of online/avail cpus
- HEADER_CMDLINE: perf command line
- HEADER_VERSION: perf version
- HEADER_TOPOLOGY: cpu topology
- HEADER_EVENT_DESC: full event description (attrs)
- HEADER_CPUID: easy-to-parse low level CPU identication
The small granularity for the entries is to make it easier to extend
without breaking backward compatiblity. Many entries are provided as
ASCII strings.
Perf report/script have been modified to print the basic information as
easy-to-parse ASCII strings. Extended information about CPU and NUMA
topology may be requested with the -I option.
Thanks to David Ahern for reviewing and testing the many versions of
this patch.
$ perf report --stdio
# ========
# captured on : Mon Sep 26 15:22:14 2011
# hostname : quad
# os release : 3.1.0-rc4-tip
# perf version : 3.1.0-rc4
# arch : x86_64
# nrcpus online : 4
# nrcpus avail : 4
# cpudesc : Intel(R) Core(TM)2 Quad CPU Q6600 @ 2.40GHz
# cpuid : GenuineIntel,6,15,11
# total memory : 8105360 kB
# cmdline : /home/eranian/perfmon/official/tip/build/tools/perf/perf record date
# event : name = cycles, type = 0, config = 0x0, config1 = 0x0, config2 = 0x0, excl_usr = 0, excl_kern = 0, id = { 29, 30, 31,
# HEADER_CPU_TOPOLOGY info available, use -I to display
# HEADER_NUMA_TOPOLOGY info available, use -I to display
# ========
#
...
$ perf report --stdio -I
# ========
# captured on : Mon Sep 26 15:22:14 2011
# hostname : quad
# os release : 3.1.0-rc4-tip
# perf version : 3.1.0-rc4
# arch : x86_64
# nrcpus online : 4
# nrcpus avail : 4
# cpudesc : Intel(R) Core(TM)2 Quad CPU Q6600 @ 2.40GHz
# cpuid : GenuineIntel,6,15,11
# total memory : 8105360 kB
# cmdline : /home/eranian/perfmon/official/tip/build/tools/perf/perf record date
# event : name = cycles, type = 0, config = 0x0, config1 = 0x0, config2 = 0x0, excl_usr = 0, excl_kern = 0, id = { 29, 30, 31,
# sibling cores : 0-3
# sibling threads : 0
# sibling threads : 1
# sibling threads : 2
# sibling threads : 3
# node0 meminfo : total = 8320608 kB, free = 7571024 kB
# node0 cpu list : 0-3
# ========
#
...
Reviewed-by: David Ahern <dsahern@gmail.com>
Tested-by: David Ahern <dsahern@gmail.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Robert Richter <robert.richter@amd.com>
Cc: Andi Kleen <ak@linux.intel.com>
Link: http://lkml.kernel.org/r/20110930134040.GA5575@quad
Signed-off-by: Stephane Eranian <eranian@google.com>
[ committer notes: Use --show-info in the tools as was in the docs, rename
perf_header_fprintf_info to perf_file_section__fprintf_info, fixup
conflict with f69b64f7 "perf: Support setting the disassembler style" ]
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
The perf_event_attr struct has two __u32's at the top and
they need to be swapped individually.
With this change I was able to analyze a perf.data collected in a
32-bit PPC VM on an x86 system. I tested both 32-bit and 64-bit
binaries for the Intel analysis side; both read the PPC perf.data
file correctly.
-v2:
- changed the existing perf_event__attr_swap() to swap only elements
of perf_event_attr and exported it for use in swapping the
attributes in the file header
- updated swap_ops used for processing events
Signed-off-by: David Ahern <dsahern@gmail.com>
Acked-by: Frederic Weisbecker <fweisbec@gmail.com>
Cc: acme@ghostprotocols.net
Cc: peterz@infradead.org
Cc: paulus@samba.org
Cc: <stable@kernel.org>
Link: http://lkml.kernel.org/r/1310754849-12474-1-git-send-email-dsahern@gmail.com
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Add an option to perf report/annotate/script to specify which
CPUs to operate on. This enables us to take a single system wide
profile and analyse each CPU (or group of CPUs) in isolation.
This was useful when profiling a multiprocess workload where the
bottleneck was on one CPU but this was hidden in the overall
profile. Per process and per thread breakdowns didn't help
because multiple processes were running on each CPU and no
single process consumed an entire CPU.
The patch converts the list of CPUs returned by cpu_map__new
into a bitmap for fast lookup. I wanted to use -C to be
consistent with perf top/record/stat, but unfortunately perf
report already uses -C <comms>.
v2: Incorporate suggestions from David Ahern:
- Added -c to perf script
- Check that SAMPLE_CPU is set when -c is used
- Update documentation
v3: Create perf_session__cpu_bitmap()
Signed-off-by: Anton Blanchard <anton@samba.org>
Acked-by: David Ahern <dsahern@gmail.com>
Cc: Arnaldo Carvalho de Melo <acme@ghostprotocols.net>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Paul Mackerras <paulus@samba.org>
Link: http://lkml.kernel.org/r/20110704215750.11647eb9@kryten
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Add "caller/callee" option to support inverted butterfly report,
in the inverted report (with caller option), the call graph start
from the callee's ancestor. Users can use such view to catch system's
performance bottleneck from a sysprof like view. Using this option
with specified sort order like pid gives us high level view of call
graph statistics.
Also add "-G" alias for inverted call graph.
Signed-off-by: Sam Liao <phyomh@gmail.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Stephane Eranian <eranian@google.com>
Cc: David Ahern <dsahern@gmail.com>
Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Resolve to a function or variable if possible and if the sym option is
enabled.
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/1306782503-22002-1-git-send-email-dsahern@gmail.com
Signed-off-by: David Ahern <dsahern@gmail.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
The 'sym' option displays both the function name and the DSO it comes
from. Split the display of the dso into a separate option. This allows
display of the ip address and symbol without the dso, thus shortening
line lengths - and decluttering the output a bit.
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/1306528124-25861-3-git-send-email-dsahern@gmail.com
Signed-off-by: David Ahern <dsahern@gmail.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Currently the "sym" output field is used to dump instruction pointers
and callchain stack. Sample addresses can also be converted to symbols,
so the meaning of "sym" needs to be fixed. This patch adds an "ip"
option and if it is selected the user can also opt to dump symbols for
them. If the user opts to dump IP without syms only the address is
shown.
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/1306528124-25861-2-git-send-email-dsahern@gmail.com
Signed-off-by: David Ahern <dsahern@gmail.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Fixes two more cases where the python binding would not load:
. Not finding die(), which it shouldn't anyway, not good to just stop the
world because some particular perf.data file is invalid, just propagate
the error to the caller.
. Not finding perf_sample_size: fix it by moving it from event.c to evsel,
where it belongs, as most cases are moving to operate on an evsel object.o
One of the fixed problems:
[root@emilia ~]# python
>>> import perf
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
ImportError: /home/acme/git/build/perf/python/perf.so: undefined symbol: perf_sample_size
>>>
[root@emilia ~]#
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lkml.kernel.org/n/tip-1hkj7b2cvgbfnoizsekjb6c9@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Commit eac9eacee1 "perf tools: Check we are able to read the event
size on mmap" brought a check to ensure we can read the size of the
event before dereferencing it, and do a remap otherwise to move the
buffer forward.
However that remap was ommitting all the necessary work to
update the new page offset, head, and to unmap previous pages,
etc...
To fix this, gather all the code that fetches the event in a
seperate helper which does all the necessary checks about the
header/event size and tells us anytime a remap is needed.
Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lkml.kernel.org/r/1306148788-6179-3-git-send-email-fweisbec@gmail.com
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Better handle event parsing error by propagating the details
in upper layers or by dumping some failure message. So that
the user knows he has some crazy events in the batch.
Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Stephane Eranian <eranian@google.com>
Check that the total size of the sample fields having a fixed
size do not exceed the one of the whole event. This robustifies
the sample parsing.
Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Stephane Eranian <eranian@google.com>
size is overriden later and used only then. Those
lines are only junk, probably a leftover.
Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Stephane Eranian <eranian@google.com>