2
0
mirror of git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git synced 2025-09-04 20:19:47 +08:00
Commit Graph

269 Commits

Author SHA1 Message Date
Arnaldo Carvalho de Melo
a41794cdd7 perf tools: Remove some unused functions
Without the bloated cplus_demangle from binutils, i.e building with:

$ make NO_DEMANGLE=1 O=~acme/git/build/perf -j3 -C tools/perf/ install

Before:

   text	   data	    bss	    dec	    hex	filename
 471851	  29280	4025056	4526187	 45106b	/home/acme/bin/perf

After:

[acme@doppio linux-2.6-tip]$ size ~/bin/perf
   text	   data	    bss	    dec	    hex	filename
 446886	  29232	4008576	4484694	 446e56	/home/acme/bin/perf

So its a 5.3% size reduction in code, but the interesting part is in the git
diff --stat output:

 19 files changed, 20 insertions(+), 1909 deletions(-)

If we ever need some of the things we got from git but weren't using, we just
have to go to the git repo and get fresh, uptodate source code bits.

Cc: Frédéric Weisbecker <fweisbec@gmail.com>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Tom Zanussi <tzanussi@gmail.com>
LKML-Reference: <new-submission>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2010-05-18 23:03:35 -03:00
Arnaldo Carvalho de Melo
1967936d68 perf options: Check v type in OPT_U?INTEGER
To avoid problems like the one fixed by Stephane Eranian in 3de29ca, now
we'll got this instead:

	bench/sched-messaging.c:259: error: negative width in bit-field ‘<anonymous>’
	bench/sched-messaging.c:261: error: negative width in bit-field ‘<anonymous>’

Which is rather cryptic, but is how BUILD_BUG_ON_ZERO works, so kernel
hackers should be already used to this.

With it in place found some problems, fixed by changing the affected
variables to sensible types or changed some OPT_INTEGER to OPT_UINTEGER.

Next csets will go thru converting each of the remaining OPT_ so that
review can be made easier by grouping changes per type per patch.

Cc: Frédéric Weisbecker <fweisbec@gmail.com>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Stephane Eranian <eranian@google.com>
Cc: Tom Zanussi <tzanussi@gmail.com>
LKML-Reference: <new-submission>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2010-05-17 15:43:38 -03:00
Stephane Eranian
3de29cab1f perf record: Fix bug mismatch with -c option definition
The -c option defines the user requested sampling period. It was implemented
using an unsigned int variable but the type of the option was OPT_LONG. Thus,
the option parser was overwriting memory belonging to other variables, namely
the mmap_pages leading to a zero page sampling buffer. The bug was exposed only
when compiling at -O0, probably because the compiler was padding variables at
higher optimization levels.

This patch fixes this problem by declaring user_interval as u64. This also
avoids wrap-around issues for large period on 32-bit systems.

Commiter note:

Made it use OPT_U64(user_interval) after implementing OPT_U64 in the
previous patch.

Cc: David S. Miller <davem@davemloft.net>
Cc: Frédéric Weisbecker <fweisbec@gmail.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <peterz@infradead.org>
LKML-Reference: <4bf11ae9.e88cd80a.06b0.ffffa8e3@mx.google.com>
Signed-off-by: Stephane Eranian <eranian@google.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2010-05-17 12:23:18 -03:00
Stephane Eranian
2e6cdf996b perf tools: change event inheritance logic in stat and record
By default, event inheritance across fork and pthread_create was on but the -i
option of stat and record, which enabled inheritance, led to believe it was off
by default.

This patch fixes this logic by inverting the meaning of the -i option.  By
default inheritance is on whether you attach to a process (-p), a thread (-t)
or start a process. If you pass -i, then you turn off inheritance. Turning off
inheritance if you don't need it, helps limit perf resource usage as well.

The patch also fixes perf stat -t xxxx and perf record -t xxxx which did not
start the counters.

Acked-by: Frederic Weisbecker <fweisbec@gmail.com>
Cc: David S. Miller <davem@davemloft.net>
Cc: Frédéric Weisbecker <fweisbec@gmail.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <peterz@infradead.org>
LKML-Reference: <4bea9d2f.d60ce30a.0b5b.08e1@mx.google.com>
Signed-off-by: Stephane Eranian <eranian@google.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2010-05-13 16:39:12 -03:00
Frederic Weisbecker
9840280757 perf: Introduce a new "round of buffers read" pseudo event
In order to provide a more rubust and deterministic reordering
algorithm, we need to know when we reach a point where we just
did a pass through over every counter buffers to read every thing
they had.

This patch introduces a new PERF_RECORD_FINISHED_ROUND pseudo event
that only consist in an event header and doesn't need to contain
anything.

Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Tom Zanussi <tzanussi@gmail.com>
Cc: Masami Hiramatsu <mhiramat@redhat.com>
2010-05-09 13:43:42 +02:00
Tom Zanussi
db620b1c2f perf/record: simplify TRACE_INFO tracepoint check
Fix a couple of inefficiencies and redundancies related to
have_tracepoints() and its use when checking whether to write
TRACE_INFO.

First, there's no need to use get_tracepoints_path() in
have_tracepoints() - we really just want the part that checks whether
any attributes correspondo to tracepoints.

Second, we really don't care about raw_samples per se - tracepoints
are always raw_samples.  In any case, the have_tracepoints() check
should be sufficient to decide whether or not to write TRACE_INFO.

Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Ingo Molnar <mingo@elte.hu>,
Cc: Mike Galbraith <efault@gmx.de>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Acked-by: Frederic Weisbecker <fweisbec@gmail.com>
LKML-Reference: <1273030770.6383.6.camel@tropicana>
Signed-off-by: Tom Zanussi <tzanussi@gmail.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2010-05-05 11:12:53 -03:00
Tom Zanussi
63e0c7715a perf: record TRACE_INFO only if using tracepoints and SAMPLE_RAW
The current perf code implicitly assumes SAMPLE_RAW means tracepoints
are being used, but doesn't check for that.  It happily records the
TRACE_INFO even if SAMPLE_RAW is used without tracepoints, but when the
perf data is read it won't go any further when it finds TRACE_INFO but
no tracepoints, and displays misleading errors.

This adds a check for both in perf-record, and won't record TRACE_INFO
unless both are true.  This at least allows perf report -D to dump raw
events, and avoids triggering a misleading error condition in perf
trace.  It doesn't actually enable the non-tracepoint raw events to be
displayed in perf trace, since perf trace currently only deals with
tracepoint events.

Cc: Frédéric Weisbecker <fweisbec@gmail.com>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
LKML-Reference: <1272865861.7932.16.camel@tropicana>
Signed-off-by: Tom Zanussi <tzanussi@gmail.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2010-05-03 10:31:48 -03:00
Arnaldo Carvalho de Melo
2c9faa0600 perf record: Don't exit in live mode when no tracepoints are enabled
With this I was able to actually test Tom Zanussi's two previous patches
in my usual perf testing ways, i.e. without any tracepoints activated.

Cc: Frédéric Weisbecker <fweisbec@gmail.com>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Tom Zanussi <tzanussi@gmail.com>
LKML-Reference: <new-submission>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2010-05-02 13:37:24 -03:00
Tom Zanussi
454c407ec1 perf: add perf-inject builtin
Currently, perf 'live mode' writes build-ids at the end of the
session, which isn't actually useful for processing live mode events.

What would be better would be to have the build-ids sent before any of
the samples that reference them, which can be done by processing the
event stream and retrieving the build-ids on the first hit.  Doing
that in perf-record itself, however, is off-limits.

This patch introduces perf-inject, which does the same job while
leaving perf-record untouched.  Normal mode perf still records the
build-ids at the end of the session as it should, but for live mode,
perf-inject can be injected in between the record and report steps
e.g.:

perf record -o - ./hackbench 10 | perf inject -v -b | perf report -v -i -

perf-inject reads a perf-record event stream and repipes it to stdout.
At any point the processing code can inject other events into the
event stream - in this case build-ids (-b option) are read and
injected as needed into the event stream.

Build-ids are just the first user of perf-inject - potentially
anything that needs userspace processing to augment the trace stream
with additional information could make use of this facility.

Cc: Ingo Molnar <mingo@elte.hu>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Frédéric Weisbecker <fweisbec@gmail.com>
LKML-Reference: <1272696080-16435-3-git-send-email-tzanussi@gmail.com>
Signed-off-by: Tom Zanussi <tzanussi@gmail.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2010-05-02 13:36:56 -03:00
Tom Zanussi
789688faef perf/live: don't synthesize build ids at the end of a live mode trace
It doesn't really make sense to record the build ids at the end of a
live mode session - live mode samples need that information during the
trace rather than at the end.

Leave event__synthesize_build_id() in place, however; we'll still be
using that to synthesize build ids in a more timely fashion in a
future patch.

Cc: Frédéric Weisbecker <fweisbec@gmail.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Steven Rostedt <rostedt@goodmis.org>
LKML-Reference: <1272696080-16435-2-git-send-email-tzanussi@gmail.com>
Signed-off-by: Tom Zanussi <tzanussi@gmail.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2010-05-02 12:04:05 -03:00
Arnaldo Carvalho de Melo
23346f21b2 perf tools: Rename "kernel_info" to "machine"
struct kernel_info and kerninfo__ are too vague, what they really
describe are machines, virtual ones or hosts.

There are more changes to introduce helpers to shorten function calls
and to make more clear what is really being done, but I left that for
subsequent patches.

Cc: Avi Kivity <avi@redhat.com>
Cc: Frédéric Weisbecker <fweisbec@gmail.com>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Zhang, Yanmin <yanmin_zhang@linux.intel.com>
LKML-Reference: <new-submission>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2010-04-27 21:17:50 -03:00
Zhang, Yanmin
a1645ce12a perf: 'perf kvm' tool for monitoring guest performance from host
Here is the patch of userspace perf tool.

Signed-off-by: Zhang Yanmin <yanmin_zhang@linux.intel.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
2010-04-19 12:37:24 +03:00
Ingo Molnar
84b13fd596 Merge branch 'perf/live' into perf/core
Conflicts:
	tools/perf/builtin-record.c

Merge reason: add the live tracing feature, resolve conflict.

Signed-off-by: Ingo Molnar <mingo@elte.hu>
2010-04-15 09:13:26 +02:00
Frederic Weisbecker
f921281930 perf: Make the trace events sample period default to 1
Trace events are mostly used for tracing and then require not to
be lost when possible. As opposite to hardware events that really
require to trigger after a given sample period, trace events mostly
need to trigger everytime.

It is a frustrating experience to trace with perf and realize we
lost a lot of events because we forgot the "-c 1" option.

Then default sample_period to 1 for trace events but let the user
override it.

Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Thomas Gleixner <tglx@linutronix.de>
2010-04-15 04:12:52 +02:00
Frederic Weisbecker
7865e817e9 perf: Make -f the default for perf record
Force the overwriting mode by default if append mode is not explicit.
Adding -f every time one uses perf on a daily basis quickly becomes a
burden.

Keep the -f among the options though to avoid breaking some random
users scripts.

Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Thomas Gleixner <tglx@linutronix.de>
2010-04-15 04:12:51 +02:00
Tom Zanussi
c7929e4727 perf: Convert perf header build_ids into build_id events
Bypasses the build_id perf header code and replaces it with a
synthesized event and processing function that accomplishes the
same thing, used when reading/writing perf data to/from a pipe.

Signed-off-by: Tom Zanussi <tzanussi@gmail.com>
Acked-by: Thomas Gleixner <tglx@linutronix.de>
Cc: fweisbec@gmail.com
Cc: rostedt@goodmis.org
Cc: k-keiichi@bx.jp.nec.com
Cc: acme@ghostprotocols.net
LKML-Reference: <1270184365-8281-9-git-send-email-tzanussi@gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2010-04-14 11:56:08 +02:00
Tom Zanussi
9215545e99 perf: Convert perf tracing data into a tracing_data event
Bypasses the tracing_data perf header code and replaces it with
a synthesized event and processing function that accomplishes
the same thing, used when reading/writing perf data to/from a
pipe.

The tracing data is pretty large, and this patch doesn't attempt
to break it down into component events.  The tracing_data event
itself doesn't actually contain the tracing data, rather it
arranges for the event processing code to skip over it after
it's read, using the skip return value added to the event
processing loop in a previous patch.

Signed-off-by: Tom Zanussi <tzanussi@gmail.com>
Acked-by: Thomas Gleixner <tglx@linutronix.de>
Cc: fweisbec@gmail.com
Cc: rostedt@goodmis.org
Cc: k-keiichi@bx.jp.nec.com
Cc: acme@ghostprotocols.net
LKML-Reference: <1270184365-8281-8-git-send-email-tzanussi@gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2010-04-14 11:56:07 +02:00
Tom Zanussi
cd19a035f3 perf: Convert perf event types into event type events
Bypasses the event type perf header code and replaces it with a
synthesized event and processing function that accomplishes the
same thing, used when reading/writing perf data to/from a pipe.

Signed-off-by: Tom Zanussi <tzanussi@gmail.com>
Acked-by: Thomas Gleixner <tglx@linutronix.de>
Cc: fweisbec@gmail.com
Cc: rostedt@goodmis.org
Cc: k-keiichi@bx.jp.nec.com
Cc: acme@ghostprotocols.net
LKML-Reference: <1270184365-8281-7-git-send-email-tzanussi@gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2010-04-14 11:56:07 +02:00
Tom Zanussi
2c46dbb517 perf: Convert perf header attrs into attr events
Bypasses the attr perf header code and replaces it with a
synthesized event and processing function that accomplishes the
same thing, used when reading/writing perf data to/from a pipe.

Making the attrs into events allows them to be streamed over a
pipe along with the rest of the header data (in later patches).
It also paves the way to allowing events to be added and removed
from perf sessions dynamically.

Signed-off-by: Tom Zanussi <tzanussi@gmail.com>
Acked-by: Thomas Gleixner <tglx@linutronix.de>
Cc: fweisbec@gmail.com
Cc: rostedt@goodmis.org
Cc: k-keiichi@bx.jp.nec.com
Cc: acme@ghostprotocols.net
LKML-Reference: <1270184365-8281-6-git-send-email-tzanussi@gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2010-04-14 11:56:07 +02:00
Tom Zanussi
529870e374 perf record: Introduce special handling for pipe output
Adds special treatment for stdout - if the user specifies '-o -'
to perf record, the intent is that the event stream be written
to stdout rather than to a disk file.

Also, redirect stdout of forked child to stderr - in pipe mode,
stdout of the forked child interferes with the stdout perf
stream, so redirect it to stderr where it can still be seen but
won't be mixed in with the perf output.

Signed-off-by: Tom Zanussi <tzanussi@gmail.com>
Acked-by: Thomas Gleixner <tglx@linutronix.de>
Cc: fweisbec@gmail.com
Cc: rostedt@goodmis.org
Cc: k-keiichi@bx.jp.nec.com
Cc: acme@ghostprotocols.net
LKML-Reference: <1270184365-8281-3-git-send-email-tzanussi@gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2010-04-14 11:56:06 +02:00
Tom Zanussi
8dc58101f2 perf: Add pipe-specific header read/write and event processing code
This patch makes several changes to allow the perf event stream
to be sent and received over a pipe:

- adds pipe-specific versions of the header read/write code

- adds pipe-specific version of the event processing code

- adds a range of event types to be used for header or other
  pseudo events, above the range used by the kernel

- checks the return value of event handlers, which they can use
  to skip over large events during event processing rather than actually
  reading them into event objects.

- unifies the multiple do_read() functions and updates its
  users.

Note that none of these changes affect the existing perf data
file format or processing - this code only comes into play if
perf output is sent to stdout (or is read from stdin).

Signed-off-by: Tom Zanussi <tzanussi@gmail.com>
Acked-by: Thomas Gleixner <tglx@linutronix.de>
Cc: fweisbec@gmail.com
Cc: rostedt@goodmis.org
Cc: k-keiichi@bx.jp.nec.com
Cc: acme@ghostprotocols.net
LKML-Reference: <1270184365-8281-2-git-send-email-tzanussi@gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2010-04-14 11:56:05 +02:00
Ian Munsie
c055564217 perf: Fix endianness argument compatibility with OPT_BOOLEAN() and introduce OPT_INCR()
Parsing an option from the command line with OPT_BOOLEAN on a
bool data type would not work on a big-endian machine due to the
manner in which the boolean was being cast into an int and
incremented. For example, running 'perf probe --list' on a
PowerPC machine would fail to properly set the list_events bool
and would therefore print out the usage information and
terminate.

This patch makes OPT_BOOLEAN work as expected with a bool
datatype. For cases where the original OPT_BOOLEAN was
intentionally being used to increment an int each time it was
passed in on the command line, this patch introduces OPT_INCR
with the old behaviour of OPT_BOOLEAN (the verbose variable is
currently the only such example of this).

I have reviewed every use of OPT_BOOLEAN to verify that a true
C99 bool was passed. Where integers were used, I verified that
they were only being used for boolean logic and changed them to
bools to ensure that they would not be mistakenly used as ints.
The major exception was the verbose variable which now uses
OPT_INCR instead of OPT_BOOLEAN.

Signed-off-by: Ian Munsie <imunsie@au.ibm.com>
Acked-by: David S. Miller <davem@davemloft.net>
Cc: <stable@kernel.org> # NOTE: wont apply to .3[34].x cleanly, please backport
Cc: Git development list <git@vger.kernel.org>
Cc: Ian Munsie <imunsie@au1.ibm.com>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
Cc: Hitoshi Mitake <mitake@dcl.info.waseda.ac.jp>
Cc: Rusty Russell <rusty@rustcorp.com.au>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Eric B Munson <ebmunson@us.ibm.com>
Cc: Valdis.Kletnieks@vt.edu
Cc: WANG Cong <amwang@redhat.com>
Cc: Thiago Farina <tfransosi@gmail.com>
Cc: Masami Hiramatsu <mhiramat@redhat.com>
Cc: Xiao Guangrong <xiaoguangrong@cn.fujitsu.com>
Cc: Jaswinder Singh Rajput <jaswinderrajput@gmail.com>
Cc: Arjan van de Ven <arjan@linux.intel.com>
Cc: OGAWA Hirofumi <hirofumi@mail.parknet.co.jp>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Tom Zanussi <tzanussi@gmail.com>
Cc: Anton Blanchard <anton@samba.org>
Cc: John Kacur <jkacur@redhat.com>
Cc: Li Zefan <lizf@cn.fujitsu.com>
Cc: Steven Rostedt <rostedt@goodmis.org>
LKML-Reference: <1271147857-11604-1-git-send-email-imunsie@au.ibm.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2010-04-14 11:26:44 +02:00
Arnaldo Carvalho de Melo
e206d556c5 perf tools: Move the prototypes in util/string.h to util.h
So that we avoid conflict with libc's string.h header.

Reviewed-by: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
Suggested-by: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
Cc: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
Cc: Frédéric Weisbecker <fweisbec@gmail.com>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Paul Mackerras <paulus@samba.org>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2010-04-03 10:19:26 -03:00
Arnaldo Carvalho de Melo
70162138c9 perf record: Add a fallback to the reference relocation symbol
Usually "_text" is enough, but I received reports that its not always
available, so fallback to "_stext" for the symbol we use to check if we
need to apply any relocation to all the symbols in the kernel symtab,
for when, for instance, kexec is being used.

Reported-by: Darren Hart <dvhltc@us.ibm.com>
Cc: Darren Hart <dvhltc@us.ibm.com>
Cc: Frédéric Weisbecker <fweisbec@gmail.com>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Paul Mackerras <paulus@samba.org>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2010-04-02 16:28:06 -03:00
Zhang, Yanmin
5a10317483 perf record: Zero out mmap_array to fix segfault
Reported-by: Li Zefan <lizf@cn.fujitsu.com>
Tested-by: Li Zefan <lizf@cn.fujitsu.com>
Signed-off-by: Zhang Yanmin <yanmin_zhang@linux.intel.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
LKML-Reference: <1269557941-15617-6-git-send-email-acme@infradead.org>
Cc: <stable@kernel.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2010-03-26 08:52:59 +01:00
Zhang, Yanmin
d6d901c23a perf events: Change perf parameter --pid to process-wide collection instead of thread-wide
Parameter --pid (or -p) of perf currently means a thread-wide
collection. For exmaple, if a process whose id is 8888 has 10
threads, 'perf top -p 8888' just collects the main thread
statistics. That's misleading. Users are used to attach a whole
process when debugging a process by gdb. To follow normal usage
style, the patch change --pid to process-wide collection and add
--tid (-t) to mean a thread-wide collection.

Usage example is:

 # perf top -p 8888
 # perf record -p 8888 -f sleep 10
 # perf stat -p 8888 -f sleep 10

Above commands collect the statistics of all threads of process
8888.

Signed-off-by: Zhang Yanmin <yanmin_zhang@linux.intel.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Avi Kivity <avi@redhat.com>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Sheng Yang <sheng@linux.intel.com>
Cc: Joerg Roedel <joro@8bytes.org>
Cc: Jes Sorensen <Jes.Sorensen@redhat.com>
Cc: Marcelo Tosatti <mtosatti@redhat.com>
Cc: Gleb Natapov <gleb@redhat.com>
Cc: zhiteng.huang@intel.com
Cc: Zachary Amsden <zamsden@redhat.com>
LKML-Reference: <1268922965-14774-3-git-send-email-acme@infradead.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2010-03-18 16:21:12 +01:00
Zhang, Yanmin
46be604b5b perf record: Enable counters only when kernel is execing subcommand
'perf record' starts counters before subcommand is execed, so
the statistics is not precise because it includes data of some
preparation steps. I fix it with the patch.

In addition, change the condition to fork/exec subcommand. If
there is a subcommand parameter, perf always fork/exec it. The
usage example is:

 # perf record -f -a sleep 10

So this command could collect statistics for 10 seconds
precisely. User still could stop it by CTRL+C. Without the new
capability, user could only input CTRL+C to stop it without
precise time clock.

Signed-off-by: Zhang Yanmin <yanmin_zhang@linux.intel.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Avi Kivity <avi@redhat.com>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Sheng Yang <sheng@linux.intel.com>
Cc: oerg Roedel <joro@8bytes.org>
Cc: Jes Sorensen <Jes.Sorensen@redhat.com>
Cc: Marcelo Tosatti <mtosatti@redhat.com>
Cc: Gleb Natapov <gleb@redhat.com>
Cc: <zhiteng.huang@intel.com>
Cc: Zachary Amsden <zamsden@redhat.com>
LKML-Reference: <1268922965-14774-2-git-send-email-acme@infradead.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2010-03-18 16:21:11 +01:00
Eric B Munson
bedbfdea31 perf record: Enable the enable_on_exec flag if record forks the target
When forking its target, perf record can capture data from
before the target application is started.  Perf stat uses the
enable_on_exec flag in the event attributes to keep from
displaying events from before the target program starts, this
patch adds the same functionality to perf record when it is will
fork the target process.

Signed-off-by: Eric B Munson <ebmunson@us.ibm.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
LKML-Reference: <1268664418-28328-1-git-send-email-acme@infradead.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2010-03-15 16:08:22 +01:00
Ingo Molnar
937779db13 Merge branch 'perf/urgent' into perf/core
Merge reason: We want to queue up a dependent patch.

Signed-off-by: Ingo Molnar <mingo@elte.hu>
2010-03-12 10:20:59 +01:00
Arnaldo Carvalho de Melo
6230f2c7ef perf record: Mention paranoid sysctl when failing to create counter
[acme@mica linux-2.6-tip]$ perf record -a -f
   Fatal: Permission error - are you root?
 	 Consider tweaking /proc/sys/kernel/perf_event_paranoid.

 [acme@mica linux-2.6-tip]$

Suggested-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Frédéric Weisbecker <fweisbec@gmail.com>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Paul Mackerras <paulus@samba.org>
LKML-Reference: <1268333592-30872-2-git-send-email-acme@infradead.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2010-03-11 20:00:48 +01:00
Arnaldo Carvalho de Melo
9f591fd76a perf record: Don't try to find buildids in a zero sized file
Fixing this symptom:

 [acme@mica linux-2.6-tip]$ perf record -a -f
   Fatal: Permission error - are you root?

 Bus error
 [acme@mica linux-2.6-tip]$

I.e. if for some reason no data is collected, in this case a non
root user trying to do systemwide profiling, no data will be
collected, and then we end up trying to mmap a zero sized file
and access the file header, b00m.

Reported-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Frédéric Weisbecker <fweisbec@gmail.com>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Paul Mackerras <paulus@samba.org>
Cc: <stable@kernel.org>
LKML-Reference: <1268333592-30872-1-git-send-email-acme@infradead.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2010-03-11 20:00:32 +01:00
Paul Mackerras
a12b51c478 perf tools: Fix sparse CPU numbering related bugs
At present, the perf subcommands that do system-wide monitoring
(perf stat, perf record and perf top) don't work properly unless
the online cpus are numbered 0, 1, ..., N-1.  These tools ask
for the number of online cpus with sysconf(_SC_NPROCESSORS_ONLN)
and then try to create events for cpus 0, 1, ..., N-1.

This creates problems for systems where the online cpus are
numbered sparsely.  For example, a POWER6 system in
single-threaded mode (i.e. only running 1 hardware thread per
core) will have only even-numbered cpus online.

This fixes the problem by reading the /sys/devices/system/cpu/online
file to find out which cpus are online.  The code that does that is in
tools/perf/util/cpumap.[ch], and consists of a read_cpu_map()
function that sets up a cpumap[] array and returns the number of
online cpus.  If /sys/devices/system/cpu/online can't be read or
can't be parsed successfully, it falls back to using sysconf to
ask how many cpus are online and sets up an identity map in cpumap[].

The perf record, perf stat and perf top code then calls
read_cpu_map() in the system-wide monitoring case (instead of
sysconf) and uses cpumap[] to get the cpu numbers to pass to
perf_event_open.

Signed-off-by: Paul Mackerras <paulus@samba.org>
Cc: Anton Blanchard <anton@samba.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Arnaldo Carvalho de Melo <acme@infradead.org>
LKML-Reference: <20100310093609.GA3959@brick.ozlabs.ibm.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2010-03-11 13:36:53 +01:00
Eric B Munson
8907fd607b perf record: Add ID and to recorded event data when recording multiple events
Currently perf record does not write the ID or the to disk for
events. This doesn't allow report to tell if an event stream
contains one or more types of events.  This patch adds this
entry to the list of data that record will write to disk if more
than one event was requested.

Signed-off-by: Eric B Munson <ebmunson@us.ibm.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Paul Mackerras <paulus@samba.org>
LKML-Reference: <1267804269-22660-2-git-send-email-acme@infradead.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2010-03-10 13:53:46 +01:00
austin_zhang@linux.intel.com
f7e7ee3675 perf record: Fix existing process callgraph symbol
When 'perf record -g' a existing process, even with debuginfo
packages, still cannnot get symbol from 'perf report'.

try:

 perf record -g -p `pidof xxx` -f
 perf report

    68.26%    :1181           b74870f2  [.] 0x000000b74870f2
              |
              |--32.09%-- 0xb73b5b44
              |          0xb7487102
              |          0xb748a4e2
              |          0xb748633d
              |          0xb73b41cd
              |          0xb73b4467
              |          0xb747d531

The reason is: for existing process, in __cmd_record(),
the pid is 0 rather than the existing process id.

Signed-off-by: Austin Zhang <austin_zhang@linux.intel.com>
Acked-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
LKML-Reference: <4710.10.255.24.35.1265389362.squirrel@linux.intel.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2010-02-08 16:55:52 +01:00
Xiao Guangrong
f887f3019e perf tools: Clean up O_LARGEFILE et al usage
Setting _FILE_OFFSET_BITS and using O_LARGEFILE, lseek64, etc,
is redundant. Thanks H. Peter Anvin for pointing it out.

So, this patch removes O_LARGEFILE, lseek64, etc.

Suggested-by: "H. Peter Anvin" <hpa@zytor.com>
Signed-off-by: Xiao Guangrong <xiaoguangrong@cn.fujitsu.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <peterz@infradead.org>
LKML-Reference: <4B6A8972.3070605@cn.fujitsu.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2010-02-04 10:03:03 +01:00
Arnaldo Carvalho de Melo
6122e4e4f5 perf record: Stop intercepting events, use postprocessing to get build-ids
We want to stream events as fast as possible to perf.data, and
also in the future we want to have splice working, when no
interception will be possible.

Using build_id__mark_dso_hit_ops to create the list of DSOs that
back MMAPs we also optimize disk usage in the build-id cache by
only caching DSOs that had hits.

Suggested-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Xiao Guangrong <xiaoguangrong@cn.fujitsu.com>
Cc: Frédéric Weisbecker <fweisbec@gmail.com>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Paul Mackerras <paulus@samba.org>
LKML-Reference: <1265223128-11786-6-git-send-email-acme@infradead.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2010-02-04 09:33:27 +01:00
Xiao Guangrong
b8f46c5a34 perf tools: Use O_LARGEFILE to open perf data file
Open perf data file with O_LARGEFILE flag since its size is
easily larger that 2G.

For example:

 # rm -rf perf.data
 # ./perf kmem record sleep 300

 [ perf record: Woken up 0 times to write data ]
 [ perf record: Captured and wrote 3142.147 MB perf.data
 (~137282513 samples) ]

 # ll -h perf.data
 -rw------- 1 root root 3.1G .....

Signed-off-by: Xiao Guangrong <xiaoguangrong@cn.fujitsu.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <peterz@infradead.org>
LKML-Reference: <4B68F32A.9040203@cn.fujitsu.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2010-02-03 09:03:59 +01:00
Hitoshi Mitake
a8e6f734ce Revert "perf record: Intercept all events"
This reverts commit f5a2c3dce0.

This patch is required for making "perf lock rec" work.
The commit f5a2c3dce0 changes write_event() of builtin-record.c
. And changed write_event() sometimes doesn't stop with perf
lock rec.

Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
LKML-Reference: <new-submission>
[ that commit also causes perf record to not be Ctrl-C-able,
  and it's concetually wrong to parse the data at record time
  (unconditionally - even when not needed), as we eventually
  want to be able to do zero-copy recording, at least for
  non-archive recordings.  ]
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2010-01-31 08:27:52 +01:00
Arnaldo Carvalho de Melo
64abebf731 perf session: Create kernel maps in the constructor
Removing one extra step needed in the tools that need this,
fixing a bug in 'perf probe' where this was not being done.

Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Masami Hiramatsu <mhiramat@redhat.com>
Cc: Frédéric Weisbecker <fweisbec@gmail.com>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Paul Mackerras <paulus@samba.org>
LKML-Reference: <1264633557-17597-4-git-send-email-acme@infradead.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2010-01-29 09:20:58 +01:00
Arnaldo Carvalho de Melo
f5a2c3dce0 perf record: Intercept all events
The event interception we need to do in 'perf record' to create
a list of all DSOs in PERF_RECORD_MMAP events wasn't seeing all
events, make sure that happens by checking size agains
event_t->header.size.

Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Frédéric Weisbecker <fweisbec@gmail.com>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Paul Mackerras <paulus@samba.org>
LKML-Reference: <1263586107-1756-1-git-send-email-acme@infradead.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2010-01-16 10:58:50 +01:00
Arnaldo Carvalho de Melo
18c3daa496 perf record: Encode the domain while synthesizing MMAP events
In the past 'perf record' had to process only userspace MMAP
events, the ones generated in the kernel, but after we reused
the MMAP events to encode the module mapings we ended up adding
them first to the list of userspace DSOs (dsos__user) and to the
kernel one (dsos__kernel).

Fix this by encoding the header.misc field and then using it,
like other parts to decide the right DSOs list to insert/find.

The gotcha here is that since the kernel puts zero in .misc,
which isn't PERF_RECORD_MISC_KERNEL (1 << 1), to differentiate,
we put 1 in .misc.

Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Frédéric Weisbecker <fweisbec@gmail.com>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Paul Mackerras <paulus@samba.org>
LKML-Reference: <1263519930-22803-2-git-send-email-acme@infradead.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2010-01-16 10:58:48 +01:00
Arnaldo Carvalho de Melo
b7cece7678 perf tools: Encode kernel module mappings in perf.data
We were always looking at the running machine /proc/modules,
even when processing a perf.data file, which only makes sense
when we're doing 'perf record' and 'perf report' on the same
machine, and in close sucession, or if we don't use modules at
all, right Peter? ;-)

Now, at 'perf record' time we read /proc/modules, find the long
path for modules, and put them as PERF_MMAP events, just like we
did to encode the reloc reference symbol for vmlinux. Talking
about that now it is encoded in .pgoff, so that we can use
.{start,len} to store the address boundaries for the kernel so
that when we reconstruct the kmaps tree we can do lookups right
away, without having to fixup the end of the kernel maps like we
did in the past (and now only in perf record).

One more step in the 'perf archive' direction when we'll finally
be able to collect data in one machine and analyse in another.

Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Frédéric Weisbecker <fweisbec@gmail.com>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Paul Mackerras <paulus@samba.org>
LKML-Reference: <1263396139-4798-1-git-send-email-acme@infradead.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2010-01-13 17:39:43 +01:00
Arnaldo Carvalho de Melo
56b03f3c4d perf tools: Handle relocatable kernels
DSOs don't have this problem because the kernel emits a
PERF_MMAP for each new executable mapping it performs on
monitored threads.

To fix the kernel case we simulate the same behaviour, by having
'perf record' to synthesize a PERF_MMAP for the kernel, encoded
like this:

[root@doppio ~]# perf record -a -f sleep 1
[ perf record: Woken up 1 times to write data ]
[ perf record: Captured and wrote 0.344 MB perf.data (~15038 samples) ]
[root@doppio ~]# perf report -D | head -10

0xd0 [0x40]: event: 1
.
. ... raw event: size 64 bytes
.  0000:  01 00 00 00 00 00 40 00 00 00 00 00 00 00 00 00 ......@........
.  0010:  00 00 00 81 ff ff ff ff 00 00 00 00 00 00 00 00 ...............
.  0020:  00 00 00 00 00 00 00 00 5b 6b 65 72 6e 65 6c 2e ........  [kernel
.  0030:  6b 61 6c 6c 73 79 6d 73 2e 5f 74 65 78 74 5d 00  kallsyms._text]
.  0xd0
[0x40]: PERF_RECORD_MMAP 0/0: [0xffffffff81000000((nil)) @ (nil)]: [kernel.kallsyms._text]

I.e. we identify such event as having:

 .pid      = 0
 .filename = [kernel.kallsyms.REFNAME]
 .start    = REFNAME addr in /proc/kallsyms at 'perf record' time

and use now a hardcoded value of '.text' for REFNAME.

Then, later, in 'perf report', if there are any kernel hits and
thus we need to resolve kernel symbols, we search for REFNAME
and if its address changed, relocation happened and we thus must
change the kernel mapping routines to one that uses .pgoff as
the relocation to apply.

This way we use the same mechanism used for the other DSOs and
don't have to do a two pass in all the kernel symbols.

Reported-by: Xiao Guangrong <xiaoguangrong@cn.fujitsu.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Frédéric Weisbecker <fweisbec@gmail.com>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Xiao Guangrong <xiaoguangrong@cn.fujitsu.com>
LKML-Reference: <1262717431-1246-1-git-send-email-acme@infradead.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2010-01-13 10:09:11 +01:00
Arnaldo Carvalho de Melo
d4db3f1645 perf record: We should fork only if a program was specified to run
IOW: Now 'perf record -a' works, this was a bug introduced in:

856e96608a
"perf record: Properly synchronize child creation"

Also fix the -C usage, i.e. allow for profiling all the tasks in
one CPU.

Reported-by: Pekka Enberg <penberg@cs.helsinki.fi>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Frédéric Weisbecker <fweisbec@gmail.com>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Paul Mackerras <paulus@samba.org>
LKML-Reference: <1261957026-15580-1-git-send-email-acme@infradead.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-12-28 09:02:50 +01:00
Peter Zijlstra
60ab271617 perf record: Use per-task-per-cpu events for inherited events
Create events with a pid and cpu contraint for inherited events
so that we get a stream per cpu, instead of all cpus contending
on a single stream.

Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: fweisbec@gmail.com
Cc: Paul Mackerras <paulus@samba.org>
LKML-Reference: <20091216165904.987643843@chello.nl>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-12-16 18:30:13 +01:00
Peter Zijlstra
856e96608a perf record: Properly synchronize child creation
Remove that ugly usleep and provide proper serialization between
parent and child just like perf-stat does.

Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: fweisbec@gmail.com
Cc: Paul Mackerras <paulus@samba.org>
LKML-Reference: <20091216165904.908184135@chello.nl>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-12-16 18:30:12 +01:00
Arnaldo Carvalho de Melo
655000e7c7 perf symbols: Adopt the strlists for dso, comm
Will be used in perf diff too.

Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Frédéric Weisbecker <fweisbec@gmail.com>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Paul Mackerras <paulus@samba.org>
LKML-Reference: <1260914682-29652-2-git-send-email-acme@infradead.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-12-16 08:53:49 +01:00
Arnaldo Carvalho de Melo
75be6cf487 perf symbols: Make symbol_conf global
This simplifies a lot of functions, less stuff to be done by
tool writers.

Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Frédéric Weisbecker <fweisbec@gmail.com>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Paul Mackerras <paulus@samba.org>
LKML-Reference: <1260914682-29652-1-git-send-email-acme@infradead.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-12-16 08:53:48 +01:00
Arnaldo Carvalho de Melo
b38d34645c perf record: Rename perf.data to perf.data.old if --force/-f is used
Suggested-by: Steven Rostedt <rostedt@goodmis.org>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Frédéric Weisbecker <fweisbec@gmail.com>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Paul Mackerras <paulus@samba.org>
LKML-Reference: <1260828571-3613-2-git-send-email-acme@infradead.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-12-15 08:50:29 +01:00
Arnaldo Carvalho de Melo
4aa6563641 perf session: Move kmaps to perf_session
There is still some more work to do to disentangle map creation
from DSO loading, but this happens only for the kernel, and for
the early adopters of perf diff, where this disentanglement
matters most, we'll be testing different kernels, so no problem
here.

Further clarification: right now we create the kernel maps for
the various modules and discontiguous kernel text maps when
loading the DSO, we should do it as a two step process, first
creating the maps, for multiple mappings with the same DSO
store, then doing the dso load just once, for the first hit on
one of the maps sharing this DSO backing store.

Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Frédéric Weisbecker <fweisbec@gmail.com>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Paul Mackerras <paulus@samba.org>
LKML-Reference: <1260741029-4430-6-git-send-email-acme@infradead.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-12-14 16:57:17 +01:00
Arnaldo Carvalho de Melo
d8f66248d6 perf session: Pass the perf_session to the event handling operations
They will need it to get the right threads list, etc.

Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Frédéric Weisbecker <fweisbec@gmail.com>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Paul Mackerras <paulus@samba.org>
LKML-Reference: <1260741029-4430-1-git-send-email-acme@infradead.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-12-14 16:57:13 +01:00
Arnaldo Carvalho de Melo
94c744b6c0 perf tools: Introduce perf_session class
That does all the initialization boilerplate, opening the file,
reading the header, checking if it is valid, etc.

And that will as well have the threads list, kmap (now) global
variable, etc, so that we can handle two (or more) perf.data files
describing sessions to compare.

Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Frédéric Weisbecker <fweisbec@gmail.com>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Paul Mackerras <paulus@samba.org>
LKML-Reference: <1260573842-19720-1-git-send-email-acme@infradead.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-12-12 07:42:12 +01:00
Simon Kaempflein
bfd451184d perf record, x86: Print more intelligent error message when sampling fails
Print more accurate error message when "perf record" fails because
there is no APIC support, on x86.

Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-11-23 09:40:13 +01:00
Arnaldo Carvalho de Melo
d5eed904bb perf tools: Eliminate some more die() uses in library functions
This time in perf_header__adds_write, propagating the do_write
error returns.

Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Frédéric Weisbecker <fweisbec@gmail.com>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Paul Mackerras <paulus@samba.org>
LKML-Reference: <1258649757-17554-2-git-send-email-acme@infradead.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-11-19 18:47:17 +01:00
Arnaldo Carvalho de Melo
4dc0a04bb1 perf tools: perf_header__read() shouldn't die()
And also don't call the constructor in it, this way it adheres
to the model the other methods follow.

Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Frédéric Weisbecker <fweisbec@gmail.com>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Paul Mackerras <paulus@samba.org>
LKML-Reference: <1258649757-17554-1-git-send-email-acme@infradead.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-11-19 18:47:17 +01:00
Arnaldo Carvalho de Melo
a9a70bbce7 perf tools: Don't die() in perf_header__new()
Propagate the errors instead, the users are the ones to decide
what to do if a library call fails.

Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <peterz@infradead.org>
LKML-Reference: <1258427892-16312-3-git-send-email-acme@infradead.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-11-17 07:19:56 +01:00
Arnaldo Carvalho de Melo
5875412152 perf tools: Don't die() in perf_header_attr__add_id()
Propagate the errors instead, the users are the ones to decide
what to do if a library call fails.

Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <peterz@infradead.org>
LKML-Reference: <1258427892-16312-2-git-send-email-acme@infradead.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-11-17 07:19:55 +01:00
Arnaldo Carvalho de Melo
11deb1f9f6 perf tools: Don't die() in perf_header__add_attr()
Propagate the errors instead, the users are the ones to decide
what to do if a library call fails.

Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <peterz@infradead.org>
LKML-Reference: <1258427892-16312-1-git-send-email-acme@infradead.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-11-17 07:19:54 +01:00
Arnaldo Carvalho de Melo
dc79c0fc08 perf tools: Don't die in perf_header_attr__new()
We really should propagate such kinds of errors so that users of
these library functions decide what to do in such cases instead
of exiting in random places like now.

Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <peterz@infradead.org>
LKML-Reference: <1258407027-384-1-git-send-email-acme@infradead.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-11-17 07:19:52 +01:00
Ingo Molnar
39dc78b651 Merge commit 'v2.6.32-rc7' into perf/core
Merge reason: pick up perf fixlets

Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-11-15 09:50:41 +01:00
Frederic Weisbecker
3e13ab2d83 perf tools: Use perf_header__set/has_feat whenever possible
And drop the alternate checks/sets using set_bit or other kind
of helpers.

Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Hitoshi Mitake <mitake@dcl.info.waseda.ac.jp>
LKML-Reference: <1257911467-28276-5-git-send-email-fweisbec@gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-11-11 07:30:19 +01:00
Frederic Weisbecker
8671dab9d5 perf tools: Move the build-id storage operations to headers
So that it makes easier to control it. Especially because we
plan to give it a feature section.

Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Hitoshi Mitake <mitake@dcl.info.waseda.ac.jp>
LKML-Reference: <1257911467-28276-2-git-send-email-fweisbec@gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-11-11 07:30:17 +01:00
Frederic Weisbecker
de8967214d perf tools: Synthetize the targeted process
Don't forget to also synthetize the targeted process from perf
record or we'll miss its dso in the events and then we won't be
able to deal with its build-id.

We are missing it because it is created after the existing
synthetized tasks but before the counters are enabled and can
send its mapping event.

Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Hitoshi Mitake <mitake@dcl.info.waseda.ac.jp>
LKML-Reference: <1257911467-28276-1-git-send-email-fweisbec@gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-11-11 07:30:17 +01:00
Pekka Enberg
c10edee2e1 perf tools: Fix permission checks
The perf_event_open() system call returns EACCES if the user is
not root which results in a very confusing error message:

  $ perf record -A -a -f

    Error: perfcounter syscall returned with -1 (Permission denied)

    Fatal: No CONFIG_PERF_EVENTS=y kernel support configured?

It turns out that's because perf tools are checking only for
EPERM. Fix that up to get a much better error message:

  $ perf record -A -a -f
    Fatal: Permission error - are you root?

Signed-off-by: Pekka Enberg <penberg@cs.helsinki.fi>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
LKML-Reference: <1257696066-4046-1-git-send-email-penberg@cs.helsinki.fi>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-11-08 17:04:54 +01:00
Arnaldo Carvalho de Melo
8d06367fa7 perf symbols: Use the buildids if present
With this change 'perf record' will intercept PERF_RECORD_MMAP
calls, creating a linked list of DSOs, then when the session
finishes, it will traverse this list and read the buildids,
stashing them at the end of the file and will set up a new
feature bit in the header bitmask.

'perf report' will then notice this feature and populate the
'dsos' list and set the build ids.

When reading the symtabs it will refuse to load from a file that
doesn't have the same build id. This improves the
reliability of the profiler output, as symbols and profiling
data is more guaranteed to match.

Example:

 [root@doppio ~]# perf report | head
 /home/acme/bin/perf with build id b1ea544ac3746e7538972548a09aadecc5753868 not found, continuing without symbols
  # Samples: 2621434559
  #
  # Overhead          Command                  Shared Object  Symbol
  # ........  ...............  .............................  ......
  #
       7.91%             init  [kernel]        [k] read_hpet
       7.64%             init  [kernel]        [k] mwait_idle_with_hints
       7.60%          swapper  [kernel]        [k] read_hpet
       7.60%          swapper  [kernel]        [k] mwait_idle_with_hints
       3.65%             init  [kernel]        [k] 0xffffffffa02339d9
[root@doppio ~]#

In this case the 'perf' binary was an older one, vanished,
so its symbols probably wouldn't match or would cause subtly
different (and misleading) output.

Next patches will support the kernel as well, reading the build
id notes for it and the modules from /sys.

Another patch should also introduce a new plumbing command:

'perf list-buildids'

that will then be used in porcelain that is distro specific to
fetch -debuginfo packages where such buildids are present. This
will in turn allow for one to run 'perf record' in one machine
and 'perf report' in another.

Future work on having the buildid sent directly from the kernel
in the PERF_RECORD_MMAP event is needed to close races, as the
DSO can be changed during a 'perf record' session, but this
patch at least helps with non-corner cases and current/older
kernels.

Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Ananth N Mavinakayanahalli <ananth@in.ibm.com>
Cc: Christoph Hellwig <hch@infradead.org>
Cc: Frank Ch. Eigler <fche@redhat.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Jason Baron <jbaron@redhat.com>
Cc: Jim Keniston <jkenisto@us.ibm.com>
Cc: K. Prasad <prasad@linux.vnet.ibm.com>
Cc: Masami Hiramatsu <mhiramat@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Roland McGrath <roland@redhat.com>
Cc: Srikar Dronamraju <srikar@linux.vnet.ibm.com>
Cc: Steven Rostedt <rostedt@goodmis.org>
LKML-Reference: <1257367843-26224-1-git-send-email-acme@infradead.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-11-08 10:44:36 +01:00
Arnaldo Carvalho de Melo
234fbbf508 perf tools: Generalize event synthesizing routines
Because we will need it in 'perf top' to support userspace
symbols for existing threads.

Now we pass a callback that will receive the synthesized event
and then write it to the output file in 'perf record' and in the
upcoming patch for 'perf top' we will just immediatelly create
the in memory representation of threads and maps.

Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Mike Galbraith <efault@gmx.de>
LKML-Reference: <1256592199-9608-2-git-send-email-acme@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-10-27 13:51:53 +01:00
Arnaldo Carvalho de Melo
7f3bedcc93 perf record: Fix race where process can disappear while reading its /proc/pid/tasks
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Mike Galbraith <efault@gmx.de>
LKML-Reference: <1256592199-9608-1-git-send-email-acme@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-10-27 13:51:53 +01:00
Arnaldo Carvalho de Melo
6beba7adbe perf tools: Unify debug messages mechanisms
We were using eprintf in some places, that looks at a global
'verbose' level, and at other places passing a 'v' parameter to
specify the verbosity level, unify it by introducing
pr_{err,warning,debug,etc}, just like in the kernel.

Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Mike Galbraith <efault@gmx.de>
LKML-Reference: <1256153646-10097-1-git-send-email-acme@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-10-23 08:22:47 +02:00
Frederic Weisbecker
2ba0825075 perf tools: Introduce bitmask'ed additional headers
This provides a new set of bitmasked headers. A new field is
added in the perf headers that implements a bitmap storing
optional features present in the perf.data file.

The layout can be pictured like this:

(Usual perf headers)(Features bitmap)[Feature 0][Feature
n][Feature 255]

If the bit n is set, then the feature n is used in this file.
They are all set in order. This brings a backward and forward
compatibility.

The trace_info section has moved into such optional features,
this is the first and only one for now.

This is backward compatible with the .32 file version although
it doesn't support the previous separate trace.info file.

And finally it doesn't support the current interim development
version.

Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Steven Rostedt <rostedt@goodmis.org>
LKML-Reference: <1255792354-11304-2-git-send-email-fweisbec@gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-10-19 09:26:35 +02:00
Frederic Weisbecker
5a116dd279 perf tools: Use kernel bitmap library
Use the kernel bitmap library for internal perf tools uses.

Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Steven Rostedt <rostedt@goodmis.org>
LKML-Reference: <1255792354-11304-1-git-send-email-fweisbec@gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-10-19 09:26:34 +02:00
Li Zefan
c171b552a7 perf trace: Add filter Suppport
Add a new option "--filter <filter_str>" to perf record, and
it should be right after "-e trace_point":

 #./perf record -R -f -e irq:irq_handler_entry --filter irq==18
 ^C
 # ./perf trace
            perf-4303  ... irq_handler_entry: irq=18 handler=eth0
            init-0     ... irq_handler_entry: irq=18 handler=eth0
            init-0     ... irq_handler_entry: irq=18 handler=eth0
            init-0     ... irq_handler_entry: irq=18 handler=eth0
            init-0     ... irq_handler_entry: irq=18 handler=eth0

See Documentation/trace/events.txt for the syntax of filter
expressions.

Signed-off-by: Li Zefan <lizf@cn.fujitsu.com>
Acked-by: Peter Zijlstra <peterz@infradead.org>
Acked-by: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Tom Zanussi <tzanussi@gmail.com>
LKML-Reference: <4AD6955F.90602@cn.fujitsu.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-10-15 11:35:23 +02:00
Mike Galbraith
7e4ff9e3e8 perf tools: Fix counter sample frequency breakage
Commit 42e59d7d19 switched to a default sample frequency of
1KHz, which overrides any user supplied count, causing sched, top
and timechart to miss events due to their discrete events
being flagged PERF_SAMPLE_PERIOD.

Override default sample frequency when the user profides a
period count, and make both record and top honor that user
supplied option.

Signed-off-by: Mike Galbraith <efault@gmx.de>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Arjan van de Ven <arjan@infradead.org>
LKML-Reference: <1255326963.15107.2.camel@marge.simson.net>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-10-12 09:10:55 +02:00
Frederic Weisbecker
03456a158d perf tools: Merge trace.info content into perf.data
This drops the trace.info file and move its contents into the
common perf.data file.

This is done by creating a new trace_info section into this file. A
user of perf headers needs to call perf_header__set_trace_info() to
save the trace meta informations into the perf.data file.

A file created by perf after his patch is unsupported by previous
version because the size of the headers have increased.

That said, it's two new fields that have been added in the end of
the headers, and those could be ignored by previous versions if
they just handled the dynamic header size and then ignore the
unknow part. The offsets guarantee the compatibility. We'll do a
-stable fix for that.

But current previous versions handle the header size using its
static size, not dynamic, then it's not backward compatible with
trace records.

Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Paul Mackerras <paulus@samba.org>
LKML-Reference: <20091006213643.GA5343@nowhere>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-10-07 08:36:10 +02:00
Ingo Molnar
42e59d7d19 perf tools: Default to 1 KHz auto-sampling freq events
Use auto-freq events by default in perf record and
perf top.

This allows more consistent hardware event sampling,
regardless of the intensity of the underlying event.

It also keeps us from over-sampling on larger/busier
systems.

(also make surrounding initializations more consistent)

Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
LKML-Reference: <new-submission>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-10-06 15:41:09 +02:00
Chris Wilson
933da83aa1 perf: Propagate term signal to child
If we launch the child on behalf of the user, ensure that it dies
along with ourselves when we are interrupted.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
LKML-Reference: <1254616502-4728-1-git-send-email-chris@chris-wilson.co.uk>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-10-04 19:37:39 +02:00
Ingo Molnar
cdd6c482c9 perf: Do the big rename: Performance Counters -> Performance Events
Bye-bye Performance Counters, welcome Performance Events!

In the past few months the perfcounters subsystem has grown out its
initial role of counting hardware events, and has become (and is
becoming) a much broader generic event enumeration, reporting, logging,
monitoring, analysis facility.

Naming its core object 'perf_counter' and naming the subsystem
'perfcounters' has become more and more of a misnomer. With pending
code like hw-breakpoints support the 'counter' name is less and
less appropriate.

All in one, we've decided to rename the subsystem to 'performance
events' and to propagate this rename through all fields, variables
and API names. (in an ABI compatible fashion)

The word 'event' is also a bit shorter than 'counter' - which makes
it slightly more convenient to write/handle as well.

Thanks goes to Stephane Eranian who first observed this misnomer and
suggested a rename.

User-space tooling and ABI compatibility is not affected - this patch
should be function-invariant. (Also, defconfigs were not touched to
keep the size down.)

This patch has been generated via the following script:

  FILES=$(find * -type f | grep -vE 'oprofile|[^K]config')

  sed -i \
    -e 's/PERF_EVENT_/PERF_RECORD_/g' \
    -e 's/PERF_COUNTER/PERF_EVENT/g' \
    -e 's/perf_counter/perf_event/g' \
    -e 's/nb_counters/nb_events/g' \
    -e 's/swcounter/swevent/g' \
    -e 's/tpcounter_event/tp_event/g' \
    $FILES

  for N in $(find . -name perf_counter.[ch]); do
    M=$(echo $N | sed 's/perf_counter/perf_event/g')
    mv $N $M
  done

  FILES=$(find . -name perf_event.*)

  sed -i \
    -e 's/COUNTER_MASK/REG_MASK/g' \
    -e 's/COUNTER/EVENT/g' \
    -e 's/\<event\>/event_id/g' \
    -e 's/counter/event/g' \
    -e 's/Counter/Event/g' \
    $FILES

... to keep it as correct as possible. This script can also be
used by anyone who has pending perfcounters patches - it converts
a Linux kernel tree over to the new naming. We tried to time this
change to the point in time where the amount of pending patches
is the smallest: the end of the merge window.

Namespace clashes were fixed up in a preparatory patch - and some
stylistic fallout will be fixed up in a subsequent patch.

( NOTE: 'counters' are still the proper terminology when we deal
  with hardware registers - and these sed scripts are a bit
  over-eager in renaming them. I've undone some of that, but
  in case there's something left where 'counter' would be
  better than 'event' we can undo that on an individual basis
  instead of touching an otherwise nicely automated patch. )

Suggested-by: Stephane Eranian <eranian@google.com>
Acked-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Acked-by: Paul Mackerras <paulus@samba.org>
Reviewed-by: Arjan van de Ven <arjan@linux.intel.com>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: David Howells <dhowells@redhat.com>
Cc: Kyle McMartin <kyle@mcmartin.ca>
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: <linux-arch@vger.kernel.org>
LKML-Reference: <new-submission>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-09-21 14:28:04 +02:00
Peter Zijlstra
8b412664d0 perf record: Disable profiling before draining the buffer
I noticed that perf-record continues profiling itself after the
child terminated and we're draining the buffer.

This can cause a _lot_ of overhead with --all recording - we keep
and keep recording, which produces new and new events.

Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
LKML-Reference: <new-submission>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-09-17 22:08:27 +02:00
Ingo Molnar
ea57c4f520 perf tools: Implement counter output multiplexing
Finish the -M/--multiplex option implementation:

 - separate it out from group_fd

 - correctly set it via the ioctl and dont mmap counters that
   are multiplexed

 - modify the perf record event loop to deal with buffer-less
   counters.

 - remove the -g option from perf sched record

 - account for unordered events in perf sched latency

 - (add -f to perf sched record to ease measurements)

 - skip idle threads (pid==0) in latency output

The result is better latency output by 'perf sched latency':

 -----------------------------------------------------------------------------------
  Task              |  Runtime ms | Switches | Average delay ms | Maximum delay ms |
 -----------------------------------------------------------------------------------
  ksoftirqd/8       |    0.071 ms |        2 | avg:    0.458 ms | max:    0.913 ms |
  at-spi-registry   |    0.609 ms |       19 | avg:    0.013 ms | max:    0.023 ms |
  perf              |    3.316 ms |       16 | avg:    0.013 ms | max:    0.054 ms |
  Xorg              |    0.392 ms |       19 | avg:    0.011 ms | max:    0.018 ms |
  sleep             |    0.537 ms |        2 | avg:    0.009 ms | max:    0.009 ms |
 -----------------------------------------------------------------------------------
  TOTAL:            |    4.925 ms |       58 |
 ---------------------------------------------

Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
LKML-Reference: <new-submission>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-09-14 15:45:11 +02:00
Frederic Weisbecker
d13025222c perf tools: Add an option to multiplex counters in a single channel
Add an option to multiplex counters output in the channel of
the group leader, ie: the first counter opened:

	-M --multiplex

The effect is better serialized samples. This is especially
useful for tracepoint samples that need to be well serialized
for their post-processing.

Also make use of this option in 'perf sched'.

Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-09-14 09:52:23 +02:00
Ingo Molnar
6ddf259da7 perf trace: Sample timestamps as well
Before:

            perf-21082 [013]     0.000000: sched_wakeup_new: task perf:21083 [120] success=1 [015]
            perf-21082 [013]     0.000000: sched_migrate_task: task perf:21082 [120] from: 13  to: 15
            perf-21082 [013]     0.000000: sched_process_fork: parent perf:21082  child perf:21083
            true-21083 [015]     0.000000: sched_wakeup: task migration/15:33 [0] success=1 [015]
            perf-21082 [013]     0.000000: sched_switch: task perf:21082 [120] (S) ==> swapper:0 [140]
            true-21083 [015]     0.000000: sched_switch: task perf:21083 [120] (R) ==> migration/15:33 [0]
            true-21083 [011]     0.000000: sched_process_exit: task true:21083 [120]

After:

            perf-21082 [013] 14674.797613: sched_wakeup_new: task perf:21083 [120] success=1 [015]
            perf-21082 [013] 14674.797506: sched_migrate_task: task perf:21082 [120] from: 13  to: 15
            perf-21082 [013] 14674.797610: sched_process_fork: parent perf:21082  child perf:21083
            true-21083 [015] 14674.797725: sched_wakeup: task migration/15:33 [0] success=1 [015]
            perf-21082 [013] 14674.797722: sched_switch: task perf:21082 [120] (S) ==> swapper:0 [140]
            true-21083 [015] 14674.797729: sched_switch: task perf:21083 [120] (R) ==> migration/15:33 [0]
            true-21083 [011] 14674.798159: sched_process_exit: task true:21083 [120]

Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
LKML-Reference: <new-submission>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-09-03 15:45:49 +02:00
Ingo Molnar
cd6feeeafd perf trace: Sample the CPU too
Sample, record, parse and print the CPU field - it had all zeroes before.

Before (watch the second column, the CPU values):

            perf-32685 [000]     0.000000: sched_wakeup_new: task perf:32686 [120] success=1 [011]
            perf-32685 [000]     0.000000: sched_migrate_task: task perf:32685 [120] from: 1  to: 11
            perf-32685 [000]     0.000000: sched_process_fork: parent perf:32685  child perf:32686
            true-32686 [000]     0.000000: sched_wakeup: task migration/11:25 [0] success=1 [011]
            true-32686 [000]     0.000000: sched_wakeup: task distccd:12793 [125] success=1 [015]
            true-32686 [000]     0.000000: sched_wakeup: task distccd:12793 [125] success=1 [015]
            perf-32685 [000]     0.000000: sched_switch: task perf:32685 [120] (S) ==> swapper:0 [140]
            true-32686 [000]     0.000000: sched_switch: task perf:32686 [120] (R) ==> migration/11:25 [0]
            true-32686 [000]     0.000000: sched_switch: task perf:32686 [120] (R) ==> distccd:12793 [125]
            true-32686 [000]     0.000000: sched_switch: task true:32686 [120] (R) ==> distccd:12793 [125]
            true-32686 [000]     0.000000: sched_process_exit: task true:32686 [120]
            true-32686 [000]     0.000000: sched_stat_wait: task: distccd:12793 wait: 6767985949080 [ns]
            true-32686 [000]     0.000000: sched_stat_wait: task: distccd:12793 wait: 6767986139446 [ns]
            true-32686 [000]     0.000000: sched_stat_sleep: task: distccd:12793 sleep: 132844 [ns]
            true-32686 [000]     0.000000: sched_stat_sleep: task: distccd:12793 sleep: 131724 [ns]

After:

            perf-32685 [001]     0.000000: sched_wakeup_new: task perf:32686 [120] success=1 [011]
            perf-32685 [001]     0.000000: sched_migrate_task: task perf:32685 [120] from: 1  to: 11
            perf-32685 [001]     0.000000: sched_process_fork: parent perf:32685  child perf:32686
            true-32686 [011]     0.000000: sched_wakeup: task migration/11:25 [0] success=1 [011]
            true-32686 [015]     0.000000: sched_wakeup: task distccd:12793 [125] success=1 [015]
            true-32686 [015]     0.000000: sched_wakeup: task distccd:12793 [125] success=1 [015]
            perf-32685 [001]     0.000000: sched_switch: task perf:32685 [120] (S) ==> swapper:0 [140]
            true-32686 [011]     0.000000: sched_switch: task perf:32686 [120] (R) ==> migration/11:25 [0]
            true-32686 [015]     0.000000: sched_switch: task perf:32686 [120] (R) ==> distccd:12793 [125]
            true-32686 [015]     0.000000: sched_switch: task true:32686 [120] (R) ==> distccd:12793 [125]
            true-32686 [015]     0.000000: sched_process_exit: task true:32686 [120]
            true-32686 [015]     0.000000: sched_stat_wait: task: distccd:12793 wait: 6767985949080 [ns]
            true-32686 [015]     0.000000: sched_stat_wait: task: distccd:12793 wait: 6767986139446 [ns]
            true-32686 [015]     0.000000: sched_stat_sleep: task: distccd:12793 sleep: 132844 [ns]
            true-32686 [015]     0.000000: sched_stat_sleep: task: distccd:12793 sleep: 131724 [ns]

So we can now see how this workload migrated between CPUs.

Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Li Zefan <lizf@cn.fujitsu.com>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
LKML-Reference: <new-submission>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-09-02 21:28:50 +02:00
Frederic Weisbecker
1ef2ed1066 perf tools: Only save the event formats we need
While opening a trace event counter, every events are saved in
the trace.info file. But we only want to save the
specifications of the events we are using.

Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Steven Rostedt <rostedt@goodmis.org>
LKML-Reference: <1251421798-9101-1-git-send-email-fweisbec@gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-08-28 07:58:11 +02:00
Frederic Weisbecker
9df37ddd81 perf tools: Record events info also when :record suffix is used.
You can enable a counter's PERF_SAMPLE_RAW attribute in two
fashions:

- using the -R option (every counters get PERF_SAMPLE_RAW)
- using the :record suffix in a trace event counter name

Currently we record the events info in a trace.info file from
perf record when the former method is used but we omit it with
the latter.

Check both situations.

Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Steven Rostedt <rostedt@goodmis.org>
LKML-Reference: <1250543271-8383-3-git-send-email-fweisbec@gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-08-18 00:00:19 +02:00
Frederic Weisbecker
5f9c39dca5 perf tools: Add perf trace
This adds perf trace into the set of perf tools.

It is written to fetch the tracepoint samples from perf events
and display them, according to the events information given by
the debugfs files through the util/trace* tools.

It is a rough first shot and doesn't yet handle the cpu,
timestamps fields and some other things.

Example:

 perf record -f -e workqueue:workqueue_execution:record -F 1 -a
 perf trace

       kblockd/0-236   [000]     0.000000: workqueue_execution: thread=:236 func=cfq_kick_queue+0x0
     kondemand/0-360   [000]     0.000000: workqueue_execution: thread=:360 func=do_dbs_timer+0x0
     kondemand/0-360   [000]     0.000000: workqueue_execution: thread=:360 func=do_dbs_timer+0x0
     kondemand/1-361   [000]     0.000000: workqueue_execution: thread=:361 func=do_dbs_timer+0x0
     kondemand/1-361   [000]     0.000000: workqueue_execution: thread=:361 func=do_dbs_timer+0x0
     kondemand/1-361   [000]     0.000000: workqueue_execution: thread=:361 func=do_dbs_timer+0x0
     kondemand/1-361   [000]     0.000000: workqueue_execution: thread=:361 func=do_dbs_timer+0x0
     kondemand/1-361   [000]     0.000000: workqueue_execution: thread=:361 func=do_dbs_timer+0x0
     kondemand/1-361   [000]     0.000000: workqueue_execution: thread=:361 func=do_dbs_timer+0x0
     kondemand/1-361   [000]     0.000000: workqueue_execution: thread=:361 func=do_dbs_timer+0x0
     kondemand/1-361   [000]     0.000000: workqueue_execution: thread=:361 func=do_dbs_timer+0x0
     kondemand/1-361   [000]     0.000000: workqueue_execution: thread=:361 func=do_dbs_timer+0x0
     kondemand/1-361   [000]     0.000000: workqueue_execution: thread=:361 func=do_dbs_timer+0x0
     kondemand/1-361   [000]     0.000000: workqueue_execution: thread=:361 func=do_dbs_timer+0x0
     kondemand/1-361   [000]     0.000000: workqueue_execution: thread=:361 func=do_dbs_timer+0x0
     kondemand/1-361   [000]     0.000000: workqueue_execution: thread=:361 func=do_dbs_timer+0x0

Todo:

- A lot of things!

Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Mike Galbraith <efault@gmx.de>
Cc: "Luis Claudio R. Goncalves" <lclaudio@uudg.org>
Cc: Clark Williams <williams@redhat.com>
Cc: Jon Masters <jonathan@jonmasters.org>
Cc: Mathieu Desnoyers <mathieu.desnoyers@polymtl.ca>
Cc: Christoph Hellwig <hch@infradead.org>
Cc: Xiao Guangrong <xiaoguangrong@cn.fujitsu.com>
Cc: Zhaolei <zhaolei@cn.fujitsu.com>
Cc: Li Zefan <lizf@cn.fujitsu.com>
Cc: Lai Jiangshan <laijs@cn.fujitsu.com>
Cc: Masami Hiramatsu <mhiramat@redhat.com>
Cc: Tom Zanussi <tzanussi@gmail.com>
Cc: "Frank Ch. Eigler" <fche@redhat.com>
Cc: Roland McGrath <roland@redhat.com>
Cc: Jason Baron <jbaron@redhat.com>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Jiaying Zhang <jiayingz@google.com>
Cc: Anton Blanchard <anton@samba.org>
LKML-Reference: <1250518688-7207-4-git-send-email-fweisbec@gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-08-17 16:32:39 +02:00
Frederic Weisbecker
8f28827a16 perf tools: Librarize trace_event() helper
Librarize trace_event() helper so that perf trace can use it
too. Also clean up the debug.h includes a bit.

It's not good to have it included in perf.h because it doesn't
make it flexible against other headers it may need (headers
that can also depend on perf.h and then create a recursive
header dependency).

Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Mike Galbraith <efault@gmx.de>
LKML-Reference: <1250453149-664-1-git-send-email-fweisbec@gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-08-16 23:06:45 +02:00
Ingo Molnar
be750231ce Merge branch 'perfcounters/urgent' into perfcounters/core
Conflicts:
	kernel/perf_counter.c

Merge reason: update to latest upstream (-rc6) and resolve
              the conflict with urgent fixes.

Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-08-15 12:06:12 +02:00
Arnaldo Carvalho de Melo
39e6dd7350 perf record: Fix typo in pid_synthesize_comm_event
We were using 'fd' locally, but there was a global 'fd' too, so
when converting from open to fopen the test made against fd
should be made against 'fp', but since we have that global
it didnt get discovered ...

Reported-by: Ulrich Drepper <drepper@redhat.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
LKML-Reference: <20090814182632.GF3490@ghostprotocols.net>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-08-15 11:59:06 +02:00
Frederic Weisbecker
daac07b2e6 perf tools: Add a general option to enable raw sample records
While we can enable the perf sample records per tracepoint
counter, we may also want to enable this option for every
tracepoint counters to open, so that we don't need to add a
:record flag for all of them.

Add the -R, --raw-samples options for this purpose.

Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Mike Galbraith <efault@gmx.de>
LKML-Reference: <1250152039-7284-2-git-send-email-fweisbec@gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-08-13 10:37:25 +02:00
Frederic Weisbecker
3a9f131fb0 perf tools: Add a per tracepoint counter attribute to get raw sample
Add a new flag field while opening a tracepoint perf counter:

	-e tracepoint_subsystem:tracepoint_name:flags

This is intended to be generic although for now it only supports the
r[e[c[o[r[d]]]]] flag:

	./perf record -e workqueue:workqueue_insertion:record
	./perf record -e workqueue:workqueue_insertion:r

will have the same effect: enabling the raw samples record for
the given tracepoint counter.

In the future, we may want to support further flags, separated
by commas.

Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Mike Galbraith <efault@gmx.de>
LKML-Reference: <1250152039-7284-1-git-send-email-fweisbec@gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-08-13 10:37:25 +02:00
Jens Axboe
0a5ac84650 perf record: Add missing -C option support for specifying profile cpu
perf top supports a -C for setting the profile CPU, but perf
record does not. This adds the same option for perf record,
allowing the user to specify a specific target profile CPU.

Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
LKML-Reference: <20090812091801.GC12579@kernel.dk>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-08-12 14:10:51 +02:00
Arnaldo Carvalho de Melo
2a8083f063 perf record: Fix .tid and .pid fill-in when synthesizing events
Noticed when trying to record events for a firefox thread. We
were synthesizing both .tid and .pid with the pid passed via
--pid.

Fix it by reading /proc/PID/status and getting the tgid
to use in .pid, .tid gets the specified "pid".

Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Frédéric Weisbecker <fweisbec@gmail.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Mike Galbraith <efault@gmx.de>
LKML-Reference: <20090811192200.GF18061@ghostprotocols.net>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-08-12 14:10:48 +02:00
Frederic Weisbecker
66e274f3b8 perf tools: Factorize the map helpers
Factorize the dso mapping helpers into a single purpose common file
"util/map.c"

Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Brice Goglin <Brice.Goglin@inria.fr>
2009-08-12 12:37:37 +02:00
Frederic Weisbecker
1fe2c1066c perf tools: Factorize the event structure definitions in a single file
Factorize the multiple definition of the events structures into a
single util/event.h file.

Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Brice Goglin <Brice.Goglin@inria.fr>
2009-08-12 12:04:39 +02:00
Frederic Weisbecker
cd84c2ac6d perf tools: Factorize high level dso helpers
Factorize multiple definitions of high level dso helpers into the
symbol source file.

The side effect is a general export of the verbose and eprintf
debugging helpers into a new file dedicated to debugging purposes.

Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Brice Goglin <Brice.Goglin@inria.fr>
2009-08-12 12:02:38 +02:00
Pierre Habouzit
266e0e2198 perf record: Fix the -A UI for empty or non-existent perf.data
1. Ignore the -A argument if there is no perf.data file
2. Treat an empty file like a non existent file.

Else, perf will try to read the perf.data header, and fail with
an error.

Treating an empty file like a non-existent file makes sense,
since an interupted (as in SIGKILLed) perf could leave such
files around, and you don't want to annoy the user with errors
for files with no data in it.

Signed-off-by: Pierre Habouzit <pierre.habouzit@intersec.com>
Acked-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Paul Mackerras <paulus@samba.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-08-09 12:54:40 +02:00
Frederic Weisbecker
f413cdb80c perf_counter: Fix/complete ftrace event records sampling
This patch implements the kernel side support for ftrace event
record sampling.

A new counter sampling attribute is added:

   PERF_SAMPLE_TP_RECORD

which requests ftrace events record sampling. In this case
if a PERF_TYPE_TRACEPOINT counter is active and a tracepoint
fires, we emit the tracepoint binary record to the
perfcounter event buffer, as a sample.

Result, after setting PERF_SAMPLE_TP_RECORD attribute from perf
record:

 perf record -f -F 1 -a -e workqueue:workqueue_execution
 perf report -D

 0x21e18 [0x48]: event: 9
 .
 . ... raw event: size 72 bytes
 .  0000:  09 00 00 00 01 00 48 00 d0 c7 00 81 ff ff ff ff  ......H........
 .  0010:  0a 00 00 00 0a 00 00 00 21 00 00 00 00 00 00 00  ........!......
 .  0020:  2b 00 01 02 0a 00 00 00 0a 00 00 00 65 76 65 6e  +...........eve
 .  0030:  74 73 2f 31 00 00 00 00 00 00 00 00 0a 00 00 00  ts/1...........
 .  0040:  e0 b1 31 81 ff ff ff ff                          .......
.
0x21e18 [0x48]: PERF_EVENT_SAMPLE (IP, 1): 10: 0xffffffff8100c7d0 period: 33

The raw ftrace binary record starts at offset 0020.

Translation:

 struct trace_entry {
	type		= 0x2b = 43;
	flags		= 1;
	preempt_count	= 2;
	pid		= 0xa = 10;
	tgid		= 0xa = 10;
 }

 thread_comm = "events/1"
 thread_pid  = 0xa = 10;
 func	    = 0xffffffff8131b1e0 = flush_to_ldisc()

What will come next?

 - Userspace support ('perf trace'), 'flight data recorder' mode
   for perf trace, etc.

 - The unconditional copy from the profiling callback brings
   some costs however if someone wants no such sampling to
   occur, and needs to be fixed in the future. For that we need
   to have an instant access to the perf counter attribute.
   This is a matter of a flag to add in the struct ftrace_event.

 - Take care of the events recursivity! Don't ever try to record
   a lock event for example, it seems some locking is used in
   the profiling fast path and lead to a tracing recursivity.
   That will be fixed using raw spinlock or recursivity
   protection.

 - [...]

 - Profit! :-)

Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Li Zefan <lizf@cn.fujitsu.com>
Cc: Tom Zanussi <tzanussi@gmail.com>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Pekka Enberg <penberg@cs.helsinki.fi>
Cc: Gabriel Munteanu <eduard.munteanu@linux360.ro>
Cc: Lai Jiangshan <laijs@cn.fujitsu.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-08-09 12:53:48 +02:00
Anton Blanchard
a0541234f8 perf_counter: Improve perf stat and perf record option parsing
perf stat and perf record currently look for all options on the command
line. This can lead to some confusion:

# perf stat ls -l
  Error: unknown switch `l'

While we can work around this by adding '--' before the command, the git
option parsing code can stop at the first non option:

# perf stat ls -l
 Performance counter stats for 'ls -l':
....

Signed-off-by: Anton Blanchard <anton@samba.org>
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
LKML-Reference: <20090722130412.GD9029@kryten>
2009-07-22 18:05:56 +02:00
Anton Blanchard
4bba828dd9 perf_counter: Add perf record option to log addresses
Add the -d or --data option to log event addresses (eg page
faults).

Signed-off-by: Anton Blanchard <anton@samba.org>
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
LKML-Reference: <20090716104817.697698033@samba.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-07-18 11:21:32 +02:00
Anton Blanchard
11b5f81e1b perf_counter: Synthesize VDSO mmap event
perf record synthesizes mmap events for the running process.
Right now it just catches file mappings, but we can check for
the vdso symbol and add that too.

Signed-off-by: Anton Blanchard <anton@samba.org>
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
LKML-Reference: <20090716104817.517264409@samba.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-07-18 11:21:30 +02:00
Ingo Molnar
f37a291c52 perf_counter tools: Add more warnings and fix/annotate them
Enable -Wextra. This found a few real bugs plus a number
of signed/unsigned type mismatches/uncleanlinesses. It
also required a few annotations

All things considered it was still worth it so lets try with
this enabled for now.

Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
LKML-Reference: <new-submission>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-07-01 12:49:48 +02:00
Frederic Weisbecker
3928ddbe99 perf record: Fix unhandled io return value
Building latest perfcounter fails on the following error:

 builtin-record.c: In function ‘create_counter’:
 builtin-record.c:451: erreur: ignoring return value of ‘read’, declared with attribute warn_unused_result
 make: *** [builtin-record.o] Erreur 1

Just check if we successfully read the perf file descriptor.

Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
LKML-Reference: <1245961287-5327-1-git-send-email-fweisbec@gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-06-25 22:25:55 +02:00
Peter Zijlstra
649c48a9e7 perf-report: Add modes for inherited stats and no-samples
Now that we can collect per task statistics, add modes that
make use of that facility.

Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
LKML-Reference: <new-submission>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-06-25 21:39:08 +02:00
Peter Zijlstra
7c6a1c65bb perf_counter tools: Rework the file format
Create a structured file format that includes the full
perf_counter_attr and all its relevant counter IDs so that
the reporting program has full information.

Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
LKML-Reference: <new-submission>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-06-25 21:39:04 +02:00
Johannes Weiner
76c64c5e4c perf record: Fix filemap pathname parsing in /proc/pid/maps
Looking backward for the first space from the end of a line in
/proc/pid/maps does not find the start of the pathname of the mapped
file if it contains a space.

Since the only slashes we have in this file occur in the (absolute!)
pathname column of file mappings, looking for the first slash in a
line is a safe method to find the name.

Signed-off-by: Johannes Weiner <hannes@cmpxchg.org>
Cc: Stefani Seibold <stefani@seibold.net>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: "Eric W. Biederman" <ebiederm@xmission.com>
Cc: Alexey Dobriyan <adobriyan@gmail.com>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
LKML-Reference: <20090624190835.GA25548@cmpxchg.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-06-25 11:35:58 +02:00
Frederic Weisbecker
eadc84cc01 perfcounter: Handle some IO return values
Building perfcounter tools raises the following warnings:

 builtin-record.c: In function ‘atexit_header’:
 builtin-record.c:464: erreur: ignoring return value of ‘pwrite’, declared with attribute warn_unused_result
 builtin-record.c: In function ‘__cmd_record’:
 builtin-record.c:503: erreur: ignoring return value of ‘read’, declared with attribute warn_unused_result

 builtin-report.c: In function ‘__cmd_report’:
 builtin-report.c:1403: erreur: ignoring return value of ‘read’, declared with attribute warn_unused_result

This patch handles these IO return values.

Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
LKML-Reference: <1245456100-5477-1-git-send-email-fweisbec@gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-06-20 12:30:33 +02:00
Paul Mackerras
9cffa8d533 perf_counter tools: Define and use our own u64, s64 etc. definitions
On 64-bit powerpc, __u64 is defined to be unsigned long rather than
unsigned long long.  This causes compiler warnings every time we
print a __u64 value with %Lx.

Rather than changing __u64, we define our own u64 to be unsigned long
long on all architectures, and similarly s64 as signed long long.
For consistency we also define u32, s32, u16, s16, u8 and s8.  These
definitions are put in a new header, types.h, because these definitions
are needed in util/string.h and util/symbol.h.

The main change here is the mechanical change of __[us]{64,32,16,8}
to remove the "__".  The other changes are:

* Create types.h
* Include types.h in perf.h, util/string.h and util/symbol.h
* Add types.h to the LIB_H definition in Makefile
* Added (u64) casts in process_overflow_event() and print_sym_table()
  to kill two remaining warnings.

Signed-off-by: Paul Mackerras <paulus@samba.org>
Acked-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: benh@kernel.crashing.org
LKML-Reference: <19003.33494.495844.956580@cargo.ozlabs.ibm.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-06-19 18:25:47 +02:00
Peter Zijlstra
f5970550d5 perf_counter tools: Add a data file header
Add a data file header so we can transfer data between record and report.

LKML-Reference: <new-submission>
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-06-19 13:42:36 +02:00
Peter Zijlstra
9d91a6f7a4 perf_counter tools: Handle lost events
Make use of the new ->data_tail mechanism to tell kernel-space
about user-space draining the data stream. Emit lost events
(and display them) if they happen.

Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
LKML-Reference: <new-submission>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-06-18 14:46:11 +02:00
Ingo Molnar
613d860229 perf record: Fix fast task-exit race
Recording with -a (or with -p) can race with tasks going away:

   couldn't open /proc/8440/maps

Causing an early exit() and no recording done.

Do not abort the recording session - instead just skip that task.

Also, only print the warnings under -v.

Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
LKML-Reference: <new-submission>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-06-15 09:08:31 +02:00
Ingo Molnar
3efa1cc99e perf record/report: Add call graph / call chain profiling
Add the first steps of call-graph profiling:

 - add the -c (--call-graph) option to perf record
 - parse the call-graph record and printout out under -D (--dump-trace)

The call-graph data is not put into the histogram yet, but it
can be seen that it's being processed correctly:

0x3ce0 [0x38]: event: 35
.
. ... raw event: size 56 bytes
.  0000:  23 00 00 00 05 00 38 00 d4 df 0e 81 ff ff ff ff  #.....8........
.  0010:  60 0b 00 00 60 0b 00 00 03 00 00 00 01 00 02 00  `...`..........
.  0020:  d4 df 0e 81 ff ff ff ff a0 61 ed 41 36 00 00 00  .........a.A6..
.  0030:  04 92 e6 41 36 00 00 00                          .a.A6..
.
0x3ce0 [0x38]: PERF_EVENT (IP, 5): 2912: 0xffffffff810edfd4 period: 1
... chain: u:2, k:1, nr:3
.....  0: 0xffffffff810edfd4
.....  1: 0x3641ed61a0
.....  2: 0x3641e69204
 ... thread: perf:2912
 ...... dso: [kernel]

This shows a 3-entry call-graph: with 1 kernel-space and two user-space
entries

Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Pekka Enberg <penberg@cs.helsinki.fi>
Cc: Arjan van de Ven <arjan@infradead.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
LKML-Reference: <new-submission>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-06-14 20:34:06 +02:00
Peter Zijlstra
bbd36e5e6a perf record: Explicity program a default counter
Up until now record has worked on the assumption that type=0, config=0
was a suitable configuration - which it is. Lets make this a little more
explicit and more readable via the use of proper symbols.

[ Impact: cleanup ]

Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
LKML-Reference: <new-submission>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-06-12 14:28:52 +02:00
Peter Zijlstra
f4dbfa8f31 perf_counter: Standardize event names
Pure renames only, to PERF_COUNT_HW_* and PERF_COUNT_SW_*.

Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
LKML-Reference: <new-submission>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-06-11 17:54:15 +02:00
Ingo Molnar
729ff5e2aa perf_counter tools: Clean up u64 usage
A build error slipped in:

 builtin-report.c: In function ‘hist_entry__fprintf’:
 builtin-report.c:711: error: format ‘%12d’ expects type ‘int’, but argument 3 has type ‘uint64_t’

Because we got a bit sloppy with those types. uint64_t really sucks,
because there's no printf format for it. So standardize on __u64
instead - for all types that go to or come from the ABI (which is __u64),
or for values that need to be large enough even on 32-bit.

Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
LKML-Reference: <new-submission>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-06-11 16:48:38 +02:00
Peter Zijlstra
ea1900e571 perf_counter tools: Normalize data using per sample period data
When we use variable period sampling, add the period to the sample
data and use that to normalize the samples.

Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
LKML-Reference: <new-submission>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-06-11 02:39:01 +02:00
Peter Zijlstra
f7b7c26e01 perf_counter tools: Propagate signals properly
Currently report and stat catch SIGINT (and others) without altering
their exit state. This means that things like:

   while :; do perf stat ./foo ; done

Loops become hard-to-interrupt, because bash never sees perf terminate
due to interruption. Fix this.

Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
LKML-Reference: <new-submission>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-06-10 16:55:27 +02:00
Peter Zijlstra
4502d77c1d perf_counter tools: Small frequency related fixes
Create the counter in a disabled state and only enable it after we
mmap() the buffer, this allows us to see the first few samples (and
observe the frequency ramp).

Furthermore, print the period in the verbose report.

Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
LKML-Reference: <new-submission>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-06-10 16:55:26 +02:00
Ingo Molnar
30c806a094 perf_counter tools: Handle kernels with !CONFIG_PERF_COUNTER
If perf is run on a !CONFIG_PERF_COUNTER kernel right now it
bails out with no messages or with confusing messages.

Standardize this case some more and explain the situation.

Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
LKML-Reference: <new-submission>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-06-07 17:46:24 +02:00
Ingo Molnar
3da297a60f perf record: Fall back to cpu-clock-ticks if no PMU
On architectures/CPUs without PMU support but with perfcounters
enabled 'perf record' currently fails because it cannot create a
cycle based hw-perfcounter.

Fall back to the cpu-clock-tick sw-perfcounter in this case, which
is hrtimer based and will always work (as long as perfcounters
are enabled).

Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
LKML-Reference: <new-submission>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-06-07 17:39:02 +02:00
Ingo Molnar
864709302a perf_counter tools: Move from Documentation/perf_counter/ to tools/perf/
Several people have suggested that 'perf' has become a full-fledged
tool that should be moved out of Documentation/. Move it to the
(new) tools/ directory.

Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
LKML-Reference: <new-submission>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-06-06 20:33:43 +02:00