summaryrefslogtreecommitdiff
path: root/tools/perf/builtin-stat.c
AgeCommit message (Collapse)AuthorFilesLines
2016-03-10perf stat: Add --metric-only support for -AAndi Kleen1-8/+37
Add metric only support for -A too. This requires a new print function that prints the metrics in the right order. v2: Fix manpage v3: Simplify nrcpus computation Signed-off-by: Andi Kleen <ak@linux.intel.com> Acked-by: Jiri Olsa <jolsa@kernel.org> Link: http://lkml.kernel.org/r/1457049458-28956-7-git-send-email-andi@firstfloor.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-03-10perf stat: Implement --metric-only modeAndi Kleen1-10/+201
Add a new mode to only print metrics. Sometimes we don't care about the raw values, just want the computed metrics. This allows more compact printing, so with -I each sample is only a single line. This also allows easier plotting and processing with other tools. The main target is with using --topdown, but it also works with -T and standard perf stat. A few metrics are not supported. To avoiding having to hardcode all the metrics in the code it uses a two pass approach: first compute dummy metrics and only print the headers in the print_metric callback. Then use the callback to print the actual values. There are some additional changes in the stat printout code to handle all metrics being on a single line. One issue is that the column code doesn't know in advance what events are not supported by the CPU, and it would be hard to find out as this could change based on dynamic conditions. That causes empty columns in some cases. The output can be fairly wide, often you may need more than 80 columns. Example: % perf stat -a -I 1000 --metric-only 1.001452803 frontend cycles idle insn per cycle stalled cycles per insn branch-misses of all branches 1.001452803 158.91% 0.66 2.39 2.92% 2.002192321 180.63% 0.76 2.08 2.96% 3.003088282 150.59% 0.62 2.57 2.84% 4.004369835 196.20% 0.98 1.62 3.79% 5.005227314 231.98% 0.84 1.90 4.71% v2: Lots of updates. v3: Use slightly narrower columns v4: Add comment Signed-off-by: Andi Kleen <ak@linux.intel.com> Acked-by: Jiri Olsa <jolsa@kernel.org> Link: http://lkml.kernel.org/r/1457049458-28956-6-git-send-email-andi@firstfloor.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-03-03perf stat: Check for frontend stalled for metricsAndi Kleen1-0/+1
Add an extra check for frontend stalled in the metrics. This avoids an extra column for the --metric-only case when the CPU does not support frontend stalled. v2: Add separate init function Signed-off-by: Andi Kleen <ak@linux.intel.com> Acked-by: Jiri Olsa <jolsa@kernel.org> Link: http://lkml.kernel.org/r/1456858672-21594-8-git-send-email-andi@firstfloor.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-03-03perf stat: Support metrics in --per-core/socket modeAndi Kleen1-8/+56
Enable metrics printing in --per-core / --per-socket mode. We need to save the shadow metrics in a unique place. Always use the first CPU in the aggregation. Then use the same CPU to retrieve the shadow value later. Example output: % perf stat --per-core -a ./BC1s Performance counter stats for 'system wide': S0-C0 2 2966.020381 task-clock (msec) # 2.004 CPUs utilized (100.00%) S0-C0 2 49 context-switches # 0.017 K/sec (100.00%) S0-C0 2 4 cpu-migrations # 0.001 K/sec (100.00%) S0-C0 2 467 page-faults # 0.157 K/sec S0-C0 2 4,599,061,773 cycles # 1.551 GHz (100.00%) S0-C0 2 9,755,886,883 instructions # 2.12 insn per cycle (100.00%) S0-C0 2 1,906,272,125 branches # 642.704 M/sec (100.00%) S0-C0 2 81,180,867 branch-misses # 4.26% of all branches S0-C1 2 2965.995373 task-clock (msec) # 2.003 CPUs utilized (100.00%) S0-C1 2 62 context-switches # 0.021 K/sec (100.00%) S0-C1 2 8 cpu-migrations # 0.003 K/sec (100.00%) S0-C1 2 281 page-faults # 0.095 K/sec S0-C1 2 6,347,290 cycles # 0.002 GHz (100.00%) S0-C1 2 4,654,156 instructions # 0.73 insn per cycle (100.00%) S0-C1 2 947,121 branches # 0.319 M/sec (100.00%) S0-C1 2 37,322 branch-misses # 3.94% of all branches 1.480409747 seconds time elapsed v2: Rebase to older patches v3: Document shadow cpus. Fix aggr_get_id argument. Fix -A shadows (Jiri) Signed-off-by: Andi Kleen <ak@linux.intel.com> Acked-by: Jiri Olsa <jolsa@kernel.org> Link: http://lkml.kernel.org/r/1456785386-19481-4-git-send-email-andi@firstfloor.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-03-03perf stat: Implement CSV metrics outputAndi Kleen1-4/+69
Now support CSV output for metrics. With the new output callbacks this is relatively straight forward by creating new callbacks. This allows to easily plot metrics from CSV files. The new line callback needs to know the number of fields to skip them correctly Example output before: % perf stat -x, true 0.200687,,task-clock,200687,100.00 0,,context-switches,200687,100.00 0,,cpu-migrations,200687,100.00 40,,page-faults,200687,100.00 730871,,cycles,203601,100.00 551056,,stalled-cycles-frontend,203601,100.00 <not supported>,,stalled-cycles-backend,0,100.00 385523,,instructions,203601,100.00 78028,,branches,203601,100.00 3946,,branch-misses,203601,100.00 After: % perf stat -x, true .502457,,task-clock,502457,100.00,0.485,CPUs utilized 0,,context-switches,502457,100.00,0.000,K/sec 0,,cpu-migrations,502457,100.00,0.000,K/sec 45,,page-faults,502457,100.00,0.090,M/sec 644692,,cycles,509102,100.00,1.283,GHz 423470,,stalled-cycles-frontend,509102,100.00,65.69,frontend cycles idle <not supported>,,stalled-cycles-backend,0,100.00,,,, 492701,,instructions,509102,100.00,0.76,insn per cycle ,,,,,0.86,stalled cycles per insn 97767,,branches,509102,100.00,194.578,M/sec 4788,,branch-misses,509102,100.00,4.90,of all branches or easier readable $ perf stat -x, -o x.csv true $ column -s, -t x.csv 0.490635 task-clock 490635 100.00 0.489 CPUs utilized 0 context-switches 490635 100.00 0.000 K/sec 0 cpu-migrations 490635 100.00 0.000 K/sec 45 page-faults 490635 100.00 0.092 M/sec 629080 cycles 497698 100.00 1.282 GHz 409498 stalled-cycles-frontend 497698 100.00 65.09 frontend cycles idle <not supported> stalled-cycles-backend 0 100.00 491424 instructions 497698 100.00 0.78 insn per cycle 0.83 stalled cycles per insn 97278 branches 497698 100.00 198.270 M/sec 4569 branch-misses 497698 100.00 4.70 of all branches Two new fields are added: metric value and metric name. v2: Split out function argument changes v3: Reenable metrics for real. v4: Fix wrong hunk from refactoring. v5: Remove extra "noise" printing (Jiri), but add it to the not counted case. Print empty metrics for not counted. v6: Avoid outputting metric on empty format. v7: Print metric at the end v8: Remove extra run, ena fields v9: Avoid extra new line for unsupported counters Signed-off-by: Andi Kleen <ak@linux.intel.com> Acked-by: Jiri Olsa <jolsa@kernel.org> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Link: http://lkml.kernel.org/r/1456785386-19481-3-git-send-email-andi@firstfloor.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-03-03perf stat: Check existence of frontend/backed stalled cyclesAndi Kleen1-2/+20
Only put the frontend/backend stalled cycles into the default perf stat events when the CPU actually supports them. This avoids empty columns with --metric-only on newer Intel CPUs. Committer note: Before: $ perf stat ls Performance counter stats for 'ls': 1.080893 task-clock (msec) # 0.619 CPUs utilized 0 context-switches # 0.000 K/sec 0 cpu-migrations # 0.000 K/sec 97 page-faults # 0.090 M/sec 3,327,741 cycles # 3.079 GHz <not supported> stalled-cycles-frontend <not supported> stalled-cycles-backend 1,609,544 instructions # 0.48 insn per cycle 319,117 branches # 295.235 M/sec 12,246 branch-misses # 3.84% of all branches 0.001746508 seconds time elapsed $ After: $ perf stat ls Performance counter stats for 'ls': 0.693948 task-clock (msec) # 0.662 CPUs utilized 0 context-switches # 0.000 K/sec 0 cpu-migrations # 0.000 K/sec 95 page-faults # 0.137 M/sec 1,792,509 cycles # 2.583 GHz 1,599,047 instructions # 0.89 insn per cycle 316,328 branches # 455.838 M/sec 12,453 branch-misses # 3.94% of all branches 0.001048987 seconds time elapsed $ Signed-off-by: Andi Kleen <ak@linux.intel.com> Acked-by: Jiri Olsa <jolsa@kernel.org> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Stephane Eranian <eranian@google.com> Link: http://lkml.kernel.org/r/1456532881-26621-2-git-send-email-andi@firstfloor.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-02-19perf stat: Bail out on unsupported event config modifiersWang Nan1-0/+1
'perf stat' accepts some config terms but doesn't apply them. For example: # perf stat -e 'instructions/no-inherit/' -e 'instructions/inherit/' bash # ls # exit Performance counter stats for 'bash': 266258061 instructions/no-inherit/ 266258061 instructions/inherit/ 1.402183915 seconds time elapsed The result is confusing, because user may expect the first 'instructions' event exclude the 'ls' command. This patch forbid most of these config terms for 'perf stat'. Result: # ./perf stat -e 'instructions/no-inherit/' -e 'instructions/inherit/' bash event syntax error: 'instructions/no-inherit/' \___ 'no-inherit' is not usable in 'perf stat' ... We can add blocked config terms back when 'perf stat' really supports them. This patch also removes unavailable config term from error message: # ./perf stat -e 'instructions/badterm/' ls event syntax error: 'instructions/badterm/' \___ unknown term valid terms: config,config1,config2,name # ./perf stat -e 'cpu/badterm/' ls event syntax error: 'cpu/badterm/' \___ unknown term valid terms: pc,any,inv,edge,cmask,event,in_tx,ldlat,umask,in_tx_cp,offcore_rsp,config,config1,config2,name Signed-off-by: Wang Nan <wangnan0@huawei.com> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexei Starovoitov <ast@kernel.org> Cc: Brendan Gregg <brendan.d.gregg@gmail.com> Cc: Cody P Schafer <dev@codyps.com> Cc: He Kuang <hekuang@huawei.com> Cc: Jeremie Galarneau <jeremie.galarneau@efficios.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Kirill Smelkov <kirr@nexedi.com> Cc: Li Zefan <lizefan@huawei.com> Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Zefan Li <lizefan@huawei.com> Cc: pi3orama@163.com Link: http://lkml.kernel.org/r/1455882283-79592-11-git-send-email-wangnan0@huawei.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-02-19perf stat: Handled scaled == -1 case for countersAndi Kleen1-1/+1
Arnaldo pointed out that the earlier cb110f471025 ("perf stat: Move noise/running printing into printout") change changed behavior for not counted counters. This patch fixes it again. Signed-off-by: Andi Kleen <ak@linux.intel.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Stephane Eranian <eranian@google.com> Fixes: cb110f471025 ("perf stat: Move noise/running printing into printout") Link: http://lkml.kernel.org/r/1455749045-18098-2-git-send-email-andi@firstfloor.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-02-16perf stat: Move noise/running printing into printoutAndi Kleen1-91/+32
Move the running/noise printing into printout to avoid duplicated code in the callers. v2: Merged with other patches. Remove unnecessary hunk. Readd hunk that ended in earlier patch. v3: Fix noise/running output in CSV mode v4: Merge with later patch that also moves not supported printing. Signed-off-by: Andi Kleen <ak@linux.intel.com> Acked-by: Jiri Olsa <jolsa@kernel.org> Cc: Stephane Eranian <eranian@google.com> Link: http://lkml.kernel.org/r/1454173616-17710-4-git-send-email-andi@firstfloor.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-02-16perf stat: Add support for metrics in interval modeAndi Kleen1-7/+13
Now that we can modify the metrics printout functions easily, it's straight forward to support metric printing for interval mode. All that is needed is to print the time stamp on every new line. Pass the prefix into the context and print it out. v2: Move wrong hunk to here. Committer note: Before: [root@jouet ~]# perf stat -I 1000 -e instructions,cycles sleep 1 # time counts unit events 1.000168216 538,913 instructions 1.000168216 748,765 cycles 1.000660048 153,741 instructions 1.000660048 214,066 cycles After: # perf stat -I 1000 -e instructions,cycles sleep 1 # time counts unit events 1.000215928 519,620 instructions # 0.69 insn per cycle 1.000215928 752,003 cycles 1.000946033 148,502 instructions # 0.33 insn per cycle 1.000946033 160,104 cycles Signed-off-by: Andi Kleen <ak@linux.intel.com> Acked-by: Jiri Olsa <jolsa@redhat.com> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Stephane Eranian <eranian@google.com> Link: http://lkml.kernel.org/r/1454173616-17710-3-git-send-email-andi@firstfloor.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-02-16perf stat: Abstract stat metrics printingAndi Kleen1-6/+65
Abstract the printing of shadow metrics. Instead of every metric calling fprintf directly and taking care of indentation, use two call backs: one to print metrics and another to start a new line. This will allow adding metrics to CSV mode and also using them for other purposes. The computation of padding is now done in the central callback, instead of every metric doing it manually. This makes it easier to add new metrics. v2: Refactor functions, printout now does more. Move shadow printing. Improve fallback callbacks. Don't use void * callback data. v3: Remove unnecessary hunk. Add typedef for new_line v4: Remove unnecessary hunk. Don't print metrics for CSV/interval mode yet. Move printout change to separate patch. v5: Fix bisect bugs. Avoid bogus frontend cycles printing. Fix indentation in different aggregation modes. v6: Delay newline handling Signed-off-by: Andi Kleen <ak@linux.intel.com> Acked-by: Jiri Olsa <jolsa@redhat.com> Cc: Stephane Eranian <eranian@google.com> Link: http://lkml.kernel.org/r/1454173616-17710-2-git-send-email-andi@firstfloor.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-01-12perf stat: Fix recort_usage typoJiri Olsa1-4/+4
Markus reported gcc 6 complains when compiling perf stat command: builtin-stat.c:1591:27: error: ‘recort_usage’ defined but not used [-Werror=unused-const-variable] static const char * const recort_usage[] = { ^~~~~~~~~~~~ I fixed the typo and realized we already export record_usage, so I also prefixed it with stat (and included report_usage). Reported-by: Markus Trippelsdorf <markus@trippelsdorf.de> Signed-off-by: Jiri Olsa <jolsa@kernel.org> Cc: David Ahern <dsahern@gmail.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Link: http://lkml.kernel.org/r/1452591329-27620-1-git-send-email-jolsa@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-01-06perf stat record: Keep sample_type 0 for pipe sessionJiri Olsa1-1/+8
For pipe sessions we need to keep sample_type zero, because script's perf_evsel__check_attr is triggered by sample_type != 0, and the check would fail on stat session. I was tempted to keep it zero unconditionally, but the pipe session is sufficient. In perf.data session we are guarded by HEADER_STAT feature. Signed-off-by: Jiri Olsa <jolsa@kernel.org> Cc: David Ahern <dsahern@gmail.com> Cc: Kan Liang <kan.liang@intel.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Link: http://lkml.kernel.org/r/1452028152-26762-2-git-send-email-jolsa@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2015-12-17perf stat report: Allow to override aggr_modeJiri Olsa1-0/+17
Allowing to override record aggr_mode. It's possible to use perf stat like: $ perf stat report -A $ perf stat report --per-core $ perf stat report --per-socket To customize the recorded aggregate mode regardless what was used during the stat record command. Reported-by: Kan Liang <kan.liang@intel.com> Signed-off-by: Jiri Olsa <jolsa@kernel.org> Cc: David Ahern <dsahern@gmail.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Link: http://lkml.kernel.org/r/1446734469-11352-19-git-send-email-jolsa@kernel.org [ Renamed 'stat' parameter to 'st' to fix 'already defined' build error with older distros (e.g. RHEL6.7) ] Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2015-12-17perf stat report: Process event update eventsJiri Olsa1-0/+1
Adding processing of event update events, so perf stat report can store additional info for events - unit,scale,name. Committer note: Before: # perf stat record -e power/energy-cores/ -a ^C Performance counter stats for 'system wide': 77.41 Joules power/energy-cores/ 1.597176695 seconds time elapsed # perf stat report Performance counter stats for '/home/acme/bin/perf stat record -e power/energy-cores/ -a': 332,488,114,176 power/energy-cores/ 1.597176695 seconds time elapsed # After, using the same perf.data file generated in the "Before" case above: # perf stat report Performance counter stats for '/home/acme/bin/perf stat record -e power/energy-cores/ -a': 77.41 Joules power/energy-cores/ 1.597176695 seconds time elapsed # Reported-by: Kan Liang <kan.liang@intel.com> Signed-off-by: Jiri Olsa <jolsa@kernel.org> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: David Ahern <dsahern@gmail.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Link: http://lkml.kernel.org/r/1446734469-11352-17-git-send-email-jolsa@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2015-12-17perf stat report: Process stat and stat round eventsJiri Olsa1-0/+28
Adding processing of stat and stat round events. The stat data com in stat events, using generic function process_stat_round_event to store data under perf_evsel object. The stat-round events comes each interval or as last event in non interval mode. The function process_stat_round_event process stored data for each perf_evsel object and print it out. Committer note: After this patch: $ perf stat record usleep 1 Performance counter stats for 'usleep 1': 0.498381 task-clock (msec) # 0.571 CPUs utilized 2 context-switches # 0.004 M/sec 0 cpu-migrations # 0.000 K/sec 149 page-faults # 0.299 M/sec 1,271,635 cycles # 2.552 GHz 928,712 stalled-cycles-frontend # 73.03% frontend cycles idle 663,286 stalled-cycles-backend # 52.16% backend cycles idle 792,614 instructions # 0.62 insns per cycle # 1.17 stalled cycles per insn 136,850 branches # 274.589 M/sec <not counted> branch-misses (0.00%) 0.000873419 seconds time elapsed $ $ perf stat report Performance counter stats for '/home/acme/bin/perf stat record usleep 1': 0.498381 task-clock (msec) # 0.571 CPUs utilized 2 context-switches # 0.004 M/sec 0 cpu-migrations # 0.000 K/sec 149 page-faults # 0.299 M/sec 1,271,635 cycles # 2.552 GHz 928,712 stalled-cycles-frontend # 73.03% frontend cycles idle 663,286 stalled-cycles-backend # 52.16% backend cycles idle 792,614 instructions # 0.62 insns per cycle # 1.17 stalled cycles per insn 136,850 branches # 274.589 M/sec <not counted> branch-misses (0.00%) 0.000873419 seconds time elapsed $ Reported-by: Kan Liang <kan.liang@intel.com> Signed-off-by: Jiri Olsa <jolsa@kernel.org> Cc: David Ahern <dsahern@gmail.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Link: http://lkml.kernel.org/r/1446734469-11352-16-git-send-email-jolsa@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2015-12-17perf stat report: Move csv_sep initialization before report commandJiri Olsa1-7/+7
So we have csv_sep properly initialized before report command leg. Reported-by: Kan Liang <kan.liang@intel.com> Signed-off-by: Jiri Olsa <jolsa@kernel.org> Cc: David Ahern <dsahern@gmail.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Link: http://lkml.kernel.org/r/1446734469-11352-18-git-send-email-jolsa@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2015-12-17perf stat report: Add support to initialize aggr_map from fileJiri Olsa1-0/+103
Using perf.data's perf_env data to initialize aggregate config. Reported-by: Kan Liang <kan.liang@intel.com> Signed-off-by: Jiri Olsa <jolsa@kernel.org> Cc: David Ahern <dsahern@gmail.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Link: http://lkml.kernel.org/r/1446734469-11352-15-git-send-email-jolsa@kernel.org [ s/stat/st/g, s/socket/socket_id/g to fix 'already defined' build error with older distros (e.g. RHEL6.7) ] Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2015-12-17perf stat report: Process stat config eventJiri Olsa1-0/+10
Adding processing of stat config event and initialize stat_config object. Reported-by: Kan Liang <kan.liang@intel.com> Signed-off-by: Jiri Olsa <jolsa@kernel.org> Cc: David Ahern <dsahern@gmail.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Link: http://lkml.kernel.org/r/1446734469-11352-14-git-send-email-jolsa@kernel.org [ Renamed 'stat' parameter to 'st' to fix 'already defined' build error with older distros (e.g. RHEL6.7) ] Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2015-12-17perf stat report: Process cpu/threads mapsJiri Olsa1-2/+64
Adding processing of cpu/threads maps. Configuring session's evlist with these maps. Reported-by: Kan Liang <kan.liang@intel.com> Signed-off-by: Jiri Olsa <jolsa@kernel.org> Cc: David Ahern <dsahern@gmail.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Link: http://lkml.kernel.org/r/1446734469-11352-13-git-send-email-jolsa@kernel.org [ s/stat/st/g, s/time/tm/g parameters to fix 'already defined' build error with older distros (e.g. RHEL6.7) ] Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2015-12-17perf stat report: Add report commandJiri Olsa1-4/+57
Adding 'perf stat report' command support. ATM it only processes attr events and display nothing. Reported-by: Kan Liang <kan.liang@intel.com> Signed-off-by: Jiri Olsa <jolsa@kernel.org> Cc: David Ahern <dsahern@gmail.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Link: http://lkml.kernel.org/r/1446734469-11352-12-git-send-email-jolsa@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2015-12-17perf stat record: Synthesize event update eventsJiri Olsa1-0/+59
Synthesize other events stuff not carried within attr event - unit, scale, name. Reported-by: Kan Liang <kan.liang@intel.com> Signed-off-by: Jiri Olsa <jolsa@kernel.org> Cc: David Ahern <dsahern@gmail.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Link: http://lkml.kernel.org/r/1446734469-11352-11-git-send-email-jolsa@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2015-12-17perf stat record: Do not allow record with multiple runs modeJiri Olsa1-0/+5
We currently don't support storing multiple session in perf.data, so we can't allow -r option in stat record. $ perf stat -e cycles -r 2 record ls Cannot use -r option with perf stat record. Committer note: Before this patch we would a perf.data file such as: $ perf stat -e cycles -r 2 record ls <SNIP> Performance counter stats for 'ls' (2 runs): 3,935,236 cycles 0.002353261 seconds time elapsed ( +- 4.76% ) $ perf report -D | grep PERF_RECORD | grep ROUND 0xf0 [0]: failed to process type: 16 Error: failed to process sample $ Reported-by: Kan Liang <kan.liang@intel.com> Signed-off-by: Jiri Olsa <jolsa@kernel.org> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: David Ahern <dsahern@gmail.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Link: http://lkml.kernel.org/r/1446734469-11352-10-git-send-email-jolsa@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2015-12-17perf stat record: Write stat round events on recordJiri Olsa1-0/+20
Writing stat round events on 'perf stat record' for each interval round. In non interval mode we store round event after the last stat event. Committer note: After the patch: $ perf report -D | grep PERF_RECORD | grep ROUND 0x852 [0x18]: PERF_RECORD_STAT_ROUND $ Reported-by: Kan Liang <kan.liang@intel.com> Signed-off-by: Jiri Olsa <jolsa@kernel.org> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: David Ahern <dsahern@gmail.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Link: http://lkml.kernel.org/r/1446734469-11352-9-git-send-email-jolsa@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2015-12-17perf stat record: Write stat events on recordJiri Olsa1-0/+19
Writing stat events on 'perf stat record' at the time we read counter values from kernel. Committer note: After the patch: $ perf stat record usleep 1 Performance counter stats for 'usleep 1': 0.598006 task-clock (msec) # 0.484 CPUs utilized 1 context-switches # 0.002 M/sec 0 cpu-migrations # 0.000 K/sec 52 page-faults # 0.087 M/sec 882,744 cycles # 1.476 GHz 581,416 stalled-cycles-frontend # 65.86% frontend cycles idle <not supported> stalled-cycles-backend 636,479 instructions # 0.72 insns per cycle # 0.91 stalled cycles per insn 129,334 branches # 216.275 M/sec 7,512 branch-misses # 5.81% of all branches 0.001235157 seconds time elapsed $ oldperf evlist task-clock context-switches cpu-migrations page-faults cycles stalled-cycles-frontend stalled-cycles-backend instructions branches branch-misses $ oldperf report --stdio Error: The perf.data file has no samples! # To display the perf.data header info, please use --header/--header-only options. # $ perf report -D | grep PERF_RECORD 0x5b0 [0x28]: PERF_RECORD_THREAD_MAP nr: 1 thread: 5504 0x5d8 [0x12]: PERF_RECORD_CPU_MAP nr: 1 cpu: 65535 0x5ea [0x40]: PERF_RECORD_STAT_CONFIG 0x62a [0x30]: PERF_RECORD_STAT 0x65a [0x30]: PERF_RECORD_STAT 0x68a [0x30]: PERF_RECORD_STAT 0x6ba [0x30]: PERF_RECORD_STAT 0x6ea [0x30]: PERF_RECORD_STAT 0x71a [0x30]: PERF_RECORD_STAT 0x74a [0x30]: PERF_RECORD_STAT 0x77a [0x30]: PERF_RECORD_STAT 0x7aa [0x30]: PERF_RECORD_STAT -1 -1 0x7da [0x40]: PERF_RECORD_MMAP -1/0: [0xffffffff81000000(0x1f000000) @ 0xffffffff81000000]: x [kernel.kallsyms]_text $ Signed-off-by: Jiri Olsa <jolsa@kernel.org> Tested-by: Kan Liang <kan.liang@intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Link: http://lkml.kernel.org/r/1446734469-11352-8-git-send-email-jolsa@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2015-12-17perf stat record: Add pipe support for record commandJiri Olsa1-11/+28
Allowing storing stat record data into pipe, so report tools (report/script) could read data directly from record. Committer note: Before this patch: $ perf stat record -o - usleep 1 | perf report -i - incompatible file format (rerun with -v to learn more) $ perf stat record -o - usleep 1 | perf script -i - incompatible file format (rerun with -v to learn more) $ ls -la perf.data ls: cannot access perf.data: No such file or directory $ After: $ perf stat record -o - usleep 1 | perf report -i - # To display the perf.data header info, please use # --header/--header-only options. # Error: The - file has no samples! $ perf stat record -o - usleep 1 | perf script -i - Display of symbols requested but neither sample IP nor sample address is selected. Hence, no addresses to convert to symbols. 0 [0x80]: failed to process type: 64 $ ls -la perf.data ls: cannot access perf.data: No such file or directory $ Signed-off-by: Jiri Olsa <jolsa@kernel.org> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Tested-by: Kan Liang <kan.liang@intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Link: http://lkml.kernel.org/r/1446734469-11352-7-git-send-email-jolsa@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2015-12-17perf stat record: Store events IDs in perf data fileJiri Olsa1-0/+35
Store event IDs in evlist object so it get stored into perf.data file. Signed-off-by: Jiri Olsa <jolsa@kernel.org> Tested-by: Kan Liang <kan.liang@intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Link: http://lkml.kernel.org/r/1446734469-11352-6-git-send-email-jolsa@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2015-12-17perf stat record: Synthesize stat record dataJiri Olsa1-13/+43
Synthesizing needed stat record data for report/script: - cpu/thread maps - stat config Committer note: New records generated on a perf.data file with this patch: $ perf report -D | grep PERF_RECORD_ 0x568 [0x28]: PERF_RECORD_THREAD_MAP nr: 1 thread: 29097 0x590 [0x12]: PERF_RECORD_CPU_MAP nr: 1 cpu: 65535 0x5a2 [0x40]: PERF_RECORD_STAT_CONFIG $ Signed-off-by: Jiri Olsa <jolsa@kernel.org> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Tested-by: Kan Liang <kan.liang@intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Link: http://lkml.kernel.org/r/1446734469-11352-5-git-send-email-jolsa@kernel.org [ Adjusted wrt kernel PERF_RECORD_MMAP added when introducing 'perf stat record' ] Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2015-12-17perf stat record: Initialize record featuresJiri Olsa1-0/+15
Disabling all non stat related features. Also as we now enable STAT feature in the data file, adding code to instruct session open to skip sample type checking for stat data files. Signed-off-by: Jiri Olsa <jolsa@kernel.org> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Tested-by: Kan Liang <kan.liang@intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Link: http://lkml.kernel.org/r/1446734469-11352-4-git-send-email-jolsa@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2015-12-17perf stat record: Add record commandJiri Olsa1-4/+116
Add 'perf stat record' command support. It creates simple (header only) perf.data file ATM. The record command could be specified anywhere among stat options. All stat command options are valid for stat record command with '-o' option exception. If specified for record command it denotes the perf data file name. Committer note: Set sample_type to PERF_SAMPLE_IDENTIFIER, which should be harmless while avoiding that older tools show confusing messages, for instance, with sample_type = 0, we get: $ perf stat record usleep 1 Performance counter stats for 'usleep 1': 0.630237 task-clock (msec) # 0.528 CPUs utilized 1 context-switches # 0.002 M/sec 0 cpu-migrations # 0.000 K/sec 52 page-faults # 0.083 M/sec 978,312 cycles # 1.552 GHz 671,931 stalled-cycles-frontend # 68.68% frontend cycles idle <not supported> stalled-cycles-backend 646,379 instructions # 0.66 insns per cycle # 1.04 stalled cycles per insn 131,046 branches # 207.931 M/sec 7,073 branch-misses # 5.40% of all branches 0.001193240 seconds time elapsed $ oldperf evlist WARNING: The perf.data file's data size field is 0 which is unexpected. Was the 'perf record' command properly terminated? non matching sample_type $ While with sample_type set to PERF_SAMPLE_IDENTIFIER, after we re-run 'perf stat record usleep' we get: $ oldperf evlist WARNING: The perf.data file's data size field is 0 which is unexpected. Was the 'perf record' command properly terminated? task-clock context-switches cpu-migrations page-faults cycles stalled-cycles-frontend stalled-cycles-backend instructions branches branch-misses $ Which at least shows the names of the events in the perf.data file. Additionally, such files, when passed to 'perf report' will produce: $ oldperf report --stdio WARNING: The perf.data file's data size field is 0 which is unexpected. Was the 'perf record' command properly terminated? Warning: Kernel address maps (/proc/{kallsyms,modules}) were restricted. Check /proc/sys/kernel/kptr_restrict before running 'perf record'. As no suitable kallsyms nor vmlinux was found, kernel samples can't be resolved. Samples in kernel modules can't be resolved as well. Error: The perf.data file has no samples! # To display the perf.data header info, please use --header/--header-only options. # $ Which is confusing and can be solved by just adding the kernel mmap record, which will also remove that warning about the data size field being equal to zero, after generating the mmap record: $ perf stat record usleep 1 Performance counter stats for 'usleep 1': 0.600796 task-clock (msec) # 0.478 CPUs utilized 1 context-switches # 0.002 M/sec 0 cpu-migrations # 0.000 K/sec 54 page-faults # 0.090 M/sec 886,844 cycles # 1.476 GHz 582,169 stalled-cycles-frontend # 65.65% frontend cycles idle <not supported> stalled-cycles-backend 638,344 instructions # 0.72 insns per cycle # 0.91 stalled cycles per insn 130,204 branches # 216.719 M/sec 7,500 branch-misses # 5.76% of all branches 0.001255897 seconds time elapsed $ oldperf evlist task-clock context-switches cpu-migrations page-faults cycles stalled-cycles-frontend stalled-cycles-backend instructions branches branch-misses $ oldperf report --stdio Error: The perf.data file has no samples! # To display the perf.data header info, please use --header/--header-only options. # [acme@zoo linux]$ No warnings, sensible output about what are the events in the perf.data file and also a "file has no samples" message, which indeed it doesn't. Signed-off-by: Jiri Olsa <jolsa@kernel.org> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Tested-by: Kan Liang <kan.liang@intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Link: htp://lkml.kernel.org/r/1446734469-11352-3-git-send-email-jolsa@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2015-12-17perf subcmd: Create subcmd libraryJosh Poimboeuf1-1/+1
Move the subcommand-related files from perf to a new library named libsubcmd.a. Since we're moving files anyway, go ahead and rename 'exec_cmd.*' to 'exec-cmd.*' to be consistent with the naming of all the other files. Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/c0a838d4c878ab17fee50998811612b2281355c1.1450193761.git.jpoimboe@redhat.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2015-12-09perf stat: Fix cmd_stat to release cpu_mapMasami Hiramatsu1-0/+9
Fix cmd_stat() to release cpu_map objects (aggr_map and cpus_aggr_map) afterwards. refcnt debugger shows that the cmd_stat initializes cpu_map but not puts it. ---- # ./perf stat -v ls .... REFCNT: BUG: Unreclaimed objects found. ==== [0] ==== Unreclaimed cpu_map@0x29339c0 Refcount +1 => 1 at ./perf(cpu_map__empty_new+0x6d) [0x4e64bd] ./perf(cmd_stat+0x5fe) [0x43594e] ./perf() [0x47b785] ./perf(main+0x617) [0x422587] /lib64/libc.so.6(__libc_start_main+0xf5) [0x7f2dff420af5] ./perf() [0x4226fd] REFCNT: Total 1 objects are not reclaimed. "cpu_map" leaks 1 objects ---- Signed-off-by: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Link: http://lkml.kernel.org/r/20151209021127.10245.93697.stgit@localhost.localdomain [ Remove NULL checks before calling the put operation, it checks it already ] Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2015-12-07perf stat: Move enable_on_exec setup under earlier codeJiri Olsa1-6/+9
It's more readable this way and we can save one perf_evsel__is_group_leader condition in current code. Signed-off-by: Jiri Olsa <jolsa@kernel.org> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Link: http://lkml.kernel.org/r/1449133606-14429-7-git-send-email-jolsa@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2015-12-07perf stat: Create events as disabledJiri Olsa1-6/+19
Currently we have 2 kinds of stat counters based on when the event is enabled: 1) tracee command events, which are enable once the tracee executes exec syscall (enable_on_exec bit) 2) all other events which get alive within the perf_event_open syscall And 2) case could raise a problem in case we want additional filter to be attached for event. In this case we want the event to be enabled after it's configured with filter. Changing the behaviour of 2) events, so they all are created as disabled (disabled bit). Adding extra enable call to make them alive once they finish setup. Signed-off-by: Jiri Olsa <jolsa@kernel.org> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Link: http://lkml.kernel.org/r/1449133606-14429-6-git-send-email-jolsa@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2015-12-07perf stat: Use perf_evlist__enable in handle_initial_delayJiri Olsa1-4/+1
No need to mimic the behaviour of perf_evlist__enable, we can use it directly. Signed-off-by: Jiri Olsa <jolsa@kernel.org> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Link: http://lkml.kernel.org/r/1449133606-14429-5-git-send-email-jolsa@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2015-12-07perf evsel: Use event maps directly in perf_evsel__enableJiri Olsa1-4/+1
All events now share proper cpu and thread maps. There's no need to pass those maps from evlist, it's safe to use evsel maps for enabling event. Signed-off-by: Jiri Olsa <jolsa@kernel.org> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Link: http://lkml.kernel.org/r/1449133606-14429-2-git-send-email-jolsa@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2015-11-26perf stat: Clear sample_(type|period) for countingJiri Olsa1-0/+7
Clear sample_(type|period) for counting, as it only confuses debug output with unwanted sampling details: Before: $ sudo perf stat -e 'raw_syscalls:sys_enter' -vv ls ------------------------------------------------------------ perf_event_attr: type 2 size 112 config 0x11 { sample_period, sample_freq } 1 sample_type TIME|CPU|PERIOD|RAW read_format TOTAL_TIME_ENABLED|TOTAL_TIME_RUNNING disabled 1 inherit 1 enable_on_exec 1 exclude_guest 1 ... After: $ sudo perf stat -e 'raw_syscalls:sys_enter' -vv ls ------------------------------------------------------------ perf_event_attr: type 2 size 112 config 0x11 read_format TOTAL_TIME_ENABLED|TOTAL_TIME_RUNNING disabled 1 inherit 1 enable_on_exec 1 exclude_guest 1 ... Signed-off-by: Jiri Olsa <jolsa@kernel.org> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: David Ahern <dsahern@gmail.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Link: http://lkml.kernel.org/r/1448465815-27404-1-git-send-email-jolsa@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2015-11-05perf stat: Make stat options globalJiri Olsa1-81/+82
So they can be used in perf stat record command in following patch. Signed-off-by: Jiri Olsa <jolsa@kernel.org> Tested-by: Kan Liang <kan.liang@intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Link: http://lkml.kernel.org/r/1446734469-11352-2-git-send-email-jolsa@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2015-11-04perf stat: Use common printout function to avoid duplicated codeAndi Kleen1-37/+20
Instead of every caller deciding whether to call abs or nsec printout do it all in a single central function. No functional changes. Signed-off-by: Andi Kleen <ak@linux.intel.com> Acked-by: Jiri Olsa <jolsa@kernel.org> Link: http://lkml.kernel.org/r/1446515428-7450-3-git-send-email-andi@firstfloor.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2015-11-04perf stat: Move sw clock metrics printout to stat-shadowAndi Kleen1-5/+3
The sw clock metrics printing was missed in the earlier move to stat-shadow of all the other metric printouts. Move it too. v2: Fix metrics printing in this version to make bisect safe. Signed-off-by: Andi Kleen <ak@linux.intel.com> Acked-by: Jiri Olsa <jolsa@kernel.org> Link: http://lkml.kernel.org/r/1446515428-7450-2-git-send-email-andi@firstfloor.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2015-10-27perf stat: Cache aggregated map entries in extra cpumapJiri Olsa1-4/+55
Currently any time we need to access socket or core id for given cpu, we access the sysfs topology file. Adding a cpus_aggr_map cpu_map to cache those entries. Signed-off-by: Jiri Olsa <jolsa@kernel.org> Tested-by: Kan Liang <kan.liang@intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Link: http://lkml.kernel.org/r/1445784728-21732-3-git-send-email-jolsa@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2015-10-19perf cpu_map: Add data arg to cpu_map__build_map callbackJiri Olsa1-2/+12
Adding data arg to cpu_map__build_map callback, so we could pass data along to the callback. It'll be needed in following patches to retrieve topology info from perf.data. Signed-off-by: Jiri Olsa <jolsa@kernel.org> Tested-by: Kan Liang <kan.liang@intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Link: http://lkml.kernel.org/r/1444992092-17897-41-git-send-email-jolsa@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2015-10-19perf stat: Add AGGR_UNSET modeJiri Olsa1-0/+5
Adding AGGR_UNSET mode, so we could distinguish unset aggr_mode in following patches. Signed-off-by: Jiri Olsa <jolsa@kernel.org> Tested-by: Kan Liang <kan.liang@intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Link: http://lkml.kernel.org/r/1444992092-17897-30-git-send-email-jolsa@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2015-10-19perf stat: Rename perf_stat struct into perf_stat_evselJiri Olsa1-2/+2
It's used as the perf_evsel::priv data, so the name suits better. Also we'll need the perf_stat name free for more generic struct. Signed-off-by: Jiri Olsa <jolsa@kernel.org> Tested-by: Kan Liang <kan.liang@intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Link: http://lkml.kernel.org/r/1444992092-17897-29-git-send-email-jolsa@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2015-10-02perf stat: Reduce min --interval-print to 10msKan Liang1-4/+9
The --interval-print parameter was limited to 100ms. However, for example, 10ms is required to do sophisticated bandwidth analysis using uncore events. The test shows that the overhead of the system-wide uncore monitoring with 10ms interval is only ~2%. So this patch reduces the minimal interval-print allowd to 10ms. But 10ms may not work well for all cases. For example, when the cpus/threads number is very large, for system-wide core event monitoring the overhead could be high. To handle this issue, a warning will be displayed when the interval-print is set between 10ms to 100ms. So users can make a decision according to their specific cases. # perf stat -e uncore_imc_1/cas_count_read/ -a --interval-print 10 -- sleep 1 print interval < 100ms. The overhead percentage could be high in some cases. Please proceed with caution. # time counts unit events 0.010200451 0.10 MiB uncore_imc_1/cas_count_read/ 0.020475117 0.02 MiB uncore_imc_1/cas_count_read/ 0.030692800 0.01 MiB uncore_imc_1/cas_count_read/ 0.040948161 0.02 MiB uncore_imc_1/cas_count_read/ 0.051159564 0.00 MiB uncore_imc_1/cas_count_read/ Signed-off-by: Kan Liang <kan.liang@intel.com> Acked-by: Jiri Olsa <jolsa@kernel.org> Cc: Andi Kleen <ak@linux.intel.com> Cc: Namhyung Kim <namhyung@kernel.org> Link: http://lkml.kernel.org/r/1443776674-42511-1-git-send-email-kan.liang@intel.com [ Added warning about overhead when using sub 100ms intervals to the man page ] Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2015-09-02perf stat: Quieten failed to read counter messageAndi Kleen1-1/+1
Since 3b3eb0445 running perf stat on a system without backend-stalled-cycles spits out ugly warnings by default. Since that is quite common, make the message a debug message only. We know anyways that the counter wasn't read by the normal <unsupported> output. Signed-off-by: Andi Kleen <ak@linux.intel.com> Acked-by: Jiri Olsa <jolsa@kernel.org> Link: http://lkml.kernel.org/r/1441147966-14917-1-git-send-email-andi@firstfloor.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2015-08-31Merge branch 'perf/urgent' into perf/core, to pick up fixesIngo Molnar1-3/+2
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2015-08-28perf stat: Get correct cpu id for print_aggrKan Liang1-3/+2
print_aggr() fails to print per-core/per-socket statistics after commit 582ec0829b3d ("perf stat: Fix per-socket output bug for uncore events") if events have differnt cpus. Because in print_aggr(), aggr_get_id needs index (not cpu id) to find core/pkg id. Also, evsel cpu maps should be used to get aggregated id. Here is an example: Counting events cycles,uncore_imc_0/cas_count_read/. (Uncore event has cpumask 0,18) $ perf stat -e cycles,uncore_imc_0/cas_count_read/ -C0,18 --per-core sleep 2 Without this patch, it failes to get CPU 18 result. Performance counter stats for 'CPU(s) 0,18': S0-C0 1 7526851 cycles S0-C0 1 1.05 MiB uncore_imc_0/cas_count_read/ S1-C0 0 <not counted> cycles S1-C0 0 <not counted> MiB uncore_imc_0/cas_count_read/ With this patch, it can get both CPU0 and CPU18 result. Performance counter stats for 'CPU(s) 0,18': S0-C0 1 6327768 cycles S0-C0 1 0.47 MiB uncore_imc_0/cas_count_read/ S1-C0 1 330228 cycles S1-C0 1 0.29 MiB uncore_imc_0/cas_count_read/ Signed-off-by: Kan Liang <kan.liang@intel.com> Acked-by: Jiri Olsa <jolsa@kernel.org> Acked-by: Stephane Eranian <eranian@google.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Fixes: 582ec0829b3d ("perf stat: Fix per-socket output bug for uncore events") Link: http://lkml.kernel.org/r/1435820925-51091-1-git-send-email-kan.liang@intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2015-08-08perf stat: Move perf_counts struct and functions into separate objectJiri Olsa1-0/+1
Moving 'struct perf_counts' and associated functions into separate object, so we could remove stat.c object dependency from python build. It makes the python code to build properly, because it fails to load due to missing stat-shadow.c object dependency if some patches from Kan Liang are applied. So apply this one, then Kan's. Signed-off-by: Jiri Olsa <jolsa@kernel.org> Cc: Andi Kleen <ak@linux.intel.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Kan Liang <kan.liang@intel.com> Cc: Namhyung Kim <namhyung@kernel.org> Link: http://lkml.kernel.org/r/20150807105103.GB8624@krava.brq.redhat.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2015-08-06perf stat: Move counter processing code into stat objectJiri Olsa1-140/+1
Moving counter processing code into stat object as perf_stat__process_counter. Signed-off-by: Jiri Olsa <jolsa@kernel.org> Cc: David Ahern <dsahern@gmail.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Link: http://lkml.kernel.org/r/1437481927-29538-8-git-send-email-jolsa@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>