summaryrefslogtreecommitdiff
path: root/tools/perf
AgeCommit message (Collapse)AuthorFilesLines
2016-05-20Merge tag 'powerpc-4.7-1' of ↵Linus Torvalds6-4/+226
git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux Pull powerpc updates from Michael Ellerman: "Highlights: - Support for Power ISA 3.0 (Power9) Radix Tree MMU from Aneesh Kumar K.V - Live patching support for ppc64le (also merged via livepatching.git) Various cleanups & minor fixes from: - Aaro Koskinen, Alexey Kardashevskiy, Andrew Donnellan, Aneesh Kumar K.V, Chris Smart, Daniel Axtens, Frederic Barrat, Gavin Shan, Ian Munsie, Lennart Sorensen, Madhavan Srinivasan, Mahesh Salgaonkar, Markus Elfring, Michael Ellerman, Oliver O'Halloran, Paul Gortmaker, Paul Mackerras, Rashmica Gupta, Russell Currey, Suraj Jitindar Singh, Thiago Jung Bauermann, Valentin Rothberg, Vipin K Parashar. General: - Update LMB associativity index during DLPAR add/remove from Nathan Fontenot - Fix branching to OOL handlers in relocatable kernel from Hari Bathini - Add support for userspace Power9 copy/paste from Chris Smart - Always use STRICT_MM_TYPECHECKS from Michael Ellerman - Add mask of possible MMU features from Michael Ellerman PCI: - Enable pass through of NVLink to guests from Alexey Kardashevskiy - Cleanups in preparation for powernv PCI hotplug from Gavin Shan - Don't report error in eeh_pe_reset_and_recover() from Gavin Shan - Restore initial state in eeh_pe_reset_and_recover() from Gavin Shan - Revert "powerpc/eeh: Fix crash in eeh_add_device_early() on Cell" from Guilherme G Piccoli - Remove the dependency on EEH struct in DDW mechanism from Guilherme G Piccoli selftests: - Test cp_abort during context switch from Chris Smart - Add several tests for transactional memory support from Rashmica Gupta perf: - Add support for sampling interrupt register state from Anju T - Add support for unwinding perf-stackdump from Chandan Kumar cxl: - Configure the PSL for two CAPI ports on POWER8NVL from Philippe Bergheaud - Allow initialization on timebase sync failures from Frederic Barrat - Increase timeout for detection of AFU mmio hang from Frederic Barrat - Handle num_of_processes larger than can fit in the SPA from Ian Munsie - Ensure PSL interrupt is configured for contexts with no AFU IRQs from Ian Munsie - Add kernel API to allow a context to operate with relocate disabled from Ian Munsie - Check periodically the coherent platform function's state from Christophe Lombard Freescale: - Updates from Scott: "Contains 86xx fixes, minor device tree fixes, an erratum workaround, and a kconfig dependency fix." * tag 'powerpc-4.7-1' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux: (192 commits) powerpc/86xx: Fix PCI interrupt map definition powerpc/86xx: Move pci1 definition to the include file powerpc/fsl: Fix build of the dtb embedded kernel images powerpc/fsl: Fix rcpm compatible string powerpc/fsl: Remove FSL_SOC dependency from FSL_LBC powerpc/fsl-pci: Add a workaround for PCI 5 errata powerpc/fsl: Fix SPI compatible on t208xrdb and t1040rdb powerpc/powernv/npu: Add PE to PHB's list powerpc/powernv: Fix insufficient memory allocation powerpc/iommu: Remove the dependency on EEH struct in DDW mechanism Revert "powerpc/eeh: Fix crash in eeh_add_device_early() on Cell" powerpc/eeh: Drop unnecessary label in eeh_pe_change_owner() powerpc/eeh: Ignore handlers in eeh_pe_reset_and_recover() powerpc/eeh: Restore initial state in eeh_pe_reset_and_recover() powerpc/eeh: Don't report error in eeh_pe_reset_and_recover() Revert "powerpc/powernv: Exclude root bus in pnv_pci_reset_secondary_bus()" powerpc/powernv/npu: Enable NVLink pass through powerpc/powernv/npu: Rework TCE Kill handling powerpc/powernv/npu: Add set/unset window helpers powerpc/powernv/ioda2: Export debug helper pe_level_printk() ...
2016-05-17Merge branch 'for-linus' of ↵Linus Torvalds5-5/+5
git://git.kernel.org/pub/scm/linux/kernel/git/jikos/trivial Pull trivial tree updates from Jiri Kosina. * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/trivial: (21 commits) gitignore: fix wording mfd: ab8500-debugfs: fix "between" in printk memstick: trivial fix of spelling mistake on management cpupowerutils: bench: fix "average" treewide: Fix typos in printk IB/mlx4: printk fix pinctrl: sirf/atlas7: fix printk spelling serial: mctrl_gpio: Grammar s/lines GPIOs/line GPIOs/, /sets/set/ w1: comment spelling s/minmum/minimum/ Blackfin: comment spelling s/divsor/divisor/ metag: Fix misspellings in comments. ia64: Fix misspellings in comments. hexagon: Fix misspellings in comments. tools/perf: Fix misspellings in comments. cris: Fix misspellings in comments. c6x: Fix misspellings in comments. blackfin: Fix misspelling of 'register' in comment. avr32: Fix misspelling of 'definitions' in comment. treewide: Fix typos in printk Doc: treewide : Fix typos in DocBook/filesystem.xml ...
2016-05-16Merge branch 'perf-core-for-linus' of ↵Linus Torvalds164-2010/+5696
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull perf updates from Ingo Molnar: "Bigger kernel side changes: - Add backwards writing capability to the perf ring-buffer code, which is preparation for future advanced features like robust 'overwrite support' and snapshot mode. (Wang Nan) - Add pause and resume ioctls for the perf ringbuffer (Wang Nan) - x86 Intel cstate code cleanups and reorgnization (Thomas Gleixner) - x86 Intel uncore and CPU PMU driver updates (Kan Liang, Peter Zijlstra) - x86 AUX (Intel PT) related enhancements and updates (Alexander Shishkin) - x86 MSR PMU driver enhancements and updates (Huang Rui) - ... and lots of other changes spread out over 40+ commits. Biggest tooling side changes: - 'perf trace' features and enhancements. (Arnaldo Carvalho de Melo) - BPF tooling updates (Wang Nan) - 'perf sched' updates (Jiri Olsa) - 'perf probe' updates (Masami Hiramatsu) - ... plus 200+ other enhancements, fixes and cleanups to tools/ The merge commits, the shortlog and the changelogs contain a lot more details" * 'perf-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (249 commits) perf/core: Disable the event on a truncated AUX record perf/x86/intel/pt: Generate PMI in the STOP region as well perf buildid-cache: Use lsdir() for looking up buildid caches perf symbols: Use lsdir() for the search in kcore cache directory perf tools: Use SBUILD_ID_SIZE where applicable perf tools: Fix lsdir to set errno correctly perf trace: Move seccomp args beautifiers to tools/perf/trace/beauty/ perf trace: Move flock op beautifier to tools/perf/trace/beauty/ perf build: Add build-test for debug-frame on arm/arm64 perf build: Add build-test for libunwind cross-platforms support perf script: Fix export of callchains with recursion in db-export perf script: Fix callchain addresses in db-export perf script: Fix symbol insertion behavior in db-export perf symbols: Add dso__insert_symbol function perf scripting python: Use Py_FatalError instead of die() perf tools: Remove xrealloc and ALLOC_GROW perf help: Do not use ALLOC_GROW in add_cmd_list perf pmu: Make pmu_formats_string to check return value of strbuf perf header: Make topology checkers to check return value of strbuf perf tools: Make alias handler to check return value of strbuf ...
2016-05-12perf stat: Fallback to user only counters when perf_event_paranoid > 1Arnaldo Carvalho de Melo1-1/+6
After 0161028b7c8a ("perf/core: Change the default paranoia level to 2") 'perf stat' fails for users without CAP_SYS_ADMIN, so just use 'perf_evsel__fallback()' to have the same behaviour as 'perf record', i.e. set perf_event_attr.exclude_kernel to 1. Now: [acme@jouet linux]$ perf stat usleep 1 Performance counter stats for 'usleep 1': 0.352536 task-clock:u (msec) # 0.423 CPUs utilized 0 context-switches:u # 0.000 K/sec 0 cpu-migrations:u # 0.000 K/sec 49 page-faults:u # 0.139 M/sec 309,407 cycles:u # 0.878 GHz 243,791 instructions:u # 0.79 insn per cycle 49,622 branches:u # 140.757 M/sec 3,884 branch-misses:u # 7.83% of all branches 0.000834174 seconds time elapsed [acme@jouet linux]$ Reported-by: Ingo Molnar <mingo@kernel.org> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Andy Lutomirski <luto@kernel.org> Cc: David Ahern <dsahern@gmail.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Kees Cook <keescook@chromium.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Masami Hiramatsu <mhiramat@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Wang Nan <wangnan0@huawei.com> Link: http://lkml.kernel.org/n/tip-b20jmx4dxt5hpaa9t2rroi0o@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-05-12perf evsel: Handle EACCESS + perf_event_paranoid=2 in fallback()Arnaldo Carvalho de Melo1-0/+18
Now with the default for the kernel.perf_event_paranoid sysctl being 2 [1] we need to fall back to :u, i.e. to set perf_event_attr.exclude_kernel to 1. Before: [acme@jouet linux]$ perf record usleep 1 Error: You may not have permission to collect stats. Consider tweaking /proc/sys/kernel/perf_event_paranoid, which controls use of the performance events system by unprivileged users (without CAP_SYS_ADMIN). The current value is 2: -1: Allow use of (almost) all events by all users >= 0: Disallow raw tracepoint access by users without CAP_IOC_LOCK >= 1: Disallow CPU event access by users without CAP_SYS_ADMIN >= 2: Disallow kernel profiling by users without CAP_SYS_ADMIN [acme@jouet linux]$ After: [acme@jouet linux]$ perf record usleep 1 [ perf record: Woken up 1 times to write data ] [ perf record: Captured and wrote 0.016 MB perf.data (7 samples) ] [acme@jouet linux]$ perf evlist cycles:u [acme@jouet linux]$ perf evlist -v cycles:u: size: 112, { sample_period, sample_freq }: 4000, sample_type: IP|TID|TIME|PERIOD, disabled: 1, inherit: 1, exclude_kernel: 1, mmap: 1, comm: 1, freq: 1, enable_on_exec: 1, task: 1, sample_id_all: 1, exclude_guest: 1, mmap2: 1, comm_exec: 1 [acme@jouet linux]$ And if the user turns on verbose mode, an explanation will appear: [acme@jouet linux]$ perf record -v usleep 1 Warning: kernel.perf_event_paranoid=2, trying to fall back to excluding kernel samples mmap size 528384B [ perf record: Woken up 1 times to write data ] Looking at the vmlinux_path (8 entries long) Using /lib/modules/4.6.0-rc7+/build/vmlinux for symbols [ perf record: Captured and wrote 0.016 MB perf.data (7 samples) ] [acme@jouet linux]$ [1] 0161028b7c8a ("perf/core: Change the default paranoia level to 2") Reported-by: Ingo Molnar <mingo@kernel.org> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Andy Lutomirski <luto@kernel.org> Cc: David Ahern <dsahern@gmail.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Kees Cook <keescook@chromium.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Masami Hiramatsu <mhiramat@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Wang Nan <wangnan0@huawei.com> Link: http://lkml.kernel.org/n/tip-b20jmx4dxt5hpaa9t2rroi0o@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-05-12perf evsel: Improve EPERM error handling in open_strerror()Arnaldo Carvalho de Melo1-2/+3
We were showing a hardcoded default value for the kernel.perf_event_paranoid sysctl, now that it became more paranoid (1 -> 2 [1]), this would need to be updated, instead show the current value: [acme@jouet linux]$ perf record ls Error: You may not have permission to collect stats. Consider tweaking /proc/sys/kernel/perf_event_paranoid, which controls use of the performance events system by unprivileged users (without CAP_SYS_ADMIN). The current value is 2: -1: Allow use of (almost) all events by all users >= 0: Disallow raw tracepoint access by users without CAP_IOC_LOCK >= 1: Disallow CPU event access by users without CAP_SYS_ADMIN >= 2: Disallow kernel profiling by users without CAP_SYS_ADMIN [acme@jouet linux]$ [1] 0161028b7c8a ("perf/core: Change the default paranoia level to 2") Reported-by: Ingo Molnar <mingo@kernel.org> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Andy Lutomirski <luto@kernel.org> Cc: David Ahern <dsahern@gmail.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Kees Cook <keescook@chromium.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Masami Hiramatsu <mhiramat@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Wang Nan <wangnan0@huawei.com> Link: http://lkml.kernel.org/n/tip-0gc4rdpg8d025r5not8s8028@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-05-12perf probe: Check if dwarf_getlocations() is availableArnaldo Carvalho de Melo2-0/+15
If not, tell the user that: config/Makefile:273: Old libdw.h, finding variables at given 'perf probe' point will not work, install elfutils-devel/libdw-dev >= 0.157 And return -ENOTSUPP in die_get_var_range(), failing features that need it, like the one pointed out above. This fixes the build on older systems, such as Ubuntu 12.04.5. Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Vinson Lee <vlee@freedesktop.org> Cc: Wang Nan <wangnan0@huawei.com> Link: http://lkml.kernel.org/n/tip-9l7luqkq4gfnx7vrklkq4obs@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-05-12perf dwarf: Guard !x86_64 definitions under #ifdef else clauseArnaldo Carvalho de Melo1-4/+4
To fix the build on Fedora Rawhide (gcc 6.0.0 20160311 (Red Hat 6.0.0-0.17): CC /tmp/build/perf/arch/x86/util/dwarf-regs.o arch/x86/util/dwarf-regs.c:66:36: error: 'x86_32_regoffset_table' defined but not used [-Werror=unused-const-variable=] static const struct pt_regs_offset x86_32_regoffset_table[] = { ^~~~~~~~~~~~~~~~~~~~~~ cc1: all warnings being treated as errors Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Wang Nan <wangnan0@huawei.com> Link: http://lkml.kernel.org/n/tip-fghuksc1u8ln82bof4lwcj0o@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-05-12perf tools: Use readdir() instead of deprecated readdir_r()Arnaldo Carvalho de Melo1-30/+30
The readdir() function is thread safe as long as just one thread uses a DIR, which is the case when parsing tracepoint event definitions, to avoid breaking the build with glibc-2.23.90 (upcoming 2.24), use it instead of readdir_r(). See: http://man7.org/linux/man-pages/man3/readdir.3.html "However, in modern implementations (including the glibc implementation), concurrent calls to readdir() that specify different directory streams are thread-safe. In cases where multiple threads must read from the same directory stream, using readdir() with external synchronization is still preferable to the use of the deprecated readdir_r(3) function." Noticed while building on a Fedora Rawhide docker container. Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Wang Nan <wangnan0@huawei.com> Link: http://lkml.kernel.org/n/tip-wddn49r6bz6wq4ee3dxbl7lo@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-05-12perf thread_map: Use readdir() instead of deprecated readdir_r()Arnaldo Carvalho de Melo1-4/+4
The readdir() function is thread safe as long as just one thread uses a DIR, which is the case in thread_map, so, to avoid breaking the build with glibc-2.23.90 (upcoming 2.24), use it instead of readdir_r(). See: http://man7.org/linux/man-pages/man3/readdir.3.html "However, in modern implementations (including the glibc implementation), concurrent calls to readdir() that specify different directory streams are thread-safe. In cases where multiple threads must read from the same directory stream, using readdir() with external synchronization is still preferable to the use of the deprecated readdir_r(3) function." Noticed while building on a Fedora Rawhide docker container. Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Wang Nan <wangnan0@huawei.com> Link: http://lkml.kernel.org/n/tip-del8h2a0f40z75j4r42l96l0@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-05-12perf script: Use readdir() instead of deprecated readdir_r()Arnaldo Carvalho de Melo1-36/+34
The readdir() function is thread safe as long as just one thread uses a DIR, which is the case in 'perf script', so, to avoid breaking the build with glibc-2.23.90 (upcoming 2.24), use it instead of readdir_r(). See: http://man7.org/linux/man-pages/man3/readdir.3.html "However, in modern implementations (including the glibc implementation), concurrent calls to readdir() that specify different directory streams are thread-safe. In cases where multiple threads must read from the same directory stream, using readdir() with external synchronization is still preferable to the use of the deprecated readdir_r(3) function." Noticed while building on a Fedora Rawhide docker container. Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Wang Nan <wangnan0@huawei.com> Link: http://lkml.kernel.org/n/tip-mt3xz7n2hl49ni2vx7kuq74g@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-05-12perf tools: Use readdir() instead of deprecated readdir_r()Arnaldo Carvalho de Melo1-6/+6
The readdir() function is thread safe as long as just one thread uses a DIR, which is the case when synthesizing events for pre-existing threads by traversing /proc, so, to avoid breaking the build with glibc-2.23.90 (upcoming 2.24), use it instead of readdir_r(). See: http://man7.org/linux/man-pages/man3/readdir.3.html "However, in modern implementations (including the glibc implementation), concurrent calls to readdir() that specify different directory streams are thread-safe. In cases where multiple threads must read from the same directory stream, using readdir() with external synchronization is still preferable to the use of the deprecated readdir_r(3) function." Noticed while building on a Fedora Rawhide docker container. CC /tmp/build/perf/util/event.o util/event.c: In function '__event__synthesize_thread': util/event.c:466:2: error: 'readdir_r' is deprecated [-Werror=deprecated-declarations] while (!readdir_r(tasks, &dirent, &next) && next) { ^~~~~ In file included from /usr/include/features.h:368:0, from /usr/include/stdint.h:25, from /usr/lib/gcc/x86_64-redhat-linux/6.0.0/include/stdint.h:9, from /git/linux/tools/include/linux/types.h:6, from util/event.c:1: /usr/include/dirent.h:189:12: note: declared here Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Wang Nan <wangnan0@huawei.com> Link: http://lkml.kernel.org/n/tip-i1vj7nyjp2p750rirxgrfd3c@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-05-11perf buildid-cache: Use lsdir() for looking up buildid cachesMasami Hiramatsu1-26/+4
Use new lsdir() for looking up buildid caches. This changes logic a bit to ignore all dot files, since the build-id cache must not start with dot. Signed-off-by: Masami Hiramatsu <mhiramat@kernel.org> Signed-off-by: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com> Cc: Ananth N Mavinakayanahalli <ananth@linux.vnet.ibm.com> Cc: Brendan Gregg <brendan.d.gregg@gmail.com> Cc: Hemant Kumar <hemant@linux.vnet.ibm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/20160511135217.23943.94596.stgit@devbox Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-05-11perf symbols: Use lsdir() for the search in kcore cache directoryMasami Hiramatsu1-12/+14
Use lsdir() to search in kcore cache directory. This also avoids checking hidden dot directory entries, because kcore cache directories must always have the name from timestamps when taking the kcore snapshots, and it never start with dot. Signed-off-by: Masami Hiramatsu <mhiramat@kernel.org> Cc: Ananth N Mavinakayanahalli <ananth@linux.vnet.ibm.com> Cc: Brendan Gregg <brendan.d.gregg@gmail.com> Cc: Hemant Kumar <hemant@linux.vnet.ibm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/20160511135208.23943.68071.stgit@devbox Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-05-11perf tools: Use SBUILD_ID_SIZE where applicableMasami Hiramatsu6-7/+7
Use the existing SBUILD_ID_SIZE macro instead of the equivalent BUILD_ID_SIZE * 2 + 1 expression for allocating a buffer for build-id strings. Signed-off-by: Masami Hiramatsu <mhiramat@kernel.org> Cc: Ananth N Mavinakayanahalli <ananth@linux.vnet.ibm.com> Cc: Brendan Gregg <brendan.d.gregg@gmail.com> Cc: Hemant Kumar <hemant@linux.vnet.ibm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/20160511135159.23943.57120.stgit@devbox Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-05-11perf tools: Fix lsdir to set errno correctlyMasami Hiramatsu1-1/+1
Fix lsdir() to set correct positive error number (ENOMEM). Since "errno" must have a positive error number instead of negative number, fix lsdir to set it correctly. Signed-off-by: Masami Hiramatsu <mhiramat@kernel.org> Cc: Ananth N Mavinakayanahalli <ananth@linux.vnet.ibm.com> Cc: Brendan Gregg <brendan.d.gregg@gmail.com> Cc: Hemant Kumar <hemant@linux.vnet.ibm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Fixes: e1ce726e1db2 ("perf tools: Add lsdir() helper to read a directory") Link: http://lkml.kernel.org/r/20160511135127.23943.40644.stgit@devbox Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-05-11perf trace: Move seccomp args beautifiers to tools/perf/trace/beauty/Arnaldo Carvalho de Melo2-52/+53
To reduce the size of builtin-trace.c. Cc: David Ahern <dsahern@gmail.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Link: http://lkml.kernel.org/n/tip-ovxifncj34ynrjjseg33lil3@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-05-11perf trace: Move flock op beautifier to tools/perf/trace/beauty/Arnaldo Carvalho de Melo2-31/+32
To reduce the size of builtin-trace.c. Cc: David Ahern <dsahern@gmail.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Link: http://lkml.kernel.org/n/tip-c4c47w2a2jx13terl2p2hros@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-05-11perf script: Fix export of callchains with recursion in db-exportChris Phlipot1-0/+4
When an IP with an unresolved symbol occurs in the callchain more than once (ie. recursion), then duplicate symbols can be created because the callchain nodes are never updated after they are first created. To fix this issue we call dso__find_symbol whenever we encounter a NULL symbol, in case we already added a symbol at that IP since we started traversing the callchain. This change prevents duplicate symbols from being exported when duplicate IPs are present in the callchain. Signed-off-by: Chris Phlipot <cphlipot0@gmail.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/1462937209-6032-5-git-send-email-cphlipot0@gmail.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-05-11perf script: Fix callchain addresses in db-exportChris Phlipot1-4/+1
Remove the call to map_ip() to adjust al.addr, because it has already been called when assembling the callchain, in: thread__resolve_callchain_sample(perf_sample) add_callchain_ip(ip = perf_sample->callchain->ips[j]) thread__find_addr_location(addr = ip) thread__find_addr_map(addr) { al->addr = addr if (al->map) al->addr = al->map->map_ip(al->map, al->addr); } Calling it a second time can result in incorrect addresses being used. This can have effects such as duplicate symbols being created and exported. Signed-off-by: Chris Phlipot <cphlipot0@gmail.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/1462937209-6032-4-git-send-email-cphlipot0@gmail.com [ Show the callchain where it is done, to help reviewing this change down the line ] Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-05-11perf script: Fix symbol insertion behavior in db-exportChris Phlipot1-2/+1
Use the dso__insert_symbol function instead of symbols__insert() in order to properly update the dso symbol cache. If the cache is not updated, then duplicate symbols can be unintentionally created, inserted, and exported. This change prevents duplicate symbols from being exported due to dso__find_symbol() using a stale symbol cache. Signed-off-by: Chris Phlipot <cphlipot0@gmail.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/1462937209-6032-3-git-send-email-cphlipot0@gmail.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-05-11perf symbols: Add dso__insert_symbol functionChris Phlipot2-0/+15
The current method for inserting symbols is to use the symbols__insert() function. However symbols__insert() does not update the dso symbol cache. This causes problems in the following scenario: 1. symbol not found at addr using dso__find_symbol 2. symbol inserted at addr using the existing symbols__insert function 3. symbol still not found at addr using dso__find_symbol() because cache isn't updated. This is undesired behavior. The undesired behavior in (3) is addressed by creating a new function, dso__insert_symbol() to both insert the symbol and update the symbol cache if necessary. If dso__insert_symbol() is used in (2) instead of symbols__insert(), then the undesired behavior in (3) is avoided. Signed-off-by: Chris Phlipot <cphlipot0@gmail.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/1462937209-6032-2-git-send-email-cphlipot0@gmail.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-05-11perf scripting python: Use Py_FatalError instead of die()Arnaldo Carvalho de Melo1-2/+5
It probably is equivalent, but that seems to be the "pythonic" way of dieing? Anyway, one less die() in the tools/perf codebase. Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Wang Nan <wangnan0@huawei.com> Cc: Chris Phlipot <cphlipot0@gmail.com> Link: http://lkml.kernel.org/n/tip-nlzgepdv2818zs4e7faif9tu@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-05-11Merge tag 'perf-core-for-mingo-20160510' of ↵Ingo Molnar27-245/+510
git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux into perf/core Pull perf/core improvements and fixes from Arnaldo Carvalho de Melo: User visible changes: - Recording 'dwarf' callchains do not need DWARF unwinding support (He Kuang) - Print recently added perf_event_attr.write_backward bit flag in -vv verbose mode (Arnaldo Carvalho de Melo) - Fix incorrect python db-export error message in 'perf script' (Chris Phlipot) - Fix handling of zero-length symbols (Chris Phlipot) - perf stat: Scale values by unit before metrics (Andi Kleen) Infrastructure changes: - Rewrite strbuf not to die(), making tools using it to check its return value instead (Masami Hiramatsu) - Support reading from backward ring buffer, add a 'perf test' entry for it (Wang Nan) Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-05-11Merge branch 'perf/urgent' into perf/core, to pick up fixesIngo Molnar1-0/+3
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-05-11perf diff: Fix duplicated output columnNamhyung Kim1-0/+3
The commit b97511c5bc94 ("perf tools: Add overhead/overhead_children keys defaults via string") moved initialization of column headers but it missed to check the sort__mode. As 'perf diff' doesn't call perf_hpp__init(), the setup_overhead() also should not be called. Before: # Baseline Delta Children Overhead Shared Object Symbol # ........ ....... ........ ........ ................... ....................... # 28.48% -28.47% 28.48% 28.48% [kernel.vmlinux ] [k] intel_idle 11.51% -11.47% 11.51% 11.51% libxul.so [.] 0x0000000001a360f7 3.49% -3.49% 3.49% 3.49% [kernel.vmlinux] [k] generic_exec_single 2.91% -2.89% 2.91% 2.91% libdbus-1.so.3.8.11 [.] 0x000000000000cdc2 2.86% -2.85% 2.86% 2.86% libxcb.so.1.1.0 [.] 0x000000000000c890 2.44% -2.39% 2.44% 2.44% [kernel.vmlinux] [k] perf_event_aux_ctx After: # Baseline Delta Shared Object Symbol # ........ ....... ................... ....................... # 28.48% -28.47% [kernel.vmlinux] [k] intel_idle 11.51% -11.47% libxul.so [.] 0x0000000001a360f7 3.49% -3.49% [kernel.vmlinux] [k] generic_exec_single 2.91% -2.89% libdbus-1.so.3.8.11 [.] 0x000000000000cdc2 2.86% -2.85% libxcb.so.1.1.0 [.] 0x000000000000c890 2.44% -2.39% [kernel.vmlinux] [k] perf_event_aux_ctx Signed-off-by: Namhyung Kim <namhyung@kernel.org> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Acked-by: Jiri Olsa <jolsa@kernel.org> Cc: <stable@vger.kernel.org> # 4.5+ Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Fixes: b97511c5bc94 ("perf tools: Add overhead/overhead_children keys defaults via string") Link: http://lkml.kernel.org/r/1462890384-12486-2-git-send-email-acme@kernel.org Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-05-11perf tools: Fix perf regs mask generationNaveen N. Rao1-4/+4
On some architectures (powerpc in particular), the number of registers exceeds what can be represented in an integer bitmask. Ensure we generate the proper bitmask on such platforms. Fixes: 71ad0f5e4 ("perf tools: Support for DWARF CFI unwinding on post processing") Signed-off-by: Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com> Acked-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-05-11perf/powerpc: Add support for unwinding perf-stackdumpChandan Kumar3-0/+98
Adds support for unwinding user stack dump by linking with libunwind. Signed-off-by: Chandan Kumar <chandan.kumar@linux.vnet.ibm.com> Reviewed-by: Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com> Tested-by: Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com> Acked-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-05-10perf tools: Remove xrealloc and ALLOC_GROWMasami Hiramatsu4-55/+0
Remove unused xrealloc() and ALLOC_GROW() from libperf. Signed-off-by: Masami Hiramatsu <mhiramat@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/20160510054801.6158.6204.stgit@devbox Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-05-10perf help: Do not use ALLOC_GROW in add_cmd_listMasami Hiramatsu1-8/+22
Replace ALLOC_GROW with normal realloc code in add_cmd_list() so that it can handle errors directly. Signed-off-by: Masami Hiramatsu <mhiramat@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/20160510054752.6158.30562.stgit@devbox Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-05-10perf pmu: Make pmu_formats_string to check return value of strbufMasami Hiramatsu1-5/+5
Make pmu_formats_string() to check return value of strbuf APIs so that it can detect errors in it. Signed-off-by: Masami Hiramatsu <mhiramat@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/20160510054744.6158.37810.stgit@devbox Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-05-10perf header: Make topology checkers to check return value of strbufMasami Hiramatsu1-11/+20
Make topology checkers to check the return value of strbuf APIs so that it can detect errors in it. Signed-off-by: Masami Hiramatsu <mhiramat@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/20160510054735.6158.98650.stgit@devbox Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-05-10perf tools: Make alias handler to check return value of strbufMasami Hiramatsu3-20/+26
Make alias handler and sq_quote_argv to check the return value of strbuf APIs. In sq_quote_argv() calls die(), but this fix handles strbuf failure as a special case and returns to caller, since the caller - handle_alias() also has to check the return value of other strbuf APIs and those checks can be merged to one if() statement. Signed-off-by: Masami Hiramatsu <mhiramat@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/20160510054725.6158.84597.stgit@devbox Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-05-10perf help: Make check_emacsclient_version to check strbuf APIsMasami Hiramatsu1-8/+10
Make check_emacsclient_version() to check the return value of strbuf APIs so that it can handle errors in strbuf. Signed-off-by: Masami Hiramatsu <mhiramat@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/20160510054716.6158.11755.stgit@devbox Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-05-10perf probe: Check the return value of strbuf APIsMasami Hiramatsu3-97/+128
Check the return value of strbuf APIs in perf-probe related code, so that it can handle errors in strbuf. Signed-off-by: Masami Hiramatsu <mhiramat@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/20160510054707.6158.69861.stgit@devbox Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-05-10perf tools: Rewrite strbuf not to die()Masami Hiramatsu2-36/+82
Rewrite strbuf implementation not to use die() nor xrealloc(). Instead of die(), now most of the API returns error code or 0 if succeeded. Suggested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: Masami Hiramatsu <mhiramat@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/20160510054658.6158.24080.stgit@devbox Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-05-09perf symbols: Fix handling of zero-length symbols.Chris Phlipot1-1/+1
This change introduces a fix to symbols__find, so that it is able to find symbols of length zero (where start == end). The current code has the following problem: - The current implementation of symbols__find is unable to find any symbols of length zero. - The db-export framework explicitly creates zero length symbols at locations where no symbol currently exists. The combination of the two above behaviors results in behavior similar to the example below. 1. addr_location is created for a sample, but symbol is unable to be resolved. 2. db export creates an "unknown" symbol of length zero at that address and inserts it into the dso. 3. A new sample comes in at the same address, but symbol__find is unable to find the zero length symbol, so it is still unresolved. 4. db export sees the symbol is unresolved, and allocated a duplicate symbol, even though it already did this in step 2. This behavior continues every time an address without symbol information is seen, which causes a very large number of these symbols to be allocated. The effect of this fix can be observed by looking at the contents of an exported database before/after the fix (generated with scripts/python/export-to-postgresql.py) Ex. BEFORE THE CHANGE: example_db=# select count(*) from symbols; count -------- 900213 (1 row) example_db=# select count(*) from symbols where symbols.name='unknown'; count -------- 897355 (1 row) example_db=# select count(*) from symbols where symbols.name!='unknown'; count ------- 2858 (1 row) AFTER THE CHANGE: example_db=# select count(*) from symbols; count ------- 25217 (1 row) example_db=# select count(*) from symbols where name='unknown'; count ------- 22359 (1 row) example_db=# select count(*) from symbols where name!='unknown'; count ------- 2858 (1 row) Signed-off-by: Chris Phlipot <cphlipot0@gmail.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/1462612620-25008-1-git-send-email-cphlipot0@gmail.com [ Moved the test to later in the rb_tree tests, as this not the likely case ] Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-05-09perf evsel: Print state of perf_event_attr.write_backwardArnaldo Carvalho de Melo1-0/+1
Now we can see if it is set when using verbose mode in various tools, such as 'perf test': # perf test -vv back 45: Test backward reading from ring buffer : --- start --- <SNIP> ------------------------------------------------------------ perf_event_attr: type 2 size 112 config 0x98 { sample_period, sample_freq } 1 sample_type IP|TID|TIME|CPU|PERIOD|RAW disabled 1 mmap 1 comm 1 task 1 sample_id_all 1 exclude_guest 1 mmap2 1 comm_exec 1 write_backward 1 ------------------------------------------------------------ sys_perf_event_open: pid 20911 cpu -1 group_fd -1 flags 0x8 <SNIP> ---- end ---- Test backward reading from ring buffer: Ok # Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Milian Wolff <milian.wolff@kdab.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Wang Nan <wangnan0@huawei.com> Link: http://lkml.kernel.org/n/tip-kxv05kv9qwl5of7rzfeiiwbv@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-05-09perf tests: Add test to check backward ring bufferWang Nan4-0/+157
This test checks reading from backward ring buffer. Test result: # ~/perf test 'ring buffer' 45: Test backward reading from ring buffer : Ok The test case is a while loop which calls prctl(PR_SET_NAME) multiple times. Each prctl should issue 2 events: one PERF_RECORD_SAMPLE, one PERF_RECORD_COMM. The first round creates a relative large ring buffer (256 pages). It can afford all events. Read from it and check the count of each type of events. The second round creates a small ring buffer (1 page) and makes it overwritable. Check the correctness of the buffer. Signed-off-by: Wang Nan <wangnan0@huawei.com> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Zefan Li <lizefan@huawei.com> Cc: pi3orama@163.com Link: http://lkml.kernel.org/r/1462758471-89706-3-git-send-email-wangnan0@huawei.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-05-09perf tools: Support reading from backward ring bufferWang Nan2-0/+54
perf_evlist__mmap_read_backward() is introduced for reading backward ring buffer. Since direction for reading such ring buffer is different from the direction kernel writing to it, and since user need to fetch most recent record from it, a perf_evlist__mmap_read_catchup() is introduced to move the reading pointer to the end of the buffer. Signed-off-by: Wang Nan <wangnan0@huawei.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Zefan Li <lizefan@huawei.com> Cc: pi3orama@163.com Link: http://lkml.kernel.org/r/1462758471-89706-2-git-send-email-wangnan0@huawei.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-05-09perf script: Fix incorrect python db-export error messageChris Phlipot1-1/+1
Fix the error message printed when attempting and failing to create the call path root incorrectly references the call return process. This change fixes the message to properly reference the failure to create the call path root. Signed-off-by: Chris Phlipot <cphlipot0@gmail.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/1462612620-25008-2-git-send-email-cphlipot0@gmail.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-05-09perf stat: Scale values by unit before metricsAndi Kleen1-1/+3
Scale values by unit before passing them to the metrics printing functions. This is needed for TopDown, because it needs to scale the slots correctly by pipeline width / SMTness. For existing metrics it shouldn't make any difference, as those generally use events that don't have any units. Signed-off-by: Andi Kleen <ak@linux.intel.com> Acked-by: Jiri Olsa <jolsa@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/1462489447-31832-8-git-send-email-andi@firstfloor.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-05-09perf callchain: Recording 'dwarf' callchains do not need DWARF unwinding supportHe Kuang1-2/+0
There is no need to check for DWARF unwinding support when using the 'dwarf' callchain record method, as this will only ask the kernel to collect stack dumps for later DWARF CFI processing, which can be done in another machine, where the support for DWARF unwinding need to be present. Signed-off-by: He Kuang <hekuang@huawei.com> Acked-by: Jiri Olsa <jolsa@kernel.org> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Ekaterina Tumanova <tumanova@linux.vnet.ibm.com> Cc: Josh Poimboeuf <jpoimboe@redhat.com> Cc: Kan Liang <kan.liang@intel.com> Cc: Masami Hiramatsu <mhiramat@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Pekka Enberg <penberg@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Cc: Sukadev Bhattiprolu <sukadev@linux.vnet.ibm.com> Cc: Wang Nan <wangnan0@huawei.com> Link: http://lkml.kernel.org/r/1462525154-125656-2-git-send-email-hekuang@huawei.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-05-06perf trace: Move futex_op beautifier to tools/perf/trace/beauty/Arnaldo Carvalho de Melo2-45/+46
To reduce the size of builtin-trace.c. Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Wang Nan <wangnan0@huawei.com> Link: http://lkml.kernel.org/n/tip-vb8dpy7bptkf219q5c25ulfp@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-05-06perf trace: Move open_flags beautifier to tools/perf/trace/beauty/Arnaldo Carvalho de Melo2-56/+57
To reduce the size of builtin-trace.c. Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Wang Nan <wangnan0@huawei.com> Link: http://lkml.kernel.org/n/tip-jt293541hv9od7gqw6lilioh@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-05-06perf trace: Move signum beautifier to tools/perf/trace/beauty/Arnaldo Carvalho de Melo2-53/+54
To reduce the size of builtin-trace.c. Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Wang Nan <wangnan0@huawei.com> Link: http://lkml.kernel.org/n/tip-qecqxwwtreio6eaatfv58yq5@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-05-06perf stat: Add extra output of counter values with -vvAndi Kleen1-0/+8
Add debug output of raw counter values per CPU when perf stat -v is specified, together with their cpu numbers. This is very useful to debug problems with per core counters, where we can normally only see aggregated values. v2: Make it depend on -vv, not -v Signed-off-by: Andi Kleen <ak@linux.intel.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/1461787251-6702-12-git-send-email-andi@firstfloor.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-05-06perf script: Update export-to-postgresql to support callchain exportChris Phlipot1-17/+30
Update the export-to-postgresql.py to support the newly introduced callchain export. callchains are added into the existing call_paths table and can now be associated with samples when the "callpaths" commandline option is used with the script. Ex.: $ perf script -s export-to-postgresql.py example_db all callchains Includes the following changes to enable callchain export via the python export APIs: - Add the "callchains" commandline option, which is used to enable callchain export by setting the perf_db_export_callchains global - Add perf_db_export_callchains checks for call_path table creation and population. - Add call_path_id to samples_table to conform with the new API example usage and output using a small test app: test_app.c: volatile int x = 0; void inc_x_loop() { int i; for(i=0; i<100000000; i++) x++; } void a() { inc_x_loop(); } void b() { inc_x_loop(); } int main() { a(); b(); return 0; } example usage: $ gcc -g -O0 test_app.c $ perf record --call-graph=dwarf ./a.out [ perf record: Woken up 77 times to write data ] [ perf record: Captured and wrote 19.373 MB perf.data (2404 samples) ] $ perf script -s scripts/python/export-to-postgresql.py example_db all callchains $ psql example_db example_db=# SELECT (SELECT name FROM symbols WHERE id = cps.symbol_id) as symbol, (SELECT name FROM symbols WHERE id = (SELECT symbol_id from call_paths where id = cps.parent_id)) as parent_symbol, sum(period) as event_count FROM samples join call_paths as cps on call_path_id = cps.id GROUP BY cps.id,evsel_id ORDER BY event_count DESC LIMIT 5; symbol | parent_symbol | event_count ------------------+--------------------------+------------- inc_x_loop | a | 734250982 inc_x_loop | b | 731028057 unknown | unknown | 1335858 task_tick_fair | scheduler_tick | 1238842 update_wall_time | tick_do_update_jiffies64 | 650373 (5 rows) The above data shows total "self time" in cycles for each call path that was sampled. It is intended to demonstrate how it accounts separately for the two ways to reach the "inc_x_loop" function(via "a" and "b"). Recursive common table expressions can be used as well to get cumulative time spent in a function as well, but that is beyond the scope of this basic example. Signed-off-by: Chris Phlipot <cphlipot0@gmail.com> Acked-by: Adrian Hunter <adrian.hunter@intel.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/1461831551-12213-7-git-send-email-cphlipot0@gmail.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-05-06perf script: Expose usage of the callchain db export via the python apiChris Phlipot3-16/+46
This change allows python scripts to be able to utilize the recent changes to the db export api allowing the export of call_paths derived from sampled callchains. These call paths are also now associated with the samples from which they were derived. - This feature is enabled by setting "perf_db_export_callchains" to true - When enabled, samples that have callchain information will have the callchains exported via call_path_table - The call_path_id field is added to sample_table to enable association of samples with the corresponding callchain stored in the call paths table. A call_path_id of 0 will be exported if there is no corresponding callchain. - When "perf_db_export_callchains" and "perf_db_export_calls" are both set to True, the call path root data structure will be shared. This prevents duplicating of data and call path ids that would result from building two separate call path trees in memory. - The call_return_processor structure definition was relocated to the header file to make its contents visible to db-export.c. This enables the sharing of call path trees between the two features, as mentioned above. This change is visible to python scripts using the python db export api. The change is backwards compatible with scripts written against the previous API, assuming that the scripts model the sample_table function after the one in export-to-postgresql.py script by allowing for additional arguments to be added in the future. ie. using *x as the final argument of the sample_table function. Signed-off-by: Chris Phlipot <cphlipot0@gmail.com> Acked-by: Adrian Hunter <adrian.hunter@intel.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/1461831551-12213-6-git-send-email-cphlipot0@gmail.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-05-06perf script: Add call path id to exported sample in db exportChris Phlipot2-1/+4
The exported sample now contains a reference to the call_path_id that represents its callchain. While callchains themselves are nice to have, being able to associate them with samples makes them much more useful, and can allow for such things as determining how much cumulative time is spent in a particular function. This information is normally possible to get from the call return processor. However, when doing normal sampling, call/return information is not available, thus necessitating the need for associating samples directly with call paths. This commit include changes to db-export layer to make this information available for subsequent patches in this change set, but by itself, does not make any changes visible to the user. Signed-off-by: Chris Phlipot <cphlipot0@gmail.com> Acked-by: Adrian Hunter <adrian.hunter@intel.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/1461831551-12213-5-git-send-email-cphlipot0@gmail.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>