summaryrefslogtreecommitdiff
AgeCommit message (Collapse)AuthorFilesLines
2014-02-18perf callchain: Separate libunwind code to special objectJiri Olsa6-3/+3
We are going to add libdw library support to do dwarf post unwind. Making the code ready by moving libunwind dwarf post unwind stuff into separate object. Signed-off-by: Jiri Olsa <jolsa@redhat.com> Acked-by: Jean Pihet <jean.pihet@linaro.org> Cc: Corey Ashford <cjashfor@linux.vnet.ibm.com> Cc: David Ahern <dsahern@gmail.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: Jean Pihet <jean.pihet@linaro.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Link: http://lkml.kernel.org/r/1389098853-14466-10-git-send-email-jolsa@redhat.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2014-02-18perf callchain: Add mask into struct regs_dumpJiri Olsa10-37/+32
Adding mask info into struct regs_dump to make the registers information compact. The mask was always passed along, so logically the mask info fits more into the struct regs_dump. Signed-off-by: Jiri Olsa <jolsa@redhat.com> Acked-by: Jean Pihet <jean.pihet@linaro.org> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Corey Ashford <cjashfor@linux.vnet.ibm.com> Cc: David Ahern <dsahern@gmail.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: Jean Pihet <jean.pihet@linaro.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Link: http://lkml.kernel.org/r/1389098853-14466-9-git-send-email-jolsa@redhat.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2014-02-18perf callchain: Do not report zero address in unwindJiri Olsa1-1/+1
We are not interested in zero addresses in callchain, do not report them. Signed-off-by: Jiri Olsa <jolsa@redhat.com> Acked-by: Jean Pihet <jean.pihet@linaro.org> Cc: Corey Ashford <cjashfor@linux.vnet.ibm.com> Cc: David Ahern <dsahern@gmail.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: Jean Pihet <jean.pihet@linaro.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Link: http://lkml.kernel.org/r/1389098853-14466-8-git-send-email-jolsa@redhat.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2014-02-18perf tools: Fix dwarf unwind max_stack processingJiri Olsa1-1/+1
The 'unwind__get_entries' function currently returns 'max_stack + 1' entries (instead of exact max_stack entries), because max_stack value does not get decremented for the first entry. This fix makes dwarf-unwind test pass. Signed-off-by: Jiri Olsa <jolsa@redhat.com> Acked-by: Jean Pihet <jean.pihet@linaro.org> Cc: Corey Ashford <cjashfor@linux.vnet.ibm.com> Cc: David Ahern <dsahern@gmail.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: Jean Pihet <jean.pihet@linaro.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Link: http://lkml.kernel.org/r/1389098853-14466-7-git-send-email-jolsa@redhat.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2014-02-18perf tests x86: Add dwarf unwind testJiri Olsa7-0/+232
Adding dwarf unwind test, that setups live machine data over the perf test thread and does the remote unwind. At this moment this test fails due to bug in the max_stack processing in unwind__get_entries function. This is fixed in following patch. Need to use -fno-optimize-sibling-calls for test compilation, otherwise 'krava_*' function calls are optimized into jumps and ommited from the stack unwind. So far it's enabled only for x86. Signed-off-by: Jiri Olsa <jolsa@redhat.com> Acked-by: Jean Pihet <jean.pihet@linaro.org> Cc: Corey Ashford <cjashfor@linux.vnet.ibm.com> Cc: David Ahern <dsahern@gmail.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: Jean Pihet <jean.pihet@linaro.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Link: http://lkml.kernel.org/r/1389098853-14466-6-git-send-email-jolsa@redhat.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2014-02-18perf tests x86: Introduce perf_regs_load functionJiri Olsa3-0/+95
Introducing perf_regs_load function, which is going to be used for dwarf unwind test in following patches. It takes single argument as a pointer to the regs dump buffer and populates it with current registers values. Signed-off-by: Jiri Olsa <jolsa@redhat.com> Acked-by: Jean Pihet <jean.pihet@linaro.org> Cc: Corey Ashford <cjashfor@linux.vnet.ibm.com> Cc: David Ahern <dsahern@gmail.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: Jean Pihet <jean.pihet@linaro.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Link: http://lkml.kernel.org/r/1389098853-14466-5-git-send-email-jolsa@redhat.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2014-02-18perf tools: Fix memory leak in event_format__print functionJiri Olsa1-0/+1
Properly destroying trace_seq object. Signed-off-by: Jiri Olsa <jolsa@redhat.com> Acked-by: Namhyung Kim <namhyung@kernel.org> Cc: Corey Ashford <cjashfor@linux.vnet.ibm.com> Cc: David Ahern <dsahern@gmail.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Ingo Molnar <mingo@elte.hu> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Steven Rostedt <rostedt@goodmis.org> Link: http://lkml.kernel.org/r/1391377150-23920-2-git-send-email-jolsa@redhat.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2014-02-18perf record: Add readable output for callchain debugJiri Olsa2-2/+5
Adding people readable output for callchain debug, to get following '-v' output: $ perf record -v -g ls callchain: type DWARF callchain: stack dump size 4096 ... Signed-off-by: Jiri Olsa <jolsa@redhat.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Andi Kleen <andi@firstfloor.org> Cc: Corey Ashford <cjashfor@linux.vnet.ibm.com> Cc: David Ahern <dsahern@gmail.com> Cc: Ingo Molnar <mingo@elte.hu> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Link: http://lkml.kernel.org/r/1391427883-13443-3-git-send-email-jolsa@redhat.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2014-02-18perf tools: Add call-graph option support into .perfconfigJiri Olsa4-1/+30
Adding call-graph option support into .perfconfig file, so it's now possible use call-graph option like: [top] call-graph = fp [record] call-graph = dwarf,8192 Above options ONLY setup the unwind method. To enable perf record/top to actually use it the command line option -g/-G must be specified. The --call-graph option overloads .perfconfig setup. Assuming above configuration: $ perf record -g ls - enables dwarf unwind with user stack size dump 8192 bytes $ perf top -G - enables frame pointer unwind $ perf record --call-graph=fp ls - enables frame pointer unwind $ perf top --call-graph=dwarf,4096 ls - enables dwarf unwind with user stack size dump 4096 bytes Signed-off-by: Jiri Olsa <jolsa@redhat.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Andi Kleen <andi@firstfloor.org> Cc: Corey Ashford <cjashfor@linux.vnet.ibm.com> Cc: David Ahern <dsahern@gmail.com> Cc: Ingo Molnar <mingo@elte.hu> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Link: http://lkml.kernel.org/r/1391427883-13443-2-git-send-email-jolsa@redhat.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2014-02-18perf tools: Put proper period for for samples without PERIOD sample_typeJiri Olsa1-1/+1
We use PERF_SAMPLE_PERIOD sample type only for frequency setup -F (default) option. The -c does not need store period, because it's always the same. In -c case the report code uses '1' as period. Fixing it to perf_event_attr::sample_period. Signed-off-by: Jiri Olsa <jolsa@redhat.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Andi Kleen <andi@firstfloor.org> Cc: Corey Ashford <cjashfor@linux.vnet.ibm.com> Cc: David Ahern <dsahern@gmail.com> Cc: Ingo Molnar <mingo@elte.hu> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Link: http://lkml.kernel.org/r/1391427883-13443-1-git-send-email-jolsa@redhat.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2014-02-18perf report: Remove some needless container_of usageArnaldo Carvalho de Melo1-9/+6
Since all it wants is to get the 'struct record' from the received 'struct perf_tool', and this is already done at the callers of these functions, short circuit it. Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Mike Galbraith <efault@gmx.de> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Link: http://lkml.kernel.org/n/tip-xz8p659sjpad396vye5t24gx@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2014-02-18perf tools: Shorten sample symbol resolving function signatureArnaldo Carvalho de Melo4-13/+10
Since two of the parameters come from the same 'struct addr_location', rename machine__resolve_bstack() to sample__resolve_bstack() and pass the that addr_location instead. This is also for consistency with the same change that resulted in the sample__resolve_mem() function. Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Mike Galbraith <efault@gmx.de> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Link: http://lkml.kernel.org/n/tip-99ecqt8jiyyksiyx3se7l5ia@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2014-02-18perf tools: Shorten sample symbol resolving function signatureArnaldo Carvalho de Melo4-11/+9
Since three of the parameters come from the same 'struct addr_location', rename machine__resolve_mem() to sample__resolve_mem() and pass the that addr_location instead. Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Mike Galbraith <efault@gmx.de> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Link: http://lkml.kernel.org/n/tip-3f5otpssefh9l5hi1t259h8n@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2014-02-18perf report: Use al->cpumode where applicableArnaldo Carvalho de Melo1-5/+3
We don't need to recalculate cpumode from the perf_event->header field, as this is already available in the struct addr_location->cpumode field. Remove the function signature of functions that receive both perf_event and addr_location parameters but use perf_event just to extract the cpumode. Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Mike Galbraith <efault@gmx.de> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Link: http://lkml.kernel.org/n/tip-tmct07y7mka54allj82trlnx@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2014-02-18Merge remote-tracking branch 'acme/perf/urgent' into perf/coreArnaldo Carvalho de Melo4-3/+44
To have some 'perf probe' related fixes needed for further devel work in this tool. Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2014-02-13perf trace: Fix ioctl 'request' beautifier build problems on !(i386 || ↵Arnaldo Carvalho de Melo1-0/+18
x86_64) arches Supporting decoding the ioctl 'request' parameter needs more work to properly support more architectures, the current approach doesn't work on at least powerpc and sparc, as reported by Ben Hutchings in http://lkml.kernel.org/r/1391593985.3003.48.camel@deadeye.wl.decadent.org.uk . Work around that by making it to be ifdefed for the architectures known to work with the current, limited approach, i386 and x86_64 till better code is written. Reported-by: Ben Hutchings <ben@decadent.org.uk> Acked-by: Ben Hutchings <ben@decadent.org.uk> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Mike Galbraith <efault@gmx.de> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Cc: <stable@vger.kernel.org> # 3.13 Fixes: 78645cf3ed32 ("perf trace: Initial beautifier for ioctl's 'cmd' arg") Link: http://lkml.kernel.org/n/tip-ss04k11insqlu329xh5g02q0@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2014-02-10perf trace: Add fallback definition of EFD_SEMAPHOREBen Hutchings1-0/+4
glibc 2.17 is missing this on sparc, despite the fact that it's not architecture-specific. Signed-off-by: Ben Hutchings <ben@decadent.org.uk> Fixes: 49af9e93adfa ('perf trace: Beautify eventfd2 'flags' arg') Cc: <stable@vger.kernel.org> Cc: Ingo Molnar <mingo@redhat.com> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: stable@vger.kernel.org Link: http://lkml.kernel.org/r/1391648435.3003.100.camel@deadeye.wl.decadent.org.uk Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2014-02-10perf list: Fix checking for supported events on older kernelsVince Weaver1-2/+15
"perf list" listing of hardware events doesn't work on older ARM devices. The change enabling event detection: commit b41f1cec91c37eeea6fdb15effbfa24ea0a5536b Author: Namhyung Kim <namhyung.kim@lge.com> Date: Tue Aug 27 11:41:53 2013 +0900 perf list: Skip unsupported events uses the following code in tools/perf/util/parse-events.c: struct perf_event_attr attr = { .type = type, .config = config, .disabled = 1, .exclude_kernel = 1, }; On ARM machines pre-dating the Cortex-A15 this doesn't work, as these machines don't support .exclude_kernel. So starting with 3.12 "perf list" does not report any hardware events at all on older machines (seen on Rasp-Pi, Pandaboard, Beagleboard, etc). This version of the patch makes changes suggested by Namhyung Kim to check for EACCESS and retry (instead of just dropping the exclude_kernel) so we can properly handle machines where /proc/sys/kernel/perf_event_paranoid is set to 2. Reported-by: Chad Paradis <chad.paradis@umit.maine.edu> Signed-off-by: Vince Weaver <vincent.weaver@maine.edu> Acked-by: Namhyung Kim <namhyung@kernel.org> Cc: Chad Paradis <chad.paradis@umit.maine.edu> Cc: Ingo Molnar <mingo@redhat.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Link: http://lkml.kernel.org/r/alpine.DEB.2.10.1312301536150.28814@vincent-weaver-1.um.maine.edu Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2014-02-10perf tools: Handle PERF_RECORD_HEADER_EVENT_TYPE properlyJiri Olsa1-0/+6
We removed event types from data file in following commits: 6065210 perf tools: Remove event types framework completely 44b3c57 perf tools: Remove event types from perf data file We no longer need this information, because we can get it directly from tracepoints. But we still need to handle PERF_RECORD_HEADER_EVENT_TYPE event for the sake of old perf data files created in pipe mode like: $ perf.3.4 record -o - foo >perf.data $ perf.312 report -i - < perf.data Reported-by: Stephane Eranian <eranian@google.com> Signed-off-by: Jiri Olsa <jolsa@redhat.com> Cc: Corey Ashford <cjashfor@linux.vnet.ibm.com> Cc: David Ahern <dsahern@gmail.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Ingo Molnar <mingo@elte.hu> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Link: http://lkml.kernel.org/r/1391524668-12546-1-git-send-email-jolsa@redhat.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2014-02-10perf probe: Do not add offset twice to uprobe addressMasami Hiramatsu1-1/+1
Fix perf-probe not to add offset value twice to uprobe probe address when post processing. The tevs[i].point.address struct member is the address of symbol+offset, but current perf-probe adjusts the point.address by adding the offset. As a result, the probe address becomes symbol+offset+offset. This may cause unexpected code corruption. Urgent fix is needed. Without this fix: --- # ./perf probe -x ./perf dso__load_vmlinux+4 # ./perf probe -l probe_perf:dso__load_vmlinux (on 0x000000000006d2b8) # nm ./perf.orig | grep dso__load_vmlinux\$ 000000000046d0a0 T dso__load_vmlinux --- You can see the given offset is 3 but the actual probed address is dso__load_vmlinux+8. With this fix: --- # ./perf probe -x ./perf dso__load_vmlinux+4 # ./perf probe -l probe_perf:dso__load_vmlinux (on 0x000000000006d2b4) --- Now the problem is fixed. Note: This bug is introduced by commit fb7345bbf7fad9bf72ef63a19c707970b9685812 Signed-off-by: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com> Cc: "David A. Long" <dave.long@linaro.org> Cc: David Ahern <dsahern@gmail.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Oleg Nesterov <oleg@redhat.com> Cc: Srikar Dronamraju <srikar@linux.vnet.ibm.com> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: yrl.pp-manager.tt@hitachi.com Link: http://lkml.kernel.org/r/20140205051858.6519.27314.stgit@kbuild-fedora.yrl.intra.hitachi.co.jp Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2014-02-09perf/x86/p4: Block PMIs on init to prevent a stream of unkown NMIsDon Zickus1-0/+15
A bunch of unknown NMIs have popped up on a Pentium4 recently when booting into a kdump kernel. This was exposed because the watchdog timer went from 60 seconds down to 10 seconds (increasing the ability to reproduce this problem). What is happening is on boot up of the second kernel (the kdump one), the previous nmi_watchdogs were enabled on thread 0 and thread 1. The second kernel only initializes one cpu but the perf counter on thread 1 still counts. Normally in a kdump scenario, the other cpus are blocking in an NMI loop, but more importantly their local apics have the performance counters disabled (iow LVTPC is masked). So any counters that fire are masked and never get through to the second kernel. However, on a P4 the local apic is shared by both threads and thread1's PMI (despite being configured to only interrupt thread1) will generate an NMI on thread0. Because thread0 knows nothing about this NMI, it is seen as an unknown NMI. This would be fine because it is a kdump kernel, strange things happen what is the big deal about a single unknown NMI. Unfortunately, the P4 comes with another quirk: clearing the overflow bit to prevent a stream of NMIs. This is the problem. The kdump kernel can not execute because of the endless NMIs that happen. To solve this, I instrumented the p4 perf init code, to walk all the counters and zero them out (just like a normal reset would). Now when the counters go off, they do not generate anything and no unknown NMIs are seen. I tested this on a P4 we have in our lab. After two or three crashes, I could normally reproduce the problem. Now after 10 crashes, everything continues to boot correctly. Signed-off-by: Don Zickus <dzickus@redhat.com> Cc: Dave Young <dyoung@redhat.com> Cc: Vivek Goyal <vgoyal@redhat.com> Cc: Cyrill Gorcunov <gorcunov@openvz.org> Signed-off-by: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/20140120154115.GZ25953@redhat.com [ Fixed a stylistic detail. ] Signed-off-by: Ingo Molnar <mingo@kernel.org>
2014-02-09perf/x86/p4: Fix counter corruption when using lots of perf groupsDon Zickus1-1/+18
On a P4 box stressing perf with: ./perf record -o perf.data ./perf stat -v ./perf bench all it was noticed that a slew of unknown NMIs would pop out rather quickly. Painfully debugging this ancient platform, led me to notice cross cpu counter corruption. The P4 machine is special in that it has 18 counters, half are used for cpu0 and the other half is for cpu1 (or all 18 if hyperthreading is disabled). But the splitting of the counters has to be actively managed by the software. In this particular bug, one of the cpu0 specific counters was being used by cpu1 and caused all sorts of random unknown nmis. I am not entirely sure on the corruption path, but what happens is: o perf schedules a group with p4_pmu_schedule_events() o inside p4_pmu_schedule_events(), it notices an hwc pointer is being reused but for a different cpu, so it 'swaps' the config bits and returns the updated 'assign' array with a _new_ index. o perf schedules another group with p4_pmu_schedule_events() o inside p4_pmu_schedule_events(), it notices an hwc pointer is being reused (the same one as above) but for the _same_ cpu [BUG!!], so it updates the 'assign' array to use the _old_ (wrong cpu) index because the _new_ index is in an earlier part of the 'assign' array (and hasn't been committed yet). o perf commits the transaction using the wrong index and corrupts the other cpu The [BUG!!] is because the 'hwc->config' is updated but not the 'hwc->idx'. So the check for 'p4_should_swap_ts()' is correct the first time around but incorrect the second time around (because hwc->config was updated in between). I think the spirit of perf was to not modify anything until all the transactions had a chance to 'test' if they would succeed, and if so, commit atomically. However, P4 breaks this spirit by touching the hwc->config element. So my fix is to continue the un-perf like breakage, by assigning hwc->idx to -1 on swap to tell follow up group scheduling to find a new index. Of course if the transaction fails rolling this back will be difficult, but that is not different than how the current code works. :-) And I wasn't sure how much effort to cleanup the code I should do for a platform that is almost 10 years old by now. Hence the lazy fix. Signed-off-by: Don Zickus <dzickus@redhat.com> Acked-by: Cyrill Gorcunov <gorcunov@openvz.org> Signed-off-by: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/1391024270-19469-1-git-send-email-dzickus@redhat.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
2014-02-09x86/nmi: Push duration printk() to irq contextPeter Zijlstra2-13/+27
Calling printk() from NMI context is bad (TM), so move it to IRQ context. In doing so we slightly change (probably wreck) the debugfs nmi_longest_ns thingy, in that it doesn't update to reflect the longest, nor does writing to it reset the count. Signed-off-by: Peter Zijlstra <peterz@infradead.org> Cc: Don Zickus <dzickus@redhat.com> Cc: Dave Hansen <dave.hansen@linux.intel.com> Link: http://lkml.kernel.org/n/tip-rdw0au56a5ymis1u8p48c12d@git.kernel.org Signed-off-by: Ingo Molnar <mingo@kernel.org>
2014-02-09perf/x86: Push the duration-logging printk() to IRQ contextPeter Zijlstra2-7/+23
Calling printk() from NMI context is bad (TM), so move it to IRQ context. This also avoids the problem where the printk() time is measured by the generic NMI duration goo and triggers a second warning. Signed-off-by: Peter Zijlstra <peterz@infradead.org> Cc: Don Zickus <dzickus@redhat.com> Cc: Dave Hansen <dave.hansen@linux.intel.com> Link: http://lkml.kernel.org/n/tip-75dv35xf6dhhmeb7nq6fua31@git.kernel.org Signed-off-by: Ingo Molnar <mingo@kernel.org>
2014-02-09Merge branch 'linus' into perf/coreIngo Molnar9201-226445/+544207
Refresh the branch to a v3.14-rc base before queueing up new devel patches. Signed-off-by: Ingo Molnar <mingo@kernel.org>
2014-02-09perf/x86: Fix Userspace RDPMC switchPeter Zijlstra1-1/+1
The current code forgets to change the CR4 state on the current CPU. Use on_each_cpu() instead of smp_call_function(). Reported-by: Mark Davies <junk@eslaf.co.uk> Suggested-by: Mark Davies <junk@eslaf.co.uk> Signed-off-by: Peter Zijlstra <peterz@infradead.org> Cc: fweisbec@gmail.com Link: http://lkml.kernel.org/n/tip-69efsat90ibhnd577zy3z9gh@git.kernel.org Signed-off-by: Ingo Molnar <mingo@kernel.org>
2014-02-09perf/x86/intel/p6: Add userspace RDPMC quirk for PProPeter Zijlstra3-16/+39
PPro machines can die hard when PCE gets enabled due to a CPU erratum. The safe way it so disable it by default and keep it disabled. See erratum 26 in: http://download.intel.com/design/archives/processors/pro/docs/24268935.pdf Reported-and-Tested-by: Mark Davies <junk@eslaf.co.uk> Cc: Alan Cox <alan@lxorguk.ukuu.org.uk> Cc: Stephane Eranian <eranian@google.com> Cc: Vince Weaver <vince@deater.net> Signed-off-by: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/20140206170815.GW2936@laptop.programming.kicks-ass.net Signed-off-by: Ingo Molnar <mingo@kernel.org>
2014-02-08Merge tag 'pinctrl-v3.14-2' of ↵Linus Torvalds6-15/+32
git://git.kernel.org/pub/scm/linux/kernel/git/linusw/linux-pinctrl Pull pinctrl fixes from Linus Walleij: "First round of pin control fixes for v3.14: - Protect pinctrl_list_add() with the proper mutex. This was identified by RedHat. Caused nasty locking warnings was rootcased by Stanislaw Gruszka. - Avoid adding dangerous debugfs files when either half of the subsystem is unused: pinmux or pinconf. - Various fixes to various drivers: locking, hardware particulars, DT parsing, error codes" * tag 'pinctrl-v3.14-2' of git://git.kernel.org/pub/scm/linux/kernel/git/linusw/linux-pinctrl: pinctrl: tegra: return correct error type pinctrl: do not init debugfs entries for unimplemented functionalities pinctrl: protect pinctrl_list add pinctrl: sirf: correct the pin index of ac97_pins group pinctrl: imx27: fix offset calculation in imx_read_2bit pinctrl: vt8500: Change devicetree data parsing pinctrl: imx27: fix wrong offset to ICONFB pinctrl: at91: use locked variant of irq_set_handler
2014-02-08Merge branch 'irq-urgent-for-linus' of ↵Linus Torvalds1-0/+1
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull irq fix from Thomas Gleixner: "Add a missing Kconfig dependency" * 'irq-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: genirq: Generic irq chip requires IRQ_DOMAIN
2014-02-08Merge branch 'x86-urgent-for-linus' of ↵Linus Torvalds18-84/+138
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86 fixes from Peter Anvin: "Quite a varied little collection of fixes. Most of them are relatively small or isolated; the biggest one is Mel Gorman's fixes for TLB range flushing. A couple of AMD-related fixes (including not crashing when given an invalid microcode image) and fix a crash when compiled with gcov" * 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86, microcode, AMD: Unify valid container checks x86, hweight: Fix BUG when booting with CONFIG_GCOV_PROFILE_ALL=y x86/efi: Allow mapping BGRT on x86-32 x86: Fix the initialization of physnode_map x86, cpu hotplug: Fix stack frame warning in check_irq_vectors_for_cpu_disable() x86/intel/mid: Fix X86_INTEL_MID dependencies arch/x86/mm/srat: Skip NUMA_NO_NODE while parsing SLIT mm, x86: Revisit tlb_flushall_shift tuning for page flushes except on IvyBridge x86: mm: change tlb_flushall_shift for IvyBridge x86/mm: Eliminate redundant page table walk during TLB range flushing x86/mm: Clean up inconsistencies when flushing TLB ranges mm, x86: Account for TLB flushes only when debugging x86/AMD/NB: Fix amd_set_subcaches() parameter type x86/quirks: Add workaround for AMD F16h Erratum792 x86, doc, kconfig: Fix dud URL for Microcode data
2014-02-08Merge tag 'jfs-3.14-rc2' of git://github.com/kleikamp/linux-shaggyLinus Torvalds1-7/+7
Pull jfs fix from David Kleikamp: "Fix regression" * tag 'jfs-3.14-rc2' of git://github.com/kleikamp/linux-shaggy: jfs: fix generic posix ACL regression
2014-02-08jfs: fix generic posix ACL regressionDave Kleikamp1-7/+7
I missed a couple errors in reviewing the patches converting jfs to use the generic posix ACL function. Setting ACL's currently fails with -EOPNOTSUPP. Signed-off-by: Dave Kleikamp <dave.kleikamp@oracle.com> Reported-by: Michael L. Semon <mlsemon35@gmail.com> Reviewed-by: Christoph Hellwig <hch@lst.de>
2014-02-08watchdog: dw_wdt: Add dependency on HAS_IOMEMRichard Weinberger1-0/+1
On archs like S390 or um this driver cannot build nor work. Make it depend on HAS_IOMEM to bypass build failures. drivers/built-in.o: In function `dw_wdt_drv_probe': drivers/watchdog/dw_wdt.c:302: undefined reference to `devm_ioremap_resource' Signed-off-by: Richard Weinberger <richard@nod.at> Signed-off-by: Wim Van Sebroeck <wim@iguana.be>
2014-02-07Merge tag 'driver-core-3.14-rc2' of ↵Linus Torvalds1-4/+8
git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core Pull driver core fix from Greg KH: "Here is a single kernfs fix to resolve a much-reported lockdep issue with the removal of entries in sysfs" * tag 'driver-core-3.14-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core: kernfs: make kernfs_deactivate() honor KERNFS_LOCKDEP flag
2014-02-07Merge branch 'for-linus' of ↵Linus Torvalds2-32/+60
git://git.kernel.org/pub/scm/linux/kernel/git/sage/ceph-client Pull ceph fixes from Sage Weil: "There is an RBD fix for a crash due to the immutable bio changes, an error path fix, and a locking fix in the recent redirect support" * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/sage/ceph-client: libceph: do not dereference a NULL bio pointer libceph: take map_sem for read in handle_reply() libceph: factor out logic from ceph_osdc_start_request() libceph: fix error handling in ceph_osdc_init()
2014-02-07Merge tag 'arm64-fixes' of ↵Linus Torvalds20-67/+102
git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux Pull arm64 fixes from Catalin Marinas: - Relax VDSO alignment requirements so that the kernel-picked one (4K) does not conflict with the dynamic linker's one (64K) - VDSO gettimeofday fix - Barrier fixes for atomic operations and cache flushing - TLB invalidation when overriding early page mappings during boot - Wired up new 32-bit arm (compat) syscalls - LSM_MMAP_MIN_ADDR when COMPAT is enabled - defconfig update - Clean-up (comments, pgd_alloc). * tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux: arm64: defconfig: Expand default enabled features arm64: asm: remove redundant "cc" clobbers arm64: atomics: fix use of acquire + release for full barrier semantics arm64: barriers: allow dsb macro to take option parameter security: select correct default LSM_MMAP_MIN_ADDR on arm on arm64 arm64: compat: Wire up new AArch32 syscalls arm64: vdso: update wtm fields for CLOCK_MONOTONIC_COARSE arm64: vdso: fix coarse clock handling arm64: simplify pgd_alloc arm64: fix typo: s/SERRROR/SERROR/ arm64: Invalidate the TLB when replacing pmd entries during boot arm64: Align CMA sizes to PAGE_SIZE arm64: add DSB after icache flush in __flush_icache_all() arm64: vdso: prevent ld from aligning PT_LOAD segments to 64k
2014-02-07Merge branch 'upstream' of git://git.linux-mips.org/pub/scm/ralf/upstream-linusLinus Torvalds7-11/+24
Pull MIPS updates from Ralf Baechle: "hree minor patches. All have sat in -next for a few days" * 'upstream' of git://git.linux-mips.org/pub/scm/ralf/upstream-linus: MIPS: fpu.h: Fix build when CONFIG_BUG is not set MIPS: Wire up sched_setattr/sched_getattr syscalls MIPS: Alchemy: Fix DB1100 GPIO registration
2014-02-07Merge branch 'v4l_for_linus' of ↵Linus Torvalds32-72/+84
git://git.kernel.org/pub/scm/linux/kernel/git/mchehab/linux-media Pull media fixes from Mauro Carvalho Chehab: "A series of small fixes. Mostly driver ones. There is one core regression fix on a patch that was meant to fix some race issues on vb2, but that actually caused more harm than good. So, we're just reverting it for now" * 'v4l_for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mchehab/linux-media: [media] adv7842: Composite free-run platfrom-data fix [media] v4l2-dv-timings: fix GTF calculation [media] hdpvr: Fix memory leak in debug [media] af9035: add ID [2040:f900] Hauppauge WinTV-MiniStick 2 [media] mxl111sf: Fix compile when CONFIG_DVB_USB_MXL111SF is unset [media] mxl111sf: Fix unintentional garbage stack read [media] cx24117: use a valid dev pointer for dev_err printout [media] cx24117: remove dead code in always 'false' if statement [media] update Michael Krufky's email address [media] vb2: Check if there are buffers before streamon [media] Revert "[media] videobuf_vm_{open,close} race fixes" [media] go7007-loader: fix usb_dev leak [media] media: bt8xx: add missing put_device call [media] exynos4-is: Compile in fimc-lite runtime PM callbacks conditionally [media] exynos4-is: Compile in fimc runtime PM callbacks conditionally [media] exynos4-is: Fix error paths in probe() for !pm_runtime_enabled() [media] s5p-jpeg: Fix wrong NV12 format parameters [media] s5k5baf: allow to handle arbitrary long i2c sequences
2014-02-07Merge tag 'hwmon-for-linus' of ↵Linus Torvalds2-36/+36
git://git.kernel.org/pub/scm/linux/kernel/git/groeck/linux-staging Pull hwmon fixes from Guenter Roeck: "Fix PMBus driver problem with some multi-page voltage sensors and fix da9055 interrupt initialization" * tag 'hwmon-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/groeck/linux-staging: hwmon: (da9055) Remove use of regmap_irq_get_virq() hwmon: (pmbus) Support per-page exponent in linear mode
2014-02-07Merge tag 'pm+acpi-3.14-rc2' of ↵Linus Torvalds7-23/+75
git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm Pull ACPI and power management fixes from Rafael Wysocki: "These include a fix for a recent ACPI hotplug regression, four concurrency related fixes and one PCI device removal fix for ACPI-based PCI hotplug (ACPIPHP), intel_pstate fix that should go into stable, three simple ACPI cleanups and a new entry for the ACPI video blacklist. Specifics: - Fix for a recent ACPI hotplug regression causing a NULL pointer dereference to occur while handling ACPI eject notifications for already ejected devices. From Toshi Kani. - Four concurrency-related fixes for ACPIPHP. Two of them add missing locking and the other two fix race conditions related to reference counting. - ACPIPHP fix to avoid NULL pointer dereferences during device removal involving Virtual Funcions. - intel_pstate fix to make it compute the percentage of time the CPU is busy properly. From Dirk Brandewie. - Removal of two unnecessary NULL pointer checks in ACPI code and a fix for sscanf() format string from Dan Carpenter and Luis G.F. - New ACPI video blacklist entry for HP EliteBook Revolve 810 from Mika Westerberg" * tag 'pm+acpi-3.14-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm: ACPI / hotplug: Fix panic on eject to ejected device ACPI / battery: Fix incorrect sscanf() string in acpi_battery_init_alarm() ACPI / proc: remove unneeded NULL check ACPI / utils: remove a pointless NULL check ACPI / video: Add HP EliteBook Revolve 810 to the blacklist intel_pstate: Take core C0 time into account for core busy calculation ACPI / hotplug / PCI: Fix bridge removal race vs dock events ACPI / hotplug / PCI: Fix bridge removal race in handle_hotplug_event() ACPI / hotplug / PCI: Scan root bus under the PCI rescan-remove lock ACPI / hotplug / PCI: Move PCI rescan-remove locking to hotplug_event() ACPI / hotplug / PCI: Remove entries from bus->devices in reverse order
2014-02-07libceph: do not dereference a NULL bio pointerIlya Dryomov1-2/+6
Commit f38a5181d9f3 ("ceph: Convert to immutable biovecs") introduced a NULL pointer dereference, which broke rbd in -rc1. Fix it. Cc: Kent Overstreet <kmo@daterainc.com> Signed-off-by: Ilya Dryomov <ilya.dryomov@inktank.com> Reviewed-by: Sage Weil <sage@inktank.com>
2014-02-07Merge tag 'efi-urgent' into x86/urgentH. Peter Anvin9129-225914/+541665
* Avoid WARN_ON() when mapping BGRT on Baytrail (EFI 32-bit). Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
2014-02-07libceph: take map_sem for read in handle_reply()Ilya Dryomov1-6/+11
Handling redirect replies requires both map_sem and request_mutex. Taking map_sem unconditionally near the top of handle_reply() avoids possible race conditions that arise from releasing request_mutex to be able to acquire map_sem in redirect reply case. (Lock ordering is: map_sem, request_mutex, crush_mutex.) Signed-off-by: Ilya Dryomov <ilya.dryomov@inktank.com> Reviewed-by: Sage Weil <sage@inktank.com>
2014-02-07libceph: factor out logic from ceph_osdc_start_request()Ilya Dryomov1-23/+39
Factor out logic from ceph_osdc_start_request() into a new helper, __ceph_osdc_start_request(). ceph_osdc_start_request() now amounts to taking locks and calling __ceph_osdc_start_request(). Signed-off-by: Ilya Dryomov <ilya.dryomov@inktank.com> Reviewed-by: Sage Weil <sage@inktank.com>
2014-02-07arm64: defconfig: Expand default enabled featuresMark Rutland2-4/+15
FPGA implementations of the Cortex-A57 and Cortex-A53 are now available in the form of the SMM-A57 and SMM-A53 Soft Macrocell Models (SMMs) for Versatile Express. As these attach to a Motherboard Express V2M-P1 it would be useful to have support for some V2M-P1 peripherals enabled by default. Additionally a couple of of features have been introduced since the last defconfig update (CMA, jump labels) that would be good to have enabled by default to ensure they are build and boot tested. This patch updates the arm64 defconfig to enable support for these devices and features. The arm64 Kconfig is modified to select HAVE_PATA_PLATFORM, which is required to enable support for the CompactFlash controller on the V2M-P1. A few options which don't need to appear in defconfig are trimmed: * BLK_DEV - selected by default * EXPERIMENTAL - otherwise gone from the kernel * MII - selected by drivers which require it * USB_SUPPORT - selected by default Signed-off-by: Mark Rutland <mark.rutland@arm.com> Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2014-02-07arm64: asm: remove redundant "cc" clobbersWill Deacon4-25/+21
cbnz/tbnz don't update the condition flags, so remove the "cc" clobbers from inline asm blocks that only use these instructions to implement conditional branches. Signed-off-by: Will Deacon <will.deacon@arm.com> Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2014-02-07arm64: atomics: fix use of acquire + release for full barrier semanticsWill Deacon5-18/+35
Linux requires a number of atomic operations to provide full barrier semantics, that is no memory accesses after the operation can be observed before any accesses up to and including the operation in program order. On arm64, these operations have been incorrectly implemented as follows: // A, B, C are independent memory locations <Access [A]> // atomic_op (B) 1: ldaxr x0, [B] // Exclusive load with acquire <op(B)> stlxr w1, x0, [B] // Exclusive store with release cbnz w1, 1b <Access [C]> The assumption here being that two half barriers are equivalent to a full barrier, so the only permitted ordering would be A -> B -> C (where B is the atomic operation involving both a load and a store). Unfortunately, this is not the case by the letter of the architecture and, in fact, the accesses to A and C are permitted to pass their nearest half barrier resulting in orderings such as Bl -> A -> C -> Bs or Bl -> C -> A -> Bs (where Bl is the load-acquire on B and Bs is the store-release on B). This is a clear violation of the full barrier requirement. The simple way to fix this is to implement the same algorithm as ARMv7 using explicit barriers: <Access [A]> // atomic_op (B) dmb ish // Full barrier 1: ldxr x0, [B] // Exclusive load <op(B)> stxr w1, x0, [B] // Exclusive store cbnz w1, 1b dmb ish // Full barrier <Access [C]> but this has the undesirable effect of introducing *two* full barrier instructions. A better approach is actually the following, non-intuitive sequence: <Access [A]> // atomic_op (B) 1: ldxr x0, [B] // Exclusive load <op(B)> stlxr w1, x0, [B] // Exclusive store with release cbnz w1, 1b dmb ish // Full barrier <Access [C]> The simple observations here are: - The dmb ensures that no subsequent accesses (e.g. the access to C) can enter or pass the atomic sequence. - The dmb also ensures that no prior accesses (e.g. the access to A) can pass the atomic sequence. - Therefore, no prior access can pass a subsequent access, or vice-versa (i.e. A is strictly ordered before C). - The stlxr ensures that no prior access can pass the store component of the atomic operation. The only tricky part remaining is the ordering between the ldxr and the access to A, since the absence of the first dmb means that we're now permitting re-ordering between the ldxr and any prior accesses. From an (arbitrary) observer's point of view, there are two scenarios: 1. We have observed the ldxr. This means that if we perform a store to [B], the ldxr will still return older data. If we can observe the ldxr, then we can potentially observe the permitted re-ordering with the access to A, which is clearly an issue when compared to the dmb variant of the code. Thankfully, the exclusive monitor will save us here since it will be cleared as a result of the store and the ldxr will retry. Notice that any use of a later memory observation to imply observation of the ldxr will also imply observation of the access to A, since the stlxr/dmb ensure strict ordering. 2. We have not observed the ldxr. This means we can perform a store and influence the later ldxr. However, that doesn't actually tell us anything about the access to [A], so we've not lost anything here either when compared to the dmb variant. This patch implements this solution for our barriered atomic operations, ensuring that we satisfy the full barrier requirements where they are needed. Cc: <stable@vger.kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Signed-off-by: Will Deacon <will.deacon@arm.com> Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2014-02-06hwmon: (da9055) Remove use of regmap_irq_get_virq()Adam Thomson1-4/+0
Remove use of regmap_irq_get_virq() in driver probe which was conflicting with use of platform_get_irq_byname(). platform_get_irq_byname() already returns the VIRQ number due to MFD core translation so using regmap_irq_get_virq() on that returned value results in an incorrect IRQ being requested. The driver probes then fail because of this. Signed-off-by: Adam Thomson <Adam.Thomson.Opensource@diasemi.com> Signed-off-by: Guenter Roeck <linux@roeck-us.net>
2014-02-06Merge branches 'acpi-cleanup' and 'acpi-video'Rafael J. Wysocki4-6/+10
* acpi-cleanup: ACPI / battery: Fix incorrect sscanf() string in acpi_battery_init_alarm() ACPI / proc: remove unneeded NULL check ACPI / utils: remove a pointless NULL check * acpi-video: ACPI / video: Add HP EliteBook Revolve 810 to the blacklist
2014-02-06Merge branch 'pm-cpufreq'Rafael J. Wysocki1-4/+17
* pm-cpufreq: intel_pstate: Take core C0 time into account for core busy calculation