- 17 Feb, 2012 2 commits
-
-
Srikar Dronamraju authored
Add uprobes support to the core kernel, with x86 support. This commit adds the kernel facilities, the actual uprobes user-space ABI and perf probe support comes in later commits. General design: Uprobes are maintained in an rb-tree indexed by inode and offset (the offset here is from the start of the mapping). For a unique (inode, offset) tuple, there can be at most one uprobe in the rb-tree. Since the (inode, offset) tuple identifies a unique uprobe, more than one user may be interested in the same uprobe. This provides the ability to connect multiple 'consumers' to the same uprobe. Each consumer defines a handler and a filter (optional). The 'handler' is run every time the uprobe is hit, if it matches the 'filter' criteria. The first consumer of a uprobe causes the breakpoint to be inserted at the specified address and subsequent consumers are appended to this list. On subsequent probes, the consumer gets appended to the existing list of consumers. The breakpoint is removed when the last consumer unregisters. For all other unregisterations, the consumer is removed from the list of consumers. Given a inode, we get a list of the mms that have mapped the inode. Do the actual registration if mm maps the page where a probe needs to be inserted/removed. We use a temporary list to walk through the vmas that map the inode. - The number of maps that map the inode, is not known before we walk the rmap and keeps changing. - extending vm_area_struct wasn't recommended, it's a size-critical data structure. - There can be more than one maps of the inode in the same mm. We add callbacks to the mmap methods to keep an eye on text vmas that are of interest to uprobes. When a vma of interest is mapped, we insert the breakpoint at the right address. Uprobe works by replacing the instruction at the address defined by (inode, offset) with the arch specific breakpoint instruction. We save a copy of the original instruction at the uprobed address. This is needed for: a. executing the instruction out-of-line (xol). b. instruction analysis for any subsequent fixups. c. restoring the instruction back when the uprobe is unregistered. We insert or delete a breakpoint instruction, and this breakpoint instruction is assumed to be the smallest instruction available on the platform. For fixed size instruction platforms this is trivially true, for variable size instruction platforms the breakpoint instruction is typically the smallest (often a single byte). Writing the instruction is done by COWing the page and changing the instruction during the copy, this even though most platforms allow atomic writes of the breakpoint instruction. This also mirrors the behaviour of a ptrace() memory write to a PRIVATE file map. The core worker is derived from KSM's replace_page() logic. In essence, similar to KSM: a. allocate a new page and copy over contents of the page that has the uprobed vaddr b. modify the copy and insert the breakpoint at the required address c. switch the original page with the copy containing the breakpoint d. flush page tables. replace_page() is being replicated here because of some minor changes in the type of pages and also because Hugh Dickins had plans to improve replace_page() for KSM specific work. Instruction analysis on x86 is based on instruction decoder and determines if an instruction can be probed and determines the necessary fixups after singlestep. Instruction analysis is done at probe insertion time so that we avoid having to repeat the same analysis every time a probe is hit. A lot of code here is due to the improvement/suggestions/inputs from Peter Zijlstra. Changelog: (v10): - Add code to clear REX.B prefix as suggested by Denys Vlasenko and Masami Hiramatsu. (v9): - Use insn_offset_modrm as suggested by Masami Hiramatsu. (v7): Handle comments from Peter Zijlstra: - Dont take reference to inode. (expect inode to uprobe_register to be sane). - Use PTR_ERR to set the return value. - No need to take reference to inode. - use PTR_ERR to return error value. - register and uprobe_unregister share code. (v5): - Modified del_consumer as per comments from Peter. - Drop reference to inode before dropping reference to uprobe. - Use i_size_read(inode) instead of inode->i_size. - Ensure uprobe->consumers is NULL, before __uprobe_unregister() is called. - Includes errno.h as recommended by Stephen Rothwell to fix a build issue on sparc defconfig - Remove restrictions while unregistering. - Earlier code leaked inode references under some conditions while registering/unregistering. - Continue the vma-rmap walk even if the intermediate vma doesnt meet the requirements. - Validate the vma found by find_vma before inserting/removing the breakpoint - Call del_consumer under mutex_lock. - Use hash locks. - Handle mremap. - Introduce find_least_offset_node() instead of close match logic in find_uprobe - Uprobes no more depends on MM_OWNER; No reference to task_structs while inserting/removing a probe. - Uses read_mapping_page instead of grab_cache_page so that the pages have valid content. - pass NULL to get_user_pages for the task parameter. - call SetPageUptodate on the new page allocated in write_opcode. - fix leaking a reference to the new page under certain conditions. - Include Instruction Decoder if Uprobes gets defined. - Remove const attributes for instruction prefix arrays. - Uses mm_context to know if the application is 32 bit. Signed-off-by: Srikar Dronamraju <srikar@linux.vnet.ibm.com> Also-written-by: Jim Keniston <jkenisto@us.ibm.com> Reviewed-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Oleg Nesterov <oleg@redhat.com> Cc: Andi Kleen <andi@firstfloor.org> Cc: Christoph Hellwig <hch@infradead.org> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: Roland McGrath <roland@hack.frob.com> Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com> Cc: Arnaldo Carvalho de Melo <acme@infradead.org> Cc: Anton Arapov <anton@redhat.com> Cc: Ananth N Mavinakayanahalli <ananth@in.ibm.com> Cc: Stephen Rothwell <sfr@canb.auug.org.au> Cc: Denys Vlasenko <vda.linux@googlemail.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Linux-mm <linux-mm@kvack.org> Link: http://lkml.kernel.org/r/20120209092642.GE16600@linux.vnet.ibm.com [ Made various small edits to the commit log ] Signed-off-by: Ingo Molnar <mingo@elte.hu>
-
Ingo Molnar authored
Merge tag 'perf-core-for-mingo' of git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux into perf/core Includes smaller fixes and improvements plus the exclude_{host,guest} feature test and fallback to handle older kernels. Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
-
- 14 Feb, 2012 15 commits
-
-
Arnaldo Carvalho de Melo authored
Instead of requiring that users of perf_record_opts set .sample_id_all_avail to true, just invert the logic, using .sample_id_all_missing, that doesn't need to be explicitely initialized since gcc will zero members ommitted in a struct initialization. Just like the newly introduced .exclude_{guest,host} feature test. Cc: David Ahern <dsahern@gmail.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Mike Galbraith <efault@gmx.de> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Link: http://lkml.kernel.org/n/tip-ab772uzk78cwybihf0vt7kxw@git.kernel.orgSigned-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
-
Arnaldo Carvalho de Melo authored
Just fall back to resetting those fields, if set, warning the user that that feature is not available. If guest samples appear they will just be discarded because no struct machine will be found and thus the event will be accounted as not handled and dropped, see 0c095715. Reported-by: Namhyung Kim <namhyung@gmail.com> Tested-by: Joerg Roedel <joerg.roedel@amd.com> Cc: David Ahern <dsahern@gmail.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Joerg Roedel <joerg.roedel@amd.com> Cc: Mike Galbraith <efault@gmx.de> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Link: http://lkml.kernel.org/n/tip-vuwxig36mzprl5n7nzvnxxsh@git.kernel.orgSigned-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
-
Stephane Eranian authored
The perf_event_attr size needs to be initialized in all cases because it captures the ABI version. This patch moves the initialization of the field from the perf_event_open() syscall stub to its proper location in the event_attr_init(). Cc: Ingo Molnar <mingo@elte.hu> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/20120209151238.GA10272@quadSigned-off-by: Stephane Eranian <eranian@google.com> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
-
Robert Richter authored
There is individual code for each feature to process header sections. Adding a function pointer .process to struct feature_ops for keeping the implementation in separate functions. Code to process header sections is now a generic function. Cc: Ingo Molnar <mingo@elte.hu> Link: http://lkml.kernel.org/r/1328884916-5901-2-git-send-email-robert.richter@amd.comSigned-off-by: Robert Richter <robert.richter@amd.com> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
-
Robert Richter authored
Needed for later changes. No modified functionality. Cc: Ingo Molnar <mingo@elte.hu> Link: http://lkml.kernel.org/r/1328884916-5901-1-git-send-email-robert.richter@amd.comSigned-off-by: Robert Richter <robert.richter@amd.com> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
-
Jiri Olsa authored
Adding implementation os bitmap_or function to the bitmap object. It is stolen from the kernel lib/bitmap.o object. It is used in upcomming patches. Cc: Corey Ashford <cjashfor@linux.vnet.ibm.com> Cc: Ingo Molnar <mingo@elte.hu> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Link: http://lkml.kernel.org/r/1327674868-10486-5-git-send-email-jolsa@redhat.comSigned-off-by: Jiri Olsa <jolsa@redhat.com> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
-
Jiri Olsa authored
Adding sysfs object to provide sysfs mount information in the same way as debugfs object does. The object provides following function: sysfs_find_mountpoint which returns the sysfs mount mount. Cc: Corey Ashford <cjashfor@linux.vnet.ibm.com> Cc: Ingo Molnar <mingo@elte.hu> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Link: http://lkml.kernel.org/r/1327674868-10486-4-git-send-email-jolsa@redhat.comSigned-off-by: Jiri Olsa <jolsa@redhat.com> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
-
Jiri Olsa authored
Following debugfs object functions are not referenced within the code: int debugfs_valid_entry(const char *path); int debugfs_umount(void); int debugfs_write(const char *entry, const char *value); int debugfs_read(const char *entry, char *buffer, size_t size); void debugfs_force_cleanup(void); int debugfs_make_path(const char *element, char *buffer, int size); Removing them. Cc: Corey Ashford <cjashfor@linux.vnet.ibm.com> Cc: Ingo Molnar <mingo@elte.hu> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Link: http://lkml.kernel.org/r/1327674868-10486-3-git-send-email-jolsa@redhat.comSigned-off-by: Jiri Olsa <jolsa@redhat.com> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
-
Namhyung Kim authored
The ctype.h in symbol.c was needed because of isupper(). However we now have it in util.h, it can be changed to use our implementation. Cc: Ingo Molnar <mingo@elte.hu> Cc: Namhyung Kim <namhyung@gmail.com> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Link: http://lkml.kernel.org/r/1328836217-9118-3-git-send-email-namhyung.kim@lge.comSigned-off-by: Namhyung Kim <namhyung.kim@lge.com> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
-
Namhyung Kim authored
The implementation of sane ctype macros only depends on symbols in util.h not cache.h. Cc: Ingo Molnar <mingo@elte.hu> Cc: Namhyung Kim <namhyung@gmail.com> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Link: http://lkml.kernel.org/r/1328836217-9118-2-git-send-email-namhyung.kim@lge.comSigned-off-by: Namhyung Kim <namhyung.kim@lge.com> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
-
Namhyung Kim authored
The util.h header provides various ctype macros but lacks those two. Add them. Cc: Ingo Molnar <mingo@elte.hu> Cc: Namhyung Kim <namhyung@gmail.com> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Link: http://lkml.kernel.org/r/1328836217-9118-1-git-send-email-namhyung.kim@lge.comSigned-off-by: Namhyung Kim <namhyung.kim@lge.com> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
-
Joerg Roedel authored
Setting perf_guest to true by default makes no sense because the perf subcommands can not setup guest symbol information and thus not process and guest samples. The only exception is perf-kvm which changes the perf_guest value on its own. So change the default for perf_guest back to false. Cc: David Ahern <dsahern@gmail.com> Cc: Ingo Molnar <mingo@elte.hu> Cc: Jason Wang <jasowang@redhat.com> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Link: http://lkml.kernel.org/r/1328893505-4115-3-git-send-email-joerg.roedel@amd.comSigned-off-by: Joerg Roedel <joerg.roedel@amd.com> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
-
Joerg Roedel authored
The perf sample processing code relies on a valid machine object. Make sure that this path is only entered when such a object exists. A counter for samples where no machine object exits is also introduced to give the user a message about these samples. Reported-by: David Ahern <dsahern@gmail.com> Reported-by: Jason Wang <jasowang@redhat.com> Cc: David Ahern <dsahern@gmail.com> Cc: Ingo Molnar <mingo@elte.hu> Cc: Jason Wang <jasowang@redhat.com> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Link: http://lkml.kernel.org/r/1328893505-4115-2-git-send-email-joerg.roedel@amd.comSigned-off-by: Joerg Roedel <joerg.roedel@amd.com> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
-
David Ahern authored
Allow a user to collect events for multiple threads or processes using a comma separated list. e.g., collect data on a VM and its vhost thread: perf top -p 21483,21485 perf stat -p 21483,21485 -ddd perf record -p 21483,21485 or monitoring vcpu threads perf top -t 21488,21489 perf stat -t 21488,21489 -ddd perf record -t 21488,21489 Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Ingo Molnar <mingo@elte.hu> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Link: http://lkml.kernel.org/r/1328718772-16688-1-git-send-email-dsahern@gmail.comSigned-off-by: David Ahern <dsahern@gmail.com> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
-
David Ahern authored
For latest tip/perf/core tree Compiles are failing on: GEN common-cmds.h make: *** No rule to make target `../../arch/x86/lib/memset_64.S', needed by `builtin-annotate.o'. Stop. Resolve by adding memset.* to the tar file. Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Ingo Molnar <mingo@elte.hu> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Link: http://lkml.kernel.org/r/1329145057-26302-1-git-send-email-dsahern@gmail.comSigned-off-by: David Ahern <dsahern@gmail.com> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
-
- 13 Feb, 2012 1 commit
-
-
Namhyung Kim authored
The perf python extention (perf.so) file lacks its dependencies in the Makefile so that it cannot be refreshed if one of source files it depends is changed. Fix it by putting them in a separate file and processing it in both of Makefile and setup.py. Reported-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: David Ahern <dsahern@gmail.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Ingo Molnar <mingo@elte.hu> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Thomas Gleixner <tglx@linutronix.de> Link: http://lkml.kernel.org/r/1329043524-12470-1-git-send-email-namhyung@gmail.comSigned-off-by: Namhyung Kim <namhyung@gmail.com> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
-
- 11 Feb, 2012 4 commits
-
-
Masami Hiramatsu authored
Fix to decode grouped AVX with VEX pp bits which should be handled as same as last-prefixes. This fixes below warnings in posttest with CONFIG_CRYPTO_SHA1_SSSE3=y. Warning: arch/x86/tools/test_get_len found difference at <sha1_transform_avx>:ffffffff810d5fc0 Warning: ffffffff810d6069: c5 f9 73 de 04 vpsrldq $0x4,%xmm6,%xmm0 Warning: objdump says 5 bytes, but insn_get_length() says 4 ... With this change, test_get_len can decode it correctly. $ arch/x86/tools/test_get_len -v -y ffffffff810d6069: c5 f9 73 de 04 vpsrldq $0x4,%xmm6,%xmm0 Succeed: decoded and checked 1 instructions Reported-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com> Cc: yrl.pp-manager.tt@hitachi.com Link: http://lkml.kernel.org/r/20120210053340.30429.73410.stgit@localhost.localdomainSigned-off-by: Ingo Molnar <mingo@elte.hu>
-
Fernando Luis Vázquez Cao authored
Reflect the change in the soft and hard lockup thresholds and their relation to the frequency of the hrtimer and NMI events in the code comments. While at it, remove references to files that do not exist anymore. Signed-off-by: Fernando Luis Vazquez Cao <fernando@oss.ntt.co.jp> Signed-off-by: Don Zickus <dzickus@redhat.com> Link: http://lkml.kernel.org/r/1328827342-6253-3-git-send-email-dzickus@redhat.comSigned-off-by: Ingo Molnar <mingo@elte.hu>
-
Fernando Luis Vázquez Cao authored
The soft and hard lockup thresholds have changed so the corresponding Kconfig entries need to be updated accordingly. Add a reference to watchdog_thresh while at it. Signed-off-by: Fernando Luis Vazquez Cao <fernando@oss.ntt.co.jp> Signed-off-by: Don Zickus <dzickus@redhat.com> Link: http://lkml.kernel.org/r/1328827342-6253-2-git-send-email-dzickus@redhat.comSigned-off-by: Ingo Molnar <mingo@elte.hu>
-
Fernando Luis Vázquez Cao authored
The soft and hard lockup detectors are now built on top of the hrtimer and perf subsystems. Update the documentation accordingly. Signed-off-by: Fernando Luis Vazquez Cao<fernando@oss.ntt.co.jp> Acked-by: Randy Dunlap <rdunlap@xenotime.net> Signed-off-by: Don Zickus <dzickus@redhat.com> Link: http://lkml.kernel.org/r/1328827342-6253-1-git-send-email-dzickus@redhat.comSigned-off-by: Ingo Molnar <mingo@elte.hu>
-
- 09 Feb, 2012 2 commits
-
-
David Ahern authored
A recent refactoring of perf-record introduced the following: perf record -a -B Couldn't generating buildids. Use --no-buildid to profile anyway. sleep: Terminated I believe the triple negative was meant to be only a double negative. :-) While I'm there, fixed the grammar on the error message. Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Ingo Molnar <mingo@elte.hu> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Robert Richter <robert.richter@amd.com> Cc: Thomas Gleixner <tglx@linutronix.de> Link: http://lkml.kernel.org/r/1328567272-13190-1-git-send-email-dsahern@gmail.comSigned-off-by: David Ahern <dsahern@gmail.com> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
-
Stephane Eranian authored
The current version of perf detects whether or not the perf.data file is written in a different endianness using the attr_size field in the header of the file. This field represents sizeof(struct perf_event_attr) as known to perf record. If the sizes do not match, then perf tries the byte-swapped version. If they match, then the tool assumes a different endianness. The issue with the approach is that it assumes the size of perf_event_attr always has to match between perf record and perf report. However, the kernel perf_event ABI is extensible. New fields can be added to struct perf_event_attr. Consequently, it is not possible to use attr_size to detect endianness. This patch takes another approach by using the magic number written at the beginning of the perf.data file to detect endianness. The magic number is an eight-byte signature. It's primary purpose is to identify (signature) a perf.data file. But it could also be used to encode the endianness. The patch introduces a new value for this signature. The key difference is that the signature is written differently in the file depending on the endianness. Thus, by comparing the signature from the file with the tool's own signature it is possible to detect endianness. The new signature is "PERFILE2". Backward compatiblity with existing perf.data file is ensured. Tested-by: David Ahern <dsahern@gmail.com> Acked-by: David Ahern <dsahern@gmail.com> Cc: Andi Kleen <andi@firstfloor.org> Cc: Anshuman Khandual <khandual@linux.vnet.ibm.com> Cc: Arun Sharma <asharma@fb.com> Cc: David Ahern <dsahern@gmail.com> Cc: Ingo Molnar <mingo@elte.hu> Cc: Lin Ming <ming.m.lin@intel.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Roberto Agostino Vitillo <ravitillo@lbl.gov> Cc: Robert Richter <robert.richter@amd.com> Cc: Vince Weaver <vweaver1@eecs.utk.edu> Link: http://lkml.kernel.org/r/1328187288-24395-15-git-send-email-eranian@google.comSigned-off-by: Stephane Eranian <eranian@google.com> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
-
- 07 Feb, 2012 3 commits
-
-
Borislav Petkov authored
Stephane Eranian reported that doing a scheduler latency measurements with perf on AMD doesn't work out as expected due to the fact that the sched_clock() granularity is too coarse, i.e. done in jiffies due to the sched_clock_stable not set, which, if set, would mean that we get to use the TSC as sample source which would give us much higher precision. However, there's no reason not to set sched_clock_stable on AMD because all families from F10h and upwards do have an invariant TSC and have the CPUID flag to prove (CPUID_8000_0007_EDX[8]). Make it so, #1. Signed-off-by: Borislav Petkov <bp@alien8.de> Cc: Borislav Petkov <bp@amd64.org> Cc: Venki Pallipadi <venki@google.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Cc: Arnaldo Carvalho de Melo <acme@infradead.org> Cc: Robert Richter <robert.richter@amd.com> Cc: Eric Dumazet <eric.dumazet@gmail.com> Cc: Andreas Herrmann <andreas.herrmann3@amd.com> Link: http://lkml.kernel.org/r/20120206132546.GA30854@quad [ Should any non-standard system break the TSC, we should mark them so explicitly, in their platform init handler, or in a DMI quirk. ] Signed-off-by: Ingo Molnar <mingo@elte.hu>
-
Ingo Molnar authored
Merge tag 'perf-core-for-mingo' of git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux into perf/core perf/core fixes and improvements. Signed-off-by: Ingo Molnar <mingo@elte.hu>
-
Ingo Molnar authored
Linux 3.3-rc2 Pick up the latest fixes. Signed-off-by: Ingo Molnar <mingo@elte.hu>
-
- 06 Feb, 2012 8 commits
-
-
Namhyung Kim authored
The output of cpu-clock event is controlled in nsec_printout(), but its alignment was broken: Performance counter stats for 'sleep 1': 6,038,774 instructions # 0.00 insns per cycle 180 faults # 0.007 K/sec [99.95%] 1,282,201 branches # 0.053 M/sec [99.84%] 24126.221811 cpu-clock [99.62%] 24121.689540 task-clock # 24.098 CPUs utilized [99.52%] 1.001001017 seconds time elapsed This patch fixes this: Performance counter stats for 'sleep 1': 13,540,843 instructions # 0.00 insns per cycle 180 faults # 0.007 K/sec [99.94%] 2,875,386 branches # 0.119 M/sec [99.82%] 24144.221137 cpu-clock [99.61%] 24133.515366 task-clock # 24.109 CPUs utilized [99.52%] 1.001020946 seconds time elapsed Cc: Ingo Molnar <mingo@elte.hu> Cc: Namhyung Kim <namhyung@gmail.com> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Link: http://lkml.kernel.org/r/1328514285-26232-2-git-send-email-namhyung.kim@lge.comSigned-off-by: Namhyung Kim <namhyung.kim@lge.com> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
-
Namhyung Kim authored
The default 'M/sec' unit is not useful if the result is small enough. Adjust it dynamically according to the value. Cc: Ingo Molnar <mingo@elte.hu> Cc: Namhyung Kim <namhyung@gmail.com> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Link: http://lkml.kernel.org/r/1328514285-26232-1-git-send-email-namhyung.kim@lge.comSigned-off-by: Namhyung Kim <namhyung.kim@lge.com> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
-
Franck Bui-Huu authored
Currently we can put the object files in a different directory by using 'O=' comand line argument. However the generated documentation files don't honor this directive, This patch fixes that. It's been tested for man target but the others seems currently broken so no tests have been done on them so far. Link: http://lkml.kernel.org/r/1328541443-18003-1-git-send-email-fbuihuu@gmail.comSigned-off-by: Franck Bui-Huu <fbuihuu@gmail.com> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
-
Jiri Olsa authored
By adding following objects: bench/mem-memset-x86-64-asm.o bench/mem-memcpy-x86-64-asm.o the x86_64 perf binary ended up with executable stack. The reason was that above objects are assembler sourced and are missing the GNU-stack note section. In such case the linker assumes that the final binary should not be restricted at all and mark the stack as RWX. Adding section ".note.GNU-stack" definition to mentioned objects, with all flags disabled, thus omiting those objects from linker stack flags decision. Reported-at: https://bugzilla.redhat.com/show_bug.cgi?id=783570Reported-by: Clark Williams <williams@redhat.com> Acked-by: Eric Dumazet <eric.dumazet@gmail.com> Cc: Corey Ashford <cjashfor@linux.vnet.ibm.com> Cc: Ingo Molnar <mingo@elte.hu> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Link: http://lkml.kernel.org/r/1328100848-5630-1-git-send-email-jolsa@redhat.comSigned-off-by: Jiri Olsa <jolsa@redhat.com> [ committer note: Remaining bits after what was already added to perf/urgent ] Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
-
Arnaldo Carvalho de Melo authored
So that we can get the perf bench exec stack fixes and then apply the remaining fix for the files added after what is in perf/urgent. Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
-
Naveen N. Rao authored
This patch fixes an issue where perf report shows nan% for certain perf.data files. The below is from a report for a do_fork probe: -nan% sshd [kernel.kallsyms] [k] do_fork -nan% packagekitd [kernel.kallsyms] [k] do_fork -nan% dbus-daemon [kernel.kallsyms] [k] do_fork -nan% bash [kernel.kallsyms] [k] do_fork A git bisect shows commit f3bda2c9 as the cause. However, looking back through the git history, I saw commit 640c03ce which seems to have removed the required initialization for perf_sample->period. The problem only started showing after commit f3bda2c9. The below patch re-introduces the initialization and it fixes the problem for me. With the below patch, for the same perf.data: 73.08% bash [kernel.kallsyms] [k] do_fork 8.97% 11-dhclient [kernel.kallsyms] [k] do_fork 6.41% sshd [kernel.kallsyms] [k] do_fork 3.85% 20-chrony [kernel.kallsyms] [k] do_fork 2.56% sendmail [kernel.kallsyms] [k] do_fork This patch applies over current linux-tip commit 9949284. Problem introduced in: $ git describe 640c03ce v2.6.37-rc3-83-g640c03ce Cc: Ananth N Mavinakayanahalli <ananth@in.ibm.com> Cc: Ingo Molnar <mingo@elte.hu> Cc: Robert Richter <robert.richter@amd.com> Cc: Srikar Dronamraju <srikar@linux.vnet.ibm.com> Cc: stable@kernel.org Link: http://lkml.kernel.org/r/20120203170113.5190.25558.stgit@localhost6.localdomain6Signed-off-by: Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
-
Jiri Olsa authored
In some perf ancient versions we used '[kernel.kallsyms._text]' as the name for the kernel map. This got changed with commit: perf: 'perf kvm' tool for monitoring guest performance from host commit a1645ce1 Author: Zhang, Yanmin <yanmin_zhang@linux.intel.com> and we started to use following name '[kernel.kallsyms]_text'. This name change is important for the report code dealing with ancient perf data. When processing the kernel map event, we need to recognize the old naming (dont match the last ']') and initialize the kernel map correctly. The subsequent call to maps__set_kallsyms_ref_reloc_sym deals with the superfluous ']' to get correct symbol name. Cc: Corey Ashford <cjashfor@linux.vnet.ibm.com> Cc: Ingo Molnar <mingo@elte.hu> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Link: http://lkml.kernel.org/r/1328461865-6127-1-git-send-email-jolsa@redhat.comSigned-off-by: Jiri Olsa <jolsa@redhat.com> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
-
Jiri Olsa authored
By adding following objects: bench/mem-memcpy-x86-64-asm.o the x86_64 perf binary ended up with executable stack. The reason was that above object are assembler sourced and is missing the GNU-stack note section. In such case the linker assumes that the final binary should not be restricted at all and mark the stack as RWX. Adding section ".note.GNU-stack" definition to mentioned object, with all flags disabled, thus omiting this object from linker stack flags decision. Problem introduced in: $ git describe ea7872b9 v2.6.37-rc2-19-gea7872b9 Reported-at: https://bugzilla.redhat.com/show_bug.cgi?id=783570Reported-by: Clark Williams <williams@redhat.com> Acked-by: Eric Dumazet <eric.dumazet@gmail.com> Cc: Corey Ashford <cjashfor@linux.vnet.ibm.com> Cc: Ingo Molnar <mingo@elte.hu> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: stable@kernel.org Link: http://lkml.kernel.org/r/1328100848-5630-1-git-send-email-jolsa@redhat.comSigned-off-by: Jiri Olsa <jolsa@redhat.com> [ committer note: Backported fix to perf/urgent (3.3-rc2+) ] Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
-
- 03 Feb, 2012 1 commit
-
-
Stephane Eranian authored
With the new throttling/unthrottling code introduced with commit: e050e3f0 ("perf: Fix broken interrupt rate throttling") we occasionally hit two WARN_ON_ONCE() checks in: - intel_pmu_pebs_enable() - intel_pmu_lbr_enable() - x86_pmu_start() The assertions are no longer problematic. There is a valid path where they can trigger but it is harmless. The assertion can be triggered with: $ perf record -e instructions:pp .... Leading to paths: intel_pmu_pebs_enable intel_pmu_enable_event x86_perf_event_set_period x86_pmu_start perf_adjust_freq_unthr_context perf_event_task_tick scheduler_tick And: intel_pmu_lbr_enable intel_pmu_enable_event x86_perf_event_set_period x86_pmu_start perf_adjust_freq_unthr_context. perf_event_task_tick scheduler_tick cpuc->enabled is always on because when we get to perf_adjust_freq_unthr_context() the PMU is not totally disabled. Furthermore when we need to adjust a period, we only stop the event we need to change and not the entire PMU. Thus, when we re-enable, cpuc->enabled is already set. Note that when we stop the event, both pebs and lbr are stopped if necessary (and possible). Signed-off-by: Stephane Eranian <eranian@google.com> Cc: peterz@infradead.org Link: http://lkml.kernel.org/r/20120202110401.GA30911@quadSigned-off-by: Ingo Molnar <mingo@elte.hu>
-
- 02 Feb, 2012 4 commits
-
-
git://git.kernel.org/pub/scm/linux/kernel/git/sage/ceph-clientLinus Torvalds authored
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/sage/ceph-client: rbd: fix safety of rbd_put_client() rbd: fix a memory leak in rbd_get_client() ceph: create a new session lock to avoid lock inversion ceph: fix length validation in parse_reply_info() ceph: initialize client debugfs outside of monc->mutex ceph: change "ceph.layout" xattr to be "ceph.file.layout"
-
Josh Triplett authored
Signed-off-by: Josh Triplett <josh@joshtriplett.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
-
Alex Elder authored
The rbd_client structure uses a kref to arrange for cleaning up and freeing an instance when its last reference is dropped. The cleanup routine is rbd_client_release(), and one of the things it does is delete the rbd_client from rbd_client_list. It acquires node_lock to do so, but the way it is done is still not safe. The problem is that when attempting to reuse an existing rbd_client, the structure found might already be in the process of getting destroyed and cleaned up. Here's the scenario, with "CLIENT" representing an existing rbd_client that's involved in the race: Thread on CPU A | Thread on CPU B --------------- | --------------- rbd_put_client(CLIENT) | rbd_get_client() kref_put() | (acquires node_lock) kref->refcount becomes 0 | __rbd_client_find() returns CLIENT calls rbd_client_release() | kref_get(&CLIENT->kref); | (releases node_lock) (acquires node_lock) | deletes CLIENT from list | ...and starts using CLIENT... (releases node_lock) | and frees CLIENT | <-- but CLIENT gets freed here Fix this by having rbd_put_client() acquire node_lock. The result could still be improved, but at least it avoids this problem. Signed-off-by: Alex Elder <elder@dreamhost.com> Signed-off-by: Sage Weil <sage@newdream.net>
-
Christopher Yeoh authored
This fixes the race in process_vm_core found by Oleg (see http://article.gmane.org/gmane.linux.kernel/1235667/ for details). This has been updated since I last sent it as the creation of the new mm_access() function did almost exactly the same thing as parts of the previous version of this patch did. In order to use mm_access() even when /proc isn't enabled, we move it to kernel/fork.c where other related process mm access functions already are. Signed-off-by: Chris Yeoh <yeohc@au1.ibm.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
-