- 21 Jul, 2010 5 commits
-
-
Mike Frysinger authored
Add more details to the dynamic function tracing design implementation. Signed-off-by: Mike Frysinger <vapier@gentoo.org> LKML-Reference: <1279610015-10250-1-git-send-email-vapier@gentoo.org> Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
-
KOSAKI Motohiro authored
Documentation/trace/ftrace.txt says buffer_size_kb: This sets or displays the number of kilobytes each CPU buffer can hold. The tracer buffers are the same size for each CPU. The displayed number is the size of the CPU buffer and not total size of all buffers. The trace buffers are allocated in pages (blocks of memory that the kernel uses for allocation, usually 4 KB in size). If the last page allocated has room for more bytes than requested, the rest of the page will be used, making the actual allocation bigger than requested. ( Note, the size may not be a multiple of the page size due to buffer management overhead. ) This can only be updated when the current_tracer is set to "nop". But it's incorrect. currently total memory consumption is 'buffer_size_kb x CPUs x 2'. Why two times difference is there? because ftrace implicitly allocate the buffer for max latency too. That makes sad result when admin want to use large buffer. (If admin want full logging and makes detail analysis). example, If admin have 24 CPUs machine and write 200MB to buffer_size_kb, the system consume ~10GB memory (200MB x 24 x 2). umm.. 5GB memory waste is usually unacceptable. Fortunatelly, almost all users don't use max latency feature. The max latency buffer can be disabled easily. This patch shrink buffer size of the max latency buffer if unnecessary. Signed-off-by: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com> LKML-Reference: <20100701104554.DA2D.A69D9226@jp.fujitsu.com> Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
-
Lai Jiangshan authored
__print_flags() and __print_symbolic() use percpu trace_seq: 1) Its memory is allocated at compile time, it wastes memory if we don't use tracing. 2) It is percpu data and it wastes more memory for multi-cpus system. 3) It disables preemption when it executes its core routine "trace_seq_printf(s, "%s: ", #call);" and introduces latency. So we move this trace_seq to struct trace_iterator. Signed-off-by: Lai Jiangshan <laijs@cn.fujitsu.com> LKML-Reference: <4C078350.7090106@cn.fujitsu.com> Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
-
Richard Kennedy authored
Reorder structure to remove 8 bytes of padding on 64 bit builds. This shrinks the size to 128 bytes so allowing allocation from a smaller slab & needed one fewer cache lines. Signed-off-by: Richard Kennedy <richard@rsk.demon.co.uk> LKML-Reference: <1269516456.2054.8.camel@localhost> Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
-
Li Zefan authored
We found that even enabling a single trace event that will rarely be triggered can add big overhead to context switch. (lmbench context switch test) ------------------------------------------------- 2p/0K 2p/16K 2p/64K 8p/16K 8p/64K 16p/16K 16p/64K ctxsw ctxsw ctxsw ctxsw ctxsw ctxsw ctxsw ------ ------ ------ ------ ------ ------- ------- 2.19 2.3 2.21 2.56 2.13 2.54 2.07 2.39 2.51 2.35 2.75 2.27 2.81 2.24 The overhead is 6% ~ 11%. It's because when a trace event is enabled 3 tracepoints (sched_switch, sched_wakeup, sched_wakeup_new) will be activated to map pid to cmdname. We'd like to avoid this overhead, so add a trace option '(no)record-cmd' to allow to disable cmdline recording. Signed-off-by: Li Zefan <lizf@cn.fujitsu.com> LKML-Reference: <4C2D57F4.2050204@cn.fujitsu.com> Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
-
- 17 Jul, 2010 4 commits
-
-
Arnaldo Carvalho de Melo authored
Introducing hists__remove_entry_filter. Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Ingo Molnar <mingo@elte.hu> Cc: Mike Galbraith <efault@gmx.de> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> LKML-Reference: <new-submission> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
-
Arnaldo Carvalho de Melo authored
Right now ENTER doesn't always exits the newt tree widget, as it is used for expanding/collapsing branches, but with the new tree widget being developed we need to regain control to handle it, expanding/collapsing branches. In fact its really up to the ui_browser user to state what extra keys should stop ui_browser__run, and it should handle just the ones needed for basic browsing. Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Mike Galbraith <efault@gmx.de> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
-
Arnaldo Carvalho de Melo authored
Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Mike Galbraith <efault@gmx.de> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> LKML-Reference: <new-submission> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
-
Ingo Molnar authored
Merge branch 'perf/core' of git://git.kernel.org/pub/scm/linux/kernel/git/frederic/random-tracing into perf/core
-
- 16 Jul, 2010 3 commits
-
-
Masami Hiramatsu authored
Invert the return value of die_compare_name(), because it returns a 'bool' result which should be expeced true if the die's name is same as compared string. LKML-Reference: <4C36EBED.1000006@hitachi.com> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Ingo Molnar <mingo@elte.hu> Signed-off-by: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
-
Masami Hiramatsu authored
Gcc generates DW_AT_comp_dir and stores relative source path if building kernel without O= option. In that case, perf probe --line sometimes doesn't work without --source option, because it tries to access relative source path. This adds DW_AT_comp_dir support to perf probe for finding an absolute source path when no --source option. LKML-Reference: <4C36EBE7.3060802@hitachi.com> Cc: Ingo Molnar <mingo@elte.hu> Signed-off-by: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
-
Masami Hiramatsu authored
Perf probe -L shows incorrect error message (Dwarf error) if it fails to find source file. This can confuse users. # ./perf probe -s /nowhere -L vfs_read Debuginfo analysis failed. (-2) Error: Failed to show lines. (-2) With this patch, it shows correct message. # ./perf probe -s /nowhere -L vfs_read Failed to find source file. (-2) Error: Failed to show lines. (-2) LKML-Reference: <4C36EBDB.4020308@hitachi.com> Cc: Chase Douglas <chase.douglas@canonical.com> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Ingo Molnar <mingo@elte.hu> Acked-by: Chase Douglas <chase.douglas@canonical.com> Signed-off-by: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
-
- 15 Jul, 2010 3 commits
-
-
Frederic Weisbecker authored
Markers have been removed, but we forgot to remove their section. Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: Ingo Molnar <mingo@elte.hu> Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
-
Frederic Weisbecker authored
The ksym (breakpoint) ftrace plugin has been superseded by perf tools that are much more poweful to use the cpu breakpoints. This tracer doesn't bring more feature. It has been deprecated for a while now, lets remove it. Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: Prasad <prasad@linux.vnet.ibm.com> Cc: Ingo Molnar <mingo@elte.hu>
-
Frederic Weisbecker authored
ftrace and perf events now use the same development branch. Don't show a stale branch to developers. Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: Ingo Molnar <mingo@redhat.com>
-
- 12 Jul, 2010 1 commit
-
-
Matt Fleming authored
Implement get_arch_regstr() for SH so that, given a DWARF register number, the corresponding symbolic name of that register can be looked up. Acked-by: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com> Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Paul Mackerras <paulus@samba.org> Cc: Ingo Molnar <mingo@elte.hu> Cc: Ian Munsie <imunsie@au.ibm.com> LKML-Reference: <e55812819ad18c2ceca5651ac7698a2af46180d7.1278774279.git.matt@console-pimps.org> Signed-off-by: Matt Fleming <matt@console-pimps.org> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
-
- 06 Jul, 2010 1 commit
-
-
Ingo Molnar authored
Merge branch 'perf/core' of git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux-2.6 into perf/core
-
- 05 Jul, 2010 14 commits
-
-
Masami Hiramatsu authored
Add static and global variables support to perf probe. This allows user to trace non-local variables (and structure members) at probe points. Cc: Ingo Molnar <mingo@elte.hu> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Mike Galbraith <efault@gmx.de> Cc: Frederic Weisbecker <fweisbec@gmail.com> LKML-Reference: <20100519195749.2885.17451.stgit@localhost6.localdomain6> Signed-off-by: Masami Hiramatsu <mhiramat@redhat.com> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
-
Masami Hiramatsu authored
Add array-entry tracing support to perf probe. This enables to trace an entry of array which is indexed by constant value, e.g. array[0]. For example: $ perf probe -a 'bio_split bi->bi_io_vec[0]' Cc: Paul Mackerras <paulus@samba.org> Cc: Ingo Molnar <mingo@elte.hu> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Mike Galbraith <efault@gmx.de> Cc: Frederic Weisbecker <fweisbec@gmail.com> LKML-Reference: <20100519195742.2885.5344.stgit@localhost6.localdomain6> Signed-off-by: Masami Hiramatsu <mhiramat@redhat.com> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
-
Masami Hiramatsu authored
Support string type casting to event argument. If perf-probe finds an argument casted as string, it ensures the target variable is "(unsigned/signed) char *(or []). perf-probe also adds dereference if the target is a pointer. So, both of 'char buf[10];' and 'char *buf;' can be accessed by 'buf:string' Cc: Ingo Molnar <mingo@elte.hu> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Mike Galbraith <efault@gmx.de> Cc: Frederic Weisbecker <fweisbec@gmail.com> LKML-Reference: <20100519195734.2885.1666.stgit@localhost6.localdomain6> Signed-off-by: Masami Hiramatsu <mhiramat@redhat.com> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
-
Masami Hiramatsu authored
Support string type tracing and printing in kprobe-tracer. This allows user to trace string data in kernel including __user data. Note that sometimes __user data may not be accessed if it is paged-out (sorry, but kprobes operation should be done in atomic, we can not wait for page-in). Commiter note: Fixed up conflicts with b7e2ecef. Cc: Ingo Molnar <mingo@elte.hu> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Mike Galbraith <efault@gmx.de> Cc: Frederic Weisbecker <fweisbec@gmail.com> LKML-Reference: <20100519195724.2885.18788.stgit@localhost6.localdomain6> Signed-off-by: Masami Hiramatsu <mhiramat@redhat.com> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
-
Cyrill Gorcunov authored
To support cache events we have reserved the low 6 bits in hw_perf_event::config (which is a part of CCCR register configuration actually). These bits represent Replay Event mertic enumerated in enum P4_PEBS_METRIC. The caller should not care about which exact bits should be set and how -- the caller just chooses one P4_PEBS_METRIC entity and puts it into the config. The kernel will track it and set appropriate additional MSR registers (metrics) when needed. The reason for this redesign was the PEBS enable bit, which should not be set until DS (and PEBS sampling) support will be implemented properly. TODO ==== - PEBS sampling (note it's tricky and works with _one_ counter only so for HT machines it will be not that easy to handle both threads) - tracking of PEBS registers state, a user might need to turn PEBS off completely (ie no PEBS enable, no UOP_tag) but some other event may need it, such events clashes and should not run simultaneously, at moment we just don't support such events - eventually export user space bits in separate header which will allow user apps to configure raw events more conveniently. Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org> Signed-off-by: Lin Ming <ming.m.lin@intel.com> Cc: Stephane Eranian <eranian@google.com> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Frederic Weisbecker <fweisbec@gmail.com> LKML-Reference: <1278295769.9540.15.camel@minggr.sh.intel.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
-
Ingo Molnar authored
Merge reason: Pick up the latest perf fixes Signed-off-by: Ingo Molnar <mingo@elte.hu>
-
Linus Torvalds authored
-
Linus Torvalds authored
* master.kernel.org:/home/rmk/linux-2.6-arm: ARM: 6205/1: perf: ensure counter delta is treated as unsigned ARM: 6202/1: Do not ARM_DMA_MEM_BUFFERABLE on RealView boards with L210/L220 ARM: 6201/1: RealView: Do not use outer_sync() on ARM11MPCore boards with L220 ARM: 6195/1: OMAP3: pmu: make CPU_HAS_PMU dependent on OMAP3_EMU ARM: 6194/1: change definition of cpu_relax() for ARM11MPCore ARM: 6193/1: RealView: Align the machine_desc.phys_io to 1MB section ARM: 6192/1: VExpress: Align the machine_desc.phys_io to 1MB section ARM: 6188/1: Add a config option for the ARM11MPCore DMA cache maintenance workaround ARM: 6187/1: The v6_dma_inv_range() function must preserve data on SMP ARM: 6186/1: Avoid the CONSISTENT_DMA_SIZE warning on noMMU builds ARM: mx3: mx31lilly: fix build error for !CONFIG_USB_ULPI [ARM] mmp: fix build failure due to IRQ_PMU depends on ARCH_PXA [ARM] pxa/mioa701: fix camera regression [ARM] pxa/z2: fix flash layout to final version [ARM] pxa/z2: fix missing include in battery driver [ARM] pxa: fix incorrect gpio type in udc_pxa2xx.h
-
Linus Torvalds authored
Merge branch 'perf-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip * 'perf-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: perf, x86: Fix incorrect branches event on AMD CPUs perf tools: Fix find tids routine by excluding "." and ".." x86: Send a SIGTRAP for user icebp traps
-
Yehuda Sadeh authored
We should initialize the module dynamic debug datastructures only after determining that the module is not loaded yet. This fixes a bug that introduced in 2.6.35-rc2, where when a trying to load a module twice, we also load it's dynamic printing data twice which causes all sorts of nasty issues. Also handle the dynamic debug cleanup later on failure. Signed-off-by: Yehuda Sadeh <yehuda@hq.newdream.net> Signed-off-by: Rusty Russell <rusty@rustcorp.com.au> (removed a #ifdef) Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
-
git://oss.sgi.com/xfs/xfsLinus Torvalds authored
* 'for-linus' of git://oss.sgi.com/xfs/xfs: xfs: remove block number from inode lookup code xfs: rename XFS_IGET_BULKSTAT to XFS_IGET_UNTRUSTED xfs: validate untrusted inode numbers during lookup xfs: always use iget in bulkstat xfs: prevent swapext from operating on write-only files
-
git://git.secretlab.ca/git/linux-2.6Linus Torvalds authored
* 'merge-devicetree' of git://git.secretlab.ca/git/linux-2.6: of/dma: fix build breakage in ppc4xx adma driver
-
git://git.kernel.org/pub/scm/linux/kernel/git/mchehab/i7coreLinus Torvalds authored
* 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mchehab/i7core: MAINTAINERS: Add an entry for i7core_edac i7core_edac: Avoid doing multiple probes for the same card i7core_edac: Properly discover the first QPI device
-
git://git.kernel.org/pub/scm/linux/kernel/git/mmarek/kbuild-2.6Linus Torvalds authored
* 'rc-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/mmarek/kbuild-2.6: kbuild: Propagate LOCALVERSION= down to scripts/setlocalversion kbuild: Clean up and speed up the localversion logic
-
- 04 Jul, 2010 1 commit
-
-
Will Deacon authored
Hardware performance counters on ARM are 32-bits wide but atomic64_t variables are used to represent counter data in the hw_perf_event structure. The armpmu_event_update function right-shifts a signed 64-bit delta variable and adds the result to the event count. This can lead to shifting in sign-bits if the MSB of the 32-bit counter value is set. This results in perf output such as: Performance counter stats for 'sleep 20': 18446744073460670464 cycles <-- 0xFFFFFFFFF12A6000 7783773 instructions # 0.000 IPC 465 context-switches 161 page-faults 1172393 branches 20.154242147 seconds time elapsed This patch ensures that the delta value is treated as unsigned so that the right shift sets the upper bits to zero. Cc: <stable@kernel.org> Acked-by: Jamie Iles <jamie.iles@picochip.com> Signed-off-by: Will Deacon <will.deacon@arm.com> Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
-
- 03 Jul, 2010 1 commit
-
-
Vince Weaver authored
While doing some performance counter validation tests on some assembly language programs I noticed that the "branches:u" count was very wrong on AMD machines. It looks like the wrong event was selected. Signed-off-by: Vince Weaver <vweaver1@eecs.utk.edu> Acked-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Paul Mackerras <paulus@samba.org> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Robert Richter <robert.richter@amd.com> Cc: Borislav Petkov <borislav.petkov@amd.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: <stable@kernel.org> LKML-Reference: <alpine.DEB.2.00.1007011526010.23160@cl320.eecs.utk.edu> Signed-off-by: Ingo Molnar <mingo@elte.hu>
-
- 02 Jul, 2010 7 commits
-
-
Dan Williams authored
Convert ppc4xx adma driver to use new node pointer location Signed-off-by: Dan Williams <dan.j.williams@intel.com> Acked-by: Anatolij Gustschin <agust@denx.de> Signed-off-by: Grant Likely <grant.likely@secretlab.ca>
-
Mauro Carvalho Chehab authored
While here, fixes the mailing list for i5400_edac Acked-by: Doug Thompson <dougthompson@xmission.com> Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
-
Mauro Carvalho Chehab authored
As Nehalem/Nehalem-EP/Westmere devices uses several devices for the same functionality (memory controller), the default way of proping devices doesn't work. So, instead of a per-device probe, all devices should be probed at once. This means that we should block any new attempt of probe, otherwise, it will try to register the same device several times. Acked-by: Doug Thompson <dougthompson@xmission.com> Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
-
Mauro Carvalho Chehab authored
On Nehalem/Nehalem-EP/Westmere, the first QPI device is the last PCI bus. The last bus is generally at 0x3f or 0xff, but there are also other systems using different setups. For example, HP Z800 has 0x7f as the last bus. This patch adds a logic to discover the last bus, dynamically detecting it at runtime. Acked-by: Doug Thompson <dougthompson@xmission.com> Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
-
Linus Torvalds authored
Merge branch 'sched-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip * 'sched-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: sched: Cure nr_iowait_cpu() users init: Fix comment init, sched: Fix race between init and kthreadd
-
git://git.kernel.org/pub/scm/linux/kernel/git/bp/bpLinus Torvalds authored
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/bp/bp: amd64_edac: Fix syndrome calculation on K8
-
Borislav Petkov authored
When calculating the DCT channel from the syndrome we need to know the syndrome type (x4 vs x8). On F10h, this is read out from extended PCI cfg space register F3x180 while on K8 we only support x4 syndromes and don't have extended PCI config space anyway. Make the code accessing F3x180 F10h only and fall back to x4 syndromes on everything else. Cc: <stable@kernel.org> # .33.x .34.x Reported-by: Jeffrey Merkey <jeffmerkey@gmail.com> Signed-off-by: Borislav Petkov <borislav.petkov@amd.com>
-