- 14 Oct, 2008 40 commits
-
-
Frederic Weisbecker authored
When one try to set a nonexistent tracer, no error is returned as if the name of the tracer was correct. We should return -EINVAL. Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com> Acked-by: Steven Rostedt <srostedt@redhat.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
-
Steven Rostedt authored
Now that the ring buffer is reentrant, some of the ftrace tracers (sched_swich, debugging traces) can also be reentrant. Note: Never make the function tracer reentrant, that can cause recursion problems all over the kernel. The function tracer must disable reentrancy. Signed-off-by: Steven Rostedt <srostedt@redhat.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
-
Steven Rostedt authored
This patch replaces the local_irq_save/restore with preempt_disable/ enable. This allows for interrupts to enter while recording. To write to the ring buffer, you must reserve data, and then commit it. During this time, an interrupt may call a trace function that will also record into the buffer before the commit is made. The interrupt will reserve its entry after the first entry, even though the first entry did not finish yet. The time stamp delta of the interrupt entry will be zero, since in the view of the trace, the interrupt happened during the first field anyway. Locking still takes place when the tail/write moves from one page to the next. The reader always takes the locks. A new page pointer is added, called the commit. The write/tail will always point to the end of all entries. The commit field will point to the last committed entry. Only this commit entry may update the write time stamp. The reader can only go up to the commit. It cannot go past it. If a lot of interrupts come in during a commit that fills up the buffer, and it happens to make it all the way around the buffer back to the commit, then a warning is printed and new events will be dropped. Signed-off-by: Steven Rostedt <srostedt@redhat.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
-
Steven Rostedt authored
Remove the global head and tail indexes and move them into the page header. Each page will now keep track of where the last write and read was made. We also rename the head and tail to read and write for better clarification. This patch is needed for future enhancements to move the ring buffer to a lockless solution. Signed-off-by: Steven Rostedt <srostedt@redhat.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
-
Frederic Weisbecker authored
At this time, only built-in initcalls interest us. We can't really produce a relevant graph if we include the modules initcall too. I had good results after this patch (see svg in attachment). Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
-
Steven Rostedt authored
The assigning of the pc counter is in the wrong spot in the check_critical_timing function. The pc variable is used in the out jump. Signed-off-by: Steven Rostedt <srostedt@redhat.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
-
Steven Rostedt authored
My original patch had a compile bug when NUMA was configured. I referenced cpu when it should have been cpu_buffer->cpu. Ingo quickly fixed this bug by replacing cpu with 'i' because that was the loop counter. Unfortunately, the 'i' was the counter of pages, not CPUs. This caused a crash when the number of pages allocated for the buffers exceeded the number of pages, which would usually be the case. Signed-off-by: Steven Rostedt <srostedt@redhat.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
-
Steven Noonan authored
Including <linux/ktime.h> eliminates the following error: include/linux/ftrace.h:220: error: expected specifier-qualifier-list before 'ktime_t' Signed-off-by: Steven Noonan <steven@uplinklabs.net> Signed-off-by: Ingo Molnar <mingo@elte.hu>
-
Ingo Molnar authored
fix: In file included from kernel/sysctl.c:52: include/linux/ftrace.h:217: error: 'KSYM_NAME_LEN' undeclared here (not in a function) Signed-off-by: Ingo Molnar <mingo@elte.hu>
-
Frederic Weisbecker authored
After some initcall traces, some initcall names may be inconsistent. That's because these functions will disappear from the .init section and also their name from the symbols table. So we have to copy the name of the function in a buffer large enough during the trace appending. It is not costly for the ring_buffer because the number of initcall entries is commonly not really large. Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
-
Frederic Weisbecker authored
Change the boot tracer printing to make it parsable for the scripts/bootgraph.pl script. We have now to output two lines for each initcall, according to the printk in do_one_initcall() in init/main.c We need now the call's time and the return's time. Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
-
Ingo Molnar authored
fix: kernel/trace/ring_buffer.c: In function ‘rb_allocate_pages’: kernel/trace/ring_buffer.c:235: error: ‘cpu’ undeclared (first use in this function) kernel/trace/ring_buffer.c:235: error: (Each undeclared identifier is reported only once kernel/trace/ring_buffer.c:235: error: for each function it appears in.) Signed-off-by: Ingo Molnar <mingo@elte.hu>
-
Steven Rostedt authored
With the new ring buffer infrastructure in ftrace, I'm trying to make ftrace a little more light weight. This patch converts a lot of the local_irq_save/restore into preempt_disable/enable. The original preempt count in a lot of cases has to be sent in as a parameter so that it can be recorded correctly. Some places were recording it incorrectly before anyway. This is also laying the ground work to make ftrace a little bit more reentrant, and remove all locking. The function tracers must still protect from reentrancy. Note: All the function tracers must be careful when using preempt_disable. It must do the following: resched = need_resched(); preempt_disable_notrace(); [...] if (resched) preempt_enable_no_resched_notrace(); else preempt_enable_notrace(); The reason is that if this function traces schedule() itself, the preempt_enable_notrace() will cause a schedule, which will lead us into a recursive failure. If we needed to reschedule before calling preempt_disable, we should have already scheduled. Since we did not, this is most likely that we should not and are probably inside a schedule function. If resched was not set, we still need to catch the need resched flag being set when preemption was off and the if case at the end will catch that for us. Signed-off-by: Steven Rostedt <srostedt@redhat.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
-
Steven Rostedt authored
The current method of overlaying the page frame as the buffer page pointer can be very dangerous and limits our ability to do other things with a page from the buffer, like send it off to disk. This patch allocates the buffer_page instead of overlaying the page's page frame. The use of the buffer_page has hardly changed due to this. Signed-off-by: Steven Rostedt <srostedt@redhat.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
-
Steven Rostedt authored
The mmiotrace map had a bug that would typecast the entry from the trace to the wrong type. That is a known danger of C typecasts, there's absolutely zero checking done on them. Help that problem a bit by using a GCC extension to implement a type filter that restricts the types that a trace record can be cast into, and by adding a dynamic check (in debug mode) to verify the type of the entry. This patch adds a macro to assign all entries of ftrace using the type of the variable and checking the entry id. The typecasts are now done in the macro for only those types that it knows about, which should be all the types that are allowed to be read from the tracer. Signed-off-by: Steven Rostedt <srostedt@redhat.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
-
Frederic Weisbecker authored
Correct the value's type of trace_empty function Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
-
Steven Rostedt authored
The old "lock always" scheme had issues with lockdep, and was not very efficient anyways. This patch does a new design to be partially lockless on writes. Writes will add new entries to the per cpu pages by simply disabling interrupts. When a write needs to go to another page than it will grab the lock. A new "read page" has been added so that the reader can pull out a page from the ring buffer to read without worrying about the writer writing over it. This allows us to not take the lock for all reads. The lock is now only taken when a read needs to go to a new page. This is far from lockless, and interrupts still need to be disabled, but it is a step towards a more lockless solution, and it also solves a lot of the issues that were noticed by the first conversion of ftrace to the ring buffers. Note: the ring_buffer_{un}lock API has been removed. Signed-off-by: Steven Rostedt <srostedt@redhat.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
-
Steven Rostedt authored
The raw_local_irq_save causes issues with lockdep. We don't need it so replace them with local_irq_save. Signed-off-by: Steven Rostedt <srostedt@redhat.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
-
Frederic Weisbecker authored
This patch adapts the boot tracer to the new type of the print_line callback. It still relays entries it doesn't support to default output functions. Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com> Acked-by: Pekka Paalanen <pq@iki.fi> Signed-off-by: Ingo Molnar <mingo@elte.hu>
-
Frederic Weisbecker authored
Adapt mmiotrace to the new print_line type. By default, it ignores (and consumes) types it doesn't support. Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com> Acked-by: Pekka Paalanen <pq@iki.fi> Signed-off-by: Ingo Molnar <mingo@elte.hu>
-
Pekka Paalanen authored
This patch fixes a bug which break the pipe when the seq is empty. Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
-
Frederic Weisbecker authored
We need a kind of disambiguation when a print_line callback returns 0. _There is not enough space to print all the entry. Please flush the seq and retry. _I can't handle this type of entry This patch changes the type of this callback for better information. Also some changes have been made in this V2. _ Only relay to default functions after the print_line callback fails. _ This patch doesn't fix the issue with the broken pipe (see patch 2/4 for that) Some things are still in discussion: _ Find better names for the enum print_line_t values _ Change the type of print_trace_line into boolean. Patches to change that can be sent later. Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com> Acked-by: Pekka Paalanen <pq@iki.fi> Signed-off-by: Ingo Molnar <mingo@elte.hu>
-
Steven Rostedt authored
Now that the underlining ring buffer for ftrace now hold variable length entries, we can take advantage of this by only storing the size of the actual event into the buffer. This happens to increase the number of entries in the buffer dramatically. We can also get rid of the "trace_cont" operation, but I'm keeping that until we have no more users. Some of the ftrace tracers can now change their code to adapt to this new feature. Signed-off-by: Steven Rostedt <srostedt@redhat.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
-
Steven Rostedt authored
This patch ports ftrace over to the new ring buffer. Signed-off-by: Steven Rostedt <srostedt@redhat.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
-
Steven Rostedt authored
Mathieu Desnoyers pointed out that the freeing of the page frame needs to be reset otherwise we might trigger BUG_ON in the page free code. Signed-off-by: Steven Rostedt <srostedt@redhat.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
-
Steven Rostedt authored
If for some strange reason the buffer_page gets bigger, or the page struct gets smaller, I want to know this ASAP. The best way is to not let the kernel compile. This patch adds code to test the size of the struct buffer_page against the page struct and will cause compile issues if the buffer_page ever gets bigger than the page struct. Signed-off-by: Steven Rostedt <srostedt@redhat.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
-
Steven Rostedt authored
This is a unified tracing buffer that implements a ring buffer that hopefully everyone will eventually be able to use. The events recorded into the buffer have the following structure: struct ring_buffer_event { u32 type:2, len:3, time_delta:27; u32 array[]; }; The minimum size of an event is 8 bytes. All events are 4 byte aligned inside the buffer. There are 4 types (all internal use for the ring buffer, only the data type is exported to the interface users). RINGBUF_TYPE_PADDING: this type is used to note extra space at the end of a buffer page. RINGBUF_TYPE_TIME_EXTENT: This type is used when the time between events is greater than the 27 bit delta can hold. We add another 32 bits, and record that in its own event (8 byte size). RINGBUF_TYPE_TIME_STAMP: (Not implemented yet). This will hold data to help keep the buffer timestamps in sync. RINGBUF_TYPE_DATA: The event actually holds user data. The "len" field is only three bits. Since the data must be 4 byte aligned, this field is shifted left by 2, giving a max length of 28 bytes. If the data load is greater than 28 bytes, the first array field holds the full length of the data load and the len field is set to zero. Example, data size of 7 bytes: type = RINGBUF_TYPE_DATA len = 2 time_delta: <time-stamp> - <prev_event-time-stamp> array[0..1]: <7 bytes of data> <1 byte empty> This event is saved in 12 bytes of the buffer. An event with 82 bytes of data: type = RINGBUF_TYPE_DATA len = 0 time_delta: <time-stamp> - <prev_event-time-stamp> array[0]: 84 (Note the alignment) array[1..14]: <82 bytes of data> <2 bytes empty> The above event is saved in 92 bytes (if my math is correct). 82 bytes of data, 2 bytes empty, 4 byte header, 4 byte length. Do not reference the above event struct directly. Use the following functions to gain access to the event table, since the ring_buffer_event structure may change in the future. ring_buffer_event_length(event): get the length of the event. This is the size of the memory used to record this event, and not the size of the data pay load. ring_buffer_time_delta(event): get the time delta of the event This returns the delta time stamp since the last event. Note: Even though this is in the header, there should be no reason to access this directly, accept for debugging. ring_buffer_event_data(event): get the data from the event This is the function to use to get the actual data from the event. Note, it is only a pointer to the data inside the buffer. This data must be copied to another location otherwise you risk it being written over in the buffer. ring_buffer_lock: A way to lock the entire buffer. ring_buffer_unlock: unlock the buffer. ring_buffer_alloc: create a new ring buffer. Can choose between overwrite or consumer/producer mode. Overwrite will overwrite old data, where as consumer producer will throw away new data if the consumer catches up with the producer. The consumer/producer is the default. ring_buffer_free: free the ring buffer. ring_buffer_resize: resize the buffer. Changes the size of each cpu buffer. Note, it is up to the caller to provide that the buffer is not being used while this is happening. This requirement may go away but do not count on it. ring_buffer_lock_reserve: locks the ring buffer and allocates an entry on the buffer to write to. ring_buffer_unlock_commit: unlocks the ring buffer and commits it to the buffer. ring_buffer_write: writes some data into the ring buffer. ring_buffer_peek: Look at a next item in the cpu buffer. ring_buffer_consume: get the next item in the cpu buffer and consume it. That is, this function increments the head pointer. ring_buffer_read_start: Start an iterator of a cpu buffer. For now, this disables the cpu buffer, until you issue a finish. This is just because we do not want the iterator to be overwritten. This restriction may change in the future. But note, this is used for static reading of a buffer which is usually done "after" a trace. Live readings would want to use the ring_buffer_consume above, which will not disable the ring buffer. ring_buffer_read_finish: Finishes the read iterator and reenables the ring buffer. ring_buffer_iter_peek: Look at the next item in the cpu iterator. ring_buffer_read: Read the iterator and increment it. ring_buffer_iter_reset: Reset the iterator to point to the beginning of the cpu buffer. ring_buffer_iter_empty: Returns true if the iterator is at the end of the cpu buffer. ring_buffer_size: returns the size in bytes of each cpu buffer. Note, the real size is this times the number of CPUs. ring_buffer_reset_cpu: Sets the cpu buffer to empty ring_buffer_reset: sets all cpu buffers to empty ring_buffer_swap_cpu: swaps a cpu buffer from one buffer with a cpu buffer of another buffer. This is handy when you want to take a snap shot of a running trace on just one cpu. Having a backup buffer, to swap with facilitates this. Ftrace max latencies use this. ring_buffer_empty: Returns true if the ring buffer is empty. ring_buffer_empty_cpu: Returns true if the cpu buffer is empty. ring_buffer_record_disable: disable all cpu buffers (read only) ring_buffer_record_disable_cpu: disable a single cpu buffer (read only) ring_buffer_record_enable: enable all cpu buffers. ring_buffer_record_enabl_cpu: enable a single cpu buffer. ring_buffer_entries: The number of entries in a ring buffer. ring_buffer_overruns: The number of entries removed due to writing wrap. ring_buffer_time_stamp: Get the time stamp used by the ring buffer ring_buffer_normalize_time_stamp: normalize the ring buffer time stamp into nanosecs. I still need to implement the GTOD feature. But we need support from the cpu frequency infrastructure. But this can be done at a later time without affecting the ring buffer interface. Signed-off-by: Steven Rostedt <srostedt@redhat.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
-
Steven Rostedt authored
It is possible that the testing thread in the ftrace wakeup test does not run before we stop the trace. This will cause the trace to fail since nothing will be in the buffers. This patch adds a small wait in the wakeup test to allow for the woken task to run and be traced. Signed-off-by: Steven Rostedt <srostedt@redhat.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
-
Frédéric Weisbecker authored
When the boot tracer can't handle an entry output, it returns 1. It should return 0 to relay on other output functions. Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
-
Frédéric Weisbecker authored
The tracing engine resets the ring buffer and the tracers touch it too during self-tests. These self-tests happen during tracers registering and work against boot tracing which is logging initcalls. We have to disable tracing self-tests if the boot-tracer is selected. Reported-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
-
Frédéric Weisbecker authored
Launch the boot tracing inside the initcall_debug area. Old printk have not been removed to keep the old way of initcall tracing for backward compatibility. [ mingo@elte.hu: resolved conflicts ] Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
-
Frédéric Weisbecker authored
Bring the entry to choose the boot tracer on the kernel config. Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
-
Frédéric Weisbecker authored
The tracing engine have now to be init in early_initcall to set the boot tracer. Only the debugfs settings will be initialized at fs_initcall time. Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
-
Frédéric Weisbecker authored
Add the boot/initcall tracer. It's primary purpose is to be able to trace the initcalls. It is intended to be used with scripts/bootgraph.pl after some small improvements. Note that it is not active after its init. To avoid tracing (and so crashing) before the whole tracing engine init, you have to explicitly call start_boot_trace() after do_pre_smp_initcalls() to enable it. Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
-
Arjan van de Ven authored
When optimizing the kernel boot time, it's very valuable to visualize what is going on at which time. In addition, with the fastboot asynchronous initcall level, it's very valuable to see which initcall gets run where and when. This patch adds a script to turn a dmesg into a SVG graph (that can be shown with tools such as InkScape, Gimp or Firefox) and a small change to the initcall code to print the PID of the thread calling the initcall (so that the script can work out the parallelism). Signed-off-by: Arjan van de Ven <arjan@linux.intel.com>
-
Lai Jiangshan authored
bit-field is not thread-safe nor smp-safe. struct marker_entry.rcu_pending is not protected by any lock in rcu-callback free_old_closure(). so we must turn it into a safe type. detail: I suppose rcu_pending and ptype are store in struct marker_entry.tmp1 free_old_closure() side: change ptype side: | load struct marker_entry.tmp1 --------------------------------|-------------------------------- | change ptype bit in tmp1 load struct marker_entry.tmp1 | change rcu_pending bit in tmp1 | store tmp1 | --------------------------------|-------------------------------- | store tmp1 now this result equals that free_old_closure() do not change rcu_pending bit, bug! This bug will cause redundant rcu_barrier_sched() called. not too harmful. ----- corresponding: free_old_closure() side: change ptype side: load struct marker_entry.tmp1 | --------------------------------|-------------------------------- | load struct marker_entry.tmp1 change rcu_pending bit in tmp1 | | change ptype bit in tmp1 | store tmp1 --------------------------------|-------------------------------- store tmp1 | now this result equals that change ptype side do not change ptype bit, bug! this bug cause marker_probe_cb() access to invalid memory. oops! see also: http://en.wikipedia.org/wiki/Bit_fieldSigned-off-by: Lai Jiangshan <laijs@cn.fujitsu.com> Acked-by: Mathieu Desnoyers <mathieu.desnoyers@polymtl.ca> Signed-off-by: Ingo Molnar <mingo@elte.hu>
-
Lai Jiangshan authored
when the second, third... probe is registered, its format is not checked, this patch fixes it. Signed-off-by: Lai Jiangshan <laijs@cn.fujitsu.com> Acked-by: Mathieu Desnoyers <mathieu.desnoyers@polymtl.ca> Signed-off-by: Ingo Molnar <mingo@elte.hu>
-
Mathieu Desnoyers authored
Turn marker synchronize unregister into a static inline. There is no reason to keep it as a macro over a static inline. Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@polymtl.ca> Signed-off-by: Ingo Molnar <mingo@elte.hu>
-
Mathieu Desnoyers authored
Lai Jiangshan discovered a reentrancy issue with markers and fixed it by adding synchronize_sched() calls at each registration/unregistraiton. It works, but it removes the ability to do batch registration/unregistration and can cause registration of ~100 markers to take about 30 seconds on a loaded machine (synchronize_sched() is much slower on such workloads). This patch implements a version of the fix which won't slow down marker batch registration/unregistration. It also go back to the original non-synchronized reg/unreg. Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@polymtl.ca> Signed-off-by: Ingo Molnar <mingo@elte.hu>
-
Mathieu Desnoyers authored
We need a marker_synchronize_unregister() before the end of exit() to make sure every probe callers have exited the non preemptible section and thus are not executing the probe code anymore. Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@polymtl.ca> Acked-by: Jeremy Kerr <jk@ozlabs.org> Signed-off-by: Ingo Molnar <mingo@elte.hu>
-