Commit 968ea6d8 authored by Rusty Russell's avatar Rusty Russell

Merge ../linux-2.6-x86

Conflicts:

	arch/x86/kernel/io_apic.c
	kernel/sched.c
	kernel/sched_stats.h
parents 7be75853 8299608f
CPU Accounting Controller
-------------------------
The CPU accounting controller is used to group tasks using cgroups and
account the CPU usage of these groups of tasks.
The CPU accounting controller supports multi-hierarchy groups. An accounting
group accumulates the CPU usage of all of its child groups and the tasks
directly present in its group.
Accounting groups can be created by first mounting the cgroup filesystem.
# mkdir /cgroups
# mount -t cgroup -ocpuacct none /cgroups
With the above step, the initial or the parent accounting group
becomes visible at /cgroups. At bootup, this group includes all the
tasks in the system. /cgroups/tasks lists the tasks in this cgroup.
/cgroups/cpuacct.usage gives the CPU time (in nanoseconds) obtained by
this group which is essentially the CPU time obtained by all the tasks
in the system.
New accounting groups can be created under the parent group /cgroups.
# cd /cgroups
# mkdir g1
# echo $$ > g1
The above steps create a new group g1 and move the current shell
process (bash) into it. CPU time consumed by this bash and its children
can be obtained from g1/cpuacct.usage and the same is accumulated in
/cgroups/cpuacct.usage also.
...@@ -82,7 +82,7 @@ of ftrace. Here is a list of some of the key files: ...@@ -82,7 +82,7 @@ of ftrace. Here is a list of some of the key files:
tracer is not adding more data, they will display tracer is not adding more data, they will display
the same information every time they are read. the same information every time they are read.
iter_ctrl: This file lets the user control the amount of data trace_options: This file lets the user control the amount of data
that is displayed in one of the above output that is displayed in one of the above output
files. files.
...@@ -94,10 +94,10 @@ of ftrace. Here is a list of some of the key files: ...@@ -94,10 +94,10 @@ of ftrace. Here is a list of some of the key files:
only be recorded if the latency is greater than only be recorded if the latency is greater than
the value in this file. (in microseconds) the value in this file. (in microseconds)
trace_entries: This sets or displays the number of bytes each CPU buffer_size_kb: This sets or displays the number of kilobytes each CPU
buffer can hold. The tracer buffers are the same size buffer can hold. The tracer buffers are the same size
for each CPU. The displayed number is the size of the for each CPU. The displayed number is the size of the
CPU buffer and not total size of all buffers. The CPU buffer and not total size of all buffers. The
trace buffers are allocated in pages (blocks of memory trace buffers are allocated in pages (blocks of memory
that the kernel uses for allocation, usually 4 KB in size). that the kernel uses for allocation, usually 4 KB in size).
If the last page allocated has room for more bytes If the last page allocated has room for more bytes
...@@ -127,6 +127,8 @@ of ftrace. Here is a list of some of the key files: ...@@ -127,6 +127,8 @@ of ftrace. Here is a list of some of the key files:
be traced. If a function exists in both set_ftrace_filter be traced. If a function exists in both set_ftrace_filter
and set_ftrace_notrace, the function will _not_ be traced. and set_ftrace_notrace, the function will _not_ be traced.
set_ftrace_pid: Have the function tracer only trace a single thread.
available_filter_functions: This lists the functions that ftrace available_filter_functions: This lists the functions that ftrace
has processed and can trace. These are the function has processed and can trace. These are the function
names that you can pass to "set_ftrace_filter" or names that you can pass to "set_ftrace_filter" or
...@@ -316,23 +318,23 @@ The above is mostly meaningful for kernel developers. ...@@ -316,23 +318,23 @@ The above is mostly meaningful for kernel developers.
The rest is the same as the 'trace' file. The rest is the same as the 'trace' file.
iter_ctrl trace_options
--------- -------------
The iter_ctrl file is used to control what gets printed in the trace The trace_options file is used to control what gets printed in the trace
output. To see what is available, simply cat the file: output. To see what is available, simply cat the file:
cat /debug/tracing/iter_ctrl cat /debug/tracing/trace_options
print-parent nosym-offset nosym-addr noverbose noraw nohex nobin \ print-parent nosym-offset nosym-addr noverbose noraw nohex nobin \
noblock nostacktrace nosched-tree noblock nostacktrace nosched-tree nouserstacktrace nosym-userobj
To disable one of the options, echo in the option prepended with "no". To disable one of the options, echo in the option prepended with "no".
echo noprint-parent > /debug/tracing/iter_ctrl echo noprint-parent > /debug/tracing/trace_options
To enable an option, leave off the "no". To enable an option, leave off the "no".
echo sym-offset > /debug/tracing/iter_ctrl echo sym-offset > /debug/tracing/trace_options
Here are the available options: Here are the available options:
...@@ -378,6 +380,20 @@ Here are the available options: ...@@ -378,6 +380,20 @@ Here are the available options:
When a trace is recorded, so is the stack of functions. When a trace is recorded, so is the stack of functions.
This allows for back traces of trace sites. This allows for back traces of trace sites.
userstacktrace - This option changes the trace.
It records a stacktrace of the current userspace thread.
sym-userobj - when user stacktrace are enabled, look up which object the
address belongs to, and print a relative address
This is especially useful when ASLR is on, otherwise you don't
get a chance to resolve the address to object/file/line after the app is no
longer running
The lookup is performed when you read trace,trace_pipe,latency_trace. Example:
a.out-1623 [000] 40874.465068: /root/a.out[+0x480] <-/root/a.out[+0
x494] <- /root/a.out[+0x4a8] <- /lib/libc-2.7.so[+0x1e1a6]
sched-tree - TBD (any users??) sched-tree - TBD (any users??)
...@@ -1059,6 +1075,83 @@ For simple one time traces, the above is sufficent. For anything else, ...@@ -1059,6 +1075,83 @@ For simple one time traces, the above is sufficent. For anything else,
a search through /proc/mounts may be needed to find where the debugfs a search through /proc/mounts may be needed to find where the debugfs
file-system is mounted. file-system is mounted.
Single thread tracing
---------------------
By writing into /debug/tracing/set_ftrace_pid you can trace a
single thread. For example:
# cat /debug/tracing/set_ftrace_pid
no pid
# echo 3111 > /debug/tracing/set_ftrace_pid
# cat /debug/tracing/set_ftrace_pid
3111
# echo function > /debug/tracing/current_tracer
# cat /debug/tracing/trace | head
# tracer: function
#
# TASK-PID CPU# TIMESTAMP FUNCTION
# | | | | |
yum-updatesd-3111 [003] 1637.254676: finish_task_switch <-thread_return
yum-updatesd-3111 [003] 1637.254681: hrtimer_cancel <-schedule_hrtimeout_range
yum-updatesd-3111 [003] 1637.254682: hrtimer_try_to_cancel <-hrtimer_cancel
yum-updatesd-3111 [003] 1637.254683: lock_hrtimer_base <-hrtimer_try_to_cancel
yum-updatesd-3111 [003] 1637.254685: fget_light <-do_sys_poll
yum-updatesd-3111 [003] 1637.254686: pipe_poll <-do_sys_poll
# echo -1 > /debug/tracing/set_ftrace_pid
# cat /debug/tracing/trace |head
# tracer: function
#
# TASK-PID CPU# TIMESTAMP FUNCTION
# | | | | |
##### CPU 3 buffer started ####
yum-updatesd-3111 [003] 1701.957688: free_poll_entry <-poll_freewait
yum-updatesd-3111 [003] 1701.957689: remove_wait_queue <-free_poll_entry
yum-updatesd-3111 [003] 1701.957691: fput <-free_poll_entry
yum-updatesd-3111 [003] 1701.957692: audit_syscall_exit <-sysret_audit
yum-updatesd-3111 [003] 1701.957693: path_put <-audit_syscall_exit
If you want to trace a function when executing, you could use
something like this simple program:
#include <stdio.h>
#include <stdlib.h>
#include <sys/types.h>
#include <sys/stat.h>
#include <fcntl.h>
#include <unistd.h>
int main (int argc, char **argv)
{
if (argc < 1)
exit(-1);
if (fork() > 0) {
int fd, ffd;
char line[64];
int s;
ffd = open("/debug/tracing/current_tracer", O_WRONLY);
if (ffd < 0)
exit(-1);
write(ffd, "nop", 3);
fd = open("/debug/tracing/set_ftrace_pid", O_WRONLY);
s = sprintf(line, "%d\n", getpid());
write(fd, line, s);
write(ffd, "function", 8);
close(fd);
close(ffd);
execvp(argv[1], argv+1);
}
return 0;
}
dynamic ftrace dynamic ftrace
-------------- --------------
...@@ -1158,7 +1251,11 @@ These are the only wild cards which are supported. ...@@ -1158,7 +1251,11 @@ These are the only wild cards which are supported.
<match>*<match> will not work. <match>*<match> will not work.
# echo hrtimer_* > /debug/tracing/set_ftrace_filter Note: It is better to use quotes to enclose the wild cards, otherwise
the shell may expand the parameters into names of files in the local
directory.
# echo 'hrtimer_*' > /debug/tracing/set_ftrace_filter
Produces: Produces:
...@@ -1213,7 +1310,7 @@ Again, now we want to append. ...@@ -1213,7 +1310,7 @@ Again, now we want to append.
# echo sys_nanosleep > /debug/tracing/set_ftrace_filter # echo sys_nanosleep > /debug/tracing/set_ftrace_filter
# cat /debug/tracing/set_ftrace_filter # cat /debug/tracing/set_ftrace_filter
sys_nanosleep sys_nanosleep
# echo hrtimer_* >> /debug/tracing/set_ftrace_filter # echo 'hrtimer_*' >> /debug/tracing/set_ftrace_filter
# cat /debug/tracing/set_ftrace_filter # cat /debug/tracing/set_ftrace_filter
hrtimer_run_queues hrtimer_run_queues
hrtimer_run_pending hrtimer_run_pending
...@@ -1299,41 +1396,29 @@ trace entries ...@@ -1299,41 +1396,29 @@ trace entries
------------- -------------
Having too much or not enough data can be troublesome in diagnosing Having too much or not enough data can be troublesome in diagnosing
an issue in the kernel. The file trace_entries is used to modify an issue in the kernel. The file buffer_size_kb is used to modify
the size of the internal trace buffers. The number listed the size of the internal trace buffers. The number listed
is the number of entries that can be recorded per CPU. To know is the number of entries that can be recorded per CPU. To know
the full size, multiply the number of possible CPUS with the the full size, multiply the number of possible CPUS with the
number of entries. number of entries.
# cat /debug/tracing/trace_entries # cat /debug/tracing/buffer_size_kb
65620 1408 (units kilobytes)
Note, to modify this, you must have tracing completely disabled. To do that, Note, to modify this, you must have tracing completely disabled. To do that,
echo "nop" into the current_tracer. If the current_tracer is not set echo "nop" into the current_tracer. If the current_tracer is not set
to "nop", an EINVAL error will be returned. to "nop", an EINVAL error will be returned.
# echo nop > /debug/tracing/current_tracer # echo nop > /debug/tracing/current_tracer
# echo 100000 > /debug/tracing/trace_entries # echo 10000 > /debug/tracing/buffer_size_kb
# cat /debug/tracing/trace_entries # cat /debug/tracing/buffer_size_kb
100045 10000 (units kilobytes)
Notice that we echoed in 100,000 but the size is 100,045. The entries
are held in individual pages. It allocates the number of pages it takes
to fulfill the request. If more entries may fit on the last page
then they will be added.
# echo 1 > /debug/tracing/trace_entries
# cat /debug/tracing/trace_entries
85
This shows us that 85 entries can fit in a single page.
The number of pages which will be allocated is limited to a percentage The number of pages which will be allocated is limited to a percentage
of available memory. Allocating too much will produce an error. of available memory. Allocating too much will produce an error.
# echo 1000000000000 > /debug/tracing/trace_entries # echo 1000000000000 > /debug/tracing/buffer_size_kb
-bash: echo: write error: Cannot allocate memory -bash: echo: write error: Cannot allocate memory
# cat /debug/tracing/trace_entries # cat /debug/tracing/buffer_size_kb
85 85
...@@ -750,6 +750,14 @@ and is between 256 and 4096 characters. It is defined in the file ...@@ -750,6 +750,14 @@ and is between 256 and 4096 characters. It is defined in the file
parameter will force ia64_sal_cache_flush to call parameter will force ia64_sal_cache_flush to call
ia64_pal_cache_flush instead of SAL_CACHE_FLUSH. ia64_pal_cache_flush instead of SAL_CACHE_FLUSH.
ftrace=[tracer]
[ftrace] will set and start the specified tracer
as early as possible in order to facilitate early
boot debugging.
ftrace_dump_on_oops
[ftrace] will dump the trace buffers on oops.
gamecon.map[2|3]= gamecon.map[2|3]=
[HW,JOY] Multisystem joystick and NES/SNES/PSX pad [HW,JOY] Multisystem joystick and NES/SNES/PSX pad
support via parallel port (up to 5 devices per port) support via parallel port (up to 5 devices per port)
......
...@@ -71,35 +71,50 @@ Look at the current lock statistics: ...@@ -71,35 +71,50 @@ Look at the current lock statistics:
# less /proc/lock_stat # less /proc/lock_stat
01 lock_stat version 0.2 01 lock_stat version 0.3
02 ----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- 02 -----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
03 class name con-bounces contentions waittime-min waittime-max waittime-total acq-bounces acquisitions holdtime-min holdtime-max holdtime-total 03 class name con-bounces contentions waittime-min waittime-max waittime-total acq-bounces acquisitions holdtime-min holdtime-max holdtime-total
04 ----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- 04 -----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
05 05
06 &inode->i_data.tree_lock-W: 15 21657 0.18 1093295.30 11547131054.85 58 10415 0.16 87.51 6387.60 06 &mm->mmap_sem-W: 233 538 18446744073708 22924.27 607243.51 1342 45806 1.71 8595.89 1180582.34
07 &inode->i_data.tree_lock-R: 0 0 0.00 0.00 0.00 23302 231198 0.25 8.45 98023.38 07 &mm->mmap_sem-R: 205 587 18446744073708 28403.36 731975.00 1940 412426 0.58 187825.45 6307502.88
08 -------------------------- 08 ---------------
09 &inode->i_data.tree_lock 0 [<ffffffff8027c08f>] add_to_page_cache+0x5f/0x190 09 &mm->mmap_sem 487 [<ffffffff8053491f>] do_page_fault+0x466/0x928
10 10 &mm->mmap_sem 179 [<ffffffff802a6200>] sys_mprotect+0xcd/0x21d
11 ............................................................................................................................................................................................... 11 &mm->mmap_sem 279 [<ffffffff80210a57>] sys_mmap+0x75/0xce
12 12 &mm->mmap_sem 76 [<ffffffff802a490b>] sys_munmap+0x32/0x59
13 dcache_lock: 1037 1161 0.38 45.32 774.51 6611 243371 0.15 306.48 77387.24 13 ---------------
14 ----------- 14 &mm->mmap_sem 270 [<ffffffff80210a57>] sys_mmap+0x75/0xce
15 dcache_lock 180 [<ffffffff802c0d7e>] sys_getcwd+0x11e/0x230 15 &mm->mmap_sem 431 [<ffffffff8053491f>] do_page_fault+0x466/0x928
16 dcache_lock 165 [<ffffffff802c002a>] d_alloc+0x15a/0x210 16 &mm->mmap_sem 138 [<ffffffff802a490b>] sys_munmap+0x32/0x59
17 dcache_lock 33 [<ffffffff8035818d>] _atomic_dec_and_lock+0x4d/0x70 17 &mm->mmap_sem 145 [<ffffffff802a6200>] sys_mprotect+0xcd/0x21d
18 dcache_lock 1 [<ffffffff802beef8>] shrink_dcache_parent+0x18/0x130 18
19 ...............................................................................................................................................................................................
20
21 dcache_lock: 621 623 0.52 118.26 1053.02 6745 91930 0.29 316.29 118423.41
22 -----------
23 dcache_lock 179 [<ffffffff80378274>] _atomic_dec_and_lock+0x34/0x54
24 dcache_lock 113 [<ffffffff802cc17b>] d_alloc+0x19a/0x1eb
25 dcache_lock 99 [<ffffffff802ca0dc>] d_rehash+0x1b/0x44
26 dcache_lock 104 [<ffffffff802cbca0>] d_instantiate+0x36/0x8a
27 -----------
28 dcache_lock 192 [<ffffffff80378274>] _atomic_dec_and_lock+0x34/0x54
29 dcache_lock 98 [<ffffffff802ca0dc>] d_rehash+0x1b/0x44
30 dcache_lock 72 [<ffffffff802cc17b>] d_alloc+0x19a/0x1eb
31 dcache_lock 112 [<ffffffff802cbca0>] d_instantiate+0x36/0x8a
This excerpt shows the first two lock class statistics. Line 01 shows the This excerpt shows the first two lock class statistics. Line 01 shows the
output version - each time the format changes this will be updated. Line 02-04 output version - each time the format changes this will be updated. Line 02-04
show the header with column descriptions. Lines 05-10 and 13-18 show the actual show the header with column descriptions. Lines 05-18 and 20-31 show the actual
statistics. These statistics come in two parts; the actual stats separated by a statistics. These statistics come in two parts; the actual stats separated by a
short separator (line 08, 14) from the contention points. short separator (line 08, 13) from the contention points.
The first lock (05-10) is a read/write lock, and shows two lines above the The first lock (05-18) is a read/write lock, and shows two lines above the
short separator. The contention points don't match the column descriptors, short separator. The contention points don't match the column descriptors,
they have two: contentions and [<IP>] symbol. they have two: contentions and [<IP>] symbol. The second set of contention
points are the points we're contending with.
The integer part of the time values is in us.
View the top contending locks: View the top contending locks:
......
...@@ -51,11 +51,16 @@ to call) for the specific marker through marker_probe_register() and can be ...@@ -51,11 +51,16 @@ to call) for the specific marker through marker_probe_register() and can be
activated by calling marker_arm(). Marker deactivation can be done by calling activated by calling marker_arm(). Marker deactivation can be done by calling
marker_disarm() as many times as marker_arm() has been called. Removing a probe marker_disarm() as many times as marker_arm() has been called. Removing a probe
is done through marker_probe_unregister(); it will disarm the probe. is done through marker_probe_unregister(); it will disarm the probe.
marker_synchronize_unregister() must be called before the end of the module exit
function to make sure there is no caller left using the probe. This, and the marker_synchronize_unregister() must be called between probe unregistration and
fact that preemption is disabled around the probe call, make sure that probe the first occurrence of
removal and module unload are safe. See the "Probe example" section below for a - the end of module exit function,
sample probe module. to make sure there is no caller left using the probe;
- the free of any resource used by the probes,
to make sure the probes wont be accessing invalid data.
This, and the fact that preemption is disabled around the probe call, make sure
that probe removal and module unload are safe. See the "Probe example" section
below for a sample probe module.
The marker mechanism supports inserting multiple instances of the same marker. The marker mechanism supports inserting multiple instances of the same marker.
Markers can be put in inline functions, inlined static functions, and Markers can be put in inline functions, inlined static functions, and
...@@ -70,6 +75,20 @@ a printk warning which identifies the inconsistency: ...@@ -70,6 +75,20 @@ a printk warning which identifies the inconsistency:
"Format mismatch for probe probe_name (format), marker (format)" "Format mismatch for probe probe_name (format), marker (format)"
Another way to use markers is to simply define the marker without generating any
function call to actually call into the marker. This is useful in combination
with tracepoint probes in a scheme like this :
void probe_tracepoint_name(unsigned int arg1, struct task_struct *tsk);
DEFINE_MARKER_TP(marker_eventname, tracepoint_name, probe_tracepoint_name,
"arg1 %u pid %d");
notrace void probe_tracepoint_name(unsigned int arg1, struct task_struct *tsk)
{
struct marker *marker = &GET_MARKER(kernel_irq_entry);
/* write data to trace buffers ... */
}
* Probe / marker example * Probe / marker example
......
...@@ -8,7 +8,7 @@ Context switch ...@@ -8,7 +8,7 @@ Context switch
By default, the switch_to arch function is called with the runqueue By default, the switch_to arch function is called with the runqueue
locked. This is usually not a problem unless switch_to may need to locked. This is usually not a problem unless switch_to may need to
take the runqueue lock. This is usually due to a wake up operation in take the runqueue lock. This is usually due to a wake up operation in
the context switch. See include/asm-ia64/system.h for an example. the context switch. See arch/ia64/include/asm/system.h for an example.
To request the scheduler call switch_to with the runqueue unlocked, To request the scheduler call switch_to with the runqueue unlocked,
you must `#define __ARCH_WANT_UNLOCKED_CTXSW` in a header file you must `#define __ARCH_WANT_UNLOCKED_CTXSW` in a header file
...@@ -23,7 +23,7 @@ disabled. Interrupts may be enabled over the call if it is likely to ...@@ -23,7 +23,7 @@ disabled. Interrupts may be enabled over the call if it is likely to
introduce a significant interrupt latency by adding the line introduce a significant interrupt latency by adding the line
`#define __ARCH_WANT_INTERRUPTS_ON_CTXSW` in the same place as for `#define __ARCH_WANT_INTERRUPTS_ON_CTXSW` in the same place as for
unlocked context switches. This define also implies unlocked context switches. This define also implies
`__ARCH_WANT_UNLOCKED_CTXSW`. See include/asm-arm/system.h for an `__ARCH_WANT_UNLOCKED_CTXSW`. See arch/arm/include/asm/system.h for an
example. example.
......
...@@ -3,28 +3,30 @@ ...@@ -3,28 +3,30 @@
Mathieu Desnoyers Mathieu Desnoyers
This document introduces Linux Kernel Tracepoints and their use. It provides This document introduces Linux Kernel Tracepoints and their use. It
examples of how to insert tracepoints in the kernel and connect probe functions provides examples of how to insert tracepoints in the kernel and
to them and provides some examples of probe functions. connect probe functions to them and provides some examples of probe
functions.
* Purpose of tracepoints * Purpose of tracepoints
A tracepoint placed in code provides a hook to call a function (probe) that you A tracepoint placed in code provides a hook to call a function (probe)
can provide at runtime. A tracepoint can be "on" (a probe is connected to it) or that you can provide at runtime. A tracepoint can be "on" (a probe is
"off" (no probe is attached). When a tracepoint is "off" it has no effect, connected to it) or "off" (no probe is attached). When a tracepoint is
except for adding a tiny time penalty (checking a condition for a branch) and "off" it has no effect, except for adding a tiny time penalty
space penalty (adding a few bytes for the function call at the end of the (checking a condition for a branch) and space penalty (adding a few
instrumented function and adds a data structure in a separate section). When a bytes for the function call at the end of the instrumented function
tracepoint is "on", the function you provide is called each time the tracepoint and adds a data structure in a separate section). When a tracepoint
is executed, in the execution context of the caller. When the function provided is "on", the function you provide is called each time the tracepoint
ends its execution, it returns to the caller (continuing from the tracepoint is executed, in the execution context of the caller. When the function
site). provided ends its execution, it returns to the caller (continuing from
the tracepoint site).
You can put tracepoints at important locations in the code. They are You can put tracepoints at important locations in the code. They are
lightweight hooks that can pass an arbitrary number of parameters, lightweight hooks that can pass an arbitrary number of parameters,
which prototypes are described in a tracepoint declaration placed in a header which prototypes are described in a tracepoint declaration placed in a
file. header file.
They can be used for tracing and performance accounting. They can be used for tracing and performance accounting.
...@@ -42,14 +44,16 @@ In include/trace/subsys.h : ...@@ -42,14 +44,16 @@ In include/trace/subsys.h :
#include <linux/tracepoint.h> #include <linux/tracepoint.h>
DEFINE_TRACE(subsys_eventname, DECLARE_TRACE(subsys_eventname,
TPPTOTO(int firstarg, struct task_struct *p), TPPROTO(int firstarg, struct task_struct *p),
TPARGS(firstarg, p)); TPARGS(firstarg, p));
In subsys/file.c (where the tracing statement must be added) : In subsys/file.c (where the tracing statement must be added) :
#include <trace/subsys.h> #include <trace/subsys.h>
DEFINE_TRACE(subsys_eventname);
void somefct(void) void somefct(void)
{ {
... ...
...@@ -61,31 +65,41 @@ Where : ...@@ -61,31 +65,41 @@ Where :
- subsys_eventname is an identifier unique to your event - subsys_eventname is an identifier unique to your event
- subsys is the name of your subsystem. - subsys is the name of your subsystem.
- eventname is the name of the event to trace. - eventname is the name of the event to trace.
- TPPTOTO(int firstarg, struct task_struct *p) is the prototype of the function
called by this tracepoint.
- TPARGS(firstarg, p) are the parameters names, same as found in the prototype.
Connecting a function (probe) to a tracepoint is done by providing a probe - TPPROTO(int firstarg, struct task_struct *p) is the prototype of the
(function to call) for the specific tracepoint through function called by this tracepoint.
register_trace_subsys_eventname(). Removing a probe is done through
unregister_trace_subsys_eventname(); it will remove the probe sure there is no
caller left using the probe when it returns. Probe removal is preempt-safe
because preemption is disabled around the probe call. See the "Probe example"
section below for a sample probe module.
The tracepoint mechanism supports inserting multiple instances of the same
tracepoint, but a single definition must be made of a given tracepoint name over
all the kernel to make sure no type conflict will occur. Name mangling of the
tracepoints is done using the prototypes to make sure typing is correct.
Verification of probe type correctness is done at the registration site by the
compiler. Tracepoints can be put in inline functions, inlined static functions,
and unrolled loops as well as regular functions.
The naming scheme "subsys_event" is suggested here as a convention intended
to limit collisions. Tracepoint names are global to the kernel: they are
considered as being the same whether they are in the core kernel image or in
modules.
- TPARGS(firstarg, p) are the parameters names, same as found in the
prototype.
Connecting a function (probe) to a tracepoint is done by providing a
probe (function to call) for the specific tracepoint through
register_trace_subsys_eventname(). Removing a probe is done through
unregister_trace_subsys_eventname(); it will remove the probe.
tracepoint_synchronize_unregister() must be called before the end of
the module exit function to make sure there is no caller left using
the probe. This, and the fact that preemption is disabled around the
probe call, make sure that probe removal and module unload are safe.
See the "Probe example" section below for a sample probe module.
The tracepoint mechanism supports inserting multiple instances of the
same tracepoint, but a single definition must be made of a given
tracepoint name over all the kernel to make sure no type conflict will
occur. Name mangling of the tracepoints is done using the prototypes
to make sure typing is correct. Verification of probe type correctness
is done at the registration site by the compiler. Tracepoints can be
put in inline functions, inlined static functions, and unrolled loops
as well as regular functions.
The naming scheme "subsys_event" is suggested here as a convention
intended to limit collisions. Tracepoint names are global to the
kernel: they are considered as being the same whether they are in the
core kernel image or in modules.
If the tracepoint has to be used in kernel modules, an
EXPORT_TRACEPOINT_SYMBOL_GPL() or EXPORT_TRACEPOINT_SYMBOL() can be
used to export the defined tracepoints.
* Probe / tracepoint example * Probe / tracepoint example
......
...@@ -99,7 +99,7 @@ config GENERIC_IOMAP ...@@ -99,7 +99,7 @@ config GENERIC_IOMAP
bool bool
default y default y
config SCHED_NO_NO_OMIT_FRAME_POINTER config SCHED_OMIT_FRAME_POINTER
bool bool
default y default y
......
...@@ -55,7 +55,6 @@ ...@@ -55,7 +55,6 @@
void build_cpu_to_node_map(void); void build_cpu_to_node_map(void);
#define SD_CPU_INIT (struct sched_domain) { \ #define SD_CPU_INIT (struct sched_domain) { \
.span = CPU_MASK_NONE, \
.parent = NULL, \ .parent = NULL, \
.child = NULL, \ .child = NULL, \
.groups = NULL, \ .groups = NULL, \
...@@ -80,7 +79,6 @@ void build_cpu_to_node_map(void); ...@@ -80,7 +79,6 @@ void build_cpu_to_node_map(void);
/* sched_domains SD_NODE_INIT for IA64 NUMA machines */ /* sched_domains SD_NODE_INIT for IA64 NUMA machines */
#define SD_NODE_INIT (struct sched_domain) { \ #define SD_NODE_INIT (struct sched_domain) { \
.span = CPU_MASK_NONE, \
.parent = NULL, \ .parent = NULL, \
.child = NULL, \ .child = NULL, \
.groups = NULL, \ .groups = NULL, \
......
...@@ -274,7 +274,7 @@ config GENERIC_CALIBRATE_DELAY ...@@ -274,7 +274,7 @@ config GENERIC_CALIBRATE_DELAY
bool bool
default y default y
config SCHED_NO_NO_OMIT_FRAME_POINTER config SCHED_OMIT_FRAME_POINTER
bool bool
default y default y
......
...@@ -653,7 +653,7 @@ config GENERIC_CMOS_UPDATE ...@@ -653,7 +653,7 @@ config GENERIC_CMOS_UPDATE
bool bool
default y default y
config SCHED_NO_NO_OMIT_FRAME_POINTER config SCHED_OMIT_FRAME_POINTER
bool bool
default y default y
......
...@@ -37,7 +37,6 @@ extern unsigned char __node_distances[MAX_COMPACT_NODES][MAX_COMPACT_NODES]; ...@@ -37,7 +37,6 @@ extern unsigned char __node_distances[MAX_COMPACT_NODES][MAX_COMPACT_NODES];
/* sched_domains SD_NODE_INIT for SGI IP27 machines */ /* sched_domains SD_NODE_INIT for SGI IP27 machines */
#define SD_NODE_INIT (struct sched_domain) { \ #define SD_NODE_INIT (struct sched_domain) { \
.span = CPU_MASK_NONE, \
.parent = NULL, \ .parent = NULL, \
.child = NULL, \ .child = NULL, \
.groups = NULL, \ .groups = NULL, \
......
...@@ -141,7 +141,7 @@ config GENERIC_NVRAM ...@@ -141,7 +141,7 @@ config GENERIC_NVRAM
bool bool
default y if PPC32 default y if PPC32
config SCHED_NO_NO_OMIT_FRAME_POINTER config SCHED_OMIT_FRAME_POINTER
bool bool
default y default y
......
...@@ -7,7 +7,19 @@ ...@@ -7,7 +7,19 @@
#ifndef __ASSEMBLY__ #ifndef __ASSEMBLY__
extern void _mcount(void); extern void _mcount(void);
#endif
#ifdef CONFIG_DYNAMIC_FTRACE
static inline unsigned long ftrace_call_adjust(unsigned long addr)
{
/* reloction of mcount call site is the same as the address */
return addr;
}
struct dyn_arch_ftrace {
struct module *mod;
};
#endif /* CONFIG_DYNAMIC_FTRACE */
#endif /* __ASSEMBLY__ */
#endif #endif
......
...@@ -34,11 +34,19 @@ struct mod_arch_specific { ...@@ -34,11 +34,19 @@ struct mod_arch_specific {
#ifdef __powerpc64__ #ifdef __powerpc64__
unsigned int stubs_section; /* Index of stubs section in module */ unsigned int stubs_section; /* Index of stubs section in module */
unsigned int toc_section; /* What section is the TOC? */ unsigned int toc_section; /* What section is the TOC? */
#else #ifdef CONFIG_DYNAMIC_FTRACE
unsigned long toc;
unsigned long tramp;
#endif
#else /* powerpc64 */
/* Indices of PLT sections within module. */ /* Indices of PLT sections within module. */
unsigned int core_plt_section; unsigned int core_plt_section;
unsigned int init_plt_section; unsigned int init_plt_section;
#ifdef CONFIG_DYNAMIC_FTRACE
unsigned long tramp;
#endif #endif
#endif /* powerpc64 */
/* List of BUG addresses, source line numbers and filenames */ /* List of BUG addresses, source line numbers and filenames */
struct list_head bug_list; struct list_head bug_list;
...@@ -68,6 +76,12 @@ struct mod_arch_specific { ...@@ -68,6 +76,12 @@ struct mod_arch_specific {
# endif /* MODULE */ # endif /* MODULE */
#endif #endif
#ifdef CONFIG_DYNAMIC_FTRACE
# ifdef MODULE
asm(".section .ftrace.tramp,\"ax\",@nobits; .align 3; .previous");
# endif /* MODULE */
#endif
struct exception_table_entry; struct exception_table_entry;
void sort_ex_table(struct exception_table_entry *start, void sort_ex_table(struct exception_table_entry *start,
......
...@@ -48,7 +48,6 @@ static inline int pcibus_to_node(struct pci_bus *bus) ...@@ -48,7 +48,6 @@ static inline int pcibus_to_node(struct pci_bus *bus)
/* sched_domains SD_NODE_INIT for PPC64 machines */ /* sched_domains SD_NODE_INIT for PPC64 machines */
#define SD_NODE_INIT (struct sched_domain) { \ #define SD_NODE_INIT (struct sched_domain) { \
.span = CPU_MASK_NONE, \
.parent = NULL, \ .parent = NULL, \
.child = NULL, \ .child = NULL, \
.groups = NULL, \ .groups = NULL, \
......
...@@ -17,6 +17,7 @@ ifdef CONFIG_FUNCTION_TRACER ...@@ -17,6 +17,7 @@ ifdef CONFIG_FUNCTION_TRACER
CFLAGS_REMOVE_cputable.o = -pg -mno-sched-epilog CFLAGS_REMOVE_cputable.o = -pg -mno-sched-epilog
CFLAGS_REMOVE_prom_init.o = -pg -mno-sched-epilog CFLAGS_REMOVE_prom_init.o = -pg -mno-sched-epilog
CFLAGS_REMOVE_btext.o = -pg -mno-sched-epilog CFLAGS_REMOVE_btext.o = -pg -mno-sched-epilog
CFLAGS_REMOVE_prom.o = -pg -mno-sched-epilog
ifdef CONFIG_DYNAMIC_FTRACE ifdef CONFIG_DYNAMIC_FTRACE
# dynamic ftrace setup. # dynamic ftrace setup.
......
...@@ -1162,39 +1162,17 @@ machine_check_in_rtas: ...@@ -1162,39 +1162,17 @@ machine_check_in_rtas:
#ifdef CONFIG_DYNAMIC_FTRACE #ifdef CONFIG_DYNAMIC_FTRACE
_GLOBAL(mcount) _GLOBAL(mcount)
_GLOBAL(_mcount) _GLOBAL(_mcount)
stwu r1,-48(r1) /*
stw r3, 12(r1) * It is required that _mcount on PPC32 must preserve the
stw r4, 16(r1) * link register. But we have r0 to play with. We use r0
stw r5, 20(r1) * to push the return address back to the caller of mcount
stw r6, 24(r1) * into the ctr register, restore the link register and
mflr r3 * then jump back using the ctr register.
stw r7, 28(r1) */
mfcr r5 mflr r0
stw r8, 32(r1)
stw r9, 36(r1)
stw r10,40(r1)
stw r3, 44(r1)
stw r5, 8(r1)
subi r3, r3, MCOUNT_INSN_SIZE
.globl mcount_call
mcount_call:
bl ftrace_stub
nop
lwz r6, 8(r1)
lwz r0, 44(r1)
lwz r3, 12(r1)
mtctr r0 mtctr r0
lwz r4, 16(r1) lwz r0, 4(r1)
mtcr r6
lwz r5, 20(r1)
lwz r6, 24(r1)
lwz r0, 52(r1)
lwz r7, 28(r1)
lwz r8, 32(r1)
mtlr r0 mtlr r0
lwz r9, 36(r1)
lwz r10,40(r1)
addi r1, r1, 48
bctr bctr
_GLOBAL(ftrace_caller) _GLOBAL(ftrace_caller)
......
...@@ -894,18 +894,6 @@ _GLOBAL(enter_prom) ...@@ -894,18 +894,6 @@ _GLOBAL(enter_prom)
#ifdef CONFIG_DYNAMIC_FTRACE #ifdef CONFIG_DYNAMIC_FTRACE
_GLOBAL(mcount) _GLOBAL(mcount)
_GLOBAL(_mcount) _GLOBAL(_mcount)
/* Taken from output of objdump from lib64/glibc */
mflr r3
stdu r1, -112(r1)
std r3, 128(r1)
subi r3, r3, MCOUNT_INSN_SIZE
.globl mcount_call
mcount_call:
bl ftrace_stub
nop
ld r0, 128(r1)
mtlr r0
addi r1, r1, 112
blr blr
_GLOBAL(ftrace_caller) _GLOBAL(ftrace_caller)
......
This diff is collapsed.
...@@ -69,10 +69,15 @@ void cpu_idle(void) ...@@ -69,10 +69,15 @@ void cpu_idle(void)
smp_mb(); smp_mb();
local_irq_disable(); local_irq_disable();
/* Don't trace irqs off for idle */
stop_critical_timings();
/* check again after disabling irqs */ /* check again after disabling irqs */
if (!need_resched() && !cpu_should_die()) if (!need_resched() && !cpu_should_die())
ppc_md.power_save(); ppc_md.power_save();
start_critical_timings();
local_irq_enable(); local_irq_enable();
set_thread_flag(TIF_POLLING_NRFLAG); set_thread_flag(TIF_POLLING_NRFLAG);
......
...@@ -22,6 +22,7 @@ ...@@ -22,6 +22,7 @@
#include <linux/fs.h> #include <linux/fs.h>
#include <linux/string.h> #include <linux/string.h>
#include <linux/kernel.h> #include <linux/kernel.h>
#include <linux/ftrace.h>
#include <linux/cache.h> #include <linux/cache.h>
#include <linux/bug.h> #include <linux/bug.h>
#include <linux/sort.h> #include <linux/sort.h>
...@@ -53,6 +54,9 @@ static unsigned int count_relocs(const Elf32_Rela *rela, unsigned int num) ...@@ -53,6 +54,9 @@ static unsigned int count_relocs(const Elf32_Rela *rela, unsigned int num)
r_addend = rela[i].r_addend; r_addend = rela[i].r_addend;
} }
#ifdef CONFIG_DYNAMIC_FTRACE
_count_relocs++; /* add one for ftrace_caller */
#endif
return _count_relocs; return _count_relocs;
} }
...@@ -306,5 +310,11 @@ int apply_relocate_add(Elf32_Shdr *sechdrs, ...@@ -306,5 +310,11 @@ int apply_relocate_add(Elf32_Shdr *sechdrs,
return -ENOEXEC; return -ENOEXEC;
} }
} }
#ifdef CONFIG_DYNAMIC_FTRACE
module->arch.tramp =
do_plt_call(module->module_core,
(unsigned long)ftrace_caller,
sechdrs, module);
#endif
return 0; return 0;
} }
...@@ -20,6 +20,7 @@ ...@@ -20,6 +20,7 @@
#include <linux/moduleloader.h> #include <linux/moduleloader.h>
#include <linux/err.h> #include <linux/err.h>
#include <linux/vmalloc.h> #include <linux/vmalloc.h>
#include <linux/ftrace.h>
#include <linux/bug.h> #include <linux/bug.h>
#include <asm/module.h> #include <asm/module.h>
#include <asm/firmware.h> #include <asm/firmware.h>
...@@ -163,6 +164,11 @@ static unsigned long get_stubs_size(const Elf64_Ehdr *hdr, ...@@ -163,6 +164,11 @@ static unsigned long get_stubs_size(const Elf64_Ehdr *hdr,
} }
} }
#ifdef CONFIG_DYNAMIC_FTRACE
/* make the trampoline to the ftrace_caller */
relocs++;
#endif
DEBUGP("Looks like a total of %lu stubs, max\n", relocs); DEBUGP("Looks like a total of %lu stubs, max\n", relocs);
return relocs * sizeof(struct ppc64_stub_entry); return relocs * sizeof(struct ppc64_stub_entry);
} }
...@@ -441,5 +447,12 @@ int apply_relocate_add(Elf64_Shdr *sechdrs, ...@@ -441,5 +447,12 @@ int apply_relocate_add(Elf64_Shdr *sechdrs,
} }
} }
#ifdef CONFIG_DYNAMIC_FTRACE
me->arch.toc = my_r2(sechdrs, me);
me->arch.tramp = stub_for_addr(sechdrs,
(unsigned long)ftrace_caller,
me);
#endif
return 0; return 0;
} }
...@@ -6,6 +6,9 @@ ifeq ($(CONFIG_PPC64),y) ...@@ -6,6 +6,9 @@ ifeq ($(CONFIG_PPC64),y)
EXTRA_CFLAGS += -mno-minimal-toc EXTRA_CFLAGS += -mno-minimal-toc
endif endif
CFLAGS_REMOVE_code-patching.o = -pg
CFLAGS_REMOVE_feature-fixups.o = -pg
obj-y := string.o alloc.o \ obj-y := string.o alloc.o \
checksum_$(CONFIG_WORD_SIZE).o checksum_$(CONFIG_WORD_SIZE).o
obj-$(CONFIG_PPC32) += div64.o copy_32.o crtsavres.o obj-$(CONFIG_PPC32) += div64.o copy_32.o crtsavres.o
......
...@@ -212,7 +212,7 @@ static void update_cpu_core_map(void) ...@@ -212,7 +212,7 @@ static void update_cpu_core_map(void)
cpu_core_map[cpu] = cpu_coregroup_map(cpu); cpu_core_map[cpu] = cpu_coregroup_map(cpu);
} }
void arch_update_cpu_topology(void) int arch_update_cpu_topology(void)
{ {
struct tl_info *info = tl_info; struct tl_info *info = tl_info;
struct sys_device *sysdev; struct sys_device *sysdev;
...@@ -221,7 +221,7 @@ void arch_update_cpu_topology(void) ...@@ -221,7 +221,7 @@ void arch_update_cpu_topology(void)
if (!machine_has_topology) { if (!machine_has_topology) {
update_cpu_core_map(); update_cpu_core_map();
topology_update_polarization_simple(); topology_update_polarization_simple();
return; return 0;
} }
stsi(info, 15, 1, 2); stsi(info, 15, 1, 2);
tl_to_cores(info); tl_to_cores(info);
...@@ -230,6 +230,7 @@ void arch_update_cpu_topology(void) ...@@ -230,6 +230,7 @@ void arch_update_cpu_topology(void)
sysdev = get_cpu_sysdev(cpu); sysdev = get_cpu_sysdev(cpu);
kobject_uevent(&sysdev->kobj, KOBJ_CHANGE); kobject_uevent(&sysdev->kobj, KOBJ_CHANGE);
} }
return 1;
} }
static void topology_work_fn(struct work_struct *work) static void topology_work_fn(struct work_struct *work)
......
...@@ -5,7 +5,6 @@ ...@@ -5,7 +5,6 @@
/* sched_domains SD_NODE_INIT for sh machines */ /* sched_domains SD_NODE_INIT for sh machines */
#define SD_NODE_INIT (struct sched_domain) { \ #define SD_NODE_INIT (struct sched_domain) { \
.span = CPU_MASK_NONE, \
.parent = NULL, \ .parent = NULL, \
.child = NULL, \ .child = NULL, \
.groups = NULL, \ .groups = NULL, \
......
...@@ -11,21 +11,21 @@ extern int get_signals(void); ...@@ -11,21 +11,21 @@ extern int get_signals(void);
extern void block_signals(void); extern void block_signals(void);
extern void unblock_signals(void); extern void unblock_signals(void);
#define local_save_flags(flags) do { typecheck(unsigned long, flags); \ #define raw_local_save_flags(flags) do { typecheck(unsigned long, flags); \
(flags) = get_signals(); } while(0) (flags) = get_signals(); } while(0)
#define local_irq_restore(flags) do { typecheck(unsigned long, flags); \ #define raw_local_irq_restore(flags) do { typecheck(unsigned long, flags); \
set_signals(flags); } while(0) set_signals(flags); } while(0)
#define local_irq_save(flags) do { local_save_flags(flags); \ #define raw_local_irq_save(flags) do { raw_local_save_flags(flags); \
local_irq_disable(); } while(0) raw_local_irq_disable(); } while(0)
#define local_irq_enable() unblock_signals() #define raw_local_irq_enable() unblock_signals()
#define local_irq_disable() block_signals() #define raw_local_irq_disable() block_signals()
#define irqs_disabled() \ #define irqs_disabled() \
({ \ ({ \
unsigned long flags; \ unsigned long flags; \
local_save_flags(flags); \ raw_local_save_flags(flags); \
(flags == 0); \ (flags == 0); \
}) })
......
...@@ -29,11 +29,14 @@ config X86 ...@@ -29,11 +29,14 @@ config X86
select HAVE_FTRACE_MCOUNT_RECORD select HAVE_FTRACE_MCOUNT_RECORD
select HAVE_DYNAMIC_FTRACE select HAVE_DYNAMIC_FTRACE
select HAVE_FUNCTION_TRACER select HAVE_FUNCTION_TRACER
select HAVE_FUNCTION_GRAPH_TRACER
select HAVE_FUNCTION_TRACE_MCOUNT_TEST
select HAVE_KVM if ((X86_32 && !X86_VOYAGER && !X86_VISWS && !X86_NUMAQ) || X86_64) select HAVE_KVM if ((X86_32 && !X86_VOYAGER && !X86_VISWS && !X86_NUMAQ) || X86_64)
select HAVE_ARCH_KGDB if !X86_VOYAGER select HAVE_ARCH_KGDB if !X86_VOYAGER
select HAVE_ARCH_TRACEHOOK select HAVE_ARCH_TRACEHOOK
select HAVE_GENERIC_DMA_COHERENT if X86_32 select HAVE_GENERIC_DMA_COHERENT if X86_32
select HAVE_EFFICIENT_UNALIGNED_ACCESS select HAVE_EFFICIENT_UNALIGNED_ACCESS
select USER_STACKTRACE_SUPPORT
config ARCH_DEFCONFIG config ARCH_DEFCONFIG
string string
...@@ -238,6 +241,16 @@ config X86_HAS_BOOT_CPU_ID ...@@ -238,6 +241,16 @@ config X86_HAS_BOOT_CPU_ID
def_bool y def_bool y
depends on X86_VOYAGER depends on X86_VOYAGER
config SPARSE_IRQ
bool "Support sparse irq numbering"
depends on (PCI_MSI || HT_IRQ) && SMP
default y
help
This enables support for sparse irq, esp for msi/msi-x. You may need
if you have lots of cards supports msi-x installed.
If you don't know what to do here, say Y.
config X86_FIND_SMP_CONFIG config X86_FIND_SMP_CONFIG
def_bool y def_bool y
depends on X86_MPPARSE || X86_VOYAGER depends on X86_MPPARSE || X86_VOYAGER
...@@ -367,10 +380,10 @@ config X86_RDC321X ...@@ -367,10 +380,10 @@ config X86_RDC321X
as R-8610-(G). as R-8610-(G).
If you don't have one of these chips, you should say N here. If you don't have one of these chips, you should say N here.
config SCHED_NO_NO_OMIT_FRAME_POINTER config SCHED_OMIT_FRAME_POINTER
def_bool y def_bool y
prompt "Single-depth WCHAN output" prompt "Single-depth WCHAN output"
depends on X86_32 depends on X86
help help
Calculate simpler /proc/<PID>/wchan values. If this option Calculate simpler /proc/<PID>/wchan values. If this option
is disabled then wchan values will recurse back to the is disabled then wchan values will recurse back to the
...@@ -465,10 +478,6 @@ config X86_CYCLONE_TIMER ...@@ -465,10 +478,6 @@ config X86_CYCLONE_TIMER
def_bool y def_bool y
depends on X86_GENERICARCH depends on X86_GENERICARCH
config ES7000_CLUSTERED_APIC
def_bool y
depends on SMP && X86_ES7000 && MPENTIUMIII
source "arch/x86/Kconfig.cpu" source "arch/x86/Kconfig.cpu"
config HPET_TIMER config HPET_TIMER
...@@ -1632,13 +1641,6 @@ config APM_ALLOW_INTS ...@@ -1632,13 +1641,6 @@ config APM_ALLOW_INTS
many of the newer IBM Thinkpads. If you experience hangs when you many of the newer IBM Thinkpads. If you experience hangs when you
suspend, try setting this to Y. Otherwise, say N. suspend, try setting this to Y. Otherwise, say N.
config APM_REAL_MODE_POWER_OFF
bool "Use real mode APM BIOS call to power off"
help
Use real mode APM BIOS calls to switch off the computer. This is
a work-around for a number of buggy BIOSes. Switch this option on if
your computer crashes instead of powering off properly.
endif # APM endif # APM
source "arch/x86/kernel/cpu/cpufreq/Kconfig" source "arch/x86/kernel/cpu/cpufreq/Kconfig"
......
...@@ -515,6 +515,7 @@ config CPU_SUP_UMC_32 ...@@ -515,6 +515,7 @@ config CPU_SUP_UMC_32
config X86_DS config X86_DS
def_bool X86_PTRACE_BTS def_bool X86_PTRACE_BTS
depends on X86_DEBUGCTLMSR depends on X86_DEBUGCTLMSR
select HAVE_HW_BRANCH_TRACER
config X86_PTRACE_BTS config X86_PTRACE_BTS
bool "Branch Trace Store" bool "Branch Trace Store"
......
...@@ -186,14 +186,10 @@ config IOMMU_LEAK ...@@ -186,14 +186,10 @@ config IOMMU_LEAK
Add a simple leak tracer to the IOMMU code. This is useful when you Add a simple leak tracer to the IOMMU code. This is useful when you
are debugging a buggy device driver that leaks IOMMU mappings. are debugging a buggy device driver that leaks IOMMU mappings.
config MMIOTRACE_HOOKS
bool
config MMIOTRACE config MMIOTRACE
bool "Memory mapped IO tracing" bool "Memory mapped IO tracing"
depends on DEBUG_KERNEL && PCI depends on DEBUG_KERNEL && PCI
select TRACING select TRACING
select MMIOTRACE_HOOKS
help help
Mmiotrace traces Memory Mapped I/O access and is meant for Mmiotrace traces Memory Mapped I/O access and is meant for
debugging and reverse engineering. It is called from the ioremap debugging and reverse engineering. It is called from the ioremap
......
...@@ -193,6 +193,7 @@ extern u8 setup_APIC_eilvt_ibs(u8 vector, u8 msg_type, u8 mask); ...@@ -193,6 +193,7 @@ extern u8 setup_APIC_eilvt_ibs(u8 vector, u8 msg_type, u8 mask);
static inline void lapic_shutdown(void) { } static inline void lapic_shutdown(void) { }
#define local_apic_timer_c2_ok 1 #define local_apic_timer_c2_ok 1
static inline void init_apic_mappings(void) { } static inline void init_apic_mappings(void) { }
static inline void disable_local_APIC(void) { }
#endif /* !CONFIG_X86_LOCAL_APIC */ #endif /* !CONFIG_X86_LOCAL_APIC */
......
...@@ -24,8 +24,6 @@ static inline cpumask_t target_cpus(void) ...@@ -24,8 +24,6 @@ static inline cpumask_t target_cpus(void)
#define INT_DELIVERY_MODE (dest_Fixed) #define INT_DELIVERY_MODE (dest_Fixed)
#define INT_DEST_MODE (0) /* phys delivery to target proc */ #define INT_DEST_MODE (0) /* phys delivery to target proc */
#define NO_BALANCE_IRQ (0) #define NO_BALANCE_IRQ (0)
#define WAKE_SECONDARY_VIA_INIT
static inline unsigned long check_apicid_used(physid_mask_t bitmap, int apicid) static inline unsigned long check_apicid_used(physid_mask_t bitmap, int apicid)
{ {
......
...@@ -7,13 +7,12 @@ ...@@ -7,13 +7,12 @@
* *
* It manages: * It manages:
* - per-thread and per-cpu allocation of BTS and PEBS * - per-thread and per-cpu allocation of BTS and PEBS
* - buffer memory allocation (optional) * - buffer overflow handling (to be done)
* - buffer overflow handling
* - buffer access * - buffer access
* *
* It assumes: * It assumes:
* - get_task_struct on all parameter tasks * - get_task_struct on all traced tasks
* - current is allowed to trace parameter tasks * - current is allowed to trace tasks
* *
* *
* Copyright (C) 2007-2008 Intel Corporation. * Copyright (C) 2007-2008 Intel Corporation.
...@@ -26,11 +25,18 @@ ...@@ -26,11 +25,18 @@
#include <linux/types.h> #include <linux/types.h>
#include <linux/init.h> #include <linux/init.h>
#include <linux/err.h>
#ifdef CONFIG_X86_DS #ifdef CONFIG_X86_DS
struct task_struct; struct task_struct;
struct ds_tracer;
struct bts_tracer;
struct pebs_tracer;
typedef void (*bts_ovfl_callback_t)(struct bts_tracer *);
typedef void (*pebs_ovfl_callback_t)(struct pebs_tracer *);
/* /*
* Request BTS or PEBS * Request BTS or PEBS
...@@ -38,60 +44,62 @@ struct task_struct; ...@@ -38,60 +44,62 @@ struct task_struct;
* Due to alignement constraints, the actual buffer may be slightly * Due to alignement constraints, the actual buffer may be slightly
* smaller than the requested or provided buffer. * smaller than the requested or provided buffer.
* *
* Returns 0 on success; -Eerrno otherwise * Returns a pointer to a tracer structure on success, or
* ERR_PTR(errcode) on failure.
*
* The interrupt threshold is independent from the overflow callback
* to allow users to use their own overflow interrupt handling mechanism.
* *
* task: the task to request recording for; * task: the task to request recording for;
* NULL for per-cpu recording on the current cpu * NULL for per-cpu recording on the current cpu
* base: the base pointer for the (non-pageable) buffer; * base: the base pointer for the (non-pageable) buffer;
* NULL if buffer allocation requested * size: the size of the provided buffer in bytes
* size: the size of the requested or provided buffer
* ovfl: pointer to a function to be called on buffer overflow; * ovfl: pointer to a function to be called on buffer overflow;
* NULL if cyclic buffer requested * NULL if cyclic buffer requested
* th: the interrupt threshold in records from the end of the buffer;
* -1 if no interrupt threshold is requested.
*/ */
typedef void (*ds_ovfl_callback_t)(struct task_struct *); extern struct bts_tracer *ds_request_bts(struct task_struct *task,
extern int ds_request_bts(struct task_struct *task, void *base, size_t size, void *base, size_t size,
ds_ovfl_callback_t ovfl); bts_ovfl_callback_t ovfl, size_t th);
extern int ds_request_pebs(struct task_struct *task, void *base, size_t size, extern struct pebs_tracer *ds_request_pebs(struct task_struct *task,
ds_ovfl_callback_t ovfl); void *base, size_t size,
pebs_ovfl_callback_t ovfl,
size_t th);
/* /*
* Release BTS or PEBS resources * Release BTS or PEBS resources
* *
* Frees buffers allocated on ds_request.
*
* Returns 0 on success; -Eerrno otherwise * Returns 0 on success; -Eerrno otherwise
* *
* task: the task to release resources for; * tracer: the tracer handle returned from ds_request_~()
* NULL to release resources for the current cpu
*/ */
extern int ds_release_bts(struct task_struct *task); extern int ds_release_bts(struct bts_tracer *tracer);
extern int ds_release_pebs(struct task_struct *task); extern int ds_release_pebs(struct pebs_tracer *tracer);
/* /*
* Return the (array) index of the write pointer. * Get the (array) index of the write pointer.
* (assuming an array of BTS/PEBS records) * (assuming an array of BTS/PEBS records)
* *
* Returns -Eerrno on error * Returns 0 on success; -Eerrno on error
* *
* task: the task to access; * tracer: the tracer handle returned from ds_request_~()
* NULL to access the current cpu * pos (out): will hold the result
* pos (out): if not NULL, will hold the result
*/ */
extern int ds_get_bts_index(struct task_struct *task, size_t *pos); extern int ds_get_bts_index(struct bts_tracer *tracer, size_t *pos);
extern int ds_get_pebs_index(struct task_struct *task, size_t *pos); extern int ds_get_pebs_index(struct pebs_tracer *tracer, size_t *pos);
/* /*
* Return the (array) index one record beyond the end of the array. * Get the (array) index one record beyond the end of the array.
* (assuming an array of BTS/PEBS records) * (assuming an array of BTS/PEBS records)
* *
* Returns -Eerrno on error * Returns 0 on success; -Eerrno on error
* *
* task: the task to access; * tracer: the tracer handle returned from ds_request_~()
* NULL to access the current cpu * pos (out): will hold the result
* pos (out): if not NULL, will hold the result
*/ */
extern int ds_get_bts_end(struct task_struct *task, size_t *pos); extern int ds_get_bts_end(struct bts_tracer *tracer, size_t *pos);
extern int ds_get_pebs_end(struct task_struct *task, size_t *pos); extern int ds_get_pebs_end(struct pebs_tracer *tracer, size_t *pos);
/* /*
* Provide a pointer to the BTS/PEBS record at parameter index. * Provide a pointer to the BTS/PEBS record at parameter index.
...@@ -102,14 +110,13 @@ extern int ds_get_pebs_end(struct task_struct *task, size_t *pos); ...@@ -102,14 +110,13 @@ extern int ds_get_pebs_end(struct task_struct *task, size_t *pos);
* *
* Returns the size of a single record on success; -Eerrno on error * Returns the size of a single record on success; -Eerrno on error
* *
* task: the task to access; * tracer: the tracer handle returned from ds_request_~()
* NULL to access the current cpu
* index: the index of the requested record * index: the index of the requested record
* record (out): pointer to the requested record * record (out): pointer to the requested record
*/ */
extern int ds_access_bts(struct task_struct *task, extern int ds_access_bts(struct bts_tracer *tracer,
size_t index, const void **record); size_t index, const void **record);
extern int ds_access_pebs(struct task_struct *task, extern int ds_access_pebs(struct pebs_tracer *tracer,
size_t index, const void **record); size_t index, const void **record);
/* /*
...@@ -129,38 +136,24 @@ extern int ds_access_pebs(struct task_struct *task, ...@@ -129,38 +136,24 @@ extern int ds_access_pebs(struct task_struct *task,
* *
* Returns the number of bytes written or -Eerrno. * Returns the number of bytes written or -Eerrno.
* *
* task: the task to access; * tracer: the tracer handle returned from ds_request_~()
* NULL to access the current cpu
* buffer: the buffer to write * buffer: the buffer to write
* size: the size of the buffer * size: the size of the buffer
*/ */
extern int ds_write_bts(struct task_struct *task, extern int ds_write_bts(struct bts_tracer *tracer,
const void *buffer, size_t size); const void *buffer, size_t size);
extern int ds_write_pebs(struct task_struct *task, extern int ds_write_pebs(struct pebs_tracer *tracer,
const void *buffer, size_t size); const void *buffer, size_t size);
/*
* Same as ds_write_bts/pebs, but omit ownership checks.
*
* This is needed to have some other task than the owner of the
* BTS/PEBS buffer or the parameter task itself write into the
* respective buffer.
*/
extern int ds_unchecked_write_bts(struct task_struct *task,
const void *buffer, size_t size);
extern int ds_unchecked_write_pebs(struct task_struct *task,
const void *buffer, size_t size);
/* /*
* Reset the write pointer of the BTS/PEBS buffer. * Reset the write pointer of the BTS/PEBS buffer.
* *
* Returns 0 on success; -Eerrno on error * Returns 0 on success; -Eerrno on error
* *
* task: the task to access; * tracer: the tracer handle returned from ds_request_~()
* NULL to access the current cpu
*/ */
extern int ds_reset_bts(struct task_struct *task); extern int ds_reset_bts(struct bts_tracer *tracer);
extern int ds_reset_pebs(struct task_struct *task); extern int ds_reset_pebs(struct pebs_tracer *tracer);
/* /*
* Clear the BTS/PEBS buffer and reset the write pointer. * Clear the BTS/PEBS buffer and reset the write pointer.
...@@ -168,33 +161,30 @@ extern int ds_reset_pebs(struct task_struct *task); ...@@ -168,33 +161,30 @@ extern int ds_reset_pebs(struct task_struct *task);
* *
* Returns 0 on success; -Eerrno on error * Returns 0 on success; -Eerrno on error
* *
* task: the task to access; * tracer: the tracer handle returned from ds_request_~()
* NULL to access the current cpu
*/ */
extern int ds_clear_bts(struct task_struct *task); extern int ds_clear_bts(struct bts_tracer *tracer);
extern int ds_clear_pebs(struct task_struct *task); extern int ds_clear_pebs(struct pebs_tracer *tracer);
/* /*
* Provide the PEBS counter reset value. * Provide the PEBS counter reset value.
* *
* Returns 0 on success; -Eerrno on error * Returns 0 on success; -Eerrno on error
* *
* task: the task to access; * tracer: the tracer handle returned from ds_request_pebs()
* NULL to access the current cpu
* value (out): the counter reset value * value (out): the counter reset value
*/ */
extern int ds_get_pebs_reset(struct task_struct *task, u64 *value); extern int ds_get_pebs_reset(struct pebs_tracer *tracer, u64 *value);
/* /*
* Set the PEBS counter reset value. * Set the PEBS counter reset value.
* *
* Returns 0 on success; -Eerrno on error * Returns 0 on success; -Eerrno on error
* *
* task: the task to access; * tracer: the tracer handle returned from ds_request_pebs()
* NULL to access the current cpu
* value: the new counter reset value * value: the new counter reset value
*/ */
extern int ds_set_pebs_reset(struct task_struct *task, u64 value); extern int ds_set_pebs_reset(struct pebs_tracer *tracer, u64 value);
/* /*
* Initialization * Initialization
...@@ -207,17 +197,13 @@ extern void __cpuinit ds_init_intel(struct cpuinfo_x86 *); ...@@ -207,17 +197,13 @@ extern void __cpuinit ds_init_intel(struct cpuinfo_x86 *);
/* /*
* The DS context - part of struct thread_struct. * The DS context - part of struct thread_struct.
*/ */
#define MAX_SIZEOF_DS (12 * 8)
struct ds_context { struct ds_context {
/* pointer to the DS configuration; goes into MSR_IA32_DS_AREA */ /* pointer to the DS configuration; goes into MSR_IA32_DS_AREA */
unsigned char *ds; unsigned char ds[MAX_SIZEOF_DS];
/* the owner of the BTS and PEBS configuration, respectively */ /* the owner of the BTS and PEBS configuration, respectively */
struct task_struct *owner[2]; struct ds_tracer *owner[2];
/* buffer overflow notification function for BTS and PEBS */
ds_ovfl_callback_t callback[2];
/* the original buffer address */
void *buffer[2];
/* the number of allocated pages for on-request allocated buffers */
unsigned int pages[2];
/* use count */ /* use count */
unsigned long count; unsigned long count;
/* a pointer to the context location inside the thread_struct /* a pointer to the context location inside the thread_struct
......
...@@ -8,7 +8,9 @@ enum reboot_type { ...@@ -8,7 +8,9 @@ enum reboot_type {
BOOT_BIOS = 'b', BOOT_BIOS = 'b',
#endif #endif
BOOT_ACPI = 'a', BOOT_ACPI = 'a',
BOOT_EFI = 'e' BOOT_EFI = 'e',
BOOT_CF9 = 'p',
BOOT_CF9_COND = 'q',
}; };
extern enum reboot_type reboot_type; extern enum reboot_type reboot_type;
......
...@@ -9,31 +9,27 @@ static inline int apic_id_registered(void) ...@@ -9,31 +9,27 @@ static inline int apic_id_registered(void)
return (1); return (1);
} }
static inline cpumask_t target_cpus(void) static inline cpumask_t target_cpus_cluster(void)
{ {
#if defined CONFIG_ES7000_CLUSTERED_APIC
return CPU_MASK_ALL; return CPU_MASK_ALL;
#else }
static inline cpumask_t target_cpus(void)
{
return cpumask_of_cpu(smp_processor_id()); return cpumask_of_cpu(smp_processor_id());
#endif
} }
#if defined CONFIG_ES7000_CLUSTERED_APIC #define APIC_DFR_VALUE_CLUSTER (APIC_DFR_CLUSTER)
#define APIC_DFR_VALUE (APIC_DFR_CLUSTER) #define INT_DELIVERY_MODE_CLUSTER (dest_LowestPrio)
#define INT_DELIVERY_MODE (dest_LowestPrio) #define INT_DEST_MODE_CLUSTER (1) /* logical delivery broadcast to all procs */
#define INT_DEST_MODE (1) /* logical delivery broadcast to all procs */ #define NO_BALANCE_IRQ_CLUSTER (1)
#define NO_BALANCE_IRQ (1)
#undef WAKE_SECONDARY_VIA_INIT
#define WAKE_SECONDARY_VIA_MIP
#else
#define APIC_DFR_VALUE (APIC_DFR_FLAT) #define APIC_DFR_VALUE (APIC_DFR_FLAT)
#define INT_DELIVERY_MODE (dest_Fixed) #define INT_DELIVERY_MODE (dest_Fixed)
#define INT_DEST_MODE (0) /* phys delivery to target procs */ #define INT_DEST_MODE (0) /* phys delivery to target procs */
#define NO_BALANCE_IRQ (0) #define NO_BALANCE_IRQ (0)
#undef APIC_DEST_LOGICAL #undef APIC_DEST_LOGICAL
#define APIC_DEST_LOGICAL 0x0 #define APIC_DEST_LOGICAL 0x0
#define WAKE_SECONDARY_VIA_INIT
#endif
static inline unsigned long check_apicid_used(physid_mask_t bitmap, int apicid) static inline unsigned long check_apicid_used(physid_mask_t bitmap, int apicid)
{ {
...@@ -60,6 +56,16 @@ static inline unsigned long calculate_ldr(int cpu) ...@@ -60,6 +56,16 @@ static inline unsigned long calculate_ldr(int cpu)
* an APIC. See e.g. "AP-388 82489DX User's Manual" (Intel * an APIC. See e.g. "AP-388 82489DX User's Manual" (Intel
* document number 292116). So here it goes... * document number 292116). So here it goes...
*/ */
static inline void init_apic_ldr_cluster(void)
{
unsigned long val;
int cpu = smp_processor_id();
apic_write(APIC_DFR, APIC_DFR_VALUE_CLUSTER);
val = calculate_ldr(cpu);
apic_write(APIC_LDR, val);
}
static inline void init_apic_ldr(void) static inline void init_apic_ldr(void)
{ {
unsigned long val; unsigned long val;
...@@ -70,10 +76,6 @@ static inline void init_apic_ldr(void) ...@@ -70,10 +76,6 @@ static inline void init_apic_ldr(void)
apic_write(APIC_LDR, val); apic_write(APIC_LDR, val);
} }
#ifndef CONFIG_X86_GENERICARCH
extern void enable_apic_mode(void);
#endif
extern int apic_version [MAX_APICS]; extern int apic_version [MAX_APICS];
static inline void setup_apic_routing(void) static inline void setup_apic_routing(void)
{ {
...@@ -144,7 +146,7 @@ static inline int check_phys_apicid_present(int cpu_physical_apicid) ...@@ -144,7 +146,7 @@ static inline int check_phys_apicid_present(int cpu_physical_apicid)
return (1); return (1);
} }
static inline unsigned int cpu_mask_to_apicid(cpumask_t cpumask) static inline unsigned int cpu_mask_to_apicid_cluster(cpumask_t cpumask)
{ {
int num_bits_set; int num_bits_set;
int cpus_found = 0; int cpus_found = 0;
...@@ -154,11 +156,7 @@ static inline unsigned int cpu_mask_to_apicid(cpumask_t cpumask) ...@@ -154,11 +156,7 @@ static inline unsigned int cpu_mask_to_apicid(cpumask_t cpumask)
num_bits_set = cpus_weight(cpumask); num_bits_set = cpus_weight(cpumask);
/* Return id to all */ /* Return id to all */
if (num_bits_set == NR_CPUS) if (num_bits_set == NR_CPUS)
#if defined CONFIG_ES7000_CLUSTERED_APIC
return 0xFF; return 0xFF;
#else
return cpu_to_logical_apicid(0);
#endif
/* /*
* The cpus in the mask must all be on the apic cluster. If are not * The cpus in the mask must all be on the apic cluster. If are not
* on the same apicid cluster return default value of TARGET_CPUS. * on the same apicid cluster return default value of TARGET_CPUS.
...@@ -171,11 +169,40 @@ static inline unsigned int cpu_mask_to_apicid(cpumask_t cpumask) ...@@ -171,11 +169,40 @@ static inline unsigned int cpu_mask_to_apicid(cpumask_t cpumask)
if (apicid_cluster(apicid) != if (apicid_cluster(apicid) !=
apicid_cluster(new_apicid)){ apicid_cluster(new_apicid)){
printk ("%s: Not a valid mask!\n", __func__); printk ("%s: Not a valid mask!\n", __func__);
#if defined CONFIG_ES7000_CLUSTERED_APIC
return 0xFF; return 0xFF;
#else }
apicid = new_apicid;
cpus_found++;
}
cpu++;
}
return apicid;
}
static inline unsigned int cpu_mask_to_apicid(cpumask_t cpumask)
{
int num_bits_set;
int cpus_found = 0;
int cpu;
int apicid;
num_bits_set = cpus_weight(cpumask);
/* Return id to all */
if (num_bits_set == NR_CPUS)
return cpu_to_logical_apicid(0);
/*
* The cpus in the mask must all be on the apic cluster. If are not
* on the same apicid cluster return default value of TARGET_CPUS.
*/
cpu = first_cpu(cpumask);
apicid = cpu_to_logical_apicid(cpu);
while (cpus_found < num_bits_set) {
if (cpu_isset(cpu, cpumask)) {
int new_apicid = cpu_to_logical_apicid(cpu);
if (apicid_cluster(apicid) !=
apicid_cluster(new_apicid)){
printk ("%s: Not a valid mask!\n", __func__);
return cpu_to_logical_apicid(0); return cpu_to_logical_apicid(0);
#endif
} }
apicid = new_apicid; apicid = new_apicid;
cpus_found++; cpus_found++;
......
#ifndef __ASM_ES7000_WAKECPU_H #ifndef __ASM_ES7000_WAKECPU_H
#define __ASM_ES7000_WAKECPU_H #define __ASM_ES7000_WAKECPU_H
/* #define TRAMPOLINE_PHYS_LOW 0x467
* This file copes with machines that wakeup secondary CPUs by the #define TRAMPOLINE_PHYS_HIGH 0x469
* INIT, INIT, STARTUP sequence.
*/
#ifdef CONFIG_ES7000_CLUSTERED_APIC
#define WAKE_SECONDARY_VIA_MIP
#else
#define WAKE_SECONDARY_VIA_INIT
#endif
#ifdef WAKE_SECONDARY_VIA_MIP
extern int es7000_start_cpu(int cpu, unsigned long eip);
static inline int
wakeup_secondary_cpu(int phys_apicid, unsigned long start_eip)
{
int boot_error = 0;
boot_error = es7000_start_cpu(phys_apicid, start_eip);
return boot_error;
}
#endif
#define TRAMPOLINE_LOW phys_to_virt(0x467)
#define TRAMPOLINE_HIGH phys_to_virt(0x469)
#define boot_cpu_apicid boot_cpu_physical_apicid
static inline void wait_for_init_deassert(atomic_t *deassert) static inline void wait_for_init_deassert(atomic_t *deassert)
{ {
#ifdef WAKE_SECONDARY_VIA_INIT #ifndef CONFIG_ES7000_CLUSTERED_APIC
while (!atomic_read(deassert)) while (!atomic_read(deassert))
cpu_relax(); cpu_relax();
#endif #endif
...@@ -50,9 +26,12 @@ static inline void restore_NMI_vector(unsigned short *high, unsigned short *low) ...@@ -50,9 +26,12 @@ static inline void restore_NMI_vector(unsigned short *high, unsigned short *low)
{ {
} }
#define inquire_remote_apic(apicid) do { \ extern void __inquire_remote_apic(int apicid);
if (apic_verbosity >= APIC_DEBUG) \
__inquire_remote_apic(apicid); \ static inline void inquire_remote_apic(int apicid)
} while (0) {
if (apic_verbosity >= APIC_DEBUG)
__inquire_remote_apic(apicid);
}
#endif /* __ASM_MACH_WAKECPU_H */ #endif /* __ASM_MACH_WAKECPU_H */
...@@ -17,8 +17,40 @@ static inline unsigned long ftrace_call_adjust(unsigned long addr) ...@@ -17,8 +17,40 @@ static inline unsigned long ftrace_call_adjust(unsigned long addr)
*/ */
return addr - 1; return addr - 1;
} }
#endif
#ifdef CONFIG_DYNAMIC_FTRACE
struct dyn_arch_ftrace {
/* No extra data needed for x86 */
};
#endif /* CONFIG_DYNAMIC_FTRACE */
#endif /* __ASSEMBLY__ */
#endif /* CONFIG_FUNCTION_TRACER */ #endif /* CONFIG_FUNCTION_TRACER */
#ifdef CONFIG_FUNCTION_GRAPH_TRACER
#ifndef __ASSEMBLY__
/*
* Stack of return addresses for functions
* of a thread.
* Used in struct thread_info
*/
struct ftrace_ret_stack {
unsigned long ret;
unsigned long func;
unsigned long long calltime;
};
/*
* Primary handler of a function return.
* It relays on ftrace_return_to_handler.
* Defined in entry32.S
*/
extern void return_to_handler(void);
#endif /* __ASSEMBLY__ */
#endif /* CONFIG_FUNCTION_GRAPH_TRACER */
#endif /* _ASM_X86_FTRACE_H */ #endif /* _ASM_X86_FTRACE_H */
...@@ -2,6 +2,7 @@ ...@@ -2,6 +2,7 @@
#define _ASM_X86_GENAPIC_32_H #define _ASM_X86_GENAPIC_32_H
#include <asm/mpspec.h> #include <asm/mpspec.h>
#include <asm/atomic.h>
/* /*
* Generic APIC driver interface. * Generic APIC driver interface.
...@@ -65,6 +66,14 @@ struct genapic { ...@@ -65,6 +66,14 @@ struct genapic {
void (*send_IPI_allbutself)(int vector); void (*send_IPI_allbutself)(int vector);
void (*send_IPI_all)(int vector); void (*send_IPI_all)(int vector);
#endif #endif
int (*wakeup_cpu)(int apicid, unsigned long start_eip);
int trampoline_phys_low;
int trampoline_phys_high;
void (*wait_for_init_deassert)(atomic_t *deassert);
void (*smp_callin_clear_local_apic)(void);
void (*store_NMI_vector)(unsigned short *high, unsigned short *low);
void (*restore_NMI_vector)(unsigned short *high, unsigned short *low);
void (*inquire_remote_apic)(int apicid);
}; };
#define APICFUNC(x) .x = x, #define APICFUNC(x) .x = x,
...@@ -105,16 +114,24 @@ struct genapic { ...@@ -105,16 +114,24 @@ struct genapic {
APICFUNC(get_apic_id) \ APICFUNC(get_apic_id) \
.apic_id_mask = APIC_ID_MASK, \ .apic_id_mask = APIC_ID_MASK, \
APICFUNC(cpu_mask_to_apicid) \ APICFUNC(cpu_mask_to_apicid) \
APICFUNC(vector_allocation_domain) \ APICFUNC(vector_allocation_domain) \
APICFUNC(acpi_madt_oem_check) \ APICFUNC(acpi_madt_oem_check) \
IPIFUNC(send_IPI_mask) \ IPIFUNC(send_IPI_mask) \
IPIFUNC(send_IPI_allbutself) \ IPIFUNC(send_IPI_allbutself) \
IPIFUNC(send_IPI_all) \ IPIFUNC(send_IPI_all) \
APICFUNC(enable_apic_mode) \ APICFUNC(enable_apic_mode) \
APICFUNC(phys_pkg_id) \ APICFUNC(phys_pkg_id) \
.trampoline_phys_low = TRAMPOLINE_PHYS_LOW, \
.trampoline_phys_high = TRAMPOLINE_PHYS_HIGH, \
APICFUNC(wait_for_init_deassert) \
APICFUNC(smp_callin_clear_local_apic) \
APICFUNC(store_NMI_vector) \
APICFUNC(restore_NMI_vector) \
APICFUNC(inquire_remote_apic) \
} }
extern struct genapic *genapic; extern struct genapic *genapic;
extern void es7000_update_genapic_to_cluster(void);
enum uv_system_type {UV_NONE, UV_LEGACY_APIC, UV_X2APIC, UV_NON_UNIQUE_APIC}; enum uv_system_type {UV_NONE, UV_LEGACY_APIC, UV_X2APIC, UV_NON_UNIQUE_APIC};
#define get_uv_system_type() UV_NONE #define get_uv_system_type() UV_NONE
......
...@@ -32,6 +32,8 @@ struct genapic { ...@@ -32,6 +32,8 @@ struct genapic {
unsigned int (*get_apic_id)(unsigned long x); unsigned int (*get_apic_id)(unsigned long x);
unsigned long (*set_apic_id)(unsigned int id); unsigned long (*set_apic_id)(unsigned int id);
unsigned long apic_id_mask; unsigned long apic_id_mask;
/* wakeup_secondary_cpu */
int (*wakeup_cpu)(int apicid, unsigned long start_eip);
}; };
extern struct genapic *genapic; extern struct genapic *genapic;
......
...@@ -188,17 +188,14 @@ extern void restore_IO_APIC_setup(void); ...@@ -188,17 +188,14 @@ extern void restore_IO_APIC_setup(void);
extern void reinit_intr_remapped_IO_APIC(int); extern void reinit_intr_remapped_IO_APIC(int);
#endif #endif
extern int probe_nr_irqs(void); extern void probe_nr_irqs_gsi(void);
#else /* !CONFIG_X86_IO_APIC */ #else /* !CONFIG_X86_IO_APIC */
#define io_apic_assign_pci_irqs 0 #define io_apic_assign_pci_irqs 0
static const int timer_through_8259 = 0; static const int timer_through_8259 = 0;
static inline void ioapic_init_mappings(void) { } static inline void ioapic_init_mappings(void) { }
static inline int probe_nr_irqs(void) static inline void probe_nr_irqs_gsi(void) { }
{
return NR_IRQS;
}
#endif #endif
#endif /* _ASM_X86_IO_APIC_H */ #endif /* _ASM_X86_IO_APIC_H */
...@@ -101,12 +101,23 @@ ...@@ -101,12 +101,23 @@
#define LAST_VM86_IRQ 15 #define LAST_VM86_IRQ 15
#define invalid_vm86_irq(irq) ((irq) < 3 || (irq) > 15) #define invalid_vm86_irq(irq) ((irq) < 3 || (irq) > 15)
#define NR_IRQS_LEGACY 16
#if defined(CONFIG_X86_IO_APIC) && !defined(CONFIG_X86_VOYAGER) #if defined(CONFIG_X86_IO_APIC) && !defined(CONFIG_X86_VOYAGER)
#ifndef CONFIG_SPARSE_IRQ
# if NR_CPUS < MAX_IO_APICS # if NR_CPUS < MAX_IO_APICS
# define NR_IRQS (NR_VECTORS + (32 * NR_CPUS)) # define NR_IRQS (NR_VECTORS + (32 * NR_CPUS))
# else # else
# define NR_IRQS (NR_VECTORS + (32 * MAX_IO_APICS)) # define NR_IRQS (NR_VECTORS + (32 * MAX_IO_APICS))
# endif # endif
#else
# if (8 * NR_CPUS) > (32 * MAX_IO_APICS)
# define NR_IRQS (NR_VECTORS + (8 * NR_CPUS))
# else
# define NR_IRQS (NR_VECTORS + (32 * MAX_IO_APICS))
# endif
#endif
#elif defined(CONFIG_X86_VOYAGER) #elif defined(CONFIG_X86_VOYAGER)
......
...@@ -32,11 +32,13 @@ static inline cpumask_t target_cpus(void) ...@@ -32,11 +32,13 @@ static inline cpumask_t target_cpus(void)
#define vector_allocation_domain (genapic->vector_allocation_domain) #define vector_allocation_domain (genapic->vector_allocation_domain)
#define read_apic_id() (GET_APIC_ID(apic_read(APIC_ID))) #define read_apic_id() (GET_APIC_ID(apic_read(APIC_ID)))
#define send_IPI_self (genapic->send_IPI_self) #define send_IPI_self (genapic->send_IPI_self)
#define wakeup_secondary_cpu (genapic->wakeup_cpu)
extern void setup_apic_routing(void); extern void setup_apic_routing(void);
#else #else
#define INT_DELIVERY_MODE dest_LowestPrio #define INT_DELIVERY_MODE dest_LowestPrio
#define INT_DEST_MODE 1 /* logical delivery broadcast to all procs */ #define INT_DEST_MODE 1 /* logical delivery broadcast to all procs */
#define TARGET_CPUS (target_cpus()) #define TARGET_CPUS (target_cpus())
#define wakeup_secondary_cpu wakeup_secondary_cpu_via_init
/* /*
* Set up the logical destination ID. * Set up the logical destination ID.
* *
......
#ifndef _ASM_X86_MACH_DEFAULT_MACH_WAKECPU_H #ifndef _ASM_X86_MACH_DEFAULT_MACH_WAKECPU_H
#define _ASM_X86_MACH_DEFAULT_MACH_WAKECPU_H #define _ASM_X86_MACH_DEFAULT_MACH_WAKECPU_H
/* #define TRAMPOLINE_PHYS_LOW (0x467)
* This file copes with machines that wakeup secondary CPUs by the #define TRAMPOLINE_PHYS_HIGH (0x469)
* INIT, INIT, STARTUP sequence.
*/
#define WAKE_SECONDARY_VIA_INIT
#define TRAMPOLINE_LOW phys_to_virt(0x467)
#define TRAMPOLINE_HIGH phys_to_virt(0x469)
#define boot_cpu_apicid boot_cpu_physical_apicid
static inline void wait_for_init_deassert(atomic_t *deassert) static inline void wait_for_init_deassert(atomic_t *deassert)
{ {
...@@ -33,9 +24,12 @@ static inline void restore_NMI_vector(unsigned short *high, unsigned short *low) ...@@ -33,9 +24,12 @@ static inline void restore_NMI_vector(unsigned short *high, unsigned short *low)
{ {
} }
#define inquire_remote_apic(apicid) do { \ extern void __inquire_remote_apic(int apicid);
if (apic_verbosity >= APIC_DEBUG) \
__inquire_remote_apic(apicid); \ static inline void inquire_remote_apic(int apicid)
} while (0) {
if (apic_verbosity >= APIC_DEBUG)
__inquire_remote_apic(apicid);
}
#endif /* _ASM_X86_MACH_DEFAULT_MACH_WAKECPU_H */ #endif /* _ASM_X86_MACH_DEFAULT_MACH_WAKECPU_H */
...@@ -13,9 +13,11 @@ static inline void smpboot_setup_warm_reset_vector(unsigned long start_eip) ...@@ -13,9 +13,11 @@ static inline void smpboot_setup_warm_reset_vector(unsigned long start_eip)
CMOS_WRITE(0xa, 0xf); CMOS_WRITE(0xa, 0xf);
local_flush_tlb(); local_flush_tlb();
pr_debug("1.\n"); pr_debug("1.\n");
*((volatile unsigned short *) TRAMPOLINE_HIGH) = start_eip >> 4; *((volatile unsigned short *)phys_to_virt(TRAMPOLINE_PHYS_HIGH)) =
start_eip >> 4;
pr_debug("2.\n"); pr_debug("2.\n");
*((volatile unsigned short *) TRAMPOLINE_LOW) = start_eip & 0xf; *((volatile unsigned short *)phys_to_virt(TRAMPOLINE_PHYS_LOW)) =
start_eip & 0xf;
pr_debug("3.\n"); pr_debug("3.\n");
} }
...@@ -32,7 +34,7 @@ static inline void smpboot_restore_warm_reset_vector(void) ...@@ -32,7 +34,7 @@ static inline void smpboot_restore_warm_reset_vector(void)
*/ */
CMOS_WRITE(0, 0xf); CMOS_WRITE(0, 0xf);
*((volatile long *) phys_to_virt(0x467)) = 0; *((volatile long *)phys_to_virt(TRAMPOLINE_PHYS_LOW)) = 0;
} }
static inline void __init smpboot_setup_io_apic(void) static inline void __init smpboot_setup_io_apic(void)
......
...@@ -27,6 +27,7 @@ ...@@ -27,6 +27,7 @@
#define vector_allocation_domain (genapic->vector_allocation_domain) #define vector_allocation_domain (genapic->vector_allocation_domain)
#define enable_apic_mode (genapic->enable_apic_mode) #define enable_apic_mode (genapic->enable_apic_mode)
#define phys_pkg_id (genapic->phys_pkg_id) #define phys_pkg_id (genapic->phys_pkg_id)
#define wakeup_secondary_cpu (genapic->wakeup_cpu)
extern void generic_bigsmp_probe(void); extern void generic_bigsmp_probe(void);
......
#ifndef _ASM_X86_MACH_GENERIC_MACH_WAKECPU_H
#define _ASM_X86_MACH_GENERIC_MACH_WAKECPU_H
#define TRAMPOLINE_PHYS_LOW (genapic->trampoline_phys_low)
#define TRAMPOLINE_PHYS_HIGH (genapic->trampoline_phys_high)
#define wait_for_init_deassert (genapic->wait_for_init_deassert)
#define smp_callin_clear_local_apic (genapic->smp_callin_clear_local_apic)
#define store_NMI_vector (genapic->store_NMI_vector)
#define restore_NMI_vector (genapic->restore_NMI_vector)
#define inquire_remote_apic (genapic->inquire_remote_apic)
#endif /* _ASM_X86_MACH_GENERIC_MACH_APIC_H */
...@@ -3,12 +3,8 @@ ...@@ -3,12 +3,8 @@
/* This file copes with machines that wakeup secondary CPUs by NMIs */ /* This file copes with machines that wakeup secondary CPUs by NMIs */
#define WAKE_SECONDARY_VIA_NMI #define TRAMPOLINE_PHYS_LOW (0x8)
#define TRAMPOLINE_PHYS_HIGH (0xa)
#define TRAMPOLINE_LOW phys_to_virt(0x8)
#define TRAMPOLINE_HIGH phys_to_virt(0xa)
#define boot_cpu_apicid boot_cpu_logical_apicid
/* We don't do anything here because we use NMI's to boot instead */ /* We don't do anything here because we use NMI's to boot instead */
static inline void wait_for_init_deassert(atomic_t *deassert) static inline void wait_for_init_deassert(atomic_t *deassert)
...@@ -27,17 +23,23 @@ static inline void smp_callin_clear_local_apic(void) ...@@ -27,17 +23,23 @@ static inline void smp_callin_clear_local_apic(void)
static inline void store_NMI_vector(unsigned short *high, unsigned short *low) static inline void store_NMI_vector(unsigned short *high, unsigned short *low)
{ {
printk("Storing NMI vector\n"); printk("Storing NMI vector\n");
*high = *((volatile unsigned short *) TRAMPOLINE_HIGH); *high =
*low = *((volatile unsigned short *) TRAMPOLINE_LOW); *((volatile unsigned short *)phys_to_virt(TRAMPOLINE_PHYS_HIGH));
*low =
*((volatile unsigned short *)phys_to_virt(TRAMPOLINE_PHYS_LOW));
} }
static inline void restore_NMI_vector(unsigned short *high, unsigned short *low) static inline void restore_NMI_vector(unsigned short *high, unsigned short *low)
{ {
printk("Restoring NMI vector\n"); printk("Restoring NMI vector\n");
*((volatile unsigned short *) TRAMPOLINE_HIGH) = *high; *((volatile unsigned short *)phys_to_virt(TRAMPOLINE_PHYS_HIGH)) =
*((volatile unsigned short *) TRAMPOLINE_LOW) = *low; *high;
*((volatile unsigned short *)phys_to_virt(TRAMPOLINE_PHYS_LOW)) =
*low;
} }
#define inquire_remote_apic(apicid) {} static inline void inquire_remote_apic(int apicid)
{
}
#endif /* __ASM_NUMAQ_WAKECPU_H */ #endif /* __ASM_NUMAQ_WAKECPU_H */
...@@ -16,6 +16,8 @@ static inline void visws_early_detect(void) { } ...@@ -16,6 +16,8 @@ static inline void visws_early_detect(void) { }
static inline int is_visws_box(void) { return 0; } static inline int is_visws_box(void) { return 0; }
#endif #endif
extern int wakeup_secondary_cpu_via_nmi(int apicid, unsigned long start_eip);
extern int wakeup_secondary_cpu_via_init(int apicid, unsigned long start_eip);
/* /*
* Any setup quirks to be performed? * Any setup quirks to be performed?
*/ */
...@@ -39,6 +41,7 @@ struct x86_quirks { ...@@ -39,6 +41,7 @@ struct x86_quirks {
void (*smp_read_mpc_oem)(struct mp_config_oemtable *oemtable, void (*smp_read_mpc_oem)(struct mp_config_oemtable *oemtable,
unsigned short oemsize); unsigned short oemsize);
int (*setup_ioapic_ids)(void); int (*setup_ioapic_ids)(void);
int (*update_genapic)(void);
}; };
extern struct x86_quirks *x86_quirks; extern struct x86_quirks *x86_quirks;
......
...@@ -314,6 +314,8 @@ extern void free_init_pages(char *what, unsigned long begin, unsigned long end); ...@@ -314,6 +314,8 @@ extern void free_init_pages(char *what, unsigned long begin, unsigned long end);
void default_idle(void); void default_idle(void);
void stop_this_cpu(void *dummy);
/* /*
* Force strict CPU ordering. * Force strict CPU ordering.
* And yes, this is required on UP too when we're talking * And yes, this is required on UP too when we're talking
......
...@@ -20,6 +20,8 @@ ...@@ -20,6 +20,8 @@
struct task_struct; struct task_struct;
struct exec_domain; struct exec_domain;
#include <asm/processor.h> #include <asm/processor.h>
#include <asm/ftrace.h>
#include <asm/atomic.h>
struct thread_info { struct thread_info {
struct task_struct *task; /* main task structure */ struct task_struct *task; /* main task structure */
......
...@@ -157,6 +157,7 @@ extern int __get_user_bad(void); ...@@ -157,6 +157,7 @@ extern int __get_user_bad(void);
int __ret_gu; \ int __ret_gu; \
unsigned long __val_gu; \ unsigned long __val_gu; \
__chk_user_ptr(ptr); \ __chk_user_ptr(ptr); \
might_fault(); \
switch (sizeof(*(ptr))) { \ switch (sizeof(*(ptr))) { \
case 1: \ case 1: \
__get_user_x(1, __ret_gu, __val_gu, ptr); \ __get_user_x(1, __ret_gu, __val_gu, ptr); \
...@@ -241,6 +242,7 @@ extern void __put_user_8(void); ...@@ -241,6 +242,7 @@ extern void __put_user_8(void);
int __ret_pu; \ int __ret_pu; \
__typeof__(*(ptr)) __pu_val; \ __typeof__(*(ptr)) __pu_val; \
__chk_user_ptr(ptr); \ __chk_user_ptr(ptr); \
might_fault(); \
__pu_val = x; \ __pu_val = x; \
switch (sizeof(*(ptr))) { \ switch (sizeof(*(ptr))) { \
case 1: \ case 1: \
......
...@@ -82,8 +82,8 @@ __copy_to_user_inatomic(void __user *to, const void *from, unsigned long n) ...@@ -82,8 +82,8 @@ __copy_to_user_inatomic(void __user *to, const void *from, unsigned long n)
static __always_inline unsigned long __must_check static __always_inline unsigned long __must_check
__copy_to_user(void __user *to, const void *from, unsigned long n) __copy_to_user(void __user *to, const void *from, unsigned long n)
{ {
might_sleep(); might_fault();
return __copy_to_user_inatomic(to, from, n); return __copy_to_user_inatomic(to, from, n);
} }
static __always_inline unsigned long static __always_inline unsigned long
...@@ -137,7 +137,7 @@ __copy_from_user_inatomic(void *to, const void __user *from, unsigned long n) ...@@ -137,7 +137,7 @@ __copy_from_user_inatomic(void *to, const void __user *from, unsigned long n)
static __always_inline unsigned long static __always_inline unsigned long
__copy_from_user(void *to, const void __user *from, unsigned long n) __copy_from_user(void *to, const void __user *from, unsigned long n)
{ {
might_sleep(); might_fault();
if (__builtin_constant_p(n)) { if (__builtin_constant_p(n)) {
unsigned long ret; unsigned long ret;
...@@ -159,7 +159,7 @@ __copy_from_user(void *to, const void __user *from, unsigned long n) ...@@ -159,7 +159,7 @@ __copy_from_user(void *to, const void __user *from, unsigned long n)
static __always_inline unsigned long __copy_from_user_nocache(void *to, static __always_inline unsigned long __copy_from_user_nocache(void *to,
const void __user *from, unsigned long n) const void __user *from, unsigned long n)
{ {
might_sleep(); might_fault();
if (__builtin_constant_p(n)) { if (__builtin_constant_p(n)) {
unsigned long ret; unsigned long ret;
......
...@@ -29,6 +29,8 @@ static __always_inline __must_check ...@@ -29,6 +29,8 @@ static __always_inline __must_check
int __copy_from_user(void *dst, const void __user *src, unsigned size) int __copy_from_user(void *dst, const void __user *src, unsigned size)
{ {
int ret = 0; int ret = 0;
might_fault();
if (!__builtin_constant_p(size)) if (!__builtin_constant_p(size))
return copy_user_generic(dst, (__force void *)src, size); return copy_user_generic(dst, (__force void *)src, size);
switch (size) { switch (size) {
...@@ -71,6 +73,8 @@ static __always_inline __must_check ...@@ -71,6 +73,8 @@ static __always_inline __must_check
int __copy_to_user(void __user *dst, const void *src, unsigned size) int __copy_to_user(void __user *dst, const void *src, unsigned size)
{ {
int ret = 0; int ret = 0;
might_fault();
if (!__builtin_constant_p(size)) if (!__builtin_constant_p(size))
return copy_user_generic((__force void *)dst, src, size); return copy_user_generic((__force void *)dst, src, size);
switch (size) { switch (size) {
...@@ -113,6 +117,8 @@ static __always_inline __must_check ...@@ -113,6 +117,8 @@ static __always_inline __must_check
int __copy_in_user(void __user *dst, const void __user *src, unsigned size) int __copy_in_user(void __user *dst, const void __user *src, unsigned size)
{ {
int ret = 0; int ret = 0;
might_fault();
if (!__builtin_constant_p(size)) if (!__builtin_constant_p(size))
return copy_user_generic((__force void *)dst, return copy_user_generic((__force void *)dst,
(__force void *)src, size); (__force void *)src, size);
......
...@@ -25,7 +25,7 @@ CFLAGS_tsc.o := $(nostackp) ...@@ -25,7 +25,7 @@ CFLAGS_tsc.o := $(nostackp)
obj-y := process_$(BITS).o signal_$(BITS).o entry_$(BITS).o obj-y := process_$(BITS).o signal_$(BITS).o entry_$(BITS).o
obj-y += traps.o irq.o irq_$(BITS).o dumpstack_$(BITS).o obj-y += traps.o irq.o irq_$(BITS).o dumpstack_$(BITS).o
obj-y += time_$(BITS).o ioport.o ldt.o obj-y += time_$(BITS).o ioport.o ldt.o dumpstack.o
obj-y += setup.o i8259.o irqinit_$(BITS).o setup_percpu.o obj-y += setup.o i8259.o irqinit_$(BITS).o setup_percpu.o
obj-$(CONFIG_X86_VISWS) += visws_quirks.o obj-$(CONFIG_X86_VISWS) += visws_quirks.o
obj-$(CONFIG_X86_32) += probe_roms_32.o obj-$(CONFIG_X86_32) += probe_roms_32.o
...@@ -65,6 +65,7 @@ obj-$(CONFIG_X86_LOCAL_APIC) += apic.o nmi.o ...@@ -65,6 +65,7 @@ obj-$(CONFIG_X86_LOCAL_APIC) += apic.o nmi.o
obj-$(CONFIG_X86_IO_APIC) += io_apic.o obj-$(CONFIG_X86_IO_APIC) += io_apic.o
obj-$(CONFIG_X86_REBOOTFIXUPS) += reboot_fixups_32.o obj-$(CONFIG_X86_REBOOTFIXUPS) += reboot_fixups_32.o
obj-$(CONFIG_DYNAMIC_FTRACE) += ftrace.o obj-$(CONFIG_DYNAMIC_FTRACE) += ftrace.o
obj-$(CONFIG_FUNCTION_GRAPH_TRACER) += ftrace.o
obj-$(CONFIG_KEXEC) += machine_kexec_$(BITS).o obj-$(CONFIG_KEXEC) += machine_kexec_$(BITS).o
obj-$(CONFIG_KEXEC) += relocate_kernel_$(BITS).o crash.o obj-$(CONFIG_KEXEC) += relocate_kernel_$(BITS).o crash.o
obj-$(CONFIG_CRASH_DUMP) += crash_dump_$(BITS).o obj-$(CONFIG_CRASH_DUMP) += crash_dump_$(BITS).o
......
...@@ -1360,6 +1360,17 @@ static void __init acpi_process_madt(void) ...@@ -1360,6 +1360,17 @@ static void __init acpi_process_madt(void)
disable_acpi(); disable_acpi();
} }
} }
/*
* ACPI supports both logical (e.g. Hyper-Threading) and physical
* processors, where MPS only supports physical.
*/
if (acpi_lapic && acpi_ioapic)
printk(KERN_INFO "Using ACPI (MADT) for SMP configuration "
"information\n");
else if (acpi_lapic)
printk(KERN_INFO "Using ACPI for processor (LAPIC) "
"configuration information\n");
#endif #endif
return; return;
} }
......
...@@ -391,11 +391,7 @@ static int power_off; ...@@ -391,11 +391,7 @@ static int power_off;
#else #else
static int power_off = 1; static int power_off = 1;
#endif #endif
#ifdef CONFIG_APM_REAL_MODE_POWER_OFF
static int realmode_power_off = 1;
#else
static int realmode_power_off; static int realmode_power_off;
#endif
#ifdef CONFIG_APM_ALLOW_INTS #ifdef CONFIG_APM_ALLOW_INTS
static int allow_ints = 1; static int allow_ints = 1;
#else #else
......
...@@ -33,6 +33,7 @@ ...@@ -33,6 +33,7 @@
#include <linux/cpufreq.h> #include <linux/cpufreq.h>
#include <linux/compiler.h> #include <linux/compiler.h>
#include <linux/dmi.h> #include <linux/dmi.h>
#include <linux/ftrace.h>
#include <linux/acpi.h> #include <linux/acpi.h>
#include <acpi/processor.h> #include <acpi/processor.h>
...@@ -391,6 +392,7 @@ static int acpi_cpufreq_target(struct cpufreq_policy *policy, ...@@ -391,6 +392,7 @@ static int acpi_cpufreq_target(struct cpufreq_policy *policy,
unsigned int next_perf_state = 0; /* Index into perf table */ unsigned int next_perf_state = 0; /* Index into perf table */
unsigned int i; unsigned int i;
int result = 0; int result = 0;
struct power_trace it;
dprintk("acpi_cpufreq_target %d (%d)\n", target_freq, policy->cpu); dprintk("acpi_cpufreq_target %d (%d)\n", target_freq, policy->cpu);
...@@ -427,6 +429,8 @@ static int acpi_cpufreq_target(struct cpufreq_policy *policy, ...@@ -427,6 +429,8 @@ static int acpi_cpufreq_target(struct cpufreq_policy *policy,
} }
} }
trace_power_mark(&it, POWER_PSTATE, next_perf_state);
switch (data->cpu_feature) { switch (data->cpu_feature) {
case SYSTEM_INTEL_MSR_CAPABLE: case SYSTEM_INTEL_MSR_CAPABLE:
cmd.type = SYSTEM_INTEL_MSR_CAPABLE; cmd.type = SYSTEM_INTEL_MSR_CAPABLE;
......
...@@ -307,12 +307,11 @@ static void __cpuinit init_intel(struct cpuinfo_x86 *c) ...@@ -307,12 +307,11 @@ static void __cpuinit init_intel(struct cpuinfo_x86 *c)
set_cpu_cap(c, X86_FEATURE_P4); set_cpu_cap(c, X86_FEATURE_P4);
if (c->x86 == 6) if (c->x86 == 6)
set_cpu_cap(c, X86_FEATURE_P3); set_cpu_cap(c, X86_FEATURE_P3);
#endif
if (cpu_has_bts) if (cpu_has_bts)
ptrace_bts_init_intel(c); ptrace_bts_init_intel(c);
#endif
detect_extended_topology(c); detect_extended_topology(c);
if (!cpu_has(c, X86_FEATURE_XTOPOLOGY)) { if (!cpu_has(c, X86_FEATURE_XTOPOLOGY)) {
/* /*
......
This diff is collapsed.
/*
* Copyright (C) 1991, 1992 Linus Torvalds
* Copyright (C) 2000, 2001, 2002 Andi Kleen, SuSE Labs
*/
#include <linux/kallsyms.h>
#include <linux/kprobes.h>
#include <linux/uaccess.h>
#include <linux/utsname.h>
#include <linux/hardirq.h>
#include <linux/kdebug.h>
#include <linux/module.h>
#include <linux/ptrace.h>
#include <linux/kexec.h>
#include <linux/bug.h>
#include <linux/nmi.h>
#include <linux/sysfs.h>
#include <asm/stacktrace.h>
#include "dumpstack.h"
int panic_on_unrecovered_nmi;
unsigned int code_bytes = 64;
int kstack_depth_to_print = 3 * STACKSLOTS_PER_LINE;
static int die_counter;
void printk_address(unsigned long address, int reliable)
{
printk(" [<%p>] %s%pS\n", (void *) address,
reliable ? "" : "? ", (void *) address);
}
#ifdef CONFIG_FUNCTION_GRAPH_TRACER
static void
print_ftrace_graph_addr(unsigned long addr, void *data,
const struct stacktrace_ops *ops,
struct thread_info *tinfo, int *graph)
{
struct task_struct *task = tinfo->task;
unsigned long ret_addr;
int index = task->curr_ret_stack;
if (addr != (unsigned long)return_to_handler)
return;
if (!task->ret_stack || index < *graph)
return;
index -= *graph;
ret_addr = task->ret_stack[index].ret;
ops->address(data, ret_addr, 1);
(*graph)++;
}
#else
static inline void
print_ftrace_graph_addr(unsigned long addr, void *data,
const struct stacktrace_ops *ops,
struct thread_info *tinfo, int *graph)
{ }
#endif
/*
* x86-64 can have up to three kernel stacks:
* process stack
* interrupt stack
* severe exception (double fault, nmi, stack fault, debug, mce) hardware stack
*/
static inline int valid_stack_ptr(struct thread_info *tinfo,
void *p, unsigned int size, void *end)
{
void *t = tinfo;
if (end) {
if (p < end && p >= (end-THREAD_SIZE))
return 1;
else
return 0;
}
return p > t && p < t + THREAD_SIZE - size;
}
unsigned long
print_context_stack(struct thread_info *tinfo,
unsigned long *stack, unsigned long bp,
const struct stacktrace_ops *ops, void *data,
unsigned long *end, int *graph)
{
struct stack_frame *frame = (struct stack_frame *)bp;
while (valid_stack_ptr(tinfo, stack, sizeof(*stack), end)) {
unsigned long addr;
addr = *stack;
if (__kernel_text_address(addr)) {
if ((unsigned long) stack == bp + sizeof(long)) {
ops->address(data, addr, 1);
frame = frame->next_frame;
bp = (unsigned long) frame;
} else {
ops->address(data, addr, bp == 0);
}
print_ftrace_graph_addr(addr, data, ops, tinfo, graph);
}
stack++;
}
return bp;
}
static void
print_trace_warning_symbol(void *data, char *msg, unsigned long symbol)
{
printk(data);
print_symbol(msg, symbol);
printk("\n");
}
static void print_trace_warning(void *data, char *msg)
{
printk("%s%s\n", (char *)data, msg);
}
static int print_trace_stack(void *data, char *name)
{
printk("%s <%s> ", (char *)data, name);
return 0;
}
/*
* Print one address/symbol entries per line.
*/
static void print_trace_address(void *data, unsigned long addr, int reliable)
{
touch_nmi_watchdog();
printk(data);
printk_address(addr, reliable);
}
static const struct stacktrace_ops print_trace_ops = {
.warning = print_trace_warning,
.warning_symbol = print_trace_warning_symbol,
.stack = print_trace_stack,
.address = print_trace_address,
};
void
show_trace_log_lvl(struct task_struct *task, struct pt_regs *regs,
unsigned long *stack, unsigned long bp, char *log_lvl)
{
printk("%sCall Trace:\n", log_lvl);
dump_trace(task, regs, stack, bp, &print_trace_ops, log_lvl);
}
void show_trace(struct task_struct *task, struct pt_regs *regs,
unsigned long *stack, unsigned long bp)
{
show_trace_log_lvl(task, regs, stack, bp, "");
}
void show_stack(struct task_struct *task, unsigned long *sp)
{
show_stack_log_lvl(task, NULL, sp, 0, "");
}
/*
* The architecture-independent dump_stack generator
*/
void dump_stack(void)
{
unsigned long bp = 0;
unsigned long stack;
#ifdef CONFIG_FRAME_POINTER
if (!bp)
get_bp(bp);
#endif
printk("Pid: %d, comm: %.20s %s %s %.*s\n",
current->pid, current->comm, print_tainted(),
init_utsname()->release,
(int)strcspn(init_utsname()->version, " "),
init_utsname()->version);
show_trace(NULL, NULL, &stack, bp);
}
EXPORT_SYMBOL(dump_stack);
static raw_spinlock_t die_lock = __RAW_SPIN_LOCK_UNLOCKED;
static int die_owner = -1;
static unsigned int die_nest_count;
unsigned __kprobes long oops_begin(void)
{
int cpu;
unsigned long flags;
oops_enter();
/* racy, but better than risking deadlock. */
raw_local_irq_save(flags);
cpu = smp_processor_id();
if (!__raw_spin_trylock(&die_lock)) {
if (cpu == die_owner)
/* nested oops. should stop eventually */;
else
__raw_spin_lock(&die_lock);
}
die_nest_count++;
die_owner = cpu;
console_verbose();
bust_spinlocks(1);
return flags;
}
void __kprobes oops_end(unsigned long flags, struct pt_regs *regs, int signr)
{
if (regs && kexec_should_crash(current))
crash_kexec(regs);
bust_spinlocks(0);
die_owner = -1;
add_taint(TAINT_DIE);
die_nest_count--;
if (!die_nest_count)
/* Nest count reaches zero, release the lock. */
__raw_spin_unlock(&die_lock);
raw_local_irq_restore(flags);
oops_exit();
if (!signr)
return;
if (in_interrupt())
panic("Fatal exception in interrupt");
if (panic_on_oops)
panic("Fatal exception");
do_exit(signr);
}
int __kprobes __die(const char *str, struct pt_regs *regs, long err)
{
#ifdef CONFIG_X86_32
unsigned short ss;
unsigned long sp;
#endif
printk(KERN_EMERG "%s: %04lx [#%d] ", str, err & 0xffff, ++die_counter);
#ifdef CONFIG_PREEMPT
printk("PREEMPT ");
#endif
#ifdef CONFIG_SMP
printk("SMP ");
#endif
#ifdef CONFIG_DEBUG_PAGEALLOC
printk("DEBUG_PAGEALLOC");
#endif
printk("\n");
sysfs_printk_last_file();
if (notify_die(DIE_OOPS, str, regs, err,
current->thread.trap_no, SIGSEGV) == NOTIFY_STOP)
return 1;
show_registers(regs);
#ifdef CONFIG_X86_32
sp = (unsigned long) (&regs->sp);
savesegment(ss, ss);
if (user_mode(regs)) {
sp = regs->sp;
ss = regs->ss & 0xffff;
}
printk(KERN_EMERG "EIP: [<%08lx>] ", regs->ip);
print_symbol("%s", regs->ip);
printk(" SS:ESP %04x:%08lx\n", ss, sp);
#else
/* Executive summary in case the oops scrolled away */
printk(KERN_ALERT "RIP ");
printk_address(regs->ip, 1);
printk(" RSP <%016lx>\n", regs->sp);
#endif
return 0;
}
/*
* This is gone through when something in the kernel has done something bad
* and is about to be terminated:
*/
void die(const char *str, struct pt_regs *regs, long err)
{
unsigned long flags = oops_begin();
int sig = SIGSEGV;
if (!user_mode_vm(regs))
report_bug(regs->ip, regs);
if (__die(str, regs, err))
sig = 0;
oops_end(flags, regs, sig);
}
void notrace __kprobes
die_nmi(char *str, struct pt_regs *regs, int do_panic)
{
unsigned long flags;
if (notify_die(DIE_NMIWATCHDOG, str, regs, 0, 2, SIGINT) == NOTIFY_STOP)
return;
/*
* We are in trouble anyway, lets at least try
* to get a message out.
*/
flags = oops_begin();
printk(KERN_EMERG "%s", str);
printk(" on CPU%d, ip %08lx, registers:\n",
smp_processor_id(), regs->ip);
show_registers(regs);
oops_end(flags, regs, 0);
if (do_panic || panic_on_oops)
panic("Non maskable interrupt");
nmi_exit();
local_irq_enable();
do_exit(SIGBUS);
}
static int __init oops_setup(char *s)
{
if (!s)
return -EINVAL;
if (!strcmp(s, "panic"))
panic_on_oops = 1;
return 0;
}
early_param("oops", oops_setup);
static int __init kstack_setup(char *s)
{
if (!s)
return -EINVAL;
kstack_depth_to_print = simple_strtoul(s, NULL, 0);
return 0;
}
early_param("kstack", kstack_setup);
static int __init code_bytes_setup(char *s)
{
code_bytes = simple_strtoul(s, NULL, 0);
if (code_bytes > 8192)
code_bytes = 8192;
return 1;
}
__setup("code_bytes=", code_bytes_setup);
/*
* Copyright (C) 1991, 1992 Linus Torvalds
* Copyright (C) 2000, 2001, 2002 Andi Kleen, SuSE Labs
*/
#ifndef DUMPSTACK_H
#define DUMPSTACK_H
#ifdef CONFIG_X86_32
#define STACKSLOTS_PER_LINE 8
#define get_bp(bp) asm("movl %%ebp, %0" : "=r" (bp) :)
#else
#define STACKSLOTS_PER_LINE 4
#define get_bp(bp) asm("movq %%rbp, %0" : "=r" (bp) :)
#endif
extern unsigned long
print_context_stack(struct thread_info *tinfo,
unsigned long *stack, unsigned long bp,
const struct stacktrace_ops *ops, void *data,
unsigned long *end, int *graph);
extern void
show_trace_log_lvl(struct task_struct *task, struct pt_regs *regs,
unsigned long *stack, unsigned long bp, char *log_lvl);
extern void
show_stack_log_lvl(struct task_struct *task, struct pt_regs *regs,
unsigned long *sp, unsigned long bp, char *log_lvl);
extern unsigned int code_bytes;
extern int kstack_depth_to_print;
/* The form of the top of the frame on the stack */
struct stack_frame {
struct stack_frame *next_frame;
unsigned long return_address;
};
#endif
...@@ -17,69 +17,14 @@ ...@@ -17,69 +17,14 @@
#include <asm/stacktrace.h> #include <asm/stacktrace.h>
#define STACKSLOTS_PER_LINE 8 #include "dumpstack.h"
#define get_bp(bp) asm("movl %%ebp, %0" : "=r" (bp) :)
int panic_on_unrecovered_nmi;
int kstack_depth_to_print = 3 * STACKSLOTS_PER_LINE;
static unsigned int code_bytes = 64;
static int die_counter;
void printk_address(unsigned long address, int reliable)
{
printk(" [<%p>] %s%pS\n", (void *) address,
reliable ? "" : "? ", (void *) address);
}
static inline int valid_stack_ptr(struct thread_info *tinfo,
void *p, unsigned int size, void *end)
{
void *t = tinfo;
if (end) {
if (p < end && p >= (end-THREAD_SIZE))
return 1;
else
return 0;
}
return p > t && p < t + THREAD_SIZE - size;
}
/* The form of the top of the frame on the stack */
struct stack_frame {
struct stack_frame *next_frame;
unsigned long return_address;
};
static inline unsigned long
print_context_stack(struct thread_info *tinfo,
unsigned long *stack, unsigned long bp,
const struct stacktrace_ops *ops, void *data,
unsigned long *end)
{
struct stack_frame *frame = (struct stack_frame *)bp;
while (valid_stack_ptr(tinfo, stack, sizeof(*stack), end)) {
unsigned long addr;
addr = *stack;
if (__kernel_text_address(addr)) {
if ((unsigned long) stack == bp + sizeof(long)) {
ops->address(data, addr, 1);
frame = frame->next_frame;
bp = (unsigned long) frame;
} else {
ops->address(data, addr, bp == 0);
}
}
stack++;
}
return bp;
}
void dump_trace(struct task_struct *task, struct pt_regs *regs, void dump_trace(struct task_struct *task, struct pt_regs *regs,
unsigned long *stack, unsigned long bp, unsigned long *stack, unsigned long bp,
const struct stacktrace_ops *ops, void *data) const struct stacktrace_ops *ops, void *data)
{ {
int graph = 0;
if (!task) if (!task)
task = current; task = current;
...@@ -107,7 +52,8 @@ void dump_trace(struct task_struct *task, struct pt_regs *regs, ...@@ -107,7 +52,8 @@ void dump_trace(struct task_struct *task, struct pt_regs *regs,
context = (struct thread_info *) context = (struct thread_info *)
((unsigned long)stack & (~(THREAD_SIZE - 1))); ((unsigned long)stack & (~(THREAD_SIZE - 1)));
bp = print_context_stack(context, stack, bp, ops, data, NULL); bp = print_context_stack(context, stack, bp, ops,
data, NULL, &graph);
stack = (unsigned long *)context->previous_esp; stack = (unsigned long *)context->previous_esp;
if (!stack) if (!stack)
...@@ -119,57 +65,7 @@ void dump_trace(struct task_struct *task, struct pt_regs *regs, ...@@ -119,57 +65,7 @@ void dump_trace(struct task_struct *task, struct pt_regs *regs,
} }
EXPORT_SYMBOL(dump_trace); EXPORT_SYMBOL(dump_trace);
static void void
print_trace_warning_symbol(void *data, char *msg, unsigned long symbol)
{
printk(data);
print_symbol(msg, symbol);
printk("\n");
}
static void print_trace_warning(void *data, char *msg)
{
printk("%s%s\n", (char *)data, msg);
}
static int print_trace_stack(void *data, char *name)
{
printk("%s <%s> ", (char *)data, name);
return 0;
}
/*
* Print one address/symbol entries per line.
*/
static void print_trace_address(void *data, unsigned long addr, int reliable)
{
touch_nmi_watchdog();
printk(data);
printk_address(addr, reliable);
}
static const struct stacktrace_ops print_trace_ops = {
.warning = print_trace_warning,
.warning_symbol = print_trace_warning_symbol,
.stack = print_trace_stack,
.address = print_trace_address,
};
static void
show_trace_log_lvl(struct task_struct *task, struct pt_regs *regs,
unsigned long *stack, unsigned long bp, char *log_lvl)
{
printk("%sCall Trace:\n", log_lvl);
dump_trace(task, regs, stack, bp, &print_trace_ops, log_lvl);
}
void show_trace(struct task_struct *task, struct pt_regs *regs,
unsigned long *stack, unsigned long bp)
{
show_trace_log_lvl(task, regs, stack, bp, "");
}
static void
show_stack_log_lvl(struct task_struct *task, struct pt_regs *regs, show_stack_log_lvl(struct task_struct *task, struct pt_regs *regs,
unsigned long *sp, unsigned long bp, char *log_lvl) unsigned long *sp, unsigned long bp, char *log_lvl)
{ {
...@@ -196,33 +92,6 @@ show_stack_log_lvl(struct task_struct *task, struct pt_regs *regs, ...@@ -196,33 +92,6 @@ show_stack_log_lvl(struct task_struct *task, struct pt_regs *regs,
show_trace_log_lvl(task, regs, sp, bp, log_lvl); show_trace_log_lvl(task, regs, sp, bp, log_lvl);
} }
void show_stack(struct task_struct *task, unsigned long *sp)
{
show_stack_log_lvl(task, NULL, sp, 0, "");
}
/*
* The architecture-independent dump_stack generator
*/
void dump_stack(void)
{
unsigned long bp = 0;
unsigned long stack;
#ifdef CONFIG_FRAME_POINTER
if (!bp)
get_bp(bp);
#endif
printk("Pid: %d, comm: %.20s %s %s %.*s\n",
current->pid, current->comm, print_tainted(),
init_utsname()->release,
(int)strcspn(init_utsname()->version, " "),
init_utsname()->version);
show_trace(NULL, NULL, &stack, bp);
}
EXPORT_SYMBOL(dump_stack);
void show_registers(struct pt_regs *regs) void show_registers(struct pt_regs *regs)
{ {
...@@ -283,167 +152,3 @@ int is_valid_bugaddr(unsigned long ip) ...@@ -283,167 +152,3 @@ int is_valid_bugaddr(unsigned long ip)
return ud2 == 0x0b0f; return ud2 == 0x0b0f;
} }
static raw_spinlock_t die_lock = __RAW_SPIN_LOCK_UNLOCKED;
static int die_owner = -1;
static unsigned int die_nest_count;
unsigned __kprobes long oops_begin(void)
{
unsigned long flags;
oops_enter();
if (die_owner != raw_smp_processor_id()) {
console_verbose();
raw_local_irq_save(flags);
__raw_spin_lock(&die_lock);
die_owner = smp_processor_id();
die_nest_count = 0;
bust_spinlocks(1);
} else {
raw_local_irq_save(flags);
}
die_nest_count++;
return flags;
}
void __kprobes oops_end(unsigned long flags, struct pt_regs *regs, int signr)
{
bust_spinlocks(0);
die_owner = -1;
add_taint(TAINT_DIE);
__raw_spin_unlock(&die_lock);
raw_local_irq_restore(flags);
if (!regs)
return;
if (kexec_should_crash(current))
crash_kexec(regs);
if (in_interrupt())
panic("Fatal exception in interrupt");
if (panic_on_oops)
panic("Fatal exception");
oops_exit();
do_exit(signr);
}
int __kprobes __die(const char *str, struct pt_regs *regs, long err)
{
unsigned short ss;
unsigned long sp;
printk(KERN_EMERG "%s: %04lx [#%d] ", str, err & 0xffff, ++die_counter);
#ifdef CONFIG_PREEMPT
printk("PREEMPT ");
#endif
#ifdef CONFIG_SMP
printk("SMP ");
#endif
#ifdef CONFIG_DEBUG_PAGEALLOC
printk("DEBUG_PAGEALLOC");
#endif
printk("\n");
sysfs_printk_last_file();
if (notify_die(DIE_OOPS, str, regs, err,
current->thread.trap_no, SIGSEGV) == NOTIFY_STOP)
return 1;
show_registers(regs);
/* Executive summary in case the oops scrolled away */
sp = (unsigned long) (&regs->sp);
savesegment(ss, ss);
if (user_mode(regs)) {
sp = regs->sp;
ss = regs->ss & 0xffff;
}
printk(KERN_EMERG "EIP: [<%08lx>] ", regs->ip);
print_symbol("%s", regs->ip);
printk(" SS:ESP %04x:%08lx\n", ss, sp);
return 0;
}
/*
* This is gone through when something in the kernel has done something bad
* and is about to be terminated:
*/
void die(const char *str, struct pt_regs *regs, long err)
{
unsigned long flags = oops_begin();
if (die_nest_count < 3) {
report_bug(regs->ip, regs);
if (__die(str, regs, err))
regs = NULL;
} else {
printk(KERN_EMERG "Recursive die() failure, output suppressed\n");
}
oops_end(flags, regs, SIGSEGV);
}
static DEFINE_SPINLOCK(nmi_print_lock);
void notrace __kprobes
die_nmi(char *str, struct pt_regs *regs, int do_panic)
{
if (notify_die(DIE_NMIWATCHDOG, str, regs, 0, 2, SIGINT) == NOTIFY_STOP)
return;
spin_lock(&nmi_print_lock);
/*
* We are in trouble anyway, lets at least try
* to get a message out:
*/
bust_spinlocks(1);
printk(KERN_EMERG "%s", str);
printk(" on CPU%d, ip %08lx, registers:\n",
smp_processor_id(), regs->ip);
show_registers(regs);
if (do_panic)
panic("Non maskable interrupt");
console_silent();
spin_unlock(&nmi_print_lock);
/*
* If we are in kernel we are probably nested up pretty bad
* and might aswell get out now while we still can:
*/
if (!user_mode_vm(regs)) {
current->thread.trap_no = 2;
crash_kexec(regs);
}
bust_spinlocks(0);
do_exit(SIGSEGV);
}
static int __init oops_setup(char *s)
{
if (!s)
return -EINVAL;
if (!strcmp(s, "panic"))
panic_on_oops = 1;
return 0;
}
early_param("oops", oops_setup);
static int __init kstack_setup(char *s)
{
if (!s)
return -EINVAL;
kstack_depth_to_print = simple_strtoul(s, NULL, 0);
return 0;
}
early_param("kstack", kstack_setup);
static int __init code_bytes_setup(char *s)
{
code_bytes = simple_strtoul(s, NULL, 0);
if (code_bytes > 8192)
code_bytes = 8192;
return 1;
}
__setup("code_bytes=", code_bytes_setup);
...@@ -17,19 +17,7 @@ ...@@ -17,19 +17,7 @@
#include <asm/stacktrace.h> #include <asm/stacktrace.h>
#define STACKSLOTS_PER_LINE 4 #include "dumpstack.h"
#define get_bp(bp) asm("movq %%rbp, %0" : "=r" (bp) :)
int panic_on_unrecovered_nmi;
int kstack_depth_to_print = 3 * STACKSLOTS_PER_LINE;
static unsigned int code_bytes = 64;
static int die_counter;
void printk_address(unsigned long address, int reliable)
{
printk(" [<%p>] %s%pS\n", (void *) address,
reliable ? "" : "? ", (void *) address);
}
static unsigned long *in_exception_stack(unsigned cpu, unsigned long stack, static unsigned long *in_exception_stack(unsigned cpu, unsigned long stack,
unsigned *usedp, char **idp) unsigned *usedp, char **idp)
...@@ -113,51 +101,6 @@ static unsigned long *in_exception_stack(unsigned cpu, unsigned long stack, ...@@ -113,51 +101,6 @@ static unsigned long *in_exception_stack(unsigned cpu, unsigned long stack,
* severe exception (double fault, nmi, stack fault, debug, mce) hardware stack * severe exception (double fault, nmi, stack fault, debug, mce) hardware stack
*/ */
static inline int valid_stack_ptr(struct thread_info *tinfo,
void *p, unsigned int size, void *end)
{
void *t = tinfo;
if (end) {
if (p < end && p >= (end-THREAD_SIZE))
return 1;
else
return 0;
}
return p > t && p < t + THREAD_SIZE - size;
}
/* The form of the top of the frame on the stack */
struct stack_frame {
struct stack_frame *next_frame;
unsigned long return_address;
};
static inline unsigned long
print_context_stack(struct thread_info *tinfo,
unsigned long *stack, unsigned long bp,
const struct stacktrace_ops *ops, void *data,
unsigned long *end)
{
struct stack_frame *frame = (struct stack_frame *)bp;
while (valid_stack_ptr(tinfo, stack, sizeof(*stack), end)) {
unsigned long addr;
addr = *stack;
if (__kernel_text_address(addr)) {
if ((unsigned long) stack == bp + sizeof(long)) {
ops->address(data, addr, 1);
frame = frame->next_frame;
bp = (unsigned long) frame;
} else {
ops->address(data, addr, bp == 0);
}
}
stack++;
}
return bp;
}
void dump_trace(struct task_struct *task, struct pt_regs *regs, void dump_trace(struct task_struct *task, struct pt_regs *regs,
unsigned long *stack, unsigned long bp, unsigned long *stack, unsigned long bp,
const struct stacktrace_ops *ops, void *data) const struct stacktrace_ops *ops, void *data)
...@@ -166,6 +109,7 @@ void dump_trace(struct task_struct *task, struct pt_regs *regs, ...@@ -166,6 +109,7 @@ void dump_trace(struct task_struct *task, struct pt_regs *regs,
unsigned long *irqstack_end = (unsigned long *)cpu_pda(cpu)->irqstackptr; unsigned long *irqstack_end = (unsigned long *)cpu_pda(cpu)->irqstackptr;
unsigned used = 0; unsigned used = 0;
struct thread_info *tinfo; struct thread_info *tinfo;
int graph = 0;
if (!task) if (!task)
task = current; task = current;
...@@ -206,7 +150,7 @@ void dump_trace(struct task_struct *task, struct pt_regs *regs, ...@@ -206,7 +150,7 @@ void dump_trace(struct task_struct *task, struct pt_regs *regs,
break; break;
bp = print_context_stack(tinfo, stack, bp, ops, bp = print_context_stack(tinfo, stack, bp, ops,
data, estack_end); data, estack_end, &graph);
ops->stack(data, "<EOE>"); ops->stack(data, "<EOE>");
/* /*
* We link to the next stack via the * We link to the next stack via the
...@@ -225,7 +169,7 @@ void dump_trace(struct task_struct *task, struct pt_regs *regs, ...@@ -225,7 +169,7 @@ void dump_trace(struct task_struct *task, struct pt_regs *regs,
if (ops->stack(data, "IRQ") < 0) if (ops->stack(data, "IRQ") < 0)
break; break;
bp = print_context_stack(tinfo, stack, bp, bp = print_context_stack(tinfo, stack, bp,
ops, data, irqstack_end); ops, data, irqstack_end, &graph);
/* /*
* We link to the next stack (which would be * We link to the next stack (which would be
* the process stack normally) the last * the process stack normally) the last
...@@ -243,62 +187,12 @@ void dump_trace(struct task_struct *task, struct pt_regs *regs, ...@@ -243,62 +187,12 @@ void dump_trace(struct task_struct *task, struct pt_regs *regs,
/* /*
* This handles the process stack: * This handles the process stack:
*/ */
bp = print_context_stack(tinfo, stack, bp, ops, data, NULL); bp = print_context_stack(tinfo, stack, bp, ops, data, NULL, &graph);
put_cpu(); put_cpu();
} }
EXPORT_SYMBOL(dump_trace); EXPORT_SYMBOL(dump_trace);
static void void
print_trace_warning_symbol(void *data, char *msg, unsigned long symbol)
{
printk(data);
print_symbol(msg, symbol);
printk("\n");
}
static void print_trace_warning(void *data, char *msg)
{
printk("%s%s\n", (char *)data, msg);
}
static int print_trace_stack(void *data, char *name)
{
printk("%s <%s> ", (char *)data, name);
return 0;
}
/*
* Print one address/symbol entries per line.
*/
static void print_trace_address(void *data, unsigned long addr, int reliable)
{
touch_nmi_watchdog();
printk(data);
printk_address(addr, reliable);
}
static const struct stacktrace_ops print_trace_ops = {
.warning = print_trace_warning,
.warning_symbol = print_trace_warning_symbol,
.stack = print_trace_stack,
.address = print_trace_address,
};
static void
show_trace_log_lvl(struct task_struct *task, struct pt_regs *regs,
unsigned long *stack, unsigned long bp, char *log_lvl)
{
printk("%sCall Trace:\n", log_lvl);
dump_trace(task, regs, stack, bp, &print_trace_ops, log_lvl);
}
void show_trace(struct task_struct *task, struct pt_regs *regs,
unsigned long *stack, unsigned long bp)
{
show_trace_log_lvl(task, regs, stack, bp, "");
}
static void
show_stack_log_lvl(struct task_struct *task, struct pt_regs *regs, show_stack_log_lvl(struct task_struct *task, struct pt_regs *regs,
unsigned long *sp, unsigned long bp, char *log_lvl) unsigned long *sp, unsigned long bp, char *log_lvl)
{ {
...@@ -342,33 +236,6 @@ show_stack_log_lvl(struct task_struct *task, struct pt_regs *regs, ...@@ -342,33 +236,6 @@ show_stack_log_lvl(struct task_struct *task, struct pt_regs *regs,
show_trace_log_lvl(task, regs, sp, bp, log_lvl); show_trace_log_lvl(task, regs, sp, bp, log_lvl);
} }
void show_stack(struct task_struct *task, unsigned long *sp)
{
show_stack_log_lvl(task, NULL, sp, 0, "");
}
/*
* The architecture-independent dump_stack generator
*/
void dump_stack(void)
{
unsigned long bp = 0;
unsigned long stack;
#ifdef CONFIG_FRAME_POINTER
if (!bp)
get_bp(bp);
#endif
printk("Pid: %d, comm: %.20s %s %s %.*s\n",
current->pid, current->comm, print_tainted(),
init_utsname()->release,
(int)strcspn(init_utsname()->version, " "),
init_utsname()->version);
show_trace(NULL, NULL, &stack, bp);
}
EXPORT_SYMBOL(dump_stack);
void show_registers(struct pt_regs *regs) void show_registers(struct pt_regs *regs)
{ {
int i; int i;
...@@ -429,147 +296,3 @@ int is_valid_bugaddr(unsigned long ip) ...@@ -429,147 +296,3 @@ int is_valid_bugaddr(unsigned long ip)
return ud2 == 0x0b0f; return ud2 == 0x0b0f;
} }
static raw_spinlock_t die_lock = __RAW_SPIN_LOCK_UNLOCKED;
static int die_owner = -1;
static unsigned int die_nest_count;
unsigned __kprobes long oops_begin(void)
{
int cpu;
unsigned long flags;
oops_enter();
/* racy, but better than risking deadlock. */
raw_local_irq_save(flags);
cpu = smp_processor_id();
if (!__raw_spin_trylock(&die_lock)) {
if (cpu == die_owner)
/* nested oops. should stop eventually */;
else
__raw_spin_lock(&die_lock);
}
die_nest_count++;
die_owner = cpu;
console_verbose();
bust_spinlocks(1);
return flags;
}
void __kprobes oops_end(unsigned long flags, struct pt_regs *regs, int signr)
{
die_owner = -1;
bust_spinlocks(0);
die_nest_count--;
if (!die_nest_count)
/* Nest count reaches zero, release the lock. */
__raw_spin_unlock(&die_lock);
raw_local_irq_restore(flags);
if (!regs) {
oops_exit();
return;
}
if (in_interrupt())
panic("Fatal exception in interrupt");
if (panic_on_oops)
panic("Fatal exception");
oops_exit();
do_exit(signr);
}
int __kprobes __die(const char *str, struct pt_regs *regs, long err)
{
printk(KERN_EMERG "%s: %04lx [#%d] ", str, err & 0xffff, ++die_counter);
#ifdef CONFIG_PREEMPT
printk("PREEMPT ");
#endif
#ifdef CONFIG_SMP
printk("SMP ");
#endif
#ifdef CONFIG_DEBUG_PAGEALLOC
printk("DEBUG_PAGEALLOC");
#endif
printk("\n");
sysfs_printk_last_file();
if (notify_die(DIE_OOPS, str, regs, err,
current->thread.trap_no, SIGSEGV) == NOTIFY_STOP)
return 1;
show_registers(regs);
add_taint(TAINT_DIE);
/* Executive summary in case the oops scrolled away */
printk(KERN_ALERT "RIP ");
printk_address(regs->ip, 1);
printk(" RSP <%016lx>\n", regs->sp);
if (kexec_should_crash(current))
crash_kexec(regs);
return 0;
}
void die(const char *str, struct pt_regs *regs, long err)
{
unsigned long flags = oops_begin();
if (!user_mode(regs))
report_bug(regs->ip, regs);
if (__die(str, regs, err))
regs = NULL;
oops_end(flags, regs, SIGSEGV);
}
notrace __kprobes void
die_nmi(char *str, struct pt_regs *regs, int do_panic)
{
unsigned long flags;
if (notify_die(DIE_NMIWATCHDOG, str, regs, 0, 2, SIGINT) == NOTIFY_STOP)
return;
flags = oops_begin();
/*
* We are in trouble anyway, lets at least try
* to get a message out.
*/
printk(KERN_EMERG "%s", str);
printk(" on CPU%d, ip %08lx, registers:\n",
smp_processor_id(), regs->ip);
show_registers(regs);
if (kexec_should_crash(current))
crash_kexec(regs);
if (do_panic || panic_on_oops)
panic("Non maskable interrupt");
oops_end(flags, NULL, SIGBUS);
nmi_exit();
local_irq_enable();
do_exit(SIGBUS);
}
static int __init oops_setup(char *s)
{
if (!s)
return -EINVAL;
if (!strcmp(s, "panic"))
panic_on_oops = 1;
return 0;
}
early_param("oops", oops_setup);
static int __init kstack_setup(char *s)
{
if (!s)
return -EINVAL;
kstack_depth_to_print = simple_strtoul(s, NULL, 0);
return 0;
}
early_param("kstack", kstack_setup);
static int __init code_bytes_setup(char *s)
{
code_bytes = simple_strtoul(s, NULL, 0);
if (code_bytes > 8192)
code_bytes = 8192;
return 1;
}
__setup("code_bytes=", code_bytes_setup);
...@@ -1157,6 +1157,9 @@ ENTRY(mcount) ...@@ -1157,6 +1157,9 @@ ENTRY(mcount)
END(mcount) END(mcount)
ENTRY(ftrace_caller) ENTRY(ftrace_caller)
cmpl $0, function_trace_stop
jne ftrace_stub
pushl %eax pushl %eax
pushl %ecx pushl %ecx
pushl %edx pushl %edx
...@@ -1171,6 +1174,11 @@ ftrace_call: ...@@ -1171,6 +1174,11 @@ ftrace_call:
popl %edx popl %edx
popl %ecx popl %ecx
popl %eax popl %eax
#ifdef CONFIG_FUNCTION_GRAPH_TRACER
.globl ftrace_graph_call
ftrace_graph_call:
jmp ftrace_stub
#endif
.globl ftrace_stub .globl ftrace_stub
ftrace_stub: ftrace_stub:
...@@ -1180,8 +1188,18 @@ END(ftrace_caller) ...@@ -1180,8 +1188,18 @@ END(ftrace_caller)
#else /* ! CONFIG_DYNAMIC_FTRACE */ #else /* ! CONFIG_DYNAMIC_FTRACE */
ENTRY(mcount) ENTRY(mcount)
cmpl $0, function_trace_stop
jne ftrace_stub
cmpl $ftrace_stub, ftrace_trace_function cmpl $ftrace_stub, ftrace_trace_function
jnz trace jnz trace
#ifdef CONFIG_FUNCTION_GRAPH_TRACER
cmpl $ftrace_stub, ftrace_graph_return
jnz ftrace_graph_caller
cmpl $ftrace_graph_entry_stub, ftrace_graph_entry
jnz ftrace_graph_caller
#endif
.globl ftrace_stub .globl ftrace_stub
ftrace_stub: ftrace_stub:
ret ret
...@@ -1200,12 +1218,43 @@ trace: ...@@ -1200,12 +1218,43 @@ trace:
popl %edx popl %edx
popl %ecx popl %ecx
popl %eax popl %eax
jmp ftrace_stub jmp ftrace_stub
END(mcount) END(mcount)
#endif /* CONFIG_DYNAMIC_FTRACE */ #endif /* CONFIG_DYNAMIC_FTRACE */
#endif /* CONFIG_FUNCTION_TRACER */ #endif /* CONFIG_FUNCTION_TRACER */
#ifdef CONFIG_FUNCTION_GRAPH_TRACER
ENTRY(ftrace_graph_caller)
cmpl $0, function_trace_stop
jne ftrace_stub
pushl %eax
pushl %ecx
pushl %edx
movl 0xc(%esp), %edx
lea 0x4(%ebp), %eax
subl $MCOUNT_INSN_SIZE, %edx
call prepare_ftrace_return
popl %edx
popl %ecx
popl %eax
ret
END(ftrace_graph_caller)
.globl return_to_handler
return_to_handler:
pushl $0
pushl %eax
pushl %ecx
pushl %edx
call ftrace_return_to_handler
movl %eax, 0xc(%esp)
popl %edx
popl %ecx
popl %eax
ret
#endif
.section .rodata,"a" .section .rodata,"a"
#include "syscall_table_32.S" #include "syscall_table_32.S"
......
...@@ -68,6 +68,8 @@ ENTRY(mcount) ...@@ -68,6 +68,8 @@ ENTRY(mcount)
END(mcount) END(mcount)
ENTRY(ftrace_caller) ENTRY(ftrace_caller)
cmpl $0, function_trace_stop
jne ftrace_stub
/* taken from glibc */ /* taken from glibc */
subq $0x38, %rsp subq $0x38, %rsp
...@@ -96,6 +98,12 @@ ftrace_call: ...@@ -96,6 +98,12 @@ ftrace_call:
movq (%rsp), %rax movq (%rsp), %rax
addq $0x38, %rsp addq $0x38, %rsp
#ifdef CONFIG_FUNCTION_GRAPH_TRACER
.globl ftrace_graph_call
ftrace_graph_call:
jmp ftrace_stub
#endif
.globl ftrace_stub .globl ftrace_stub
ftrace_stub: ftrace_stub:
retq retq
...@@ -103,8 +111,20 @@ END(ftrace_caller) ...@@ -103,8 +111,20 @@ END(ftrace_caller)
#else /* ! CONFIG_DYNAMIC_FTRACE */ #else /* ! CONFIG_DYNAMIC_FTRACE */
ENTRY(mcount) ENTRY(mcount)
cmpl $0, function_trace_stop
jne ftrace_stub
cmpq $ftrace_stub, ftrace_trace_function cmpq $ftrace_stub, ftrace_trace_function
jnz trace jnz trace
#ifdef CONFIG_FUNCTION_GRAPH_TRACER
cmpq $ftrace_stub, ftrace_graph_return
jnz ftrace_graph_caller
cmpq $ftrace_graph_entry_stub, ftrace_graph_entry
jnz ftrace_graph_caller
#endif
.globl ftrace_stub .globl ftrace_stub
ftrace_stub: ftrace_stub:
retq retq
...@@ -140,6 +160,69 @@ END(mcount) ...@@ -140,6 +160,69 @@ END(mcount)
#endif /* CONFIG_DYNAMIC_FTRACE */ #endif /* CONFIG_DYNAMIC_FTRACE */
#endif /* CONFIG_FUNCTION_TRACER */ #endif /* CONFIG_FUNCTION_TRACER */
#ifdef CONFIG_FUNCTION_GRAPH_TRACER
ENTRY(ftrace_graph_caller)
cmpl $0, function_trace_stop
jne ftrace_stub
subq $0x38, %rsp
movq %rax, (%rsp)
movq %rcx, 8(%rsp)
movq %rdx, 16(%rsp)
movq %rsi, 24(%rsp)
movq %rdi, 32(%rsp)
movq %r8, 40(%rsp)
movq %r9, 48(%rsp)
leaq 8(%rbp), %rdi
movq 0x38(%rsp), %rsi
subq $MCOUNT_INSN_SIZE, %rsi
call prepare_ftrace_return
movq 48(%rsp), %r9
movq 40(%rsp), %r8
movq 32(%rsp), %rdi
movq 24(%rsp), %rsi
movq 16(%rsp), %rdx
movq 8(%rsp), %rcx
movq (%rsp), %rax
addq $0x38, %rsp
retq
END(ftrace_graph_caller)
.globl return_to_handler
return_to_handler:
subq $80, %rsp
movq %rax, (%rsp)
movq %rcx, 8(%rsp)
movq %rdx, 16(%rsp)
movq %rsi, 24(%rsp)
movq %rdi, 32(%rsp)
movq %r8, 40(%rsp)
movq %r9, 48(%rsp)
movq %r10, 56(%rsp)
movq %r11, 64(%rsp)
call ftrace_return_to_handler
movq %rax, 72(%rsp)
movq 64(%rsp), %r11
movq 56(%rsp), %r10
movq 48(%rsp), %r9
movq 40(%rsp), %r8
movq 32(%rsp), %rdi
movq 24(%rsp), %rsi
movq 16(%rsp), %rdx
movq 8(%rsp), %rcx
movq (%rsp), %rax
addq $72, %rsp
retq
#endif
#ifndef CONFIG_PREEMPT #ifndef CONFIG_PREEMPT
#define retint_kernel retint_restore_args #define retint_kernel retint_restore_args
#endif #endif
......
...@@ -38,8 +38,11 @@ ...@@ -38,8 +38,11 @@
#include <asm/io.h> #include <asm/io.h>
#include <asm/nmi.h> #include <asm/nmi.h>
#include <asm/smp.h> #include <asm/smp.h>
#include <asm/atomic.h>
#include <asm/apicdef.h> #include <asm/apicdef.h>
#include <mach_mpparse.h> #include <mach_mpparse.h>
#include <asm/genapic.h>
#include <asm/setup.h>
/* /*
* ES7000 chipsets * ES7000 chipsets
...@@ -161,6 +164,43 @@ es7000_rename_gsi(int ioapic, int gsi) ...@@ -161,6 +164,43 @@ es7000_rename_gsi(int ioapic, int gsi)
return gsi; return gsi;
} }
static int wakeup_secondary_cpu_via_mip(int cpu, unsigned long eip)
{
unsigned long vect = 0, psaival = 0;
if (psai == NULL)
return -1;
vect = ((unsigned long)__pa(eip)/0x1000) << 16;
psaival = (0x1000000 | vect | cpu);
while (*psai & 0x1000000)
;
*psai = psaival;
return 0;
}
static void noop_wait_for_deassert(atomic_t *deassert_not_used)
{
}
static int __init es7000_update_genapic(void)
{
genapic->wakeup_cpu = wakeup_secondary_cpu_via_mip;
/* MPENTIUMIII */
if (boot_cpu_data.x86 == 6 &&
(boot_cpu_data.x86_model >= 7 || boot_cpu_data.x86_model <= 11)) {
es7000_update_genapic_to_cluster();
genapic->wait_for_init_deassert = noop_wait_for_deassert;
genapic->wakeup_cpu = wakeup_secondary_cpu_via_mip;
}
return 0;
}
void __init void __init
setup_unisys(void) setup_unisys(void)
{ {
...@@ -176,6 +216,8 @@ setup_unisys(void) ...@@ -176,6 +216,8 @@ setup_unisys(void)
else else
es7000_plat = ES7000_CLASSIC; es7000_plat = ES7000_CLASSIC;
ioapic_renumber_irq = es7000_rename_gsi; ioapic_renumber_irq = es7000_rename_gsi;
x86_quirks->update_genapic = es7000_update_genapic;
} }
/* /*
...@@ -317,26 +359,6 @@ es7000_mip_write(struct mip_reg *mip_reg) ...@@ -317,26 +359,6 @@ es7000_mip_write(struct mip_reg *mip_reg)
return status; return status;
} }
int
es7000_start_cpu(int cpu, unsigned long eip)
{
unsigned long vect = 0, psaival = 0;
if (psai == NULL)
return -1;
vect = ((unsigned long)__pa(eip)/0x1000) << 16;
psaival = (0x1000000 | vect | cpu);
while (*psai & 0x1000000)
;
*psai = psaival;
return 0;
}
void __init void __init
es7000_sw_apic(void) es7000_sw_apic(void)
{ {
......
This diff is collapsed.
...@@ -21,6 +21,7 @@ ...@@ -21,6 +21,7 @@
#include <asm/smp.h> #include <asm/smp.h>
#include <asm/ipi.h> #include <asm/ipi.h>
#include <asm/genapic.h> #include <asm/genapic.h>
#include <asm/setup.h>
extern struct genapic apic_flat; extern struct genapic apic_flat;
extern struct genapic apic_physflat; extern struct genapic apic_physflat;
...@@ -53,6 +54,9 @@ void __init setup_apic_routing(void) ...@@ -53,6 +54,9 @@ void __init setup_apic_routing(void)
genapic = &apic_physflat; genapic = &apic_physflat;
printk(KERN_INFO "Setting APIC routing to %s\n", genapic->name); printk(KERN_INFO "Setting APIC routing to %s\n", genapic->name);
} }
if (x86_quirks->update_genapic)
x86_quirks->update_genapic();
} }
/* Same for both flat and physical. */ /* Same for both flat and physical. */
......
This diff is collapsed.
...@@ -118,6 +118,9 @@ int show_interrupts(struct seq_file *p, void *v) ...@@ -118,6 +118,9 @@ int show_interrupts(struct seq_file *p, void *v)
} }
desc = irq_to_desc(i); desc = irq_to_desc(i);
if (!desc)
return 0;
spin_lock_irqsave(&desc->lock, flags); spin_lock_irqsave(&desc->lock, flags);
#ifndef CONFIG_SMP #ifndef CONFIG_SMP
any_count = kstat_irqs(i); any_count = kstat_irqs(i);
......
...@@ -242,6 +242,8 @@ void fixup_irqs(cpumask_t map) ...@@ -242,6 +242,8 @@ void fixup_irqs(cpumask_t map)
for_each_irq_desc(irq, desc) { for_each_irq_desc(irq, desc) {
cpumask_t mask; cpumask_t mask;
if (!desc)
continue;
if (irq == 2) if (irq == 2)
continue; continue;
......
...@@ -94,6 +94,8 @@ void fixup_irqs(cpumask_t map) ...@@ -94,6 +94,8 @@ void fixup_irqs(cpumask_t map)
int break_affinity = 0; int break_affinity = 0;
int set_affinity = 1; int set_affinity = 1;
if (!desc)
continue;
if (irq == 2) if (irq == 2)
continue; continue;
......
...@@ -68,8 +68,7 @@ void __init init_ISA_irqs (void) ...@@ -68,8 +68,7 @@ void __init init_ISA_irqs (void)
/* /*
* 16 old-style INTA-cycle interrupts: * 16 old-style INTA-cycle interrupts:
*/ */
for (i = 0; i < 16; i++) { for (i = 0; i < NR_IRQS_LEGACY; i++) {
/* first time call this irq_desc */
struct irq_desc *desc = irq_to_desc(i); struct irq_desc *desc = irq_to_desc(i);
desc->status = IRQ_DISABLED; desc->status = IRQ_DISABLED;
......
...@@ -142,8 +142,7 @@ void __init init_ISA_irqs(void) ...@@ -142,8 +142,7 @@ void __init init_ISA_irqs(void)
init_bsp_APIC(); init_bsp_APIC();
init_8259A(0); init_8259A(0);
for (i = 0; i < 16; i++) { for (i = 0; i < NR_IRQS_LEGACY; i++) {
/* first time call this irq_desc */
struct irq_desc *desc = irq_to_desc(i); struct irq_desc *desc = irq_to_desc(i);
desc->status = IRQ_DISABLED; desc->status = IRQ_DISABLED;
......
...@@ -586,26 +586,23 @@ static void __init __get_smp_config(unsigned int early) ...@@ -586,26 +586,23 @@ static void __init __get_smp_config(unsigned int early)
{ {
struct intel_mp_floating *mpf = mpf_found; struct intel_mp_floating *mpf = mpf_found;
if (x86_quirks->mach_get_smp_config) { if (!mpf)
if (x86_quirks->mach_get_smp_config(early)) return;
return;
}
if (acpi_lapic && early) if (acpi_lapic && early)
return; return;
/* /*
* ACPI supports both logical (e.g. Hyper-Threading) and physical * MPS doesn't support hyperthreading, aka only have
* processors, where MPS only supports physical. * thread 0 apic id in MPS table
*/ */
if (acpi_lapic && acpi_ioapic) { if (acpi_lapic && acpi_ioapic)
printk(KERN_INFO "Using ACPI (MADT) for SMP configuration "
"information\n");
return; return;
} else if (acpi_lapic)
printk(KERN_INFO "Using ACPI for processor (LAPIC) "
"configuration information\n");
if (!mpf) if (x86_quirks->mach_get_smp_config) {
return; if (x86_quirks->mach_get_smp_config(early))
return;
}
printk(KERN_INFO "Intel MultiProcessor Specification v1.%d\n", printk(KERN_INFO "Intel MultiProcessor Specification v1.%d\n",
mpf->mpf_specification); mpf->mpf_specification);
......
...@@ -31,7 +31,7 @@ ...@@ -31,7 +31,7 @@
#include <asm/numaq.h> #include <asm/numaq.h>
#include <asm/topology.h> #include <asm/topology.h>
#include <asm/processor.h> #include <asm/processor.h>
#include <asm/mpspec.h> #include <asm/genapic.h>
#include <asm/e820.h> #include <asm/e820.h>
#include <asm/setup.h> #include <asm/setup.h>
...@@ -235,6 +235,13 @@ static int __init numaq_setup_ioapic_ids(void) ...@@ -235,6 +235,13 @@ static int __init numaq_setup_ioapic_ids(void)
return 1; return 1;
} }
static int __init numaq_update_genapic(void)
{
genapic->wakeup_cpu = wakeup_secondary_cpu_via_nmi;
return 0;
}
static struct x86_quirks numaq_x86_quirks __initdata = { static struct x86_quirks numaq_x86_quirks __initdata = {
.arch_pre_time_init = numaq_pre_time_init, .arch_pre_time_init = numaq_pre_time_init,
.arch_time_init = NULL, .arch_time_init = NULL,
...@@ -250,6 +257,7 @@ static struct x86_quirks numaq_x86_quirks __initdata = { ...@@ -250,6 +257,7 @@ static struct x86_quirks numaq_x86_quirks __initdata = {
.mpc_oem_pci_bus = mpc_oem_pci_bus, .mpc_oem_pci_bus = mpc_oem_pci_bus,
.smp_read_mpc_oem = smp_read_mpc_oem, .smp_read_mpc_oem = smp_read_mpc_oem,
.setup_ioapic_ids = numaq_setup_ioapic_ids, .setup_ioapic_ids = numaq_setup_ioapic_ids,
.update_genapic = numaq_update_genapic,
}; };
void numaq_mps_oem_check(struct mp_config_table *mpc, char *oem, void numaq_mps_oem_check(struct mp_config_table *mpc, char *oem,
......
...@@ -7,7 +7,9 @@ ...@@ -7,7 +7,9 @@
#include <linux/module.h> #include <linux/module.h>
#include <linux/pm.h> #include <linux/pm.h>
#include <linux/clockchips.h> #include <linux/clockchips.h>
#include <linux/ftrace.h>
#include <asm/system.h> #include <asm/system.h>
#include <asm/apic.h>
unsigned long idle_halt; unsigned long idle_halt;
EXPORT_SYMBOL(idle_halt); EXPORT_SYMBOL(idle_halt);
...@@ -100,6 +102,9 @@ static inline int hlt_use_halt(void) ...@@ -100,6 +102,9 @@ static inline int hlt_use_halt(void)
void default_idle(void) void default_idle(void)
{ {
if (hlt_use_halt()) { if (hlt_use_halt()) {
struct power_trace it;
trace_power_start(&it, POWER_CSTATE, 1);
current_thread_info()->status &= ~TS_POLLING; current_thread_info()->status &= ~TS_POLLING;
/* /*
* TS_POLLING-cleared state must be visible before we * TS_POLLING-cleared state must be visible before we
...@@ -112,6 +117,7 @@ void default_idle(void) ...@@ -112,6 +117,7 @@ void default_idle(void)
else else
local_irq_enable(); local_irq_enable();
current_thread_info()->status |= TS_POLLING; current_thread_info()->status |= TS_POLLING;
trace_power_end(&it);
} else { } else {
local_irq_enable(); local_irq_enable();
/* loop is done by the caller */ /* loop is done by the caller */
...@@ -122,6 +128,21 @@ void default_idle(void) ...@@ -122,6 +128,21 @@ void default_idle(void)
EXPORT_SYMBOL(default_idle); EXPORT_SYMBOL(default_idle);
#endif #endif
void stop_this_cpu(void *dummy)
{
local_irq_disable();
/*
* Remove this CPU:
*/
cpu_clear(smp_processor_id(), cpu_online_map);
disable_local_APIC();
for (;;) {
if (hlt_works(smp_processor_id()))
halt();
}
}
static void do_nothing(void *unused) static void do_nothing(void *unused)
{ {
} }
...@@ -154,24 +175,31 @@ EXPORT_SYMBOL_GPL(cpu_idle_wait); ...@@ -154,24 +175,31 @@ EXPORT_SYMBOL_GPL(cpu_idle_wait);
*/ */
void mwait_idle_with_hints(unsigned long ax, unsigned long cx) void mwait_idle_with_hints(unsigned long ax, unsigned long cx)
{ {
struct power_trace it;
trace_power_start(&it, POWER_CSTATE, (ax>>4)+1);
if (!need_resched()) { if (!need_resched()) {
__monitor((void *)&current_thread_info()->flags, 0, 0); __monitor((void *)&current_thread_info()->flags, 0, 0);
smp_mb(); smp_mb();
if (!need_resched()) if (!need_resched())
__mwait(ax, cx); __mwait(ax, cx);
} }
trace_power_end(&it);
} }
/* Default MONITOR/MWAIT with no hints, used for default C1 state */ /* Default MONITOR/MWAIT with no hints, used for default C1 state */
static void mwait_idle(void) static void mwait_idle(void)
{ {
struct power_trace it;
if (!need_resched()) { if (!need_resched()) {
trace_power_start(&it, POWER_CSTATE, 1);
__monitor((void *)&current_thread_info()->flags, 0, 0); __monitor((void *)&current_thread_info()->flags, 0, 0);
smp_mb(); smp_mb();
if (!need_resched()) if (!need_resched())
__sti_mwait(0, 0); __sti_mwait(0, 0);
else else
local_irq_enable(); local_irq_enable();
trace_power_end(&it);
} else } else
local_irq_enable(); local_irq_enable();
} }
...@@ -183,9 +211,13 @@ static void mwait_idle(void) ...@@ -183,9 +211,13 @@ static void mwait_idle(void)
*/ */
static void poll_idle(void) static void poll_idle(void)
{ {
struct power_trace it;
trace_power_start(&it, POWER_CSTATE, 0);
local_irq_enable(); local_irq_enable();
while (!need_resched()) while (!need_resched())
cpu_relax(); cpu_relax();
trace_power_end(&it);
} }
/* /*
......
...@@ -38,6 +38,7 @@ ...@@ -38,6 +38,7 @@
#include <linux/percpu.h> #include <linux/percpu.h>
#include <linux/prctl.h> #include <linux/prctl.h>
#include <linux/dmi.h> #include <linux/dmi.h>
#include <linux/ftrace.h>
#include <asm/uaccess.h> #include <asm/uaccess.h>
#include <asm/pgtable.h> #include <asm/pgtable.h>
...@@ -548,7 +549,8 @@ __switch_to_xtra(struct task_struct *prev_p, struct task_struct *next_p, ...@@ -548,7 +549,8 @@ __switch_to_xtra(struct task_struct *prev_p, struct task_struct *next_p,
* the task-switch, and shows up in ret_from_fork in entry.S, * the task-switch, and shows up in ret_from_fork in entry.S,
* for example. * for example.
*/ */
struct task_struct * __switch_to(struct task_struct *prev_p, struct task_struct *next_p) __notrace_funcgraph struct task_struct *
__switch_to(struct task_struct *prev_p, struct task_struct *next_p)
{ {
struct thread_struct *prev = &prev_p->thread, struct thread_struct *prev = &prev_p->thread,
*next = &next_p->thread; *next = &next_p->thread;
......
...@@ -39,6 +39,7 @@ ...@@ -39,6 +39,7 @@
#include <linux/prctl.h> #include <linux/prctl.h>
#include <linux/uaccess.h> #include <linux/uaccess.h>
#include <linux/io.h> #include <linux/io.h>
#include <linux/ftrace.h>
#include <asm/pgtable.h> #include <asm/pgtable.h>
#include <asm/system.h> #include <asm/system.h>
...@@ -551,8 +552,9 @@ static inline void __switch_to_xtra(struct task_struct *prev_p, ...@@ -551,8 +552,9 @@ static inline void __switch_to_xtra(struct task_struct *prev_p,
* - could test fs/gs bitsliced * - could test fs/gs bitsliced
* *
* Kprobes not supported here. Set the probe on schedule instead. * Kprobes not supported here. Set the probe on schedule instead.
* Function graph tracer not supported too.
*/ */
struct task_struct * __notrace_funcgraph struct task_struct *
__switch_to(struct task_struct *prev_p, struct task_struct *next_p) __switch_to(struct task_struct *prev_p, struct task_struct *next_p)
{ {
struct thread_struct *prev = &prev_p->thread; struct thread_struct *prev = &prev_p->thread;
......
This diff is collapsed.
...@@ -36,7 +36,10 @@ int reboot_force; ...@@ -36,7 +36,10 @@ int reboot_force;
static int reboot_cpu = -1; static int reboot_cpu = -1;
#endif #endif
/* reboot=b[ios] | s[mp] | t[riple] | k[bd] | e[fi] [, [w]arm | [c]old] /* This is set by the PCI code if either type 1 or type 2 PCI is detected */
bool port_cf9_safe = false;
/* reboot=b[ios] | s[mp] | t[riple] | k[bd] | e[fi] [, [w]arm | [c]old] | p[ci]
warm Don't set the cold reboot flag warm Don't set the cold reboot flag
cold Set the cold reboot flag cold Set the cold reboot flag
bios Reboot by jumping through the BIOS (only for X86_32) bios Reboot by jumping through the BIOS (only for X86_32)
...@@ -45,6 +48,7 @@ static int reboot_cpu = -1; ...@@ -45,6 +48,7 @@ static int reboot_cpu = -1;
kbd Use the keyboard controller. cold reset (default) kbd Use the keyboard controller. cold reset (default)
acpi Use the RESET_REG in the FADT acpi Use the RESET_REG in the FADT
efi Use efi reset_system runtime service efi Use efi reset_system runtime service
pci Use the so-called "PCI reset register", CF9
force Avoid anything that could hang. force Avoid anything that could hang.
*/ */
static int __init reboot_setup(char *str) static int __init reboot_setup(char *str)
...@@ -79,6 +83,7 @@ static int __init reboot_setup(char *str) ...@@ -79,6 +83,7 @@ static int __init reboot_setup(char *str)
case 'k': case 'k':
case 't': case 't':
case 'e': case 'e':
case 'p':
reboot_type = *str; reboot_type = *str;
break; break;
...@@ -404,12 +409,27 @@ static void native_machine_emergency_restart(void) ...@@ -404,12 +409,27 @@ static void native_machine_emergency_restart(void)
reboot_type = BOOT_KBD; reboot_type = BOOT_KBD;
break; break;
case BOOT_EFI: case BOOT_EFI:
if (efi_enabled) if (efi_enabled)
efi.reset_system(reboot_mode ? EFI_RESET_WARM : EFI_RESET_COLD, efi.reset_system(reboot_mode ?
EFI_RESET_WARM :
EFI_RESET_COLD,
EFI_SUCCESS, 0, NULL); EFI_SUCCESS, 0, NULL);
reboot_type = BOOT_KBD;
break;
case BOOT_CF9:
port_cf9_safe = true;
/* fall through */
case BOOT_CF9_COND:
if (port_cf9_safe) {
u8 cf9 = inb(0xcf9) & ~6;
outb(cf9|2, 0xcf9); /* Request hard reset */
udelay(50);
outb(cf9|6, 0xcf9); /* Actually do the reset */
udelay(50);
}
reboot_type = BOOT_KBD; reboot_type = BOOT_KBD;
break; break;
} }
...@@ -470,6 +490,11 @@ static void native_machine_restart(char *__unused) ...@@ -470,6 +490,11 @@ static void native_machine_restart(char *__unused)
static void native_machine_halt(void) static void native_machine_halt(void)
{ {
/* stop other cpus and apics */
machine_shutdown();
/* stop this cpu */
stop_this_cpu(NULL);
} }
static void native_machine_power_off(void) static void native_machine_power_off(void)
......
This diff is collapsed.
...@@ -140,19 +140,6 @@ void native_send_call_func_ipi(cpumask_t mask) ...@@ -140,19 +140,6 @@ void native_send_call_func_ipi(cpumask_t mask)
send_IPI_mask(mask, CALL_FUNCTION_VECTOR); send_IPI_mask(mask, CALL_FUNCTION_VECTOR);
} }
static void stop_this_cpu(void *dummy)
{
local_irq_disable();
/*
* Remove this CPU:
*/
cpu_clear(smp_processor_id(), cpu_online_map);
disable_local_APIC();
if (hlt_works(smp_processor_id()))
for (;;) halt();
for (;;);
}
/* /*
* this function calls the 'stop' function on all other CPUs in the system. * this function calls the 'stop' function on all other CPUs in the system.
*/ */
......
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
...@@ -17,6 +17,7 @@ ...@@ -17,6 +17,7 @@
#include <asm/bigsmp/apic.h> #include <asm/bigsmp/apic.h>
#include <asm/bigsmp/ipi.h> #include <asm/bigsmp/ipi.h>
#include <asm/mach-default/mach_mpparse.h> #include <asm/mach-default/mach_mpparse.h>
#include <asm/mach-default/mach_wakecpu.h>
static int dmi_bigsmp; /* can be set by dmi scanners */ static int dmi_bigsmp; /* can be set by dmi scanners */
......
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
Markdown is supported
0%
or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment