1. 21 Dec, 2011 4 commits
    • Steven Rostedt's avatar
      ftrace: Remove usage of "freed" records · 32082309
      Steven Rostedt authored
      Records that are added to the function trace table are
      permanently there, except for modules. By separating out the
      modules to their own pages that can be freed in one shot
      we can remove the "freed" flag and simplify some of the record
      management.
      
      Another benefit of doing this is that we can also move the
      records around; sort them.
      Signed-off-by: default avatarSteven Rostedt <rostedt@goodmis.org>
      32082309
    • Steven Rostedt's avatar
      ftrace: Allow archs to modify code without stop machine · c88fd863
      Steven Rostedt authored
      The stop machine method to modify all functions in the kernel
      (some 20,000 of them) is the safest way to do so across all archs.
      But some archs may not need this big hammer approach to modify code
      on SMP machines, and can simply just update the code it needs.
      
      Adding a weak function arch_ftrace_update_code() that now does the
      stop machine, will also let any arch override this method.
      
      If the arch needs to check the system and then decide if it can
      avoid stop machine, it can still call ftrace_run_stop_machine() to
      use the old method.
      Signed-off-by: default avatarSteven Rostedt <rostedt@goodmis.org>
      c88fd863
    • Steven Rostedt's avatar
      ftrace: Do not function trace inlined functions · 45959ee7
      Steven Rostedt authored
      When gcc inlines a function, it does not mark it with the mcount
      prologue, which in turn means that inlined functions are not traced
      by the function tracer. But if CONFIG_OPTIMIZE_INLINING is set, then
      gcc is allowed not to inline a function that is marked inline.
      
      Depending on the options and the compiler, a function may or may
      not be traced by the function tracer, depending on whether gcc
      decides to inline a function or not. This has caused several
      problems in the pass becaues gcc is not always consistent with
      what it decides to inline between different gcc versions.
      
      Some places should not be traced (like paravirt native_* functions)
      and these are mostly marked as inline. When gcc decides not to
      inline the function, and if that function should not be traced, then
      the ftrace function tracer will suddenly break when it use to work
      fine. This becomes even harder to debug when different versions of
      gcc will not inline that function, making the same kernel and config
      work for some gcc versions and not work for others.
      
      By making all functions marked inline to not be traced will remove
      the ambiguity that gcc adds when it comes to tracing functions marked
      inline. All gcc versions will be consistent with what functions are
      traced and having volatile working code will be removed.
      
      Note, only the inline macro when CONFIG_OPTIMIZE_INLINING is set needs
      to have notrace added, as the attribute __always_inline will force
      the function to be inlined and then not traced.
      Signed-off-by: default avatarSteven Rostedt <rostedt@goodmis.org>
      45959ee7
    • Jiri Olsa's avatar
      ftrace: Fix unregister ftrace_ops accounting · 30fb6aa7
      Jiri Olsa authored
      Multiple users of the function tracer can register their functions
      with the ftrace_ops structure. The accounting within ftrace will
      update the counter on each function record that is being traced.
      When the ftrace_ops filtering adds or removes functions, the
      function records will be updated accordingly if the ftrace_ops is
      still registered.
      
      When a ftrace_ops is removed, the counter of the function records,
      that the ftrace_ops traces, are decremented. When they reach zero
      the functions that they represent are modified to stop calling the
      mcount code.
      
      When changes are made, the code is updated via stop_machine() with
      a command passed to the function to tell it what to do. There is an
      ENABLE and DISABLE command that tells the called function to enable
      or disable the functions. But the ENABLE is really a misnomer as it
      should just update the records, as records that have been enabled
      and now have a count of zero should be disabled.
      
      The DISABLE command is used to disable all functions regardless of
      their counter values. This is the big off switch and is not the
      complement of the ENABLE command.
      
      To make matters worse, when a ftrace_ops is unregistered and there
      is another ftrace_ops registered, neither the DISABLE nor the
      ENABLE command are set when calling into the stop_machine() function
      and the records will not be updated to match their counter. A command
      is passed to that function that will update the mcount code to call
      the registered callback directly if it is the only one left. This
      means that the ftrace_ops that is still registered will have its callback
      called by all functions that have been set for it as well as the ftrace_ops
      that was just unregistered.
      
      Here's a way to trigger this bug. Compile the kernel with
      CONFIG_FUNCTION_PROFILER set and with CONFIG_FUNCTION_GRAPH not set:
      
       CONFIG_FUNCTION_PROFILER=y
       # CONFIG_FUNCTION_GRAPH is not set
      
      This will force the function profiler to use the function tracer instead
      of the function graph tracer.
      
        # cd /sys/kernel/debug/tracing
        # echo schedule > set_ftrace_filter
        # echo function > current_tracer
        # cat set_ftrace_filter
       schedule
        # cat trace
       # tracer: nop
       #
       # entries-in-buffer/entries-written: 692/68108025   #P:4
       #
       #                              _-----=> irqs-off
       #                             / _----=> need-resched
       #                            | / _---=> hardirq/softirq
       #                            || / _--=> preempt-depth
       #                            ||| /     delay
       #           TASK-PID   CPU#  ||||    TIMESTAMP  FUNCTION
       #              | |       |   ||||       |         |
            kworker/0:2-909   [000] ....   531.235574: schedule <-worker_thread
                 <idle>-0     [001] .N..   531.235575: schedule <-cpu_idle
            kworker/0:2-909   [000] ....   531.235597: schedule <-worker_thread
                   sshd-2563  [001] ....   531.235647: schedule <-schedule_hrtimeout_range_clock
      
        # echo 1 > function_profile_enabled
        # echo 0 > function_porfile_enabled
        # cat set_ftrace_filter
       schedule
        # cat trace
       # tracer: function
       #
       # entries-in-buffer/entries-written: 159701/118821262   #P:4
       #
       #                              _-----=> irqs-off
       #                             / _----=> need-resched
       #                            | / _---=> hardirq/softirq
       #                            || / _--=> preempt-depth
       #                            ||| /     delay
       #           TASK-PID   CPU#  ||||    TIMESTAMP  FUNCTION
       #              | |       |   ||||       |         |
                 <idle>-0     [002] ...1   604.870655: local_touch_nmi <-cpu_idle
                 <idle>-0     [002] d..1   604.870655: enter_idle <-cpu_idle
                 <idle>-0     [002] d..1   604.870656: atomic_notifier_call_chain <-enter_idle
                 <idle>-0     [002] d..1   604.870656: __atomic_notifier_call_chain <-atomic_notifier_call_chain
      
      The same problem could have happened with the trace_probe_ops,
      but they are modified with the set_frace_filter file which does the
      update at closure of the file.
      
      The simple solution is to change ENABLE to UPDATE and call it every
      time an ftrace_ops is unregistered.
      
      Link: http://lkml.kernel.org/r/1323105776-26961-3-git-send-email-jolsa@redhat.com
      
      Cc: stable@vger.kernel.org # 3.0+
      Signed-off-by: default avatarJiri Olsa <jolsa@redhat.com>
      Signed-off-by: default avatarSteven Rostedt <rostedt@goodmis.org>
      30fb6aa7
  2. 12 Dec, 2011 3 commits
  3. 06 Dec, 2011 14 commits
  4. 05 Dec, 2011 17 commits
  5. 02 Dec, 2011 2 commits
    • Arnaldo Carvalho de Melo's avatar
      perf test: Soft errors shouldn't stop the "Validate PERF_RECORD_" test · f71c49e5
      Arnaldo Carvalho de Melo authored
      For errors that don't preclude checking for further errors, aka "soft"
      errors, just  continue testing for other errors.
      
      Better coverage in verbose mode.
      Suggested-by: default avatarDavid Ahern <dsahern@gmail.com>
      Cc: David Ahern <dsahern@gmail.com>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Stephane Eranian <eranian@google.com>
      Link: http://lkml.kernel.org/n/tip-jafcokbj26m845dsgm2hx6az@git.kernel.orgSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      f71c49e5
    • Arnaldo Carvalho de Melo's avatar
      perf test: Validate PERF_RECORD_ events and perf_sample fields · 3e7c439a
      Arnaldo Carvalho de Melo authored
      This new test will validate these new routines extracted from 'perf
      record':
      
       - perf_evlist__config_attrs
       - perf_evlist__prepare_workload
       - perf_evlist__start_workload
      
      In addition to several other perf_evlist methods.
      
      It consists of starting a simple workload, setting up just one event to
      monitor ("cycles") requesting that several PERF_SAMPLE_ fields be
      present in all events.
      
      It then will check that the expected PERF_RECORD_ events are produced
      and will sanity check all its fields.
      
      Some checks performed:
      
      . PERF_SAMPLE_TIME monotonically increases.
      
      . PERF_SAMPLE_CPU is the one requested with sched_setaffinity
      
      . PERF_SAMPLE_TID and PERF_SAMPLE_PID matches the one we forked
        in perf_evlist__prepare_workload and that is stored in
        evlist->workload.pid
      
      . For the events where these fields are also present in its
        pre-sample_id_all fields (e.g. event->mmap.pid), that they are what
        is expected too.
      
      . That we get a bunch of mmaps:
      
        PATH/libcSUFFIX
        PATH/ldSUFFIX
        [vdso]
        PATH/sleep
      
      Example:
      
        [root@emilia ~]# taskset -c 3,4 perf test -v1 perf_sample
         6: Validate PERF_RECORD_* events & perf_sample fields:
        --- start ---
        7159480799825 3 PERF_RECORD_SAMPLE
        7159480805584 3 PERF_RECORD_SAMPLE
        7159480807814 3 PERF_RECORD_SAMPLE
        7159480810430 3 PERF_RECORD_SAMPLE
        7159480861511 3 PERF_RECORD_MMAP 8086/8086: [0x7fffffffd000(0x2000) @ 0x7fffffffd000]: //anon
        7159481052516 3 PERF_RECORD_COMM: sleep:8086
        7159481070188 3 PERF_RECORD_MMAP 8086/8086: [0x400000(0x6000) @ 0]: /bin/sleep
        7159481077104 3 PERF_RECORD_MMAP 8086/8086: [0x3d06400000(0x221000) @ 0]: /lib64/ld-2.12.so
        7159481092912 3 PERF_RECORD_MMAP 8086/8086: [0x7fff1adff000(0x1000) @ 0x7fff1adff000]: [vdso]
        7159481196779 3 PERF_RECORD_MMAP 8086/8086: [0x3d06800000(0x37f000) @ 0]: /lib64/libc-2.12.so
        7160481558435 3 PERF_RECORD_EXIT(8086:8086):(8086:8086)
        ---- end ----
        Validate PERF_RECORD_* events & perf_sample fields: Ok
        [root@emilia ~]#
      
      Cc: David Ahern <dsahern@gmail.com>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Stephane Eranian <eranian@google.com>
      Link: http://lkml.kernel.org/n/tip-svag18v2z4idas0dyz3umjpq@git.kernel.orgSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      3e7c439a