1. 08 Feb, 2013 7 commits
    • Oleg Nesterov's avatar
      uprobes: Introduce uprobe->register_rwsem · e591c8d7
      Oleg Nesterov authored
      Introduce uprobe->register_rwsem. It is taken for writing around
      __uprobe_register/unregister.
      
      Change handler_chain() to use this sem rather than consumer_rwsem.
      
      The main reason for this change is that we have the nasty problem
      with mmap_sem/consumer_rwsem dependency. filter_chain() needs to
      protect uprobe->consumers like handler_chain(), but they can not
      use the same lock. filter_chain() can be called under ->mmap_sem
      (currently this is always true), but we want to allow ->handler()
      to play with the probed task's memory, and this needs ->mmap_sem.
      
      Alternatively we could use srcu, but synchronize_srcu() is very
      slow and ->register_rwsem allows us to do more. In particular, we
      can teach handler_chain() to do remove_breakpoint() if this bp is
      "nacked" by all consumers, we know that we can't race with the
      new consumer which does uprobe_register().
      
      See also the next patches. uprobes_mutex[] is almost ready to die.
      Signed-off-by: default avatarOleg Nesterov <oleg@redhat.com>
      Acked-by: default avatarSrikar Dronamraju <srikar@linux.vnet.ibm.com>
      e591c8d7
    • Oleg Nesterov's avatar
      uprobes: _register() should always do register_for_each_vma(true) · 9a98e03c
      Oleg Nesterov authored
      To support the filtering uprobe_register() should do
      register_for_each_vma(true) every time the new consumer comes,
      we need to install the previously nacked breakpoints.
      
      Note:
      	- uprobes_mutex[] should die, what it actually protects is
      	  alloc_uprobe().
      
      	- UPROBE_RUN_HANDLER should die too, obviously it can't work
      	  unless uprobe has a single consumer. The consumer should
      	  serialize with _register/_unregister itself. Or this flag
      	  should live in uprobe_consumer->state.
      
      	- Perhaps we can do some optimizations later. For example, if
      	  filter_chain() never returns false uprobe can record this
      	  fact and avoid the unnecessary register_for_each_vma().
      Signed-off-by: default avatarOleg Nesterov <oleg@redhat.com>
      Acked-by: default avatarSrikar Dronamraju <srikar@linux.vnet.ibm.com>
      9a98e03c
    • Oleg Nesterov's avatar
      uprobes: _unregister() should always do register_for_each_vma(false) · 04aab9b2
      Oleg Nesterov authored
      uprobe_unregister() removes the breakpoints only if the last consumer
      goes away. To support the filtering it should do this every time, we
      want to remove the breakpoints which nobody else want to keep.
      
      Note: given that filter_chain() is not actually implemented, this patch
      itself doesn't change the behaviour yet, register_for_each_vma(false)
      is a heavy "nop" unless there are no more consumers.
      Signed-off-by: default avatarOleg Nesterov <oleg@redhat.com>
      Acked-by: default avatarSrikar Dronamraju <srikar@linux.vnet.ibm.com>
      04aab9b2
    • Oleg Nesterov's avatar
      uprobes: Introduce filter_chain() · 63633cbf
      Oleg Nesterov authored
      Add the new helper filter_chain(). Currently it is only placeholder,
      the comment explains what is should do. We will change it later to
      consult every consumer to decide whether we need to install the swbp.
      Until then it works as if any consumer returns true, this matches the
      current behavior.
      
      Change install_breakpoint() to call filter_chain() instead of checking
      uprobe->consumers != NULL. We obviously need this, and this equally
      closes the race with _unregister().
      
      Change remove_breakpoint() to call this helper too. Currently this is
      pointless because remove_breakpoint() is only called when the last
      consumer goes away, but we will change this.
      Signed-off-by: default avatarOleg Nesterov <oleg@redhat.com>
      Acked-by: default avatarSrikar Dronamraju <srikar@linux.vnet.ibm.com>
      63633cbf
    • Oleg Nesterov's avatar
      uprobes: Kill uprobe_consumer->filter() · fe20d71f
      Oleg Nesterov authored
      uprobe_consumer->filter() is pointless in its current form, kill it.
      
      We will add it back, but with the different signature/semantics. Perhaps
      we will even re-introduce the callsite in handler_chain(), but not to
      just skip uc->handler().
      Signed-off-by: default avatarOleg Nesterov <oleg@redhat.com>
      Acked-by: default avatarSrikar Dronamraju <srikar@linux.vnet.ibm.com>
      fe20d71f
    • Oleg Nesterov's avatar
      uprobes: Kill the pointless inode/uc checks in register/unregister · f0744af7
      Oleg Nesterov authored
      register/unregister verifies that inode/uc != NULL. For what?
      This really looks like "hide the potential problem", the caller
      should pass the valid data.
      
      register() also checks uc->next == NULL, probably to prevent the
      double-register but the caller can do other stupid/wrong things.
      If we do this check, then we should document that uc->next should
      be cleared before register() and add BUG_ON().
      
      Also add the small comment about the i_size_read() check.
      Signed-off-by: default avatarOleg Nesterov <oleg@redhat.com>
      Acked-by: default avatarSrikar Dronamraju <srikar@linux.vnet.ibm.com>
      f0744af7
    • Oleg Nesterov's avatar
      uprobes: Move __set_bit(UPROBE_SKIP_SSTEP) into alloc_uprobe() · bbc33d05
      Oleg Nesterov authored
      Cosmetic. __set_bit(UPROBE_SKIP_SSTEP) is the part of initialization,
      it is not clear why it is set in insert_uprobe().
      Signed-off-by: default avatarOleg Nesterov <oleg@redhat.com>
      Acked-by: default avatarSrikar Dronamraju <srikar@linux.vnet.ibm.com>
      bbc33d05
  2. 06 Feb, 2013 24 commits
  3. 03 Feb, 2013 1 commit
  4. 01 Feb, 2013 2 commits
    • Steven Rostedt (Red Hat)'s avatar
      tracing: Init current_trace to nop_trace and remove NULL checks · d840f718
      Steven Rostedt (Red Hat) authored
      On early boot up, when the ftrace ring buffer is initialized, the
      static variable current_trace is initialized to &nop_trace.
      Before this initialization, current_trace is NULL and will never
      become NULL again. It is always reassigned to a ftrace tracer.
      
      Several places check if current_trace is NULL before it uses
      it, and this check is frivolous, because at the point in time
      when the checks are made the only way current_trace could be
      NULL is if ftrace failed its allocations at boot up, and the
      paths to these locations would probably not be possible.
      
      By initializing current_trace to &nop_trace where it is declared,
      current_trace will never be NULL, and we can remove all these
      checks of current_trace being NULL which never needed to be
      checked in the first place.
      
      Cc: Dan Carpenter <dan.carpenter@oracle.com>
      Cc: Hiraku Toyooka <hiraku.toyooka.gu@hitachi.com>
      Signed-off-by: default avatarSteven Rostedt <rostedt@goodmis.org>
      d840f718
    • Ingo Molnar's avatar
      Merge tag 'perf-core-for-mingo' of... · 9c4c5fd9
      Ingo Molnar authored
      Merge tag 'perf-core-for-mingo' of git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux into perf/core
      
      Pull perf/core improvements and fixes from Arnaldo Carvalho de Melo:
      
      . Make some POWER7 events available in sysfs, equivalent to
        what was done on x86, from Sukadev Bhattiprolu.
      
      . Add event group view, from Namyung Kim:
      
        To use it, 'perf record' should group events when recording. And then perf
        report parses the saved group relation from file header and prints them
        together if --group option is provided.  You can use 'perf evlist' command to
        see event group information:
      
          $ perf record -e '{ref-cycles,cycles}' noploop 1
          [ perf record: Woken up 2 times to write data ]
          [ perf record: Captured and wrote 0.385 MB perf.data (~16807 samples) ]
      
          $ perf evlist --group
          {ref-cycles,cycles}
      
        With this example, default perf report will show you each event
        separately like this:
      
          $ perf report
          ...
          # group: {ref-cycles,cycles}
          # ========
          # Samples: 3K of event 'ref-cycles'
          # Event count (approx.): 3153797218
          #
          # Overhead  Command      Shared Object                      Symbol
          # ........  .......  .................  ..........................
              99.84%  noploop  noploop            [.] main
               0.07%  noploop  ld-2.15.so         [.] strcmp
               0.03%  noploop  [kernel.kallsyms]  [k] timerqueue_del
               0.03%  noploop  [kernel.kallsyms]  [k] sched_clock_cpu
               0.02%  noploop  [kernel.kallsyms]  [k] account_user_time
               0.01%  noploop  [kernel.kallsyms]  [k] __alloc_pages_nodemask
               0.00%  noploop  [kernel.kallsyms]  [k] native_write_msr_safe
      
          # Samples: 3K of event 'cycles'
          # Event count (approx.): 3722310525
          #
          # Overhead  Command      Shared Object                     Symbol
          # ........  .......  .................  .........................
              99.76%  noploop  noploop            [.] main
               0.11%  noploop  [kernel.kallsyms]  [k] _raw_spin_lock
               0.06%  noploop  [kernel.kallsyms]  [k] find_get_page
               0.03%  noploop  [kernel.kallsyms]  [k] sched_clock_cpu
               0.02%  noploop  [kernel.kallsyms]  [k] rcu_check_callbacks
               0.02%  noploop  [kernel.kallsyms]  [k] __current_kernel_time
               0.00%  noploop  [kernel.kallsyms]  [k] native_write_msr_safe
      
        In this case the event group information will be shown in the end of
        header area.  So you can use --group option to enable event group view.
      
          $ perf report --group
          ...
          # group: {ref-cycles,cycles}
          # ========
          # Samples: 7K of event 'anon group { ref-cycles, cycles }'
          # Event count (approx.): 6876107743
          #
          #         Overhead  Command      Shared Object                      Symbol
          # ................  .......  .................  ..........................
              99.84%  99.76%  noploop  noploop            [.] main
               0.07%   0.00%  noploop  ld-2.15.so         [.] strcmp
               0.03%   0.00%  noploop  [kernel.kallsyms]  [k] timerqueue_del
               0.03%   0.03%  noploop  [kernel.kallsyms]  [k] sched_clock_cpu
               0.02%   0.00%  noploop  [kernel.kallsyms]  [k] account_user_time
               0.01%   0.00%  noploop  [kernel.kallsyms]  [k] __alloc_pages_nodemask
               0.00%   0.00%  noploop  [kernel.kallsyms]  [k] native_write_msr_safe
               0.00%   0.11%  noploop  [kernel.kallsyms]  [k] _raw_spin_lock
               0.00%   0.06%  noploop  [kernel.kallsyms]  [k] find_get_page
               0.00%   0.02%  noploop  [kernel.kallsyms]  [k] rcu_check_callbacks
               0.00%   0.02%  noploop  [kernel.kallsyms]  [k] __current_kernel_time
      
        As you can see the Overhead column now contains both of ref-cycles and
        cycles and header line shows group information also - 'anon group {
        ref-cycles, cycles }'.  The output is sorted by period of group leader
        first.
      
        If perf.data file doesn't contain group information, this --group
        option does nothing.  So if you want enable event group view by
        default you can set it in ~/.perfconfig file:
      
          $ cat ~/.perfconfig
          [report]
          group = true
      
        It can be overridden with command line if you want:
      
          $ perf report --no-group
      Signed-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      Signed-off-by: default avatarIngo Molnar <mingo@kernel.org>
      9c4c5fd9
  5. 31 Jan, 2013 6 commits