1. 24 Sep, 2012 9 commits
  2. 21 Sep, 2012 3 commits
    • Xiao Guangrong's avatar
      perf kvm: Events analysis tool · bcf6edcd
      Xiao Guangrong authored
      Add 'perf kvm stat' support to analyze kvm vmexit/mmio/ioport smartly
      
      Usage:
      - kvm stat
        run a command and gather performance counter statistics, it is the alias of
        perf stat
      
      - trace kvm events:
        perf kvm stat record, or, if other tracepoints are interesting as well, we
        can append the events like this:
        perf kvm stat record -e timer:* -a
      
        If many guests are running, we can track the specified guest by using -p or
        --pid, -a is used to track events generated by all guests.
      
      - show the result:
        perf kvm stat report
      
      The output example is following:
      13005
      13059
      
      total 2 guests are running on the host
      
      Then, track the guest whose pid is 13059:
      ^C[ perf record: Woken up 1 times to write data ]
      [ perf record: Captured and wrote 0.253 MB perf.data.guest (~11065 samples) ]
      
      See the vmexit events:
      
      Analyze events for all VCPUs:
      
                   VM-EXIT    Samples  Samples%     Time%         Avg time
      
               APIC_ACCESS        460    70.55%     0.01%     22.44us ( +-   1.75% )
                       HLT         93    14.26%    99.98% 832077.26us ( +-  10.42% )
        EXTERNAL_INTERRUPT         64     9.82%     0.00%     35.35us ( +-  14.21% )
         PENDING_INTERRUPT         24     3.68%     0.00%      9.29us ( +-  31.39% )
                 CR_ACCESS          7     1.07%     0.00%      8.12us ( +-   5.76% )
            IO_INSTRUCTION          3     0.46%     0.00%     18.00us ( +-  11.79% )
             EXCEPTION_NMI          1     0.15%     0.00%      5.83us ( +-   -nan% )
      
      Total Samples:652, Total events handled time:77396109.80us.
      
      See the mmio events:
      
      Analyze events for all VCPUs:
      
               MMIO Access    Samples  Samples%     Time%         Avg time
      
              0xfee00380:W        387    84.31%    79.28%      8.29us ( +-   3.32% )
              0xfee00300:W         24     5.23%     9.96%     16.79us ( +-   1.97% )
              0xfee00300:R         24     5.23%     7.83%     13.20us ( +-   3.00% )
              0xfee00310:W         24     5.23%     2.93%      4.94us ( +-   3.84% )
      
      Total Samples:459, Total events handled time:4044.59us.
      
      See the ioport event:
      
      Analyze events for all VCPUs:
      
            IO Port Access    Samples  Samples%     Time%         Avg time
      
               0xc050:POUT          3   100.00%   100.00%     13.75us ( +-  10.83% )
      
      Total Samples:3, Total events handled time:41.26us.
      
      And, --vcpu is used to track the specified vcpu and --key is used to sort the
      result:
      
      Analyze events for VCPU 0:
      
                   VM-EXIT    Samples  Samples%     Time%         Avg time
      
                       HLT         27    13.85%    99.97% 405790.24us ( +-  12.70% )
        EXTERNAL_INTERRUPT         13     6.67%     0.00%     27.94us ( +-  22.26% )
               APIC_ACCESS        146    74.87%     0.03%     21.69us ( +-   2.91% )
            IO_INSTRUCTION          2     1.03%     0.00%     17.77us ( +-  20.56% )
                 CR_ACCESS          2     1.03%     0.00%      8.55us ( +-   6.47% )
         PENDING_INTERRUPT          5     2.56%     0.00%      6.27us ( +-   3.94% )
      
      Total Samples:195, Total events handled time:10959950.90us.
      Signed-off-by: default avatarDong Hao <haodong@linux.vnet.ibm.com>
      Signed-off-by: default avatarRunzhen Wang <runzhen@linux.vnet.ibm.com>
      [ Dong Hao <haodong@linux.vnet.ibm.com>
        Runzhen Wang <runzhen@linux.vnet.ibm.com>:
           - rebase it on current acme's tree
           - fix the compiling-error on i386 ]
      Signed-off-by: default avatarXiao Guangrong <xiaoguangrong@linux.vnet.ibm.com>
      Acked-by: default avatarDavid Ahern <dsahern@gmail.com>
      Cc: Avi Kivity <avi@redhat.com>
      Cc: David Ahern <dsahern@gmail.com>
      Cc: Ingo Molnar <mingo@kernel.org>
      Cc: Marcelo Tosatti <mtosatti@redhat.com>
      Cc: kvm@vger.kernel.org
      Cc: Runzhen Wang <runzhen@linux.vnet.ibm.com>
      Link: http://lkml.kernel.org/r/1347870675-31495-4-git-send-email-haodong@linux.vnet.ibm.comSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      bcf6edcd
    • Xiao Guangrong's avatar
      KVM: x86: Export svm/vmx exit code and vector code to userspace · 26bf264e
      Xiao Guangrong authored
      Exporting KVM exit information to userspace to be consumed by perf.
      Signed-off-by: default avatarDong Hao <haodong@linux.vnet.ibm.com>
      [ Dong Hao <haodong@linux.vnet.ibm.com>: rebase it on acme's git tree ]
      Signed-off-by: default avatarXiao Guangrong <xiaoguangrong@linux.vnet.ibm.com>
      Acked-by: default avatarMarcelo Tosatti <mtosatti@redhat.com>
      Cc: Avi Kivity <avi@redhat.com>
      Cc: David Ahern <dsahern@gmail.com>
      Cc: Ingo Molnar <mingo@kernel.org>
      Cc: Marcelo Tosatti <mtosatti@redhat.com>
      Cc: kvm@vger.kernel.org
      Cc: Runzhen Wang <runzhen@linux.vnet.ibm.com>
      Link: http://lkml.kernel.org/r/1347870675-31495-2-git-send-email-haodong@linux.vnet.ibm.comSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      26bf264e
    • Eric Sandeen's avatar
      perf tools: Fix parallel build · e6048fb8
      Eric Sandeen authored
      Parallel builds of perf were failing for me on a 32p box, with:
      
          * new build flags or prefix
      util/pmu.l:7:23: error: pmu-bison.h: No such file or directory
      
      ...
      
      make: *** [util/pmu-flex.o] Error 1
      make: *** Waiting for unfinished jobs....
      
      This can pretty quickly be seen by adding a sleep in front of the bison
      calls in tools/perf/Makefile and running make -j4 on a smaller box i.e.:
      
      	sleep 10; $(QUIET_BISON)$(BISON) -v util/pmu.y -d -o $(OUTPUT)util/pmu-bison.c
      
      Adding the following dependencies fixes it for me.
      Signed-off-by: default avatarEric Sandeen <sandeen@redhat.com>
      Reviewed-by: default avatarNamhyung Kim <namhyung@kernel.org>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Link: http://lkml.kernel.org/r/505BD190.40707@redhat.comSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      e6048fb8
  3. 20 Sep, 2012 3 commits
  4. 19 Sep, 2012 3 commits
  5. 17 Sep, 2012 12 commits
  6. 15 Sep, 2012 10 commits
    • Oleg Nesterov's avatar
      uprobes: Make arch_uprobe_task->saved_trap_nr "unsigned int" · baedbf02
      Oleg Nesterov authored
      Make arch_uprobe_task->saved_trap_nr "unsigned int" and move it down
      after ->saved_scratch_register, this changes sizeof() from 24 to 16.
      Signed-off-by: default avatarOleg Nesterov <oleg@redhat.com>
      Acked-by: default avatarSrikar Dronamraju <srikar@linux.vnet.ibm.com>
      baedbf02
    • Oleg Nesterov's avatar
      uprobes/x86: Fix arch_uprobe_disable_step() && UTASK_SSTEP_TRAPPED interaction · d6a00b35
      Oleg Nesterov authored
      arch_uprobe_disable_step() should also take UTASK_SSTEP_TRAPPED into
      account. In this case the probed insn was not executed, we need to
      clear X86_EFLAGS_TF if it was set by us and that is all.
      
      Again, this code will look more clean when we move it into
      arch_uprobe_post_xol() and arch_uprobe_abort_xol().
      Signed-off-by: default avatarOleg Nesterov <oleg@redhat.com>
      Acked-by: default avatarSrikar Dronamraju <srikar@linux.vnet.ibm.com>
      d6a00b35
    • Oleg Nesterov's avatar
      uprobes/x86: Xol should send SIGTRAP if X86_EFLAGS_TF was set · 3a4664aa
      Oleg Nesterov authored
      arch_uprobe_disable_step() correctly preserves X86_EFLAGS_TF and
      returns to user-mode. But this means the application gets SIGTRAP
      only after the next insn.
      
      This means that UPROBE_CLEAR_TF logic is not really right. _enable
      should only record the state of X86_EFLAGS_TF, and _disable should
      check it separately from UPROBE_FIX_SETF.
      
      Remove arch_uprobe_task->restore_flags, add ->saved_tf instead, and
      change enable/disable accordingly. This assumes that the probed insn
      was not trapped, see the next patch.
      
      arch_uprobe_skip_sstep() logic has the same problem, change it to
      check X86_EFLAGS_TF and send SIGTRAP as well. We will cleanup this
      all after we fold enable/disable_step into pre/post_hol hooks.
      
      Note: send_sig(SIGTRAP) is not actually right, we need send_sigtrap().
      But this needs more changes, handle_swbp() does the same and this is
      equally wrong.
      Signed-off-by: default avatarOleg Nesterov <oleg@redhat.com>
      Acked-by: default avatarSrikar Dronamraju <srikar@linux.vnet.ibm.com>
      3a4664aa
    • Oleg Nesterov's avatar
      uprobes/x86: Do not (ab)use TIF_SINGLESTEP/user_*_single_step() for single-stepping · 9bd1190a
      Oleg Nesterov authored
      user_enable/disable_single_step() was designed for ptrace, it assumes
      a single user and does unnecessary and wrong things for uprobes. For
      example:
      
      	- arch_uprobe_enable_step() can't trust TIF_SINGLESTEP, an
      	  application itself can set X86_EFLAGS_TF which must be
      	  preserved after arch_uprobe_disable_step().
      
      	- we do not want to set TIF_SINGLESTEP/TIF_FORCED_TF in
      	  arch_uprobe_enable_step(), this only makes sense for ptrace.
      
      	- otoh we leak TIF_SINGLESTEP if arch_uprobe_disable_step()
      	  doesn't do user_disable_single_step(), the application will
      	  be killed after the next syscall.
      
      	- arch_uprobe_enable_step() does access_process_vm() we do
      	  not need/want.
      
      Change arch_uprobe_enable/disable_step() to set/clear X86_EFLAGS_TF
      directly, this is much simpler and more correct. However, we need to
      clear TIF_BLOCKSTEP/DEBUGCTLMSR_BTF before executing the probed insn,
      add set_task_blockstep(false).
      
      Note: with or without this patch, there is another (hopefully minor)
      problem. A probed "pushf" insn can see the wrong X86_EFLAGS_TF set by
      uprobes. Perhaps we should change _disable to update the stack, or
      teach arch_uprobe_skip_sstep() to emulate this insn.
      Signed-off-by: default avatarOleg Nesterov <oleg@redhat.com>
      Acked-by: default avatarSrikar Dronamraju <srikar@linux.vnet.ibm.com>
      9bd1190a
    • Oleg Nesterov's avatar
      ptrace/x86: Partly fix set_task_blockstep()->update_debugctlmsr() logic · 95cf00fa
      Oleg Nesterov authored
      Afaics the usage of update_debugctlmsr() and TIF_BLOCKSTEP in
      step.c was always very wrong.
      
      1. update_debugctlmsr() was simply unneeded. The child sleeps
         TASK_TRACED, __switch_to_xtra(next_p => child) should notice
         TIF_BLOCKSTEP and set/clear DEBUGCTLMSR_BTF after resume if
         needed.
      
      2. It is wrong. The state of DEBUGCTLMSR_BTF bit in CPU register
         should always match the state of current's TIF_BLOCKSTEP bit.
      
      3. Even get_debugctlmsr() + update_debugctlmsr() itself does not
         look right. Irq can change other bits in MSR_IA32_DEBUGCTLMSR
         register or the caller can be preempted in between.
      
      4. It is not safe to play with TIF_BLOCKSTEP if task != current.
         DEBUGCTLMSR_BTF and TIF_BLOCKSTEP should always match each
         other if the task is running. The tracee is stopped but it
         can be SIGKILL'ed right before set/clear_tsk_thread_flag().
      
      However, now that uprobes uses user_enable_single_step(current)
      we can't simply remove update_debugctlmsr(). So this patch adds
      the additional "task == current" check and disables irqs to avoid
      the race with interrupts/preemption.
      
      Unfortunately this patch doesn't solve the last problem, we need
      another fix. Probably we should teach ptrace_stop() to set/clear
      single/block stepping after resume.
      
      And afaics there is yet another problem: perf can play with
      MSR_IA32_DEBUGCTLMSR from nmi, this obviously means that even
      __switch_to_xtra() has problems.
      Signed-off-by: default avatarOleg Nesterov <oleg@redhat.com>
      95cf00fa
    • Oleg Nesterov's avatar
      ptrace/x86: Introduce set_task_blockstep() helper · 848e8f5f
      Oleg Nesterov authored
      No functional changes, preparation for the next fix and for uprobes
      single-step fixes.
      
      Move the code playing with TIF_BLOCKSTEP/DEBUGCTLMSR_BTF into the
      new helper, set_task_blockstep().
      Signed-off-by: default avatarOleg Nesterov <oleg@redhat.com>
      Acked-by: default avatarSrikar Dronamraju <srikar@linux.vnet.ibm.com>
      848e8f5f
    • Sebastian Andrzej Siewior's avatar
      uprobes/x86: Implement x86 specific arch_uprobe_*_step · bdc1e472
      Sebastian Andrzej Siewior authored
      The arch specific implementation behaves like user_enable_single_step()
      except that it does not disable single stepping if it was already
      enabled by ptrace. This allows the debugger to single step over an
      uprobe. The state of block stepping is not restored. It makes only sense
      together with TF and if that was enabled then the debugger is notified.
      
      Note: this is still not correct. For example, TIF_SINGLESTEP check
      is not right, the application itself can set X86_EFLAGS_TF. And otoh
      we leak TIF_SINGLESTEP (set by enable) if the probed insn is "popf".
      See the next patches, we need the changes in arch/x86/kernel/step.c
      first.
      Signed-off-by: default avatarSebastian Andrzej Siewior <bigeasy@linutronix.de>
      Signed-off-by: default avatarOleg Nesterov <oleg@redhat.com>
      Acked-by: default avatarSrikar Dronamraju <srikar@linux.vnet.ibm.com>
      bdc1e472
    • Sebastian Andrzej Siewior's avatar
      uprobes: Introduce arch_uprobe_enable/disable_step() · 9d778782
      Sebastian Andrzej Siewior authored
      As Oleg pointed out in [0] uprobe should not use the ptrace interface
      for enabling/disabling single stepping.
      
      [0] http://lkml.kernel.org/r/20120730141638.GA5306@redhat.com
      
      Add the new "__weak arch" helpers which simply call user_*_single_step()
      as a preparation. This is only needed to not break the powerpc port, we
      will fold this logic into arch_uprobe_pre/post_xol() hooks later.
      
      We should also change handle_singlestep(), _disable_step(&uprobe->arch)
      should be called before put_uprobe().
      Signed-off-by: default avatarSebastian Andrzej Siewior <bigeasy@linutronix.de>
      Signed-off-by: default avatarOleg Nesterov <oleg@redhat.com>
      Acked-by: default avatarSrikar Dronamraju <srikar@linux.vnet.ibm.com>
      9d778782
    • Oleg Nesterov's avatar
      uprobes: Teach find_active_uprobe() to clear MMF_HAS_UPROBES · 499a4f3e
      Oleg Nesterov authored
      The wrong MMF_HAS_UPROBES doesn't really hurt, just it triggers
      the "slow" and unnecessary handle_swbp() path if the task hits
      the non-uprobe breakpoint.
      
      So this patch changes find_active_uprobe() to check every valid
      vma and clear MMF_HAS_UPROBES if no uprobes were found. This is
      adds the slow O(n) path, but it is only called in unlikely case
      when the task hits the normal breakpoint first time after
      uprobe_unregister().
      
      Note the "not strictly accurate" comment in mmf_recalc_uprobes().
      We can fix this, we only need to teach vma_has_uprobes() to return
      a bit more more info, but I am not sure this worth the trouble.
      Signed-off-by: default avatarOleg Nesterov <oleg@redhat.com>
      Acked-by: default avatarSrikar Dronamraju <srikar@linux.vnet.ibm.com>
      499a4f3e
    • Oleg Nesterov's avatar
      uprobes: Introduce MMF_RECALC_UPROBES · 9f68f672
      Oleg Nesterov authored
      Add the new MMF_RECALC_UPROBES flag, it means that MMF_HAS_UPROBES
      can be false positive after remove_breakpoint() or uprobe_munmap().
      It is also set by uprobe_dup_mmap(), this is not optimal but simple.
      We could add the new hook, uprobe_dup_vma(), to set MMF_HAS_UPROBES
      only if the new mm actually has uprobes, but I don't think this
      makes sense.
      
      The next patch will use this flag to clear MMF_HAS_UPROBES.
      Signed-off-by: default avatarOleg Nesterov <oleg@redhat.com>
      Acked-by: default avatarSrikar Dronamraju <srikar@linux.vnet.ibm.com>
      9f68f672