1. 06 Nov, 2013 5 commits
    • Peter Zijlstra's avatar
      perf: Optimize perf_output_begin() -- lost_event case · d20a973f
      Peter Zijlstra authored
      Avoid touching the lost_event and sample_data cachelines twince. Its
      not like we end up doing less work, but it might help to keep all
      accesses to these cachelines in one place.
      
      Due to code shuffle, this looses 4 bytes on x86_64-defconfig.
      Signed-off-by: default avatarPeter Zijlstra <peterz@infradead.org>
      Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: Mathieu Desnoyers <mathieu.desnoyers@polymtl.ca>
      Cc: Michael Ellerman <michael@ellerman.id.au>
      Cc: Michael Neuling <mikey@neuling.org>
      Cc: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com>
      Cc: james.hogan@imgtec.com
      Cc: Vince Weaver <vince@deater.net>
      Cc: Victor Kaplansky <VICTORK@il.ibm.com>
      Cc: Oleg Nesterov <oleg@redhat.com>
      Cc: Anton Blanchard <anton@samba.org>
      Link: http://lkml.kernel.org/n/tip-zfxnc58qxj0eawdoj31hhupv@git.kernel.orgSigned-off-by: default avatarIngo Molnar <mingo@kernel.org>
      d20a973f
    • Peter Zijlstra's avatar
      perf: Optimize perf_output_begin() · 85f59edf
      Peter Zijlstra authored
      There's no point in re-doing the memory-barrier when we fail the
      cmpxchg(). Also placing it after the space reservation loop makes it
      clearer it only separates the userpage->tail read from the data
      stores.
      Signed-off-by: default avatarPeter Zijlstra <peterz@infradead.org>
      Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: Mathieu Desnoyers <mathieu.desnoyers@polymtl.ca>
      Cc: Michael Ellerman <michael@ellerman.id.au>
      Cc: Michael Neuling <mikey@neuling.org>
      Cc: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com>
      Cc: james.hogan@imgtec.com
      Cc: Vince Weaver <vince@deater.net>
      Cc: Victor Kaplansky <VICTORK@il.ibm.com>
      Cc: Oleg Nesterov <oleg@redhat.com>
      Cc: Anton Blanchard <anton@samba.org>
      Link: http://lkml.kernel.org/n/tip-c19u6egfldyx86tpyc3zgkw9@git.kernel.orgSigned-off-by: default avatarIngo Molnar <mingo@kernel.org>
      85f59edf
    • Peter Zijlstra's avatar
      perf: Add unlikely() to the ring-buffer code · c72b42a3
      Peter Zijlstra authored
      Add unlikely() annotations to 'slow' paths:
      
      When having a sampling event but no output buffer; you have bigger
      issues -- also the bail is still faster than actually doing the work.
      
      When having a sampling event but a control page only buffer, you have
      bigger issues -- again the bail is still faster than actually doing
      work.
      
      Optimize for the case where you're not loosing events -- again, not
      doing the work is still faster but make sure that when you have to
      actually do work its as fast as possible.
      
      The typical watermark is 1/2 the buffer size, so most events will not
      take this path.
      
      Shrinks perf_output_begin() by 16 bytes on x86_64-defconfig.
      Signed-off-by: default avatarPeter Zijlstra <peterz@infradead.org>
      Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: Mathieu Desnoyers <mathieu.desnoyers@polymtl.ca>
      Cc: Michael Ellerman <michael@ellerman.id.au>
      Cc: Michael Neuling <mikey@neuling.org>
      Cc: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com>
      Cc: james.hogan@imgtec.com
      Cc: Vince Weaver <vince@deater.net>
      Cc: Victor Kaplansky <VICTORK@il.ibm.com>
      Cc: Oleg Nesterov <oleg@redhat.com>
      Cc: Anton Blanchard <anton@samba.org>
      Link: http://lkml.kernel.org/n/tip-wlg3jew3qnutm8opd0hyeuwn@git.kernel.orgSigned-off-by: default avatarIngo Molnar <mingo@kernel.org>
      c72b42a3
    • Peter Zijlstra's avatar
      perf: Simplify the ring-buffer code · 26c86da8
      Peter Zijlstra authored
      By using CIRC_SPACE() we can obviate the need for perf_output_space().
      
      Shrinks the size of perf_output_begin() by 17 bytes on
      x86_64-defconfig.
      Signed-off-by: default avatarPeter Zijlstra <peterz@infradead.org>
      Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: Mathieu Desnoyers <mathieu.desnoyers@polymtl.ca>
      Cc: Michael Ellerman <michael@ellerman.id.au>
      Cc: Michael Neuling <mikey@neuling.org>
      Cc: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com>
      Cc: james.hogan@imgtec.com
      Cc: Vince Weaver <vince@deater.net>
      Cc: Victor Kaplansky <VICTORK@il.ibm.com>
      Cc: Oleg Nesterov <oleg@redhat.com>
      Cc: Anton Blanchard <anton@samba.org>
      Link: http://lkml.kernel.org/n/tip-vtb0xb0llebmsdlfn1v5vtfj@git.kernel.orgSigned-off-by: default avatarIngo Molnar <mingo@kernel.org>
      26c86da8
    • Ingo Molnar's avatar
      Merge tag 'perf-core-for-mingo' of... · 83bf9702
      Ingo Molnar authored
      Merge tag 'perf-core-for-mingo' of git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux into perf/core
      
      Pull perf/core improvements and fixes from Arnaldo Carvalho de Melo:
      
        * Check maximum frequency rate for record/top, emitting better error
          messages, from Jiri Olsa.
      
        * Disable live kvm command if timerfd is not supported, from David Ahern.
      
        * Add usage to 'perf list', from David Ahern.
      
        * Fix detection of non-core features, from David Ahern.
      
        * Consolidate __hists__add_*entry(), cleanup from Namhyung Kim.
      Signed-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      Signed-off-by: default avatarIngo Molnar <mingo@kernel.org>
      83bf9702
  2. 05 Nov, 2013 9 commits
  3. 04 Nov, 2013 26 commits