1. 18 Dec, 2010 5 commits
    • Christoph Lameter's avatar
      irq_work: Use per cpu atomics instead of regular atomics · 20b87691
      Christoph Lameter authored
      The irq work queue is a per cpu object and it is sufficient for
      synchronization if per cpu atomics are used. Doing so simplifies
      the code and reduces the overhead of the code.
      
      Before:
      
      christoph@linux-2.6$ size kernel/irq_work.o
         text	   data	    bss	    dec	    hex	filename
          451	      8	      1	    460	    1cc	kernel/irq_work.o
      
      After:
      
      christoph@linux-2.6$ size kernel/irq_work.o 
         text	   data	    bss	    dec	    hex	filename
          438	      8	      1	    447	    1bf	kernel/irq_work.o
      
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Signed-off-by: default avatarChristoph Lameter <cl@linux.com>
      20b87691
    • Tejun Heo's avatar
      Merge branch 'this_cpu_ops' into for-2.6.38 · 05c2d088
      Tejun Heo authored
      05c2d088
    • Christoph Lameter's avatar
      cpuops: Use cmpxchg for xchg to avoid lock semantics · 8270137a
      Christoph Lameter authored
      Use cmpxchg instead of xchg to realize this_cpu_xchg.
      
      xchg will cause LOCK overhead since LOCK is always implied but cmpxchg
      will not.
      
      Baselines:
      
      xchg()		= 18 cycles (no segment prefix, LOCK semantics)
      __this_cpu_xchg = 1 cycle
      
      (simulated using this_cpu_read/write, two prefixes. Looks like the
      cpu can use loop optimization to get rid of most of the overhead)
      
      Cycles before:
      
      this_cpu_xchg	 = 37 cycles (segment prefix and LOCK (implied by xchg))
      
      After:
      
      this_cpu_xchg	= 11 cycle (using cmpxchg without lock semantics)
      Signed-off-by: default avatarChristoph Lameter <cl@linux.com>
      Signed-off-by: default avatarTejun Heo <tj@kernel.org>
      8270137a
    • Christoph Lameter's avatar
      x86: this_cpu_cmpxchg and this_cpu_xchg operations · 7296e08a
      Christoph Lameter authored
      Provide support as far as the hardware capabilities of the x86 cpus
      allow.
      
      Define CONFIG_CMPXCHG_LOCAL in Kconfig.cpu to allow core code to test for
      fast cpuops implementations.
      
      V1->V2:
      	- Take out the definition for this_cpu_cmpxchg_8 and move it into
      	  a separate patch.
      
      tj: - Reordered ops to better follow this_cpu_* organization.
          - Renamed macro temp variables similar to their existing
            neighbours.
      Signed-off-by: default avatarChristoph Lameter <cl@linux.com>
      Signed-off-by: default avatarTejun Heo <tj@kernel.org>
      7296e08a
    • Christoph Lameter's avatar
      percpu: Generic this_cpu_cmpxchg() and this_cpu_xchg support · 2b712442
      Christoph Lameter authored
      Generic code to provide new per cpu atomic features
      
      	this_cpu_cmpxchg
      	this_cpu_xchg
      
      Fallback occurs to functions using interrupts disable/enable
      to ensure correct per cpu atomicity.
      
      Fallback to regular cmpxchg and xchg is not possible since per cpu atomic
      semantics include the guarantee that the current cpus per cpu data is
      accessed atomically. Use of regular cmpxchg and xchg requires the
      determination of the address of the per cpu data before regular cmpxchg
      or xchg which therefore cannot be atomically included in an xchg or
      cmpxchg without segment override.
      
      tj: - Relocated new ops to conform better to the general organization.
          - This patch contains a trivial comment fix.
      Signed-off-by: default avatarChristoph Lameter <cl@linux.com>
      Signed-off-by: default avatarTejun Heo <tj@kernel.org>
      2b712442
  2. 17 Dec, 2010 20 commits
  3. 16 Dec, 2010 15 commits