1. 15 Aug, 2008 2 commits
    • Mathieu Desnoyers's avatar
      x86: revert replace LOCK_PREFIX in futex.h · 1f49a2c2
      Mathieu Desnoyers authored
      Since we now use DS prefixes instead of NOP to remove LOCK prefixes,
      there are no longer any issues with instruction boundaries moving around.
      
      Depends on :
      
      x86 alternatives : fix LOCK_PREFIX race with preemptible kernel and CPU hotplug
      
      On Thu, 14 Aug 2008, Mathieu Desnoyers wrote:
      >
      > Changing the 0x90 (single-byte nop) currently used into a 0x3E DS segment
      > override prefix should fix this issue. Since the default of the atomic
      > instructions is to use the DS segment anyway, it should not affect the
      > behavior.
      
      Ok, so I think this is an _excellent_ patch, but I'd like to also then use
      LOCK_PREFIX in include/asm-x86/futex.h.
      
      See commit 9d55b992.
      
                      Linus
      
      Applies to 2.6.27-rc2 (and -rc3 unless hell broke loose in futex.h between rc2
      and rc3).
      Signed-off-by: default avatarMathieu Desnoyers <mathieu.desnoyers@polymtl.ca>
      CC: Linus Torvalds <torvalds@linux-foundation.org>
      CC: H. Peter Anvin <hpa@zytor.com>
      CC: Jeremy Fitzhardinge <jeremy@goop.org>
      CC: Roland McGrath <roland@redhat.com>
      CC: Ingo Molnar <mingo@elte.hu>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      CC: Steven Rostedt <srostedt@redhat.com>
      CC: Thomas Gleixner <tglx@linutronix.de>
      CC: Peter Zijlstra <peterz@infradead.org>
      CC: Andrew Morton <akpm@linux-foundation.org>
      CC: David Miller <davem@davemloft.net>
      CC: Ulrich Drepper <drepper@redhat.com>
      CC: Rusty Russell <rusty@rustcorp.com.au>
      CC: Gregory Haskins <ghaskins@novell.com>
      CC: Arnaldo Carvalho de Melo <acme@redhat.com>
      CC: "Luis Claudio R. Goncalves" <lclaudio@uudg.org>
      CC: Clark Williams <williams@redhat.com>
      CC: Christoph Lameter <cl@linux-foundation.org>
      CC: Andi Kleen <andi@firstfloor.org>
      CC: Harvey Harrison <harvey.harrison@gmail.com>
      Signed-off-by: default avatarH. Peter Anvin <hpa@zytor.com>
      1f49a2c2
    • Mathieu Desnoyers's avatar
      x86: alternatives : fix LOCK_PREFIX race with preemptible kernel and CPU hotplug · f88f07e0
      Mathieu Desnoyers authored
      If a kernel thread is preempted in single-cpu mode right after the NOP (nop
      about to be turned into a lock prefix), then we CPU hotplug a CPU, and then the
      thread is scheduled back again, a SMP-unsafe atomic operation will be used on
      shared SMP variables, leading to corruption. No corruption would happen in the
      reverse case : going from SMP to UP is ok because we split a bit instruction
      into tiny pieces, which does not present this condition.
      
      Changing the 0x90 (single-byte nop) currently used into a 0x3E DS segment
      override prefix should fix this issue. Since the default of the atomic
      instructions is to use the DS segment anyway, it should not affect the
      behavior.
      
      The exception to this are references that use ESP/RSP and EBP/RBP as
      the base register (they will use the SS segment), however, in Linux
      (a) DS == SS at all times, and (b) we do not distinguish between
      segment violations reported as #SS as opposed to #GP, so there is no
      need to disassemble the instruction to figure out the suitable segment.
      
      This patch assumes that the 0x3E prefix will leave atomic operations as-is (thus
      assuming they normally touch data in the DS segment). Since there seem to be no
      obvious ill-use of other segment override prefixes for atomic operations, it
      should be safe. It can be verified with a quick
      
      grep -r LOCK_PREFIX include/asm-x86/
      grep -A 1 -r LOCK_PREFIX arch/x86/
      
      Taken from
      
      This source :
      AMD64 Architecture Programmer's Manual Volume 3: General-Purpose and System
      Instructions
      States
      "Instructions that Reference a Non-Stack Segment—If an instruction encoding
      references any base register other than rBP or rSP, or if an instruction
      contains an immediate offset, the default segment is the data segment (DS).
      These instructions can use the segment-override prefix to select one of the
      non-default segments, as shown in Table 1-5."
      
      Therefore, forcing the DS segment on the atomic operations, which already use
      the DS segment, should not change.
      
      This source :
      http://wiki.osdev.org/X86_Instruction_Encoding
      States
      "In 64-bit the CS, SS, DS and ES segment overrides are ignored."
      
      Confirmed by "AMD 64-Bit Technology" A.7
      http://www.amd.com/us-en/assets/content_type/white_papers_and_tech_docs/x86-64_overview.pdf
      
      "In 64-bit mode, the DS, ES, SS and CS segment-override prefixes have no effect.
      These four prefixes are no longer treated as segment-override prefixes in the
      context of multipleprefix rules. Instead, they are treated as null prefixes."
      
      This patch applies to 2.6.27-rc2, but would also have to be applied to earlier
      kernels (2.6.26, 2.6.25, ...).
      
      Performance impact of the fix : tests done on "xaddq" and "xaddl" shows it
      actually improves performances on Intel Xeon, AMD64, Pentium M. It does not
      change the performance on Pentium II, Pentium 3 and Pentium 4.
      
      Xeon E5405 2.0GHz :
      NR_TESTS                                    10000000
      test empty cycles :                        162207948
      test test 1-byte nop xadd cycles :         170755422
      test test DS override prefix xadd cycles : 170000118 *
      test test LOCK xadd cycles :               472012134
      
      AMD64 2.0GHz :
      NR_TESTS                                    10000000
      test empty cycles :                        146674549
      test test 1-byte nop xadd cycles :         150273860
      test test DS override prefix xadd cycles : 149982382 *
      test test LOCK xadd cycles :               270000690
      
      Pentium 4 3.0GHz
      NR_TESTS                                    10000000
      test empty cycles :                        290001195
      test test 1-byte nop xadd cycles :         310000560
      test test DS override prefix xadd cycles : 310000575 *
      test test LOCK xadd cycles :              1050103740
      
      Pentium M 2.0GHz
      NR_TESTS 10000000
      test empty cycles :                        180000523
      test test 1-byte nop xadd cycles :         320000345
      test test DS override prefix xadd cycles : 310000374 *
      test test LOCK xadd cycles :               480000357
      
      Pentium 3 550MHz
      NR_TESTS                                    10000000
      test empty cycles :                        510000231
      test test 1-byte nop xadd cycles :         620000128
      test test DS override prefix xadd cycles : 620000110 *
      test test LOCK xadd cycles :               800000088
      
      Pentium II 350MHz
      NR_TESTS                                    10000000
      test empty cycles :                        200833494
      test test 1-byte nop xadd cycles :         340000130
      test test DS override prefix xadd cycles : 340000126 *
      test test LOCK xadd cycles :               530000078
      
      Speed test modules can be found at
      http://ltt.polymtl.ca/svn/trunk/tests/kernel/test-prefix-speed-32.c
      http://ltt.polymtl.ca/svn/trunk/tests/kernel/test-prefix-speed.c
      
      Macro-benchmarks
      
      2.0GHz E5405 Core 2 dual Quad-Core Xeon
      
      Summary
      
      * replace smp lock prefixes with DS segment selector prefixes
                        no lock prefix (s)   with lock prefix (s)    Speedup
      make -j1 kernel/      33.94 +/- 0.07         34.91 +/- 0.27      2.8 %
      hackbench 50           2.99 +/- 0.01          3.74 +/- 0.01     25.1 %
      
      * replace smp lock prefixes with 0x90 nops
                        no lock prefix (s)   with lock prefix (s)    Speedup
      make -j1 kernel/      34.16 +/- 0.32         34.91 +/- 0.27      2.2 %
      hackbench 50           3.00 +/- 0.01          3.74 +/- 0.01     24.7 %
      
      Detail :
      
      1 CPU, replace smp lock prefixes with DS segment selector prefixes
      
      make -j1 kernel/
      
      real	0m34.067s
      user	0m30.630s
      sys	0m2.980s
      
      real	0m33.867s
      user	0m30.582s
      sys	0m3.024s
      
      real	0m33.939s
      user	0m30.738s
      sys	0m2.876s
      
      real	0m33.913s
      user	0m30.806s
      sys	0m2.808s
      
      avg : 33.94s
      std. dev. : 0.07s
      
      hackbench 50
      
      Time: 2.978
      Time: 2.982
      Time: 3.010
      Time: 2.984
      Time: 2.982
      
      avg : 2.99
      std. dev. : 0.01
      
      1 CPU, noreplace-smp
      
      make -j1 kernel/
      
      real	0m35.326s
      user	0m30.630s
      sys	0m3.260s
      
      real	0m34.325s
      user	0m30.802s
      sys	0m3.084s
      
      real	0m35.568s
      user	0m30.722s
      sys	0m3.168s
      
      real	0m34.435s
      user	0m30.886s
      sys	0m2.996s
      
      avg.: 34.91s
      std. dev. : 0.27s
      
      hackbench 50
      
      Time: 3.733
      Time: 3.750
      Time: 3.761
      Time: 3.737
      Time: 3.741
      
      avg : 3.74
      std. dev. : 0.01
      
      1 CPU, replace smp lock prefixes with 0x90 nops
      
      make -j1 kernel/
      
      real	0m34.139s
      user	0m30.782s
      sys	0m2.820s
      
      real	0m34.010s
      user	0m30.630s
      sys	0m2.976s
      
      real	0m34.777s
      user	0m30.658s
      sys	0m2.916s
      
      real	0m33.924s
      user	0m30.634s
      sys	0m2.924s
      
      real	0m33.962s
      user	0m30.774s
      sys	0m2.800s
      
      real	0m34.141s
      user	0m30.770s
      sys	0m2.828s
      
      avg : 34.16
      std. dev. : 0.32
      
      hackbench 50
      
      Time: 2.999
      Time: 2.994
      Time: 3.004
      Time: 2.991
      Time: 2.988
      
      avg : 3.00
      std. dev. : 0.01
      
      I did more runs (20 runs of each) to compare the nop case to the DS
      prefix case. Results in seconds. They actually does not seems to show a
      significant difference.
      
      NOP
      
      34.155
      33.955
      34.012
      35.299
      35.679
      34.141
      33.995
      35.016
      34.254
      33.957
      33.957
      34.008
      35.013
      34.494
      33.893
      34.295
      34.314
      34.854
      33.991
      34.132
      
      DS
      
      34.080
      34.304
      34.374
      35.095
      34.291
      34.135
      33.940
      34.208
      35.276
      34.288
      33.861
      33.898
      34.610
      34.709
      33.851
      34.256
      35.161
      34.283
      33.865
      35.078
      
      Used http://www.graphpad.com/quickcalcs/ttest1.cfm?Format=C to do the
      T-test (yeah, I'm lazy) :
      
       Group      Group One (DS prefix)       Group Two (nops)
       Mean                    34.37815               34.37070
       SD                       0.46108                0.51905
       SEM                      0.10310                0.11606
       N                             20                     20
      
      P value and statistical significance:
        The two-tailed P value equals 0.9620
        By conventional criteria, this difference is considered to be not statistically significant.
      
      Confidence interval:
        The mean of Group One minus Group Two equals 0.00745
        95% confidence interval of this difference: From -0.30682 to 0.32172
      
      Intermediate values used in calculations:
        t = 0.0480
        df = 38
        standard error of difference = 0.155
      
      So, unless these calculus are completely bogus, the difference between the nop
      and the DS case seems not to be statistically significant.
      Signed-off-by: default avatarMathieu Desnoyers <mathieu.desnoyers@polymtl.ca>
      Acked-by: default avatarH. Peter Anvin <hpa@zytor.com>
      CC: Linus Torvalds <torvalds@linux-foundation.org>
      CC: Jeremy Fitzhardinge <jeremy@goop.org>
      CC: Roland McGrath <roland@redhat.com>
      CC: Ingo Molnar <mingo@elte.hu>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      CC: Steven Rostedt <srostedt@redhat.com>
      CC: Thomas Gleixner <tglx@linutronix.de>
      CC: Peter Zijlstra <peterz@infradead.org>
      CC: Andrew Morton <akpm@linux-foundation.org>
      CC: David Miller <davem@davemloft.net>
      CC: Ulrich Drepper <drepper@redhat.com>
      CC: Rusty Russell <rusty@rustcorp.com.au>
      CC: Gregory Haskins <ghaskins@novell.com>
      CC: Arnaldo Carvalho de Melo <acme@redhat.com>
      CC: "Luis Claudio R. Goncalves" <lclaudio@uudg.org>
      CC: Clark Williams <williams@redhat.com>
      CC: Christoph Lameter <cl@linux-foundation.org>
      CC: Andi Kleen <andi@firstfloor.org>
      CC: Harvey Harrison <harvey.harrison@gmail.com>
      Signed-off-by: default avatarH. Peter Anvin <hpa@zytor.com>
      f88f07e0
  2. 14 Aug, 2008 38 commits