• Zachary Amsden's avatar
    [PATCH] faster signal handling on x86 · 29f1caa9
    Zachary Amsden authored
    Optimize away the unconditional write to debug registers on signal delivery
    path.  This is already done on x86_64.
    
    We only need to write to dr7 if there is a breakpoint to re-enable, and 
    MOVDR is a serializing instruction, which is expensive.  Getting rid of 
    it gets a 33% faster signal delivery path (at least on Xeon - I didn't
    test other CPUs, so your gain may vary).
    
    [ Editors note: it's likely only that slow on Netburst.  Serializing is
      not that expensive, but it is likely that writing to %db7 invalidates
      the trace cache, which explains why it's so slow on Xeon - it's not
      just the op itself, it has to re-populate the cache all the time.
    					--- Linus ]
    
    
    Measured delta TSC for three paths on a 2.4GHz Xeon.
    
    1) With unconditional write to dr7 :  800-1000 cycles
    2) With conditional write to dr7   :  84-112 cycles
    3) With unlikely write to dr7      :  84 cycles
    
    Performance test using divzero microbenchmark (3 million divide by zeros):
    
    With unconditional write:
       7.445 real / 6.136 system
       7.529 real / 6.482 system
       7.541 real / 5.974 system
       7.546 real / 6.217 system
       7.445 real / 6.167 system
    
    With unlikely write:
       5.779 real / 4.518 system
       5.783 real / 4.591 system
       5.552 real / 4.569 system
       5.790 real / 4.528 system
       5.554 real / 4.382 system
    
    That's about a 33% speedup - more than I expected; apparently getting rid
    of the serializing instruction makes the do_signal path much faster.
    
    Zachary Amsden (zach@vmware.com)
    29f1caa9
signal.c 16.4 KB