[PATCH] faster signal handling on x86
Optimize away the unconditional write to debug registers on signal delivery path. This is already done on x86_64. We only need to write to dr7 if there is a breakpoint to re-enable, and MOVDR is a serializing instruction, which is expensive. Getting rid of it gets a 33% faster signal delivery path (at least on Xeon - I didn't test other CPUs, so your gain may vary). [ Editors note: it's likely only that slow on Netburst. Serializing is not that expensive, but it is likely that writing to %db7 invalidates the trace cache, which explains why it's so slow on Xeon - it's not just the op itself, it has to re-populate the cache all the time. --- Linus ] Measured delta TSC for three paths on a 2.4GHz Xeon. 1) With unconditional write to dr7 : 800-1000 cycles 2) With conditional write to dr7 : 84-112 cycles 3) With unlikely write to dr7 : 84 cycles Performance test using divzero microbenchmark (3 million divide by zeros): With unconditional write: 7.445 real / 6.136 system 7.529 real / 6.482 system 7.541 real / 5.974 system 7.546 real / 6.217 system 7.445 real / 6.167 system With unlikely write: 5.779 real / 4.518 system 5.783 real / 4.591 system 5.552 real / 4.569 system 5.790 real / 4.528 system 5.554 real / 4.382 system That's about a 33% speedup - more than I expected; apparently getting rid of the serializing instruction makes the do_signal path much faster. Zachary Amsden (zach@vmware.com)
Showing
Please register or sign in to comment