1. 05 Jan, 2018 3 commits
    • Kirill Smelkov's avatar
      benchstat: skip empty rows when just dumping data · 2e294072
      Kirill Smelkov authored
      When comparing 2 files, benchstat skips rows that are not present in
      both benchmarks. However currently when benchstat'ing just 1 file or 3
      or more, benchstat dumps every row for every unit, even if row results
      for the unit are completely empty.
      
      When several units are in use, this creates a lot of noise of empty
      benchmark lines output. For example for the following benchmark
      
      ```
      BenchmarkTwoHourMarathon 1 7200000000000 ns/op 14400000000000 user-ns/op 5 ns/GC 12 quick-bytes
      BenchmarkTwoHourMarathon 1 7200000000000 ns/op 14500000000000 user-ns/op 5 ns/GC 16 quick-bytes
      BenchmarkTwoHourMarathon 1 7200000000000 ns/op 14600000000000 user-ns/op 5 ns/GC 12 quick-bytes
      BenchmarkTwoHourMarathon 1 7200000000000 ns/op 14200000000000 user-ns/op 5 ns/GC 16 quick-bytes
      BenchmarkTwoHourMarathon 1 7200000000000 ns/op 14300000000000 user-ns/op 5 ns/GC 12 quick-bytes
      
      BenchmarkTimens 1 31 ns/op 21 user-ns/op
      BenchmarkTimeμs 1 32 μs/op 22 user-μs/op	# mu = U+03bc
      BenchmarkTimeµs 1 33 µs/op 23 user-µs/op	# mu = U+b5
      BenchmarkTimeus 1 34 us/op 24 user-us/op
      BenchmarkTimems 1 35 ms/op 25 user-ms/op
      BenchmarkTimes  1 36  s/op 26 user-s/op
      
      BenchmarkTimeObjectns 1 41 ns/object
      BenchmarkTimeObjectµs 1 42 µs/object
      BenchmarkTimeObjectus 1 43 us/object
      BenchmarkTimeObjectms 1 44 ms/object
      BenchmarkTimeObjects  1 45 s/object
      ```
      
      current output is
      
      ```
      name             time/op
      TwoHourMarathon    7200s ± 0%
      Timens            31.0ns ± 0%
      Timeμs            32.0µs ± 0%
      Timeµs            33.0µs ± 0%
      Timeus            34.0µs ± 0%
      Timems            35.0ms ± 0%
      Times              36.0s ± 0%
      TimeObjectns
      TimeObjectµs
      TimeObjectus
      TimeObjectms
      TimeObjects
      
      name             user-time/op
      TwoHourMarathon   14400s ± 1%
      Timens            21.0ns ± 0%
      Timeμs            22.0µs ± 0%
      Timeµs            23.0µs ± 0%
      Timeus            24.0µs ± 0%
      Timems            25.0ms ± 0%
      Times              26.0s ± 0%
      TimeObjectns
      TimeObjectµs
      TimeObjectus
      TimeObjectms
      TimeObjects
      
      name             time/GC
      TwoHourMarathon   5.00ns ± 0%
      Timens
      Timeμs
      Timeµs
      Timeus
      Timems
      Times
      TimeObjectns
      TimeObjectµs
      TimeObjectus
      TimeObjectms
      TimeObjects
      
      name             quick-bytes
      TwoHourMarathon    13.6B ±18%
      Timens
      Timeμs
      Timeµs
      Timeus
      Timems
      Times
      TimeObjectns
      TimeObjectµs
      TimeObjectus
      TimeObjectms
      TimeObjects
      
      name             time/object
      TwoHourMarathon
      Timens
      Timeμs
      Timeµs
      Timeus
      Timems
      Times
      TimeObjectns      41.0ns ± 0%
      TimeObjectµs      42.0µs ± 0%
      TimeObjectus      43.0µs ± 0%
      TimeObjectms      44.0ms ± 0%
      TimeObjects        45.0s ± 0%
      ```
      
      and it becomes
      
      ```
      name             time/op
      TwoHourMarathon    7200s ± 0%
      Timens            31.0ns ± 0%
      Timeμs            32.0µs ± 0%
      Timeµs            33.0µs ± 0%
      Timeus            34.0µs ± 0%
      Timems            35.0ms ± 0%
      Times              36.0s ± 0%
      
      name             user-time/op
      TwoHourMarathon   14400s ± 1%
      Timens            21.0ns ± 0%
      Timeμs            22.0µs ± 0%
      Timeµs            23.0µs ± 0%
      Timeus            24.0µs ± 0%
      Timems            25.0ms ± 0%
      Times              26.0s ± 0%
      
      name             time/GC
      TwoHourMarathon   5.00ns ± 0%
      
      name             quick-bytes
      TwoHourMarathon    13.6B ±18%
      
      name             time/object
      TimeObjectns      41.0ns ± 0%
      TimeObjectµs      42.0µs ± 0%
      TimeObjectus      43.0µs ± 0%
      TimeObjectms      44.0ms ± 0%
      TimeObjects        45.0s ± 0%
      ```
      
      after this patch.
      
      Change-Id: Ic260e39913af7b2c3aa1ff850aed65002d553d6c
      2e294072
    • Kirill Smelkov's avatar
      benchstat: detect unit is "time" not only for "ns/op" and "ns/GC" · 99f0dffb
      Kirill Smelkov authored
      For example in my testing I use "µs/object"
      
      	https://lab.nexedi.com/kirr/neo/blob/8bac3dba/t/time/time-soct17-z6001.txt#L392
      
      and in general anything with time unit in nominator represents just
      time.
      
      For the following benchmark data
      
      ```
      BenchmarkTimeObjectns 1 41 ns/object
      BenchmarkTimeObjectµs 1 42 µs/object
      BenchmarkTimeObjectus 1 43 us/object
      BenchmarkTimeObjectms 1 44 ms/object
      BenchmarkTimeObjects  1 45 s/object
      ```
      
      output before patch:
      
      ```
      name             ns/object
      TimeObjectns        41.0 ± 0%
      TimeObjectµs       42.0k ± 0%
      TimeObjectus       43.0k ± 0%
      TimeObjectms       44.0M ± 0%
      TimeObjects        45.0G ± 0%
      ```
      
      output after patch:
      
      ```
      name             time/object
      TimeObjectns      41.0ns ± 0%
      TimeObjectµs      42.0µs ± 0%
      TimeObjectus      43.0µs ± 0%
      TimeObjectms      44.0ms ± 0%
      TimeObjects        45.0s ± 0%
      ```
      
      Change-Id: I55f2331334828cf6632f675344d98846e7cdaf41
      99f0dffb
    • Kirill Smelkov's avatar
      benchstat: normalize time units to ns · 59f3a5d3
      Kirill Smelkov authored
      A bit of context: as part of benchmarking my server I first benchmark
      underlying disk and then also network RTT latency around serving
      machine to get idea about serving machine/network capabilities.
      
      Those benchmarks are done with standard tools like ioping, ping and
      lat_tcp (from lmbench) and then their output is amended with lines in Go
      benchmarking format via simple sed scripts:
      
      https://lab.nexedi.com/kirr/neo/blob/8bac3dba/go/neo/t/neotest#L857
      https://lab.nexedi.com/kirr/neo/blob/8bac3dba/go/neo/t/neotest#L1023
      https://lab.nexedi.com/kirr/neo/blob/8bac3dba/go/neo/t/neotest#L1063
      
      For example original ioping output
      
      	--- . (ext4 /dev/sda1) ioping statistics ---
      	18.0 k requests completed in 2.97 s, 70.2 MiB read, 6.06 k iops, 23.7 MiB/s
      	generated 18.0 k requests in 3.00 s, 70.2 MiB, 5.99 k iops, 23.4 MiB/s
      	min/avg/max/mdev = 123.2 us / 165.0 us / 225.8 us / 35.1 us
      
      becomes amended this way:
      
      	--- . (ext4 /dev/sda1) ioping statistics ---
      	18.0 k requests completed in 2.97 s, 70.2 MiB read, 6.06 k iops, 23.7 MiB/s
      	generated 18.0 k requests in 3.00 s, 70.2 MiB, 5.99 k iops, 23.4 MiB/s
      	min/avg/max/mdev = 123.2 us / 165.0 us / 225.8 us / 35.1 us
      	Benchmarkz6001/disk/randread/direct/4K-min 1 123.2 us/op
      	Benchmarkz6001/disk/randread/direct/4K-avg 1 165.0 us/op
      
      ( https://lab.nexedi.com/kirr/neo/blob/8bac3dba/t/time/time-soct17-z6001-z600.txt#L105 )
      
      Emitting lines in std Go benchmarking format allows to analyze whole benchmark
      run - software + environment - with tools like benchstat.
      
      However there is a problem: all those tools use different units. For
      example ping uses "ms":
      
      https://lab.nexedi.com/kirr/neo/blob/8bac3dba/t/time/time-soct17-z6001-z600.txt#L414
      
      while ioping chooses unit dynamically depending on the value - "ns" for values
      under 1 microsecond, "us" for values under 1 millisecond, "ms" for values under
      1 second, etc - see e.g. here:
      
      https://lab.nexedi.com/kirr/neo/blob/8bac3dba/t/time/time-soct17-z6001-z600.txt#L262
      
      And without units normalization benchstat thinks all those "ns/op", "us/op" and
      "ms/op" are different units and dumps corresponding measurements in separate
      groups - for example even different ioping runs can land into different sections:
      
      	name                                               µs/op
      	z6001/sha1/py/1024B                                2.27 ± 1%
      	z6001/sha1/go/1024B                                2.35 ± 0%
      	z6001/sha1/py/4096B                                7.83 ± 0%
      	z6001/sha1/go/4096B                                9.37 ± 0%
      	z600/sha1/py/1024B                                 2.25 ± 1%
      	z600/sha1/go/1024B                                 2.36 ± 0%
      	z600/sha1/py/4096B                                 7.82 ± 0%
      	z600/sha1/go/4096B                                 9.38 ± 0%
      	z6001-z600/tcprtt(c-c)/1B                          50.0 ± 0%
      	z6001-z600/tcprtt(c-go)/1B                         56.4 ± 3%
      	z600-z6001/tcprtt(c-c)/1B                          50.0 ± 0%
      	z600-z6001/tcprtt(c-go)/1B                         56.0 ± 3%
      	z6001-z600/tcprtt(c-c)/1400B                        130 ± 0%
      	z6001-z600/tcprtt(c-go)/1400B                       135 ± 2%
      	z600-z6001/tcprtt(c-c)/1400B                        130 ± 0%
      	z600-z6001/tcprtt(c-go)/1400B                       135 ± 1%
      	z6001-z600/tcprtt(c-c)/1500B                        145 ± 0%
      	z6001-z600/tcprtt(c-go)/1500B                       156 ±10%
      	z600-z6001/tcprtt(c-c)/1500B                        145 ± 0%
      	z600-z6001/tcprtt(c-go)/1500B                       150 ± 0%
      	z6001-z600/tcprtt(c-c)/4096B                        185 ± 0%
      	z6001-z600/tcprtt(c-go)/4096B                       194 ± 6%
      	z600-z6001/tcprtt(c-c)/4096B                        185 ± 0%
      	z600-z6001/tcprtt(c-go)/4096B                       192 ± 1%
      
      	name                                               us/op
      	z6001/disk/randread/direct/4K-min                   123 ± 1%
      	z6001/disk/randread/direct/4K-avg                   164 ± 1%
      	z6001/disk/randread/pagecache/4K-avg               1.22 ± 1%
      
      	name                                               time/op
      	z6001/disk/randread/pagecache/4K-min              411ns ± 1%
      	z6001/disk/randread/pagecache/4K-avg              949ns ± 0%
      
      	name                                               ms/op
      	z6001-z600/pingrtt/16B-min                         0.03 ± 0%
      	z6001-z600/pingrtt/16B-avg                         0.04 ± 0%
      	z600-z6001/pingrtt/16B-min                         0.03 ± 0%
      	z600-z6001/pingrtt/16B-avg                         0.04 ± 0%
      	z6001-z600/pingrtt/1452B-min                       0.11 ± 0%
      	z6001-z600/pingrtt/1452B-avg                       0.12 ± 0%
      	z600-z6001/pingrtt/1452B-min                       0.11 ± 0%
      	z600-z6001/pingrtt/1452B-avg                       0.12 ± 0%
      
      which makes it harder to understand the output and also benchstat in
      diff mode can hide a line if e.g. value was close to 1 µs and on second
      run ioping decided to use different unit for the value.
      
      It should be possible to normalize time units to "ns/op" in all
      converters. However instead of doing same and similar conversions again
      and again in every converter we can implement this logic in only one
      place - benchstat - the place where benchmark data is consumed and teach
      only it to understand various units. This way it will understand e.g.
      all the timing units automatically that are used there in practice in
      the wide, like ns, us, μs, ms etc, and removes the need to do unit
      conversions from every producer.
      
      Another motivation for teaching benchstat centrally about various units
      is that raw benchmark output is frequently read/grasped by human (at
      least I do this sometimes in addition to observing just statistics) and
      for humans it is easier to read e.g. time in properly selected units for
      a particular test - e.g. microseconds or milliseconds, and always
      producing time in nanoseconds makes the data harder to percept for a
      person.
      
      This patch teaches benchstat to handle various timing units and
      normalize them internally all to time.
      
      Change-Id: I1941af58e73df6737b6488d5429d0bf87a81c1ee
      59f3a5d3
  2. 03 Jan, 2018 1 commit
  3. 29 Nov, 2017 1 commit
  4. 28 Nov, 2017 1 commit
  5. 27 Sep, 2017 1 commit
  6. 06 Jul, 2017 1 commit
  7. 27 May, 2017 2 commits
    • Josh Bleecher Snyder's avatar
      benchstat: don't print sign for 0% when there is no change · ae5bfa61
      Josh Bleecher Snyder authored
      When all samples are identical, instead of printing
      +0.00%, print just 0.00%.
      
      This is not just more accurate.
      When benchmarking the compiler, one item of interest
      is the impact on object file and executable sizes.
      These can change by tiny amounts, below two sig figs.
      However, the existence of any change is sometimes a cue
      to the reader to investigate further.
      This obviously should not be relied on,
      but it is nevertheless a useful signal.
      
      Change-Id: I1e5157a52fc47c6871ca87b0e9cff223f4fd43d9
      Reviewed-on: https://go-review.googlesource.com/44190
      Run-TryBot: Josh Bleecher Snyder <josharian@gmail.com>
      TryBot-Result: Gobot Gobot <gobot@golang.org>
      Reviewed-by: default avatarBrad Fitzpatrick <bradfitz@golang.org>
      ae5bfa61
    • Josh Bleecher Snyder's avatar
      benchstat: handle user-ns/op units · 0da03b58
      Josh Bleecher Snyder authored
      compilebench emits both ns/op and user-ns/op measurements.
      This CL improves benchstat's handling of such measurements.
      Migrated from https://github.com/rsc/benchstat/pull/5.
      
      Sample output, before:
      
      name       old user-ns/op  new user-ns/op  delta
      Template        279M ± 7%       273M ± 4%     ~     (p=0.329 n=6+5)
      Unicode         148M ±11%       131M ± 7%  -11.49%  (p=0.030 n=6+5)
      GoTypes         830M ± 2%       835M ± 3%     ~     (p=0.792 n=6+5)
      SSA            9.60G ± 3%      9.64G ± 2%     ~     (p=0.792 n=6+5)
      Flate           158M ± 3%       155M ± 1%     ~     (p=0.329 n=6+5)
      GoParser        204M ± 6%       202M ± 8%     ~     (p=0.792 n=6+5)
      Reflect         511M ± 9%       531M ± 6%     ~     (p=0.177 n=6+5)
      Tar             151M ± 3%       151M ± 5%     ~     (p=0.931 n=6+5)
      XML             271M ± 3%       268M ± 3%     ~     (p=0.429 n=6+5)
      
      Sample output, after:
      
      name       old user-time/op  new user-time/op  delta
      Template         279ms ± 7%        273ms ± 4%     ~     (p=0.329 n=6+5)
      Unicode          148ms ±11%        131ms ± 7%  -11.49%  (p=0.030 n=6+5)
      GoTypes          830ms ± 2%        835ms ± 3%     ~     (p=0.792 n=6+5)
      SSA              9.60s ± 3%        9.64s ± 2%     ~     (p=0.792 n=6+5)
      Flate            158ms ± 3%        155ms ± 1%     ~     (p=0.329 n=6+5)
      GoParser         204ms ± 6%        202ms ± 8%     ~     (p=0.792 n=6+5)
      Reflect          511ms ± 9%        531ms ± 6%     ~     (p=0.177 n=6+5)
      Tar              151ms ± 3%        151ms ± 5%     ~     (p=0.931 n=6+5)
      XML              271ms ± 3%        268ms ± 3%     ~     (p=0.429 n=6+5)
      
      Change-Id: I716b4e89a28adc72e9135b8c580434f748a9c0a9
      Reviewed-on: https://go-review.googlesource.com/39794
      Run-TryBot: Josh Bleecher Snyder <josharian@gmail.com>
      TryBot-Result: Gobot Gobot <gobot@golang.org>
      Reviewed-by: default avatarQuentin Smith <quentin@golang.org>
      0da03b58
  8. 12 May, 2017 1 commit
  9. 11 May, 2017 1 commit
  10. 21 Apr, 2017 3 commits
  11. 07 Apr, 2017 2 commits
  12. 04 Apr, 2017 1 commit
  13. 03 Apr, 2017 1 commit
  14. 31 Mar, 2017 3 commits
  15. 14 Mar, 2017 3 commits
  16. 07 Mar, 2017 1 commit
  17. 06 Mar, 2017 1 commit
  18. 28 Feb, 2017 2 commits
  19. 27 Feb, 2017 5 commits
  20. 17 Feb, 2017 5 commits
  21. 10 Feb, 2017 1 commit