Benchmarker is a class to assist with "micro-benchmarking". The goal is to discern how long it takes to run a snippet of code (function, lambda, etc). The code will be run in some number of trials, each consisting of many iterations, yielding statistics about the run time of the code.
Tne number of trials is user-selectable, with a reasonable default of 10 trials. The number of iterations per trial may be set explicitly, but the default is to automatically compute a reasonable number of iterations based on their timing. For most use cases, it's fire and forget.
Generally, the most and least expensive trials will be discarded (all sorts of things can happen to give you a few spurious results) and then the remainder of trials will be used to compute the average, standard deviation, range, and median value, in ns per iteration as well as millions of executions per second. The default behavior it just to echo the relevant statistics to the console.
The basic use illustrated by this example in which we try to assess the difference in speed between acos() and fast_acos():
Benchmarker bench;
float val = 0.5f;
clobber (val); // Scrub compiler's knowledge of the value
bench ("acos", [&](){ DoNotOptimize(std::acos(val)); });
bench ("fast_acos", [&](){ // alternate indentation style
DoNotOptimize(OIIO::fast_acos(val));
});
Which produces output like this: acos : 4.3 ns, 230.5 M/s (10x2097152, sdev=0.4ns rng=31.2%, med=4.6) fast_acos : 3.4 ns, 291.2 M/s (10x2097152, sdev=0.4ns rng=33.0%, med=3.4)
Some important details:
After declaring the Benchmarker, a number of options can be set: number of trials to run, iterations per trial (0 means automatic detection), verbosity, whether (or how many) outliers to exclude. You can chain them together if you want: bench.iterations(10000).trials(10);
It can be VERY hard to get valid benchmarks without the compiler messing up your results. Some tips:
- Code that is too fast will not be reliable. Anything that appears to take less than 1 ns actually prints "unreliable" instead of full stats, figuring that it is likely that it has been inadvertently optimized away.
- Use the DoNotOptimize() call on any final results computed by your benchmarked code, or else the compiler is likely to remove the code that leads to any values it thinks will never be used.
- Beware of the compiler constant folding operations in your code – do not pass constants unless you want to benchmark its performance on known constants, and it is probably smart to ensure that all variables acccessed by your code should be passed to clobber() before running the benchmark, to confuse the compiler into not assuming its value.
Definition at line 126 of file benchmark.h.