Recently my friend and benchee co-maintainer Tobi had an idea to use benchee to run benchmarks with random data. This is an interesting idea, since you can get what is probably a more accurate picture of how your function behaves with real-world wierdness. It’s essentially the idea of using different inputs for a function, but turned up to 11.
While one thing we might do with this - to potentially find performance edge cases in your code - will require some updates to Benchee (and that won’t come until we ship 1.0), there is another thing you can do right now if you want.
Let’s say that we want to benchmark integer addition. We can think of a few different combinations of integers to add, and use them as inputs, like this:
And we might see results like this:
Ok, so we know that adding two integers is a VERY fast operation, and given the super high deviation, we can see that adding to large integers is for some reason a little faster than adding two small integers (N.B. again, because of the super high deviation, I wouldn’t put too much stock into this. It could be true, but it might also be other things messing with the results.)
But, we’ve used a pretty limited set of inputs here. I think we could get REALLY
interesting results if we started using randomly generated data in our
benchmarks. To do this, we can use the
stream_data library, which can generate
random data for us. We’ll just use any random integer, since all integers can be
Then there’s one important thing we want to do to make sure our benchmark is
accurate, and that’s to make sure we’re pulling off the stream outside of
our actual benchmark function. To do this, we can use the provided
hook that benchee has, which allows you to execute a function before each
benchmark is run, and the result of that function gets passed to the function
being benchmarked. The time spent in the
before_each function is not
counted towards the measured runtime.
So, then we can set our benchmark up like this:
And now we can see that our results are pretty darn similar to the ones that we had when we specifically gave 3 inputs. The deviation is still super high for such a fast operation, but we can clearly see that the average, median and 99th percentile are all significantly higher than our previous results:
So, which pairs of nubmers were the fastest and slowest? That we’ll have to answer at a later time once we add that feature to benchee, but for now we can use this technique to get a better picture of a function’s real-world performance. Of course you need to make sure the random data that you’re generating is representative of a the real-world usage of your function, but for many functions this might end up being a more accurate way of getting a picture of your function’s performance!