From c2191f84ebfabf26f4ede963efbdf00b7857093f Mon Sep 17 00:00:00 2001 From: Eduard Urbach Date: Mon, 24 Feb 2025 08:18:01 +0100 Subject: [PATCH] Added a new post --- posts/realistic-benchmarks.md | 53 +++++++++++++++++++++++++++++++++++ public/app.html | 1 + 2 files changed, 54 insertions(+) create mode 100644 posts/realistic-benchmarks.md diff --git a/posts/realistic-benchmarks.md b/posts/realistic-benchmarks.md new file mode 100644 index 0000000..d824865 --- /dev/null +++ b/posts/realistic-benchmarks.md @@ -0,0 +1,53 @@ +--- +title: Realistic Benchmarks +tags: article software benchmarks +created: 2025-02-24T05:49:06Z +published: true +--- + +When people talk about realistic software benchmarks, it's important to realize that absolute numbers do not matter to anyone. +Nobody cares if our algorithm took 2 nanoseconds or 3 nanoseconds to compute. +This depends on the power and configuration of our machine and doesn't convey any useful meaning. + +What we really care about is the relative relationships between different algorithms. + +By itself, a number like 2 nanoseconds is non-significant and not useful to anyone. +However, once we add another competing algorithm to the mix, it becomes interesting. +When algorithm A takes 2 ns to compute and algorithm B takes 4 ns to compute, +we can see a relationship between A and B and that is the fact that A is twice as fast as B. + +``` +time(A) = 0.5 * time(B) +``` + +The goal of a realistic benchmark is not to reproduce the timing of 2 ns and 4 ns. + +The goal of a realistic benchmark is to approximate the `A = 0.5 * B` relationship as good as possible. + +Sometimes, a different machine will lead to not only a change in absolute numbers, but also a change in relative relations between the algorithms. +With this realization we can no longer view the `0.5` relation between A and B as a fixed number, +instead it is machine dependent and we should generalize it as a scalar `s`. + +``` +time(A) = s * time(B) +``` + +Now we get a better definition of what a realistic benchmark is: + +> Two benchmarks can be compared in their quality by looking at how well they approximate `s` on the same machine. + +But how do we actually get closer to the real value of `s`? + +One of the most significant factors is the sample size. +In order to get a more realistic outcome, we need a lot of samples. +Ideally we want an infinite amount of test samples, because the more we have, the closer we approach the real result of `s`. + +This means that in the case of a web server, we want as much load as possible from our stress testing tool, +because more samples will bring us closer to the real relative performance relationships of the algorithms tested. + +Some people have the misconception that a realistic benchmark should produce realistic absolute numbers. +Realism in absolute numbers, especially when it comes at the cost of a worse approximation of `s`, is not useful to anyone. + +As a benchmark developer, please do not confuse these 2 types of realism. +Realistic absolute numbers are just some volatile random numbers on somebody's machine. +Realistic relative relations in the form of a good approximation of `s` are much more stable across different machines and this is what people truly care about. diff --git a/public/app.html b/public/app.html index e09a964..11ac6d3 100644 --- a/public/app.html +++ b/public/app.html @@ -10,6 +10,7 @@