Updated article
This commit is contained in:
parent
8b556dcca6
commit
037b2bc42e
@ -11,8 +11,8 @@ This depends on the power and configuration of our machine and doesn't convey an
|
|||||||
|
|
||||||
What we really care about is the relative relationships between different algorithms.
|
What we really care about is the relative relationships between different algorithms.
|
||||||
|
|
||||||
By itself, a number like 2 nanoseconds is non-significant and not useful.
|
By itself, a number like 2 nanoseconds is not very helpful.
|
||||||
However, once we add another competing algorithm to the mix, it becomes interesting.
|
However, once we add another competing algorithm to the mix, we can compare it.
|
||||||
When algorithm A takes 2 ns to compute and algorithm B takes 4 ns to compute,
|
When algorithm A takes 2 ns to compute and algorithm B takes 4 ns to compute,
|
||||||
we can see a relationship between A and B and that is the fact that A is twice as fast as B.
|
we can see a relationship between A and B and that is the fact that A is twice as fast as B.
|
||||||
|
|
||||||
@ -24,7 +24,8 @@ The goal of a realistic benchmark is not to reproduce the timing of 2 ns and 4 n
|
|||||||
|
|
||||||
The goal of a realistic benchmark is to approximate the `A = 0.5 * B` relationship as good as possible.
|
The goal of a realistic benchmark is to approximate the `A = 0.5 * B` relationship as good as possible.
|
||||||
|
|
||||||
Sometimes, a different machine will lead to not only a change in absolute numbers, but also a change in relative relations between the algorithms.
|
Sometimes, a different machine will lead to not only a change in absolute numbers,
|
||||||
|
but also a change in relative relations between the algorithms.
|
||||||
With this realization we can no longer view the `0.5` relation between A and B as a fixed number,
|
With this realization we can no longer view the `0.5` relation between A and B as a fixed number,
|
||||||
instead it is machine dependent and we should generalize it as a scalar `s`.
|
instead it is machine dependent and we should generalize it as a scalar `s`.
|
||||||
|
|
||||||
@ -46,9 +47,27 @@ This means that in the case of a web server, we want as much load as possible fr
|
|||||||
because more samples will bring us closer to the real relative performance relationships of the algorithms tested.
|
because more samples will bring us closer to the real relative performance relationships of the algorithms tested.
|
||||||
The less stress we put on the server, the worse our results become.
|
The less stress we put on the server, the worse our results become.
|
||||||
|
|
||||||
Some people have the misconception that a realistic benchmark should produce realistic absolute numbers.
|
## The woods
|
||||||
Realism in absolute numbers, especially when it comes at the cost of a worse approximation of `s`, is not useful to anyone.
|
|
||||||
|
|
||||||
As a benchmark developer, please do not confuse these 2 types of realism.
|
Just imagine you're out in the woods with a crossbow.
|
||||||
Realistic absolute numbers are just some volatile random numbers on somebody's machine.
|
Now a bear jumps out of nothing and you are fighting for your life.
|
||||||
Realistic relative relations in the form of a good approximation of `s` are much more stable across different machines and this is what people truly care about.
|
You have bought that crossbow based on a benchmark published by an outlet that focuses on crossbows for beginners.
|
||||||
|
Because it's focused on beginners, they have a hard capped reloading speed of 1 bolt per 5 seconds.
|
||||||
|
Even though there are crossbolts that can reload in 3 seconds, the benchmark would not reflect this,
|
||||||
|
because they are not targeted at experts and assume that everybody is clumsy anyway.
|
||||||
|
|
||||||
|
The benchmark showed two crossbows, A and B, but it concluded that both let you reload 1 bolt every 5 seconds,
|
||||||
|
even though B can actually be reloaded in 3 seconds. But that result was not published.
|
||||||
|
So you ended up buying crossbow A, leading to your death against the bear.
|
||||||
|
The inaccuracy of the benchmark lead to your downfall.
|
||||||
|
|
||||||
|
In a life or death situation, you want the fastest crossbow out there.
|
||||||
|
You want to see the differences in reload speed when they're performed at the highest level.
|
||||||
|
Because when shit hits the fan, you want the sharpest tool in the shed.
|
||||||
|
|
||||||
|
## Conclusion
|
||||||
|
|
||||||
|
As a public benchmark developer, please do not confuse these two types of realism.
|
||||||
|
Getting more "realistic" numbers by throttling your benchmark just ends up hurting people because you focused on absolute numbers.
|
||||||
|
A realistic approximation of `s` performed under maximum load is a much more stable
|
||||||
|
result across different machines and this is what people truly care about.
|
||||||
|
Loading…
x
Reference in New Issue
Block a user