Improve benchmark usability #18

New issue

Open

opened 2025-12-28 18:11:55 +00:00 by sami · 1 comment

sami commented

2025-12-28 18:11:55 +00:00

Owner

Originally created by @fyrchik on GitHub (Mar 2, 2021).

Often we need to compare different patches of neo-go or even add some C#/mixed on top of it.
There are some problems:

Running benchmark on local machine can interfere with other programs running which affects results.
Again, because of system load multiple benchmark runs give more meaningful results. Mean/variation of block times/TPS in a single run is a bit different of same metrics for results of multiple runs. Former is related to how node processes transactions, while latter can help to make more valid conclusions. On my machine +-100TPS (with a mean of 1600) across multiple runs happens constantly, and 10% variation is something we should take into account.

Here is how we can make benchmark more flexible:

Allow to provide revisions to compare. Something like REVISIONS=master,patch1,patch2 make start.GoFourNodes10wrk. The interface is discussable. Note, that C# and mixed setups should also be allowed.
Emit plots for results from (1). We already have scripts for this, it can be extended.

I think doing all of this in a single command can also make results more reproducible and easier to share.

Originally created by @fyrchik on GitHub (Mar 2, 2021). Often we need to compare different patches of `neo-go` or even add some C#/mixed on top of it. There are some problems: 1. Running benchmark on local machine can interfere with other programs running which affects results. 2. Again, because of system load multiple benchmark runs give more meaningful results. Mean/variation of block times/TPS in a single run is a bit different of same metrics for results of multiple runs. Former is related to how node processes transactions, while latter can help to make more valid conclusions. On my machine +-100TPS (with a mean of 1600) across multiple runs happens constantly, and 10% variation is something we should take into account. Here is how we can make benchmark more flexible: 1. Allow to provide revisions to compare. Something like `REVISIONS=master,patch1,patch2 make start.GoFourNodes10wrk`. The interface is discussable. Note, that C# and mixed setups should also be allowed. 2. Emit plots for results from (1). We already have scripts for this, it can be extended. I think doing all of this in a single command can also make results more reproducible and easier to share.

sami added the

feature

labels

2025-12-28 18:11:55 +00:00

sami commented

2025-12-28 18:11:56 +00:00

Author

Owner

@roman-khimov commented on GitHub (Mar 2, 2021):

Batching and averaging out multiple runs can be useful and plots can just take the best result for a particular combination.

Refs. #4, though.

@roman-khimov commented on GitHub (Mar 2, 2021): Batching and averaging out multiple runs can be useful and plots can just take the best result for a particular combination. Refs. #4, though.