Wednesday, September 12

It happens occasionally that I need to validate an optimization by benchmarking, and the benchmark results are a bit noisy. Usually the situation does not ask for a heavy-duty statistical analysis, and I am too lazy for that anyway. However, sometimes having at least the fundamental statistics of a sample would be handy.

Obviously, calculating the standard deviation of a sample in the 21st century by itself is not a problem. There is LibreOffice, there is R, there's a bazillion statistical tools out there. They do feel a bit heavyweight for the task though, and the friction involved in starting them up and entering the data in the right format means that often I just do not bother. Google brings up some dynamic webpages for the task, a much more lightweight solution, but most of these are ugly and ad-riddled.

What's a programmer to do? Code up another weekend app,, of course. And it is the way I like it: there is a textarea where it is easy to paste and edit inline snippets of text, say, shell script output in full; non-numeric data are ignored; statistics are computed on the fly (no stupid "Submit" buttons); and you can save the snippet on the server and get a short link (useful for footnotes pointing to data sources).

The service probably will not be expanded much further. After all, if those basic statistics are not enough, you are probably better off firing up a real statistics tool. However, I do plan to add another page for the most common task that I run into, namely, comparing the means of two populations (did my optimization have an effect?). In other words, calculating the p-value of the hypothesis that the means of two populations are the same. Stay tuned.