Assume we’re frequently sampling a particular value and want to keep statistics on the

Question

0

Asked: May 11, 20262026-05-11T14:23:08+00:00 2026-05-11T14:23:08+00:00

Assume we’re frequently sampling a particular value and want to keep statistics on the

0

Assume we’re frequently sampling a particular value and want to keep statistics on the samples. The simplest approach is to store every sample so we can calculate whatever stats we want, but this requires unbounded storage. Using a constant amount of storage, we can keep track of some stats like minimum and maximum values. What else can we track using only constant storage? I am thinking of percentiles, standard deviation, and any other useful statistics.

That’s the theoretical question. In my actual situation, the samples are simply millisecond timings: profiling information for a long-running application. There will be millions of samples but not much more than a billion or so. So what stats can be kept for the samples using no more than, say, 10 variables?

Report

Leave an answer
Cancel reply

You must login to add an answer.

Need An Account,

1 Answer

score 0 · Answer 1 · 2026-05-11T14:23:08+00:00

Minimum, maximum, average, total count, variance are all easy and useful. That’s 5 values. Usually you’ll store sum and not average, and when you need the average you can just divide the sum by the count.

So, in your loop

maxVal=max(x, maxVal); minVal=min(x, minVal); count+=1; sum+=x; secondorder+=x*x;

later, you may print any of these stats. Mean and standard deviation can be computed at any time and are:

mean=sum/count; std=sqrt(secondorder/count - mean*mean);

Median and percentile estimation are more difficult, but possible. The usual trick is to make a set of histogram bins and fill the bins when a sample is found inside them. You can then estimate median and such by looking at the distribution of those bin populations. This is only an approximation to the distribution, but often enough. To find the exact median, you must store all samples.

Sign Up

Sign In

Forgot Password

The Archive Base Latest Questions

Assume we’re frequently sampling a particular value and want to keep statistics on the

Leave an answerCancel reply

1 Answer

Leave an answer
Cancel reply