I am currently working on reimplementing some algorithm written in Java in Python. One

Question

0

Asked: May 18, 20262026-05-18T23:29:14+00:00 2026-05-18T23:29:14+00:00

I am currently working on reimplementing some algorithm written in Java in Python. One

0

I am currently working on reimplementing some algorithm written in Java in Python. One step is to calculate the standard deviation of a list of values. The original implementation uses DescriptiveStatistics.getStandardDeviation from the Apache Math 1.1 library for this. I use the standard deviation of numpy 1.5. The problem is, they give (very) different results for the same input. The sample I have is this:

[0.113967640255, 0.223095775796, 0.283134228235, 0.416793887842]

I get the following results:

numpy           : 0.10932134388775223
Apache Math 1.1 : 0.12620366805397404
Wolfram Alpha   : 0.12620366805397404

I checked with Wolfram Alpha to get a third opinion. I do not think that such a difference can be explained by precision alone. Does anyone have any idea why this is happening, and what I could do about it?

Edit: Calculating it manually in Python gives the same result:

>>> from math import sqrt
>>> v = [0.113967640255, 0.223095775796, 0.283134228235, 0.416793887842]
>>> mu = sum(v) / 4
>>> sqrt(sum([(x - mu)**2 for x in v]) / 4)
0.10932134388775223

Also, about not using it right:

>>> from numpy import std
>>> std([0.113967640255, 0.223095775796, 0.283134228235, 0.416793887842])
0.10932134388775223

Report

Leave an answer
Cancel reply

You must login to add an answer.

Need An Account,

1 Answer

Editorial Team · Answer 1 · 2026-05-18T23:29:15+00:00

Apache and Wolfram divide by N-1 rather than N. This is a degrees of freedom adjustment, since you estimate μ. By dividing by N-1 you obtain an unbiased estimate of the population standard deviation. You can change NumPy’s behavior using the ddof option.

This is described in the NumPy documentation:

The average squared deviation is
normally calculated as x.sum() / N,
where N = len(x). If, however, ddof is
specified, the divisor N – ddof is
used instead. In standard statistical
practice, ddof=1 provides an unbiased
estimator of the variance of the
infinite population. ddof=0 provides a
maximum likelihood estimate of the
variance for normally distributed
variables. The standard deviation
computed in this function is the
square root of the estimated variance,
so even with ddof=1, it will not be an
unbiased estimate of the standard
deviation per se.

Sign Up

Sign In

Forgot Password

The Archive Base Latest Questions

I am currently working on reimplementing some algorithm written in Java in Python. One

Leave an answerCancel reply

1 Answer

Leave an answer
Cancel reply