I want to find geometric average of data and performance does matters.
Which one should I pick between
-
Keep multiplication over single variable and take Nth-root at the end of calculation
X = MUL(x[i])^(1/N)Thus,
O(N) x Multiplication Complexity + O(1) x Nth-root -
Use logarithm
X = e ^ { 1/N * SUM(log(x[i])) }Thus,
O(N) x Logarithm Complexity + O(1) x Nth-division + O(1) Exponential -
Specialized algorithm for geometric average. Please tell me if there is.
I thought I would try to benchmark this and get a comparison, here is my attempt.
Comparing was difficult since the list of numbers needed to be large enough to make timing it reasonable, so N is large. In my test N = 50,000,000 elements.
However, multiplying lots of numbers together which are greater than 1 overflows the double storing the product. But multiplying together numbers less than 1 gives a total product which is very small, and dividing by the number of elements gives zero.
Just a couple more things: Make sure none of your elements are zero, and the Log approach doesn’t work for negative elements.
(The multiply would work without overflow if C# had a BigDecimal class with an Nth root function.)
Anyway, in my code each element is between 1 and 1.00001
On the other hand, the log approach had no problems with overflows, or underflows.
Here’s the code:
My computer output describes that both geometric means are the same, but that:
So, the multiply appears to be faster.
But multiply is very problematic with overflow and underflow, so I would recommend the Log approach, unless you can guarantee the product won’t overflow and that the product won’t get too close to zero.