Please look at this picture:

Is it possible to find per-column sum for all columns faster than in O(n^2)?
Firstly I thought it’s possible to make it n * log(n), if we regroup summation like this (to sum 2 rows at time, then remaining 2 rows, and then remaining 2 rows…):

But then I counted the number of pluses and it came out to be equal in both cases – 7 = 7 from both pictures.
So is it possible to compose such a sum in n * log(n) time, or I have fooled myself (I know there are FHT or FFT like transforms, so that might be the case)?
No, our input size is
O(n^2), so our algorithm can not be faster than that (because we are using all the input values).This is assuming that
nis the amount of rows, that the matrix is square (givingn^2) and there is no special relation between the elements.