Is there a more efficient way to achieve this:
Given an array A of size n and two positive integers a and b, find the sum floor(abs(A[i]-A[j])*a/b) taken over all pairs (i, j) where 0 <= i < j < n.
int A[n];
int a, b; // assigned some positive integer values
...
int total = 0;
for (int i = 0; i < n; i++) {
for (int j = i+1; j < n; j++) {
total += abs(A[i]-A[j])*a/b; // want integer division here
}
}
To optimize this a little bit, I sorted the array (O(nlogn)) and then didn’t use an abs function. Also, I cached the value a[i] before the inner for loop, so I could just read stuff from A sequentially. I was considering precomputing a/b and storing that in a float, but the extra casting just makes it slower (especially since I want to take the floor of the result).
I couldn’t come up with a solution that was better than O(n^2).
Yes, there is a more efficient algorithm. It can be done it in O(n*log n). I don’t expect there to be an asymptotically faster way, but I’m far from any idea of a proof.
Algorithm
First sort the array in O(n*log n) time.
Now, let us look at the terms
for
0 <= i < j < n. For each0 <= k < n, writeA[k]*a = q[k]*b + r[k]with0 <= r[k] < b.For
A[k] >= 0, we haveq[k] = (A[k]*a)/bandr[k] = (A[k]*a)%bwith integer division, forA[k] < 0, we haveq[k] = (A[k]*a)/b - 1andr[k] = b + (A[k]*a)%bunlessbdividesA[k]*a, in which case we haveq[k] = (A[k]*a)/bandr[k] = 0.Now we rewrite the terms:
Each
q[k]appearsktimes with positive sign (fori = 0, 1, .. , k-1) andn-1-ktimes with negative sign (forj = k+1, k+2, ..., n-1), so its total contribution to the sum isThe remainders still have to be accounted for. Now, since
0 <= r[k] < b, we haveand
floor((r[j]-r[i])/b)is 0 whenr[j] >= r[i]and-1whenr[j] < r[i]. Sowhere an inversion is a pair
(i,j)of indices with0 <= i < j < nandr[j] < r[i].Calculating the
q[k]andr[k]and summing the(2*k+1-n)*q[k]is done in O(n) time.It remains to efficiently count the inversions of the
r[k]array.For each index
0 <= k < n, letc(k)be the number ofi < ksuch thatr[k] < r[i], i.e. the number of inversions in whichkappears as the larger index.Then obviously the number of inversions is
∑ c(k).On the other hand,
c(k)is the number of elements that are moved behindr[k]in a stable sort (stability is important here).Counting these moves, and hence the inversions of an array is easy to do while merge-sorting it.
Thus the inversions can be counted in O(n*log n) too, giving an overall complexity of O(n*log n).
Code
A sample implementation with a simple unscientific benchmark (but the difference between the naive quadratic algorithm and the above is so large that an unscientific benchmark is conclusive enough).