Is there an algorithm available to optimize the performance of the following?
for (i = 0; i < LIMIT; i++) {
for (j = 0; j < LIMIT; j++) {
// do something with i and j
}
}
- Both
iandjstart at 0 - Both loops end on the same condition
- Both
iandjare incremented at the same rate
Can this be done in 1 loop somehow?
It is possible to write this using one loop, but I would strongly suggest not doing so. The double for-loop is a well-established idiom that programmers know how to read, and if you collapse the two loops into one you sacrifice readability. Moreover, it’s unclear if this will actually make the code run any faster, since the compiler is already very good at optimizing loops. Collapsing the two loops into one requires some extra math at each step that is almost certainly slower than the two loops independently.
That said, if you do want to write this as a single loop, one idea is to think about the iteration space, the set of pairs that you iterate over. Right now, that looks like this:
The idea is to try to visit all of these pairs in the order
(0, 0), (0, 1), ..., (0, N-1), (1, 0), (1, 1), ..., (1, N-1), ..., (N-1, 0), (N-1, 1), ..., (N-1, N-1). To do this, note that every time we incrementi, we skip overNelements, while when we incrementjwe skip over just one element. Consequently, iteration(i, j)of the loop will map to positioni * N + jin the linearized loop ordering. This means that on iterationi * N + j, we want to visit(i, j). To do this, we can recoveriandjfrom the index using some simple arithmetic. Ifkis the current loop counter, we want to visitThus the loop can be written as
However, you have to be careful with this because
N * Nmight not fit into an integer and thus could overflow. In that case, you would want to fall back on the double for-loop. Moreover, the introduction of the extra divisions and moduli will make this code run (potentially) much slower than the double for-loop. Finally, this code is much harder to read than the original code, and you’d need to be sure to provide aggressive comments describing what it is that you’re doing here. Again, I strongly advise you not to do this at all unless you have a very strong reason to suspect that there is a problem with the standard double for-loop.(Interestingly, the trick used here can also be used to represent a multidimensional array using a single-dimensional array. The logic is identical – you have a two-dimensional structure that you want to represent with a one-dimensional structure.)
Hope this helps!