I have the following nested loop computation: int aY=aY,aX=aX; for(int i=0; i<aY; i+=a) {

Question

0

Asked: June 17, 20262026-06-17T06:22:11+00:00 2026-06-17T06:22:11+00:00

I have the following nested loop computation: int aY=aY,aX=aX; for(int i=0; i<aY; i+=a) {

0

I have the following nested loop computation:

int aY=a*Y,aX=a*X;
for(int i=0; i<aY; i+=a)
{
    for(int j=0; j<aX; j+=a)
    {
        xInd=i-j+offX;
        yInd=i+j+offY;
        if ((xInd>=0) && (xInd<X) &&
            (yInd>=0) && (yInd<Y) )
            {
             z=yInd*X+xInd;
            //use z
            }
     }
}

I want to lose the dependency on i,j,xInd and yInd as much as possible. In other words, I want to “traverse” all of the values z receives while running through the loop, but without involving helping variables i,j,xInd and yInd – or at least have a minimal number of computations involved (most importantly to have no multiplications). How can I do that? Other hints to possible ways to make the loop more efficient would be welcome. Thanks!

Report

Leave an answer
Cancel reply

You must login to add an answer.

Need An Account,

1 Answer

Editorial Team · Answer 1 · 2026-06-17T06:22:12+00:00

If we read the question as how to mimimize the number of iterations around the loop, we can take the following approach.

The constraints:

(xInd>=0) && (xInd<X)
(yInd>=0) && (yInd<Y)

allow use to tighten the bound of the for loop. Expanding xInd and yInd gives:

0 <= i - j + offX <= X
0 <= i + j + offY <= Y

Fixing i allows us to rewrite the second loop bounds as:

for(int i=0; i<aY; i+=a) {
    int lower = (max(i + offX - X, -i - offY) / a) * a; //factored out for clarity.
    int upper = min(i + offX, Y - i -offY);
    for(int j=lower; j<=upper; j+=a) {

If you know more about the possible values of offX, offY, a, X and Y further reductions may be possible.

Note that in reality you probably wouldn’t want to blindly apply this type of optimisation without profiling first (it may prevent the compiler from doing this for you e.g. gcc graphite).

Use as index

if the value z=yInd*X+xInd is being used to index memory, a bigger win is achieved by ensuring that the memory accesses are sequential to ensure good cache behaviour.

Currently yInd changes for each iteration so poor cache performance will potentially result.

A solution to this issue would be to first compute and store all the indicies, then do all the memory operations in a second pass using these indicies.

int indicies[Y * X];
int index = 0;
for(...){
    for(...){
        ...
        indicies[index++] = z;
    }
}
// sort indicies
for(int idx = 0; idx < index; idx++){
    z = indicies[idx];
    //do stuff with z
}

Sign Up

Sign In

Forgot Password

The Archive Base Latest Questions

I have the following nested loop computation: int aY=a*Y,aX=a*X; for(int i=0; i<aY; i+=a) {

Leave an answerCancel reply

1 Answer

I have the following nested loop computation: int aY=aY,aX=aX; for(int i=0; i<aY; i+=a) {

Leave an answer
Cancel reply