We are working on a video processing application using EmguCV and recently had to do some pixel level operation. I initially wrote the loops to go across all the pixels in the image as follows:
for (int j = 0; j < Img.Width; j++ )
{
for (int i = 0; i < Img.Height; i++)
{
// Pixel operation code
}
}
The time to execute the loops was pretty bad. Then I posted on the EmguCV forum and got a suggestion to switch the loops like this:
for (int j = Img.Width; j-- > 0; )
{
for (int i = Img.Height; i-- > 0; )
{
// Pixel operation code
}
}
I was very surprised to find that the code executed in half the time!
The only thing I can think of is the comparison that takes place in the loops each time accesses a property, which it no longer has to. Is this the reason for the speed up? Or is there something else? I was thrilled to see this improvement. And would love it if someone could clarify the reason for this.
The difference isn’t the cost of branching, it’s the fact that you are fetching an object property
Img.WidthandImg.Heightin the inner loop. The optimizer has no way of knowing that these are constants for purposes of that loop.You should get the same performance speedup by doing this.
Edit:
As Joshua Suggests, putting Width in the inner loop will have you walking through the memory sequentially, which will be better cache coherency, and might be faster. (depends on how big your bitmap is).