I just stumbled upon this blog post about cache algorithms.
The author shows two code samples that loop through a rectangle and compute something (my guess is the computing code is just a placeholder).
On one of the examples, he scans the rectangle vertically, and on the other horizontally. He then says the second is fastest, and that every programmer should know why. Now I must not be a programmer, because to me it looks exactly the same.
Can anyone explain why the former is faster?
Cache coherence. When you scan horizontally, your data will be closer together in memory, so you will have less cache misses and thus performance will be faster. For a small enough rectangle, this won’t matter.