If I have a char array that is of length 8 billion. Would breaking it into smaller arrays increase performance by improving caching? Basically, I will iterate the array and do some comparisons. If not, what is the most optimal way of using an array with such length.
I am reading a file in binary form into an array, and will be performing binary comparisons on different parts of the file.
8 GB worth of data will inevitably ruin data locality so one way or the other you either have to manage your memory in smaller pieces or your OS will do the disk swapping of virtual memory.
There is, however, an alternative – a so-called
mmap. Essentially this allows you to map a file into a virtual memory space and your OS then takes the task of accessing it and loading the necessary pages into RAM, while your access to this file becomes nothing more than just a simple memory addressing.Read more about
mmapat http://en.wikipedia.org/wiki/Mmap