I was just wondering how exactly memory works such that a language standard (such as C++’s ISO/ANSI standard) can guarantee that any data structure (even an array) will be contiguous.
I don’t even know how to write a data structure using continguous memory, but could you please give me a short example of how designers can do this?
For instance, assuming std::vector from C++ allocates all of its memory at runtime, how would it know that the memory slots ahead of the current allocated memory is not in use (and thusly, free for use by the vector)? Do vectors just look quite far ahead and hope the user doesn’t try and push_back so many objects that it can no longer store it in a contiguous memory block? Or does the operating system move around memory at will to keep this from becoming a problem (no idea how this would work)?
Your question seems to be about how to understand the concept of memory allocation from a beginners point of view. Let me try to explain what is going on in a very simplified manner. As an example we can think of a C++ program that adds a lot of elements to a
std::vector.When the program starts the C++ runtime will call the operating system to allocate some memory. This piece of memory is called the heap and it is used when dynamic memory is required by the C++ program. Initially the heap is mostly unused, but calls to
newandmallocwill carve out blocks of memory on the heap. Internally the heap uses some bookkeeping information to keep track of the used and free ares of the heap.Exactly how
std::vectorbehaves internally depends on the implementation, but in general it will allocate a buffer for the elements of the vector on the heap. This buffer is big enough to accommodate all the elements in the vector, but most of the time it has some free space at the end. Here is a buffer that stores 5 elements and has space enough for 8 elements. The buffer is located at address 1000 on the heap.The
std::vectorkeeps track of both the number of elements in the vector (5) and the size (8) and location (1000) of the buffer.Here is the buffer after
push_backis called to add a new element to the vector:That can be done two more times until all space has been used in the buffer.
But what happens if
push_backis called once more? The vector has to increase the size of the buffer. The buffer is allocated on the heap and if the area right after the buffer is unused it may actually be possible to simply grow the buffer. However, most of the time the memory has been allocated to some other object. This is something the heap keeps track of. For the vector to be able to grow the buffer it has to allocate a completely new buffer with an increased size. Many implementations will simply double the size of the buffer. Here is the new buffer that now stores 9 elements and has room for 16 elements. The new buffer is allocated at address 2000 on the heap:The contents of the old buffer is copied to the new buffer, and this operation can be costly if the buffer is big.
In case you wonder the heap may also grow while the program is running just as individual blocks allocated on the heap may grow. This will increase the memory consumption of the program. As more and more elements are added to the vector the heap will have to grow until the operating system refuses to increase the size of the heap. When that happens the program will fail with an out of memory condition.
To sum up:
std::vectorwill preallocate a buffer to allow the vector to grow, but if the vector grows beyond the size of the buffer it will allocate a new buffer and copy the entire contents of the vector to this new buffer.