Assume I have used ptr = malloc(old_size); to allocate a memory block with old_size bytes. Only the first header_size bytes is meaningful. I’m going to increase the size to new_size.
new_size is greater than old_size and old_size is greater than header_size.
before:
/- - - - - - - old_size - - - - - - - \
+===============+---------------------+
\-header_size-/
after:
/- - - - - - - - - - - - - - - new_size - - - - - - - - - - - - - - - - - - -\
+===============+------------------------------------------------------------+
\- header_size-/
I don’t care what is stored after ptr + header_size because I’ll read some data to there.
method 1: go straight to new_size
ptr = realloc(ptr, new_size);
method 2: shrink to header_size and grow to new_size
ptr = realloc(ptr, header_size);
ptr = realloc(ptr, new_size);
method 3: allocate a new memory block and copy the first header_size bytes
void *newptr = malloc(new_size);
memcpy(newptr, ptr, header_size);
free(ptr);
ptr = newptr;
Which is faster?
It almost certainly depends on the values of
old_size,new_sizeandheader_size, and also it depends on the implementation. You’d have to pick some values and measure.1) is probably best in the case where
header_size == old_size-1 && old_size == new_size-1, since it gives you the best chance of the singlereallocbeing basically a no-op. (2) should be only very slightly slower in that case (2 almost-no-ops being marginally slower than 1).3) is probably best in the case where
header_size == 1 && old_size == 1024*1024 && new_size == 2048*1024, because thereallocwould have to move the allocation, but you avoid copying 1MB of data you don’t care about. (2) should be only very slightly slower in that case.2) is probably best when
header_sizeis much smaller thanold_size, andnew_sizeis in a range where it’s reasonably likely that thereallocwill relocate, but also reasonably likely that it won’t. Then you can’t predict which of (1) and (3) it is that will be very slightly faster than (2).In analyzing (2), I have assumed that realloc downwards is approximately free and returns the same pointer. This is not guaranteed. I can think of two things that can mess you up:
Either of those could make (2) significantly more expensive than (1). So it’s an implementation detail whether or not (2) is a good way of hedging your bets between the advantages of (1) (sometimes avoids copying anything) and the advantages of (3) (sometimes avoids copying too much).
Btw, this kind of idle speculation about performance is more effective in order to tentatively explain your observations, than it is to tentatively predict what observations we would make in the unlikely event that we actually cared enough about performance to test it.
Furthermore, I suspect that for large allocations, the implementation might be able to do even a relocating
reallocwithout copying anything, by re-mapping the memory to a new address. In which case they would all be fast. I haven’t looked into whether implementations actually do that, though.