I just experimented with the size of python data structures in memory. I wrote the following snippet:
import sys
lst1=[]
lst1.append(1)
lst2=[1]
print(sys.getsizeof(lst1), sys.getsizeof(lst2))
I got the following outputs on the following configurations:
- Windows 7 64bit, Python3.1:
52 40(solst1has 52 bytes andlst2has 40 bytes) - Ubuntu 11.4 32bit with Python3.2: output is
48 32 - Ubuntu 11.4 32bit Python2.7:
48 36
Can anyone explain to me why the two sizes differ although both are lists containing a 1?
In the python documentation for the getsizeof function I found the following:
…adds an additional garbage collector overhead if the object is managed by the garbage collector.
Could this be the case in my little example?
Here’s a fuller interactive session that will help me explain what’s going on (Python 2.6 on Windows XP 32-bit, but it doesn’t matter really):
Note that the empty list is a bit smaller than the one with
[1]in it. When an element is appended, however, it grows much larger.The reason for this is the implementation details in
Objects/listobject.c, in the source of CPython.Empty list
When an empty list
[]is created, no space for elements is allocated – this can be seen inPyList_New. 36 bytes is the amount of space required for the list data structure itself on a 32-bit machine.List with one element
When a list with a single element
[1]is created, space for one element is allocated in addition to the memory required by the list data structure itself. Again, this can be found inPyList_New. Givensizeas argument, it computes:And then has:
So we see that with
size = 1, space for one pointer is allocated. 4 bytes (on my 32-bit box).Appending to an empty list
When calling
appendon an empty list, here’s what happens:PyList_Appendcallsapp1app1asks for the list’s size (and gets 0 as an answer)app1then callslist_resizewithsize+1(1 in our case)list_resizehas an interesting allocation strategy, summarized in this comment from its source.Here it is:
Let’s do some math
Let’s see how the numbers I quoted in the session in the beginning of my article are reached.
So 36 bytes is the size required by the list data structure itself on 32-bit. With a single element, space is allocated for one pointer, so that’s 4 extra bytes – total 40 bytes. OK so far.
When
app1is called on an empty list, it callslist_resizewithsize=1. According to the over-allocation algorithm oflist_resize, the next largest available size after 1 is 4, so place for 4 pointers will be allocated. 4 * 4 = 16 bytes, and 36 + 16 = 52.Indeed, everything makes sense 🙂