I’m using this simple code and observing monotonically increasing memory usage. I’m using this little module to dump stuff to disk. I observed it happens with unicode strings and not with integers, is there something I’m doing wrong?
When I do:
>>> from utils.diskfifo import DiskFifo
>>> df=DiskFifo()
>>> for i in xrange(1000000000):
... df.append(i)
Memory consumption is stable
but when I do:
>>> while True:
... a={'key': u'value', 'key2': u'value2'}
... df.append(a)
It goes to the roof. Any hints? below the module…
import tempfile
import cPickle
class DiskFifo:
def __init__(self):
self.fd = tempfile.TemporaryFile()
self.wpos = 0
self.rpos = 0
self.pickler = cPickle.Pickler(self.fd)
self.unpickler = cPickle.Unpickler(self.fd)
self.size = 0
def __len__(self):
return self.size
def extend(self, sequence):
map(self.append, sequence)
def append(self, x):
self.fd.seek(self.wpos)
self.pickler.dump(x)
self.wpos = self.fd.tell()
self.size = self.size + 1
def next(self):
try:
self.fd.seek(self.rpos)
x = self.unpickler.load()
self.rpos = self.fd.tell()
return x
except EOFError:
raise StopIteration
def __iter__(self):
self.rpos = 0
return self
The pickler module is storing all objects it has seen in its memo, so it doesn’t have to pickle the same thing twice. You want to skip this (so references to your objects aren’t stored in your pickler object) and clear the memo before dumping:
Source: http://docs.python.org/library/pickle.html#pickle.Pickler.clear_memo
Edit:
You can actually watch the size of the memo go up as you pickle your objects by using the following append function: