I have a long running background process that parses a few hundred thousand lines of a CSV. I noticed that the process has a memory leak that occasionally causes the task to hit its soft memory limit and terminate. I have narrowed the section of code down to the following chunk of code:
class BaseModel(db.Model):
_keyNamespace = 'MyApp.Models'
@classmethod
def get_by_item_id(cls, id):
key = "%s_%d" % (cls._keyNamespace, id)
item = CacheStrategy.get(key)
if not item:
query = cls.gql("WHERE Id = :1", id)
item = query.get()
del query
return item
I’ve cut this down to the bare bones but it is still causing Query objects to remain in memory. A sample GC reference dump is included at the end of the comment showing the Query and Query_Filter counts increase by 200 after every 200 order batch step. If i get rid of the query call, this of course goes away.
My question is, WHY is this leaking Query references and how do I get it to honour the del and drop the query reference?
I’ve tried making this an instance method (no difference). Reference count trace below:
INFO 2011-10-17 16:29:39,158 orderparser.py:151] Putting a 200 unit batch of orders, 0.335000 seconds from start
DEBUG 2011-10-17 16:29:40,315 memleaker.py:20] Top Mem Leaks
DEBUG 2011-10-17 16:29:40,334 memleaker.py:22] 356306 Property
DEBUG 2011-10-17 16:29:40,334 memleaker.py:22] 356305 PropertyValue
DEBUG 2011-10-17 16:29:40,334 memleaker.py:22] 74410 Path
DEBUG 2011-10-17 16:29:40,334 memleaker.py:22] 74408 Path_Element
DEBUG 2011-10-17 16:29:40,334 memleaker.py:22] 45127 PropertyValue_ReferenceValue
DEBUG 2011-10-17 16:29:40,334 memleaker.py:22] 45127 PropertyValue_ReferenceValuePathElement
DEBUG 2011-10-17 16:29:40,334 memleaker.py:22] 43822 Reference
DEBUG 2011-10-17 16:29:40,335 memleaker.py:22] 30595 EntityProto
DEBUG 2011-10-17 16:29:40,335 memleaker.py:22] 320 ProtocolMessage
DEBUG 2011-10-17 16:29:40,335 memleaker.py:22] 217 Query
DEBUG 2011-10-17 16:29:40,335 memleaker.py:22] 209 Query_Filter
DEBUG 2011-10-17 16:29:40,335 memleaker.py:22] 55 NOT_PROVIDED
DEBUG 2011-10-17 16:29:40,335 memleaker.py:22] 34 Index_Property
DEBUG 2011-10-17 16:29:40,335 memleaker.py:22] 28 ExtendableProtocolMessage
DEBUG 2011-10-17 16:29:40,336 memleaker.py:22] 18 CompositeIndex
INFO 2011-10-17 16:29:40,644 orderparser.py:151] Putting a 200 unit batch of orders, 1.821000 seconds from start
DEBUG 2011-10-17 16:29:41,930 memleaker.py:20] Top Mem Leaks
DEBUG 2011-10-17 16:29:41,948 memleaker.py:22] 356506 Property
DEBUG 2011-10-17 16:29:41,948 memleaker.py:22] 356505 PropertyValue
DEBUG 2011-10-17 16:29:41,948 memleaker.py:22] 74410 Path
DEBUG 2011-10-17 16:29:41,948 memleaker.py:22] 74408 Path_Element
DEBUG 2011-10-17 16:29:41,948 memleaker.py:22] 45127 PropertyValue_ReferenceValue
DEBUG 2011-10-17 16:29:41,948 memleaker.py:22] 45127 PropertyValue_ReferenceValuePathElement
DEBUG 2011-10-17 16:29:41,948 memleaker.py:22] 43822 Reference
DEBUG 2011-10-17 16:29:41,951 memleaker.py:22] 30595 EntityProto
DEBUG 2011-10-17 16:29:41,951 memleaker.py:22] 417 Query
DEBUG 2011-10-17 16:29:41,951 memleaker.py:22] 409 Query_Filter
DEBUG 2011-10-17 16:29:41,951 memleaker.py:22] 320 ProtocolMessage
DEBUG 2011-10-17 16:29:41,951 memleaker.py:22] 55 NOT_PROVIDED
DEBUG 2011-10-17 16:29:41,951 memleaker.py:22] 34 Index_Property
DEBUG 2011-10-17 16:29:41,951 memleaker.py:22] 28 ExtendableProtocolMessage
DEBUG 2011-10-17 16:29:41,953 memleaker.py:22] 18 CompositeIndex
INFO 2011-10-17 16:29:42,276 orderparser.py:151] Putting a 200 unit batch of orders, 3.450000 seconds from start
DEBUG 2011-10-17 16:29:43,565 memleaker.py:20] Top Mem Leaks
DEBUG 2011-10-17 16:29:43,585 memleaker.py:22] 356706 Property
DEBUG 2011-10-17 16:29:43,585 memleaker.py:22] 356705 PropertyValue
DEBUG 2011-10-17 16:29:43,585 memleaker.py:22] 74410 Path
DEBUG 2011-10-17 16:29:43,585 memleaker.py:22] 74408 Path_Element
DEBUG 2011-10-17 16:29:43,585 memleaker.py:22] 45127 PropertyValue_ReferenceValue
DEBUG 2011-10-17 16:29:43,585 memleaker.py:22] 45127 PropertyValue_ReferenceValuePathElement
DEBUG 2011-10-17 16:29:43,585 memleaker.py:22] 43822 Reference
DEBUG 2011-10-17 16:29:43,586 memleaker.py:22] 30595 EntityProto
DEBUG 2011-10-17 16:29:43,586 memleaker.py:22] 617 Query
DEBUG 2011-10-17 16:29:43,586 memleaker.py:22] 609 Query_Filter
DEBUG 2011-10-17 16:29:43,586 memleaker.py:22] 320 ProtocolMessage
DEBUG 2011-10-17 16:29:43,586 memleaker.py:22] 55 NOT_PROVIDED
DEBUG 2011-10-17 16:29:43,586 memleaker.py:22] 34 Index_Property
DEBUG 2011-10-17 16:29:43,586 memleaker.py:22] 28 ExtendableProtocolMessage
DEBUG 2011-10-17 16:29:43,588 memleaker.py:22] 18 CompositeIndex
I’m unable to reproduce this using your refcount code and a trivial snippet below (on shell.appspot.com or a fresh app):
It seems likely that something in your environment is holding references to the queries that have been executed. Are you using appstats or another development or debugging tool? Can you create a minimum reproduction case that exhibits the behaviour you observed?