I’ve tried for a while to detect memory problems with cherrypy. Any web call to a function uses memory that when I run this command:
ps -u djotjog -o pid,rss,command | awk '{print $0}{sum+=$2} END {print "Total", sum/1024, "MB"}'
seems to remain used up permanently. I’ve tried to ‘del’ every object in the function before exiting. No effect. I was wondering if my class instance, which stores a lot of data – might be the problem. I use something like:
class Data:
pass
ref_data = Data()
... do stuff... make a stories_dict ...
ref_data.stories = stories_dict #dictionary 'id':'story' pairs
del stories_dict
In the end, I see 350MB still used each time I run the web-call, and after it reaches 500MB, it seems to spawn another cherrypy instance!
PID RSS COMMAND
10492 960 ps -u globamh1 -o pid,rss,command
10493 784 awk {print $0}{sum+=$2} END {print "Total", sum/1024, "MB"}
29833 1708 -bash
Total 3.37109 MB
LATER…
PID RSS COMMAND
12811 1164 /bin/sh cherryd.fcgi
12817 293788 /home4/globamh1/python-2.7.2/bin/python2.7 /home4/globamh1/.local/bin/cherryd -P modules -c cherryd.conf -f -i app
13195 984 ps -u globamh1 -o pid,rss,command
13196 16 awk {print $0}{sum+=$2} END {print "Total", sum/1024, "MB"}
29833 1708 -bash
Total 308 MB
Later still…
PID RSS COMMAND
4053 5216 /home/globamh1/python-2.7.2/bin/python /home/globamh1/python- 2.7.2/ngo_prompter_2.py
4091 988 ps -u globamh1 -o pid,rss,command
4092 784 awk {print $0}{sum+=$2} END {print "Total", sum/1024, "MB"}
12817 1111616 /home4/globamh1/python-2.7.2/bin/python2.7 /home4/globamh1/.local/bin/cherryd -P modules -c cherryd.conf -f -i app
29833 1716 -bash
32413 1168 /bin/sh cherryd.fcgi
32414 576792 /home4/globamh1/python-2.7.2/bin/python2.7 /home4/globamh1/.local/bin/cherryd -P modules -c cherryd.conf -f -i app
Total 1658.48 MB
So to wrap this up into some specific questions:
- how quickly should python’s garbage collector work?
- does cherrypy or apache do something weird to keep data persistent?
- how can I trust cherrypy to respond to multiple requests if it is using so much memory? I already see that it ignores some requests.
- is this a server configuration problem?
Is THIS the same problem?
Memory not released by python cherrypy application on linux
And if yes, how do I configure that solution on a shared hosting site?
From the sample code you show, there’s very little to be collected. In particular:
That makes ref_data.stories and stories_dict both refer to the same, large dataset. Even if you delete stories_dict, since ref_data.stories is still a reference to that same dict, it will not be garbage-collected until ref_data.stories is deleted (or ref_data is deleted):
Otherwise, the only thing left to be garbage-collected when stories_dict is deleted is the pointer to that dictionary (just a few bytes, probably).
I’m not sure there are guarantees, but in my experience, the garbage collector runs immediately when any object is
deled or a function exits.I suspect not. Do you not see this behavior if you run the same routines directly from the interpreter?
If your application fundamentally uses more memory than the system has available if requests are handled in parallel, you will need to find some way to synchronize the responses across requests. Another option is to configure apache/cherrypy to only serve one request at a time. I believe this is part of the WSGI configuration (how many processes/threads to allocate). If you limit the number of processes/threads to 1, then CherryPy will only serve one request at a time.