I’m running a capped collection (mongodb) with tailable cursors. It runs fine, until – suddenly, after 20-300 seconds – it jumps to 100% cpu and mongostat shows that getmore comes to a complete halt.
I did cProfile on the python script, and found this:
1 0.000 0.000 0.000 0.000 {method 'lstrip' of 'str' objects}
77 0.000 0.000 0.000 0.000 {method 'match' of '_sre.SRE_Pattern' objects}
11 0.000 0.000 0.000 0.000 {method 'partition' of 'str' objects}
34 242.726 7.139 242.726 7.139 {method 'poll' of 'select.epoll' objects}
12 0.000 0.000 0.000 0.000 {method 'pop' of 'dict' objects}
So the epoll is definitively standing out here, while everything else looks normal, and this is likely what’s causing the hangup and CPU havoc.
But what does it mean? (Hints here perhaps?) What’s going on and how can I fix it?
This is the code that most likely triggers the epoll:
while WSHandler.cursor.alive:
try:
doc = WSHandler.cursor.next()
which is run in a separate thread (with threading.thread()).
(I’m using Tornado WebSocket, three mongodb scripts for inserting into db, and one script for tailing the cursor. The cProfile is from the tailing script.)
This was caused by running three
IOLoopswith Tornado in three different scripts. This messes up for the epoll whichIOLoopuses. Solved with putting all three scripts in oneIOLoop.And from running an AWS micro instance!