This is the strangest thing!
I have a multi-threaded client application written in Python. I’m using threading to concurrently download and process pages. I would use the cURL multi-handle except that the bottleneck is definitely the processor (not the bandwidth) in this application so it is more efficient to use a thread pool.
I have a 64b i7 rocking 16GB RAM. Beefy. I launch 80 threads while listening to Pandora and trolling Stackoverflow and BAM! The parent process sometimes ends with the message
Killed
Other times a single page (which is it’s own process in Chrome) will die. Other times the whole browser crashes.
If you want to see a bit of code here is the gist of it:
Here is the parent process:
def start( ):
while True:
for url in to_download:
queue.put( ( url, uri_id ) )
to_download = [ ]
if queue.qsize( ) < BATCH_SIZE:
to_download = get_more_urls( BATCH_SIZE )
if threading.activeCount( ) < NUM_THREADS:
for thread in threads:
if not thread.isAlive( ):
print "Respawning..."
thread.join( )
threads.remove( thread )
t = ClientThread( queue )
t.start( )
threads.append( t )
time.sleep( 0.5 )
And here is the gist of the ClientThread:
class ClientThread( threading.Thread ):
def __init__( self, queue ):
threading.Thread.__init__( self )
self.queue = queue
def run( self ):
while True:
try:
self.url, self.url_id = self.queue.get( )
except:
raise SystemExit
html = StringIO.StringIO( )
curl = pycurl.Curl( )
curl.setopt( pycurl.URL, self.url )
curl.setopt( pycurl.NOSIGNAL, True )
curl.setopt( pycurl.WRITEFUNCTION, html.write )
curl.close( )
try:
curl.perform( )
except pycurl.error, error:
errno, errstr = error
print errstr
curl.close( )
EDIT: Oh, right…forgot to ask the question…should be obvious: Why do my processes get killed? Is it happening at the OS level? Kernel level? Is this due to a limitation on the number of open TCP connections I can have? Is it a limit on the number of threads I can run at once? The output of cat /proc/sys/kernel/threads-max is 257841. So…I don’t think it’s that….
I think I’ve got it…OK…I have no swap space at all on my drive. Is there a way to create some swap space now? I’m running Fedora 16. There WAS swap…then I enabled all my RAM and it disappeared magically. Tailing /var/log/messages I found this error:
Mar 26 19:54:03 gazelle kernel: [700140.851877] [15961] 500 15961 12455 7292 1 0 0 postgres
Mar 26 19:54:03 gazelle kernel: [700140.851880] Out of memory: Kill process 15258 (chrome) score 5 or sacrifice child
Mar 26 19:54:03 gazelle kernel: [700140.851883] Killed process 15258 (chrome) total-vm:214744kB, anon-rss:70660kB, file-rss:18956kB
Mar 26 19:54:05 gazelle dbus: [system] Activating service name='org.fedoraproject.Setroubleshootd' (using servicehelper)
You’ve triggered the kernel’s Out Of Memory (OOM) handler; it selects which processes to kill in a complicated fashion that tries hard to kill as few processes as possible to make the most impact. Chrome apparently makes the most inviting process to kill under the criteria the kernel uses.
You can see a summary of the criteria in the
proc(5)manpage under the/proc/[pid]/oom_scorefile:You can adjust the
oom_scorefile for your Python program if you want it to be the one that is killed.Probably the better approach is adding more swap to your system to try to push off the time when the OOM-killer is invoked. Granted, having more swap doesn’t necessarily mean that your system will never run out of memory — and you might not care for the way it handles if there is a lot of swap traffic — but it can at least get you past tight memory problems.
If you’ve already allocated all the space available for swap partitions, you can add swap files. Because they go through the filesystem, there is more overhead for swap files than swap partitions, but you can add them after the drive is partitioned, making it an easy short-term solution. You use the
dd(1)command to allocate the file (do not useseekto make a sparse file) and then usemkswap(8)to format the file for swap use, then useswapon(8)to turn on that specific file. (I think you can even add swap files tofstab(5)to make them automatically available at next reboot, too, but I’ve never tried and don’t know the syntax.)