I recently inherited a python project, and I’m working on maintaining it now. Part of the code makes a few hundred thousand requests from a website and saves the results to a database. The code is reusing the same httplib.HTTPConnection object for reach request and then just looping over a
conn.request("GET",someString,'',headers)
response = conn.getresponse()
section. A few days ago in my logs I saw that one of the requests threw the exception:
[Errno 104] Connection reset by peer
followed by every other conn.request() failing. My first inclination was to just build a new connection for each request, but the perfomance impact of that was profound and horrible. So my question is, how do I fix this, especially since I’m not 100% sure how I can even really test this.
If I just call conn.connect() after an exception, will it correctly reconnect?
I’m looking for advise on how to fix it and possibly how I could test it.
Thanks for your time.
I think you first need to decide the failure mode you want to handle. For instance, did the connection reset because of a temporary resource problem on the server and a quick turnaround connect will fix it? Or, is the server down or rebooting and you should abort your process?
Presuming the first case, I think you are thinking along the right lines. Try something like this (note, this is not working code – it’s just an example of the logic):
You should probably add some logic to that to pause between repeated connect attempts and to give up after a certain number of tries (which is basically the second scenario above).
In order to test this, try using tcpkill to cause the TCP connection to reset:
http://www.gnutoolbox.com/tcpkill-command/