I am using Yahoo Api, I have implemented random sleep method in addition to that I have added hard sleeps but still I am unable to figure how I can just wait or try again if I don’t get a response at first attempt.
For an example the code that I have put below, fails at some users, totally randomly. After it fails I take the url on my browser and it works like a charm. So my questions is why? and How can I resolve this? or can I improve this code to do do another request after a hard sleep (Only if thats a good approach)
I have few more information which I forgot to add, I changed the code to get my http success code:
print urlobject.getcode()
and it returns 200, but no json, as some suggested this might be throttle.
Note: I have removed my appid(Key) from the url
# return the json question for given question id
def returnJSONQuestion(questionId):
randomSleep()
url = 'http://answers.yahooapis.com/AnswersService/V1/getQuestion?appid=APPIDREMOVED8&question_id={0}&output=json'
format_url = url.format(questionId)
try:
request = urllib2.Request(format_url)
urlobject = urllib2.urlopen(request)
time.sleep(10)
jsondata = json.loads(urlobject.read().decode("utf-8"))
print jsondata
except urllib2.HTTPError, e:
print e.code
logging.exception("Exception")
except urllib2.URLError, e:
print e.reason
logging.exception("Exception")
except(json.decoder.JSONDecodeError,ValueError):
print 'Question ID ' + questionId + ' Decode JSON has failed'
logging.info("This qid didn't work " + questionId)
return jsondata
Alrighty, first up, a few points that do not directly answer your question, but may be helpful:
1) I’m pretty sure there’s never any need to wait between calling urllib2.urlopen and reading the returned addinfourl object. The examples at http://docs.python.org/library/urllib2.html#examples do not feature any such sleep.
2)
can be simplified to just
which is simpler and more readable.
Basically, .load takes a file-like object as an argument, whereas .loads takes a string. You may have thought that it was necessary to read() the data first in order to decode it from utf-8, but this is in fact no problem, because .load assumes by default that the object it is reading is ascii or utf-8 encoded (see http://docs.python.org/library/json.html#json.load).
3) It may not matter for your present purposes, but I’d regard your exception handling here as bad. If anything goes wrong during the “try:” block, then the variable jsondata will not have been assigned. Then when we try to return it after the end of the try/except blocks, a NameError will be raised due to trying to use the unassigned variable. That means that if some other function in your application calls returnJSONQuestion and an exception occurs, then it will be a NameError, and not the original exception, that the outer function sees, and any tracebacks the outer function generates will not point to the spot where the real problem occurred. This could easily cause confusion when trying to figure out what has gone wrong. It would be better, therefore, if all your ‘except’ blocks here finished with ‘raise’.
4) In Python, it’s a good idea to put comments saying what a function does as docstrings (see http://www.python.org/dev/peps/pep-0257/#what-is-a-docstring) instead of as comments above the function.
Anyway, to actually answer your question…
You can get a seemingly random URLError when trying to open a URL for all kinds of reasons. Maybe there was a bug on the server during the handling of your request; maybe there was a connection problem and some data dropped; maybe the server was down for a few seconds while one of its admins changed a setting or pushed an update; maybe something else entirely. I’ve noticed after doing a little web development that some servers are much more reliable than others, but I figure that for most real-world purposes, you probably don’t need to worry about why. The simplest thing to do is just to retry the request until you succeed.
With all the above in mind, the code below will probably do what you need:
Hope this helps! If you’re going to be making a lot of different web requests in your program, you’ll probably want to abstract out this ‘retry request on exception’ logic into a function somewhere so that you don’t need to have the boilerplate retry logic mixed in with other stuff. 🙂