I’m trying to get content from external feeds on my Django web site with Universal Feed Parser. I want to have some user error handling, e.g. if the user supplies a URL that is not a feed. When I tried how feedparser responds to faulty input, I was surprised to see that feedparser does not throw any Exceptions at all. E.g. on HTML content, it tries to parse some information from the HTML code, and on non-existing domains, it returns a mostly empty dictionary:
{'bozo': 1,
'bozo_exception': URLError(gaierror(-2, 'Name or service not known'),),
'encoding': 'utf-8',
'entries': [],
'feed': {},
'version': None}
Other faulty input manifest themselves in the status_code or the namespaces values in the returned dictionary.
So, what’s the best approach to have sane error checking without resorting to an endless cascade of if .. elif .. elif ...?
Acoording to
feedparserdocumentation, in the Bozo Detection section:(In my opinion, it’s not a very good practice to catch all exceptions and return them in another form, but that’s just the way it works since “applications may just warn about non-well-formed feeds”.)
So, after trying to parse a feed at any URL, you may check for the “bozo bit” and re-raise the corresponding exception:
You may handle the exception according to the type and message, or by making assertions on other attributes of the object returned by
feedparser.parse(such as:f.feedmust be non-empty,f.statusmust equal 200,f.entriesmust be non-empty,f.versionmust be a valid feed format version, etc.), whatever seems most reasonable to your application.