I tried using ActiveResource to parse a web service that was more like a HTML document and I kept getting a 404 error.
Do I need to use an XML parser for this task instead of ActiveResource?
My guess is that ActiveResource is only useful if you are consuming data from another Rails app and the XML data is easily translatable to a Rails model. For example, if the web service is more wide-ranging XML like a HTML document or an RSS feed, you want to use a parser like hpricot or nokogiri. Is this correct?
How do you know when to use an XML parser and when to use ActiveResource?
Update: ActiveResource is also not an XML parser. It is a REST consumer allowing you to interact with a remote resource similar to how you would an ActiveRecord model. It does use an XML parser under the hood (I’m assuming through ActiveSupport’s XmlMini I show below).
ActiveResource has some strict requirements about the structure of the XML content and works best when interacting with the REST API of another Rails application. It is not intended to do generic screen scraping of an HTML page. For that use Nokogiri directly.
ActiveSupport isn’t an XML parser, it is a miscellaneous collection of useful Ruby methods and classes. However, it does offer a wrapper around many different XML parsers giving you a consistent interface.
You can see which XML parser is being used and switch to a different XML parser. Try this in
script/console.However, that will still use the XML parser in Nokogiri which assumes strict, valid markup. Most HTML pages do not fit this strict requirement and therefore it is better to use Nokogiri’s HTML parser directly instead of going through ActiveSupport.