This has become a real pain in my backside.
The URL I’m trying to parse is http://torrentz.eu/feed_verifiedP?q=ubuntu
Here’s a short version of the xml:
<?xml version="1.0"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom">
<channel>
<title>Torrentz - ubuntu</title>
<link>http://torrentz.eu/verified?q=ubuntu</link>
<description>ubuntu search</description>
<language>en-us</language>
<atom:link href="http://torrentz.eu/feed_verifiedP?q=ubuntu" rel="self" type="application/rss+xml" />
<item>
<title>ubuntu 11 10 desktop i386 iso</title>
<link>http://torrentz.eu/8ac3731ad4b039c05393b5404afa6e7397810b41</link>
<guid>http://torrentz.eu/8ac3731ad4b039c05393b5404afa6e7397810b41</guid>
<pubDate>Thu, 13 Oct 2011 15:02:06 +0000</pubDate>
<category>apps linux applications os software</category>
<description>Size: 695 MB Seeds: 4,613 Peers: 161 Hash: 8ac3731ad4b039c05393b5404afa6e7397810b41</description>
</item>
</channel>
</rss>
My code:
SAXParserFactory spf = SAXParserFactory.newInstance();
SAXParser sp = spf.newSAXParser();
XMLReader xr = sp.getXMLReader();
//Get Torrents
XMLTorrentsRSSHandler torrentsHandler = new XMLTorrentsRSSHandler();
xr.setContentHandler(torrentsHandler);
InputStream in = url.openStream();
xr.parse(new InputSource(in));
XMLTorrentsRSSParsedDataSet parsedTorrentsDataSet = torrentsHandler.getParsedData();
I keep getting this exception:
org.apache.harmony.xml.ExpatParser$ParseException: At line 1, column 53: mismatched tag
Why the flip does it torment me like this!?
EDIT: This method was working fine until today. Perhaps the website changed but where is this flippin’ mismatched tag?
Why do you have Harmony on your build path? Your code works fine with the built-in SAXParser in Oracle’s JDK7u3. If there isn’t a reason to be using the harmony implementation, you should revert to the standard one.
Testcase form: