My Question is quite simple:
is there a way to parse html in java to a DOM-Document, if there are tags like this img-tag in the htmlcontent?
<p><img src="..."></p>
This is the Codesnippet that gives me a SAXException while parsing these elements:
DocumentBuilderFactory dbf = DocumentBuilderFactory.newInstance();
DocumentBuilder db = dbf.newDocumentBuilder();
InputStream is = new ByteArrayInputStream( htmlcontent.getBytes());
Document dom = db.parse(is);
is.close();
I don’t think so but jsoup can do that. It’s not the DOM API but it’s quite similar.