I’m trying to get the text in between the tags <dev>Text Here</dev>:
<div id="tt" class="info">
Text Here
</div>
Output: Text Here
How can I achieve this using regex in java? thanks.
EDIT:
I’m using HtmlUnit:
currentPage.getElementById("tt").asXml();
currentPage.getElementById("tt").asText(); // returns ""
With regular expressions, you can use the following:
However, a better solution would be to parse the HTML into XHTML, using JTidy, for example, and then extract the required text using XPath (
//div[@id = 'tt']/text()). Something along these lines: