I’ve got a problem finding empty HTML elements in a multiline HTML file. My regexp is this:
Pattern pattern = Pattern.compile("<([a-zA-Z][a-zA-Z0-9]*)[^>]*?>[\\s]*?</\\1>");
Matcher matcher = pattern.matcher(htmlOut);
while (matcher.find())
{
htmlOut = matcher.replaceAll("");
matcher = pattern.matcher(htmlOut);
}
The problem is it doesn’t match any of the empty tags.
FYI: The same regexp <([a-zA-Z][a-zA-Z0-9]*)[^>]*?>[\s]*?</\1> works in sublime text!
Any approach?
The pattern is OK, but you’re using it wrong.
replaceAll()is called on the string, not on the matcher object.Also, no need to iterate over the matches – one
replaceAllis enough:You don’t need lazy quantifiers, though – but that wouldn’t affect the match results.