I am using this pattern to remove all HTML tags (Java code):
String html="text <a href=#>link</a> <b>b</b> pic<img src=#>";
html=html.replaceAll("\\<.*?\\>", "");
System.out.println(html);
Now, I want to keep tag <a ...> (with </a>) and tag <img ...>
I want the result to be:
text <a href=#>link</a> b pic<img src=#>
How to do this?
I don’t need HTML parser to do this,
because I need this regex pattern to filter a lot of html fragment,
so,I want the solution with regex
You could do this using a negative lookahead:
Rubular
However this has a number of problems and I would recommend instead that you use an HTML parser if you want a robust solution.
For more information see this question: