OK, so I have been searching for hours about my problem but nothing seems to come up.
So here’s my code snippet followed by the problem:
Pattern forKeys = Pattern.compile("^<feature>\\s*<name>Deviation</name>.*?</feature>", Pattern.DOTALL|Pattern.MULTILINE);
Matcher n = forKeys.matcher("");
String aLine = null;
while((aLine = in.readLine()) != null) {
n.reset(aLine);
String result = n.replaceAll("");
out.write(result);
out.newLine();
}
let’s just assume the undeclared variables are already declared..
my point is, my RegEx (and maybe the matcher also) is not working properly.
I want to erase the parts with the “<feature><name>Deviation</name>*any character/s here*</feature>” included in the ff lines:
<feature>
<name>Deviation</name>
<more words here>
</feature>
<feature>
<name>Average</name>
</feature>
<feature>
<name>Deviation</name>
sample words
</feature>
I think my problem is the use of repititive operators (how to traverse line breaks, tabs, etc), but I can’t seem to find the correct expression.
Any ideas? Thanks in advance.
Parsing HTML or XML with regex is evil and error-prone.
Use an XML parser and things will work much better.
Here’s a solution for your problem using Dom4J:
Apart from that you are also making a mistake (see my comments):
Your regex might or might not work if you read the entire file to a String, but it can’t work if you apply it on individual lines.