Sorry guys, Ive googled and still cant get my code to work. Not exactly a whiz with java (yet, but give me time 🙂 ). I have an xml document that i am using a DOM parser to read, extract the class attributes and now i need to exclude some of those attributes using regex. For instance, my output so far is:
[[#text: ns1:Spare3]]
[[#text: ns1:Spare4]]
[[#text: ns1:Spare5]]
[[#text: ns1:Street]]
[[#text: ns1:Anything]]
[[#text: ns1:TearLineDateUpdated]]
[[#text: ns1:SourceReportTearline]]
[[#text: ns1:AnyFilter]]
[[#text: ns1:UpdatedByTelecom]]
[[#text: ns1:UpdatedByName]]
and i need to exclude those lines that contain the words Spare, or start with TearLine (not case sensitive) and a few others.
My code snippet (that i wrote to test with) says:
Pattern p = Pattern.compile(".*?\\Spare\\(.*?\\)",
Pattern.CASE_INSENSITIVE|Pattern.DOTALL | Pattern.MULTILINE);
Matcher m = p.matcher((nl.item(i)).toString());
if (m.matches())
{
System.out.println("["+nl.item(i)+"]" + "matched");
}
else
{
System.out.println("["+nl.item(i)+"]" + "not matched");
}
How do i exclude any lines that contain the word Spare and any lines that start with TearLine (but TearLine can occur elsewhere in the word and thats ok).?
Are those the actual strings you’re trying to match? That is, the DOM parser produced those strings, and now you’re applying the regex to them? If so, you want something like this:
output:
Notes:
I used
find()instead ofmatches()so my regex only has to match the part that interests me, not the whole string.Some of the other responders used
^TearLinebecause you said that word had to appear at the beginning of the line, but if my guess is right, you really want to match it right after thens1:prefix. On the other hand,.*spareallowsspareto appear anywhere, not just at the beginning (.*?spareworks, too).Similarly, Ωmega used
"\\bSpare\\b"on the assumption that you were interested only in the complete wordSpare. I left out the word boundaries (\b) because you seem to want to match things likeSpare3or (I’m guessing)FooSpare.I don’t know why you added
\\(.*?\\)to your regex, since there were no parentheses in your sample strings.