I’m not understanding why my regex pattern doesn’t seem to work. Here is an example:
String token = "23030G40KT";
Pattern p = Pattern
.compile("(\\d{3}|VRB)|(\\d{2,3})|(G\\d{2,3})?|(KT|MPS|KMH)");
Matcher m = p.matcher(token);
while(m.find()){
System.out.println(m.group());
}
That prints out:
230
30
G40
(With two following blank lines that aren’t showing here)
I’d like to print:
230
30
G40
KT
with no blank lines. What do I need to change?
The reason your original regex doesn’t work is described very well in other answers, such as @Reimus’s. However, I want to help you simplify it further. Your regex looks complicated but is actually very simple if you break it down.
Let’s talk about what your original regex does:
\\d{3}– Three decimals|– OrVRB– “VRB”|– Or\\d{2,3}– 2 or 3 decimals|– OrG\\d{2,3}– “G” followed by 2 or 3 decimals|– Or(KT|MPS|KMH)– “KT” or “MPS” or “KMH”So basically you just have a bunch of things or’d together. Some of them are redundant (such as “3 decimals” and “2 or 3 decimals”). Combine them together and you get fewer cases with no grouping needed.
You can achieve the same results with this simpler regex: