I am trying to write a program to parse Java Garbage Collection logs. I just created a grammar that matches a minor collection. Once I have identified a pattern I would like to parse it into individual tokens. My question is, is there any elegant way to do this with my previously defined grammar?
public class RegexTestHarness {
private final static String REGEX_SMALL_COLLECTION = "\\d+\\.\\d+: \\[GC \\d+.\\d+: \\[ParNew: \\d+K\\-\\>0K\\(\\d+K\\), \\d+.\\d+ secs\\] \\d+K\\-\\>\\d+K\\(\\d+K\\), \\d+.\\d+ secs\\]";
public static void main(String[] args){
Pattern pattern = Pattern.compile(REGEX_SMALL_COLLECTION);
Matcher matcher = pattern.matcher("54.770: [GC 54.770: [ParNew: 5232768K->0K(5237824K), 1.1304192 secs] 5238622K->380448K(10480704K), 1.1306410 secs]");
while (matcher.find()) {
System.out.println(matcher.group(0));
System.out.println(matcher.start());
System.out.println(matcher.end());
}
}
}
You need to add groups to your regex.
private final static String REGEX_SMALL_COLLECTION = "(\\d+\\.\\d+): \\[GC (\\d+.\\d+): \\[ParNew: \\d+K\\-\\>0K\\(\\d+K\\), \\d+.\\d+ secs\\] \\d+K\\-\\>\\d+K\\(\\d+K\\), \\d+.\\d+ secs\\]";and then access the groups to the values. In the above example, I added parenthesis around the first two items you want — this tells the regex engine to capture the matching substrings. You will need to add more. As you are currently doing, you use
Matcher.group()to get each group. Note that group 0 is always the entire match. The rest are numbered from1up, in order of their opening parens(.