I’m decent with regexes but stumped here. I’m having trouble with group 2, below. However, I think this should be fairly easy for a regex guru…
Problem
I’m trying to match zero or more instances of a set of keywords, in any order
[Update: For future reference]
The simplest solution (derived from black panda‘s response) is:
((keyword1 | keyword2 | keyword3 )*)
note: the space after each word is essential!
In my case, this translated into:
((static |final )*)
this is is the bare-bones, simplest answer. The better, more performant approach is in black panda‘s response, below. It allows for any amount of whitespace and is faster for a RE engine to process.
Input
I need to split the following input into very specific groups.
Note: the numbers are not part of the input. That is, each input line starts with the letter p.
- public static final int ONE = 1;
- public final static int TWO = 2;
- public final int THREE = 3;
- public static int FOUR = 4;
- private int FIVE = 5;
Groups
I need to break the input into match groups such that
group 1 = public or private or protected
group 2 = 0 or more instances of “static” or “final” <– group I’m struggling with
group 3 = data type
group 4 = variable name
group 5 = value
Group 2 Details
Given the input above, group 2 would be as follows:
- static final
- final static
- final
- static
- [empty string]
Failed Solutions
this is the regex I came up with and id doesn’t work for group 2:
^.*(public|private|protected)\s+(static\s+|final\s+)*\s+([^ ]+)\s+([^ ]+)\s*(;|=)(.*)$
for group 2, I’ve tried:
- (static\s+|final\s+)*
- (static|final)*\s+
- (static |final )*
- (static\ |final\ )*
Summary
What should be the regular expression for “group 2” that matches one or more instances of the words “static” or “final”. A proper solution would be expandable to match any subset of any words such as [static, final, transient, volatile].
Can you grab everything inbetween, and make sure groups 3 and greater exist?
group 2 =
((?:(?:static|final|transient|volatile)\s+)*)