I use Java 7.
I would like to extract the lang and the country from a String which represent a bundle file name or properties file name. File name doesn’t contains the extension.
For example
- bundle –> Empty string or null
- bundle_en –> en
- bundle_en_US –> en_US
- complicated_bundle_name_en_US –> en_US
I tried this but it doesn’t give me the expected result.
private static void testPattern(String bundleName) {
final Pattern pattern = Pattern.compile(".+(_[a-z]{2,3}(_[A-Z]{2,3}){0,1}){0,1}");
final Matcher matcher = pattern.matcher(bundleName);
if (matcher.matches()) {
for (int i = 0; i < matcher.groupCount(); ++i) {
System.out.println("Group " + i + " = " + matcher.group(i));
}
} else {
System.out.println("Nothing");
}
}
For “bundle_en_US” its show:
Group 0 = bundle_fr_US
Group 1 = null
Can you help me to correct my regex or may be this regex already exist in the java core ?
Thanks.
Problem is imo that
.+in the beginning already matches the whole sequence. Use reluctant quantifier instead:Edit: The finest solution I would consider this:
Using non capturing groups this solely captures language and country code and throws out the “_”. Additionally you should change your
forcondition toi <= matcher.groupCount()otherwise you miss the last group. Using the last version and the altered for condition you get:Input:
bundleOutput:
Input:
bundle_enOutput:
Input:
bundle_en_USOutput: