I’m searching for state abbreviations in a string. Here’s an example input string:
String inputStr = 'Albany, NY + Chicago, IL and IN, NY, OH and WI';
The pattern that I’m using to match state abbreviations is:
String patternStr = '(^|\\W|\\G)[a-zA-Z]{2}($|\\W)';
I’m looping through the matches and stripping out the non-alpha characters during the loop, but I know that I should be able to do that in one pass. Here’s the current approach:
Pattern myPattern = Pattern.compile(patternStr);
Matcher myMatcher = myPattern.matcher(inputStr);
Pattern alphasOnly = Pattern.compile('[a-zA-Z]+');
String[] states = new String[]{};
while (myMatcher.find()) {
String rawMatch = inputStr.substring(myMatcher.start(),myMatcher.end());
Matcher alphaMatcher = alphasOnly.matcher(rawMatch);
while (alphaMatcher.find()) {
states.add(rawMatch.substring(alphaMatcher.start(),alphaMatcher.end()));
}
}
System.debug(states);
|DEBUG|(NY, IL, IN, NY, OH, WI)
This works, but it’s verbose and probably inefficient. What’s the one-pass way to get this done in Java/Apex?
You need to use Matcher.group(). Try this:
Output: NY IL IN NY OH WI
In a real system, you’d want to verify against a list of all valid state abbreviations, otherwise you could pick up all sorts of junk.