I need to be able to split an input String by commas, semi-colons or white-space (or a mix of the three). I would also like to treat multiple consecutive delimiters in the input as a single delimiter. Here’s what I have so far:
String regex = "[,;\\s]+";
return input.split(regex);
This works, except for when the input string starts with one of the delimiter characters, in which case the first element of the result array is an empty String. I do not want my result to have empty Strings, so that something like, “,,,,ZERO; , ;;ONE ,TWO;,” returns just a three element array containing the capitalized Strings.
Is there a better way to do this than stripping out any leading characters that match my reg-ex prior to invoking String.split?
Thanks in advance!
If by “better” you mean higher performance then you might want to try creating a regular expression that matches what you want to match and using
Matcher.findin a loop and pulling out the matches as you find them. This saves modifying the string first. But measure it for yourself to see which is faster for your data.If by “better” you mean simpler, then no I don’t think there is a simpler way than the way you suggested: removing the leading separators before applying the split.