I want to use split and regular expressions together to separate special codes in a line.
This is my line:
14S15T3C16W17A0-20m0-7T
Now I want to separate out each item, and the items could be for e.g. 14S, 15T, 7T, etc.
It consists of random length of digits and one single alphabet after that digit:
E.g.: 125125125125125X or 11T.
There is also an exception which is the 0- and these will remain as they are, and must be separated out too.
I have made a regular expression myself:
Dim digits() As String = Regex.Split(line, "([0-9][A-Z]|0-)")
But the problem is that it only takes 1 digit of the combination, for example, if the line is 11T2B13D, it will separate it like this: 1, 1T, 2B, 1, 3D
How can I solve this problem?
Since there will be a single alphabet character or a slash
-(for the case of0-) that ends each token, it can be split usingRegex.Splitwith this regex:(?<=pattern)is zero-width (text not consumed) positive look-behind, and it will match if the text before the current position matches thepatterninside.The regex above just checks that the character before the current position is alphabet (upper or lower case)
a-zA-Zor a dash-, and split at the current position.Alternatively, you can do this with
Regex.Matcheswith this regex:Since the number can be arbitrary long, you need the 1 or more quantifier
+. The rest should be clear, since it is very close to what you have tried.Both method should have the same effect for valid input (according to your specification). However, when the input is invalid,
Regex.Splitapproach will produce invalid tokens, whileRegex.Matchesapproach produces valid tokens (it will skip invalid character/sequences).