Suppose I have a text file with some data I want to retrieve lost in a sea of regular written language.
Each piece of data I want to retrieve is a tuple of 3 numbers between 0 and 99 (that I will call N1 to N3), which can be formatted in 4 different ways:
N1-N2-N3N1N2N3N1.N2.N3N1/N2/N3
Using regular expressions, is it possible to describe something like that:
Something I will call separator later is something in this list : [ '-', '', '.', '/' ]
My expression is like: N1{separator}N2{same_separator_as_the_first_one}N3
?
It seems like the only way to express that is:
My expression is like: ({N1}-{N2}-{N3}) OR ({N1}{N2}{N3}) OR ({N1}.{N2}.{N3}) OR ({N1}/{N2}/{N3})
…which becomes quickly unreadable…
Is it possible to achieve the first kind of expression with regular expressions? Is there something available which is not regex that allows this kind of expressiveness?
The real question is:
Given the available formats, what is the best way to write a function
which gets a string and returns N1 to N3 along with the used separator
character (and throws an exception when the string does not match any
format)?
This depends slightly on the flavor of regex, but in a typical language, I would write:
Then group 2 is the separator, and groups 1, 3, and 4 are the three numbers.