I have a string that’s of the following scheme:
VersionNumber.VersionString-VersionNumber.VersionString
Such that the following example strings can be converted into arrays of information:
1. 1.x-2.x => (1, 'x', '2', 'x')
2. 1.2-3.4 => (1, 2, 3, 4)
3. 1.2-3.4-beta5 => (1, 2, 3, '4-beta5')
4. 1.2-beta3-3.4 => (1, '2-beta3', 3, 4)
5. 1.2-beta3-4.5-beta6 => (1, '2-beta3', 4, '5-beta6')
The logic for the parse is:
- First element is everything before the first period.
- Second element is everything up to a hyphen immediately before a number.
- Third element always starts with a number and is everything up to the next period.
- Fourth element is everything after the period.
Notes:
- Second element is an arbitrary string, but will never have a hyphen that immediately precedes a number (e.g.
2-3is not valid, but2-beta4is). - Third element always starts with a number, and begins right after a hyphen.
I’ve been able to parse the first three cases using the following expression:
(.+?).(.+?)-(.+?).(.*)
But I’m not sure how to modify it to handle cases 4 and 5 (when the second element contains a hyphen). The two approaches I thought of were:
- Modify the second group to match everything before a hyphen immediately preceding a digit.
- Modify the second group to match everything until it hits a second hyphen only if the first hyphen immediately precedes a non-digit character.
Presumably the first approach is the correct/simplest way to do it, but I’m struggling with coming up with the correct regexp to express it.
Can
VersionStringever contain a dot? If not, this should work:The
[^.]+initially matches everything up to the next dot, but then backtracks a little bit. IfVersionStringcan contain a dot, you can use this:Matching digits explicitly in the
VersionNumberpart serves to enforce your “digit preceded by a hyphen” rule.(Actually,
(.+?)works just as well; I used(\S+?)because I was testing the regex plucking the version strings out of the full text of your message.)EDIT: Per the comments below, here’s the final version: