I am trying to parse some text files into a database and there is a string that includes 2 pieces of information in it.
There are a few options for what the string can look like.
It can either look like a single word Word or it can have that first word, followed by a dash, followed by any number of other words like Word - Second.
The key though, is that IF the string ends in a number like Word - Second 4 or two numbers separated by a slash like Word - Second 2/3 then those numbers need to be put into a different variable.
I do NOT know enough about regex to do this one. Help? (with explanations?)
I think you might be looking for something like this:
Explanation:
^ Start of line ( First capturing group (for the words) [a-zA-Z]+ A word (?:...)? (Omitted for clarity) ) Close first group (?: Start non-capturing group \s+ Some whitespace ( Second capturing group (for the numbers) \d+ A number (?:\/\d+)? Optionally a slash followed by another number ) Close capturing group )? Close optional non-capturing group $ End of lineI omitted an explanation of this part above:
(?: *- *[a-zA-Z]+(?: +[a-zA-Z]+)*)?. It matches a dash followed by one or more space separated words. I also wrote\sin the explanation instead ofbecause the space is invisible. But\smatches any whitespace, including new lines. You may prefer to match only spaces.Rubular