My problem is as follows:
I have an array of strings that contain dates and other data. My date will have one of several formats:
- dd/mm/yyyy
- dd/mm/yy
- mm/yy
- d/m/yy
- yyyy
- yy
Is there a way to search a string for numbers that fit that pattern in the string?
In addition, it would be nice if I could check if the dd is between 1 and 31 inclusive etc, but it would not be so bad if I had to do that afterwards.
Each of these corresponds to a regex.
Here are regexes for each:
\b(?:[012][1-9]|3[01])/(?:0[1-9]|1[012])/\d{4}\b\b(?:[012][1-9]|3[01])/(?:0[1-9]|1[012])/\d{2}\b\b(?:0[1-9]|1[012])/\d\d\b\b[1-9]/[1-9]/\d\d\b\b\d{4}\b\b\d\d\bOf course, you can combine these together in different ways. You can even make one super regex.
The last one is rather interesting, though. I can imagine a case where you might have a plain old number in your text, like
42that might not actually correspond to a year. Still I guess you can postprocess that.Happy regexing.
ADDENDUM
To answer some questions in the comments:
Yes it works at the beginning and the end of the string, because
\bis a word boundary, which includes all transitions from word characters (letters, digits, and underscores) to non-word characters and vice-versa, including the beginning and ending of the string.To see tests, see here: http://jsfiddle.net/wRufK/. Yes I know this is in JavaScript and not C#, but jsfiddle is a very convenient way to show code in action. There are differences though — in C# we use
Regex.matchand the JavaScript regex has extra backslashes to escape the inner forward slashes.indexOfmight be overkill depending on the application. If you want to find all matches, see http://msdn.microsoft.com/en-us/library/twcw2f1c.aspx for info on repeated matching. You can also modify the regexes for capturing.Since your dates can be in any of the forms above, and probably others, a single regex might be preferable. A very flexible date finder is here: http://www.regular-expressions.info/dates.html. You might want to consider it instead of fixing an exact set.