I am trying to match a 24 hour time with regular expressions using egrep.
Here is my test file, test.txt:
32:23:31
24:30:31
23:70:31
23:61:31
23:10:70
23:10:61
22:17:16
01:17:15
24:15:22
0:17:16
00:17:17
24:30:31
Here is my regular expression:
egrep '(2[0-3]|1[0-9]|0[0-9]|[^0-9][0-9]):([0-5][0-9]|[0-9]):([0-5][0-9]|[0-9])' test.txt
Resulting matches:
23:10:70
23:10:61
22:17:16
01:17:15
00:17:17
Any idea why it is matching 23:10:70 and 23:10:61?
It’s actually matching
23:10:7and23:10:6, but since you are not using the end of line metacharacter$at the end of the string, it will process anything that follows.In other words, you should only allow
[0-9]at the end of the string, if the matched digit is the last one on the line, that is, if it is followed by$.Another option, is to force the last digit to be 0-padded if it is less than 10, i.e., instead of
[0-9]use0[0-9]. This will match23:10:07, but not23:10:7. It’s the same you already have for the hours part.