Is it possible to output the first character from a string (its index) that causes a mismatch with a regular expression? Is it possible with just using regular expression matching operations or something more complex must be employed?
For instance, in JavaScript, I may have a regular expression /^\d{3}\s\d{2}$/ that matches string with 3 digits followed by a whitespace and another 2 digits. I have a string "123a45" to which I apply this regular expression. Doing this (e.g., "123a45".match(/^\d{3}\s\d{2}$/)) returns null since the regular expression is not matched. How can I get the first character that causes this mismatch (in this case "a", the character with the index 3)?
One use case for this could be to point user directly to the character that causes a string entered by the user to be invalid according to some regular expression used for its validation.
You would need to break-down the regex pattern to all possible matching patterns for partial matches and such list of patterns ordered from the longest match to the shortest one (or none). Once you got match, calculating the lenght of (partial) match you’ll get position of the character that causes mismatch. Substring from that position with length of one character is exactly character that is behind this mismatch (if some). If there is no mismatch, then it returns empty (sub-)string.
http://jsfiddle.net/ETWWS/