I would like a regular expression that, starting at the beginning of the text, matches a word. If the exact word is typed it matches, but will also match a certain minimal number of matching characters, provided that any additional characters also match.
For example, if I am trying to match “San Francisco,” but am willing to accept the first five characters as sufficient to identify it uniquely in the domain:
- Match: San Francisco
- Match: San F
- Match: San Fra
- Match: San Franciscoblahblah
- Fail: Boston
- Fail: San Diego
- Fail: San Fransisko
- Fail: San Frano
This almost works, but incorrectly matches the last two:
^San Fr?a?n?c?i?s?c?o?
I’m using .NET regular expressions, but a solution in any language will do.
The issue you’re having is one of grouping.
The parentheses will make it so that ‘a’ being allowed is dependent on a preceding ‘r’, and so forth. It will still match on ‘San Frano’ and ‘San Fransisko’, but the matches will only be ‘San Fran’, similar to your ‘San Fransiscoblahblah’ case.