I’m trying to write a regex to parse a (seemingly very simple) piece of text like this.
some stuff
First name: John
Last name: Smith
more stuff
I want to capture the first and last name, so I tried a regex like this:
(?<=First name:\s*)(?<FirstName>\w+)(?<=\s*Last name:\s*)(?<LastName>\w+)
This fails to find a match. Each part (first name and last name) works individually, but they don’t work together. Also, the following works
(?<=John\s*Last name:\s*)(?<LastName>\w+)
but when I move “John” out of the non-matching group…
John(?<=\s*Last name:\s*)(?<LastName>\w+)
… it doesn’t match!
What am I doing wrong here?
Since look-behind assertions are zero-width (i.e. they don’t consume any characters), the FirstName capture will match/capture whatever is after “First name:”, in this case “John”. After this first match, the position in the target string will be immediately after “John”. But since the next part of the regex is another look-behind, the regex will look to see if what immediately precedes its current position matches your look-behind text, in this case “Last name:”. Since it is actually preceded by “John”, the whole regex fails and never even gets to “Smith”.