I’ve been learning Regex/grep from the BBEdit manual, and it’s been smooth sailing except for this part (it’s near the end, and unlike previous sections doesn’t offer an explanation).
The two big parts I’m having difficulty with are the [^ ] part and the first part ^(.*)
Firstly, is that saying anything but a space? How does it then catch the X. Potter, with a space after the X.?
Secondly, the manual talked about non-greedy qualifiers, used so that it wouldn’t match the longest pattern by default and accidentally match your full phrase. How does ^(.*) not match the full line and make it \1? Beginning of line, zero or more occurrences of anything but a carriage return? How does that not catch Junior X. Potter as one pattern? I thought we’d have to use a non-greedy qualifier here, but it seems not.
And lastly, what exactly do spaces do in a regular expression pattern? Do they represent themselves, (I thought you needed \t to do that?) or a simple space (don’t need to do \space to escape it).
Rearranging Name Lists
You can use grep patterns to transform a list of names in first name first form to last name first order (for a later sorting, for instance). Assume that the names are in the form:
Junior X. Potter
Jill Safai
Dylan Schuyler Goode
Walter Wang
If you use this search pattern:
^(.*) ([^ ]+)$
And this replacement string:
\2, \1
The transformed list becomes:
Potter, Junior X.
Safai, Jill
Goode, Dylan Schuyler
Wang, Walter
^(.*)means match anything from the beginning of the line to a space…([^ ]+)$BUT – not just any space, but the particular space that is followed by 1-or-more non-space characters, then the end of the line. The “non-space characters to the end of the line” will be the second matching group.So a human would process this in reverse: find the group of non-space characters at the end of the line, “Potter”, then you’ve found the /2 match. Ah-ha, there is the preceding space, then anything before that is the /1 match, “Junior X.”.
Edit: a space represents itself, it doesn’t need to be escaped. So be careful that you don’t insert spaces to prettify your regex – it actually means something.