I’m using perl and need to split strings of author names delimited by commas as well as a last “and”. The names are formed as first name and last name, looking like this:
$string1 = "Joe Smith, Jason Jones, Jane Doe and Jack Jones";
$string2 = "Joe Smith, Jason Jones, Jane Doe, and Jack Jones";
$string3 = "Jane Doe and Joe Smith";
# Next line doesn't work because there is no comma between last two names
@data = split(/,/, $string1);
I would just like to split the full names into elements of an array, like what split() would do, so that the @data array would contain, for example:
@data[0]: "Joe Smith"
@data[1]: "Jason Jones"
@data[2]: "Jane Doe"
@data[3]: "Jack Jones"
However, the problem is that there is no comma between the last two names in the lists. Any help would be appreciated.
You could use a simple alternation in your regular expression for split:
For example:
If you also have to deal with the Oxford Comma (i.e. “this, that, and the other thing”), then you could use
For example:
Thanks to stackoverflowuser2010 for noting this case.
You’ll want the
\s*,\s*and\s+at the beginning to keep the other branches of the alternation from splitting on the comma or “and” first, this order appears to be guaranteed as well: