I have a csv file that has rows that I want to split by commas using Stata. However, I only want to split by comma if there is not a whitespace directly after the comma.
For example, the input looks like this:
ID Name
1 Bob
2 Robert,Joe
3 Mike, Jr.
4 Alfred, Sr.
5 Andy,Michael,Bo
I want the output to look like:
ID Name
1 Bob
2 Robert
2 Joe
3 Mike Jr.
4 Alfred Sr.
5 Andy
5 Michael
5 Bo
Such that a new row is created when there is no whitespace after the comma, but a new row is not created when there is whitespace directly after the comma.
Would greatly appreciate any clarification you can provide!
This regex should work
/,\S/When I checked out the documentation I couldn’t see exactly what is supported.
So if the above doesn’t work then
/,[^\s]/And as a last resort but I hope one of the non whitespace checks work would we to split on
, followed by an alphanumeric character
/,[a-zA-Z0-9]/