I have a “CSV” with which some of the data fields happen to contain the comma delimiter as in the second row of the following sample data.
"1","stuff","and","things"
"2","black,white","more","stuff"
I can’t change the source data and I don’t know how to str.split() and not split in the value “black,white”.
Ways I’ve approached my problem:
- I looked at partition() and don’t see how that would benefit me.
- I’m sure a regex would capture data properly but I’m not sure how to tie one into splitting.
- Since every row in the source will always have the same number of fields I thought maybe setting maxsplit would help but talked myself out of that with the thinking that it would still split within “black,white” and I would end up loosing the last value (which would be “stuff” in this case).
Certainly this is easy to overcome so I’m looking forward to learning something new!
Your help is greatly appreciated.
Commas outside the strings are always followed by double-quotes. Just split on
,"instead of just,(or even",")Of course, edit for efficiency
YevgenYampolskiy’s suggestion of
shlexis also an alternative.