I am trying to use the TOKENIZE function in PIG with a document that is comma separated. I would like to split on the commas, but NOT on white space. For example I would like for a list of
(car, toy car, bunny) to be ((car), (toy car), (bunny) not ((car), (toy), (car), (bunny)).
Is there a way to this?
I am trying to use the TOKENIZE function in PIG with a document that
Share
Have you had a look to STRSPLIT for splitting just on the comma?
(it works for CHARARRAY like TOKENIZE)