I have a data set that looks like this
Code Product
1 A|B
2 A|B|C
3 A|B|C|D|E
When I split the column Product using colsplit function, duplication occurs. The output of colsplit function looks like this:
Code Product.1 Product.2 Product.3 Product.4 Product.5
1 A B A B A
2 A B C A B
3 A B C D E
This happens because one of the cells had five elements. Is there any way to avoid this duplication?
Thanks and regards
Jayaram
Update (21 Oct 2013)
The concepts below have been rolled into a family of functions called
concat.split.*in my “splitstackshape” package. Here is a very straightforward solution usingconcat.split.multiple:Remove the
"long"argument if you want the wide format, but your comments indicated that ultimately you wanted a long format for your output.Original answer (17 Dec 2012)
You can do this with
strsplitandsapplyas follows:Or… use
rbind.fillfrom the “plyr” package, after making each of your rows into a single columndata.frame:Or… inspired by @DWin’s great answer here, re-read the second column as a
data.framein itself.