I have a data frame that contains a long character string each associated with a ‘Sample’:
Sample Data
1 000000000000000000000000000N01000000000000N0N000000000N00N0000NN00N0N000000100000N00N0N0000000NNNN011111111111111111111111111111110000000000000000000N000000N0000000000N
2 000000000000000000000000000N01000000000000N0N000000000N00N0000NN00N0N000000100000N00N0N0000000NNNN011111111111111111111111111111110000000000000000000N000000N0000000000N
I would like to code an easy way to break this string into 5 pieces in the following format:
Sample X
CCT6 - Characters 1-33
GAT1 - Characters 34-68
IMD3 - Characters 69-99
PDR3 - Characters 100-130
RIM15 - Characters 131-168
Giving an output that looks like this for each sample:
Sample 1
CCT6 - 000000000000000000000000000N01000
GAT1 - 000000000N0N000000000N00N0000NN00N0
IMD3 - N000000100000N00N0N0000000NNNN0
PDR3 - 1111111111111111111111111111111
RIM15 - 0000000000000000000N000000N0000000000N
I’ve been able to use the substr function to break the long string into individual pieces but id like to able to automate it so I can get all 5 pieces in one output. Ideally this output would also be a data frame.
This is what
?read.fwfis for.First some data which looks like your question:
Now use
read.fwf, specify the widths of each field and their names, and that all should be of modecharacter. We wrap the text column of the example data intextConnectionso that we can treat it like a connection understood generally by theread.*and other functions.Now loop over the rows and print out each one as per your example:
Giving, for example: