I have a dataframe that looks like this
DF:
V1 V2 V3 V4 V5 V6 V7 V8
0 ss66369915 0 0 G A A A
0 ss66112992 0 0 A A A A
0 ss66369329 0 0 A A A A
0 ss66368644 0 0 A A A A
0 ss66368284 0 0 A A G A
0 ss66126380 0 0 A G A G
0 ss66407282 0 0 A A A A
0 ss66405035 0 0 A A A A
0 ss66405148 0 0 G G A G
0 ss66405271 0 0 G G G G
The data in columns V6 through V9 are biallelic genotypes, so I would like to merge every two columns together into one.
For example, it would look like:
V1 V2 V3 V4 V5_V6 V7 V8
0 ss66369915 0 0 GA A A
0 ss66112992 0 0 AA A A
0 ss66369329 0 0 AA A A
0 ss66368644 0 0 AA A A
0 ss66368284 0 0 AA G A
0 ss66126380 0 0 AG A G
0 ss66407282 0 0 AA A A
0 ss66405035 0 0 AA A A
0 ss66405148 0 0 GG A G
0 ss66405271 0 0 GG G G
I was able to do this using:
DF$V5_V6=paste(DF$V5, DF$V6, sep="")
or
within(DF, V5_V6 <- paste(V5, V6, sep=''))
However my actual dataframe consists of 4776 rows and I would have to merge every two columns starting from column 5 to column 4776.
I was wondering how I could achieve this without doing it manually. I tried to use a for loop with no success. I am very new to using R.
Thank you!
Maybe you can show the for loop you tried?
Here’s one approach using a loop that should do what you want, if I understand what you want. Specifically – this for loop will paste the values of columns 5 & 6, 7 & 8, 9 & 10, etc together. We use the
names()function to extract the relevant column names and paste them together. We use[to index into the objectnewdatthat is created.Results in: