I have two data.frames (here just a subset is reported since they are too big):
DF1:
"G1" "G2"
IL3RA ABCC1
SRSF9 ADAM19
IL22RA2 BIK
UROD ALG3
SLC35C2 GGH
OR12D3 SEC31A
OSBPL3 HIST1H2BK
DF2:
"S1" "S2" "S3"
IL3RA 0 0
SRSF9 1 1
A1CF 0 0
A1CF1 1 1
GGH 2 0
HIST1H2BK 0 0
AAK1 0 0
I would like the following output:
"G1" "S2" "S3" "G2" "S2" "S3"
IL3RA 0 0 GGH 2 0
SRSF9 1 1 HIST1H2BK 0 0
I applied the function suggested to me in another similar situation. The function is:
lapply(DF1, function(x) DF2[na.omit(match(DF2[[1]], x)), ])
Surprisingly in this case it doesn’t work. I really don’t know why..I reproduced exactly the case posted in the post titled: "lop %in% over the columns of a data.frame" on my new data but nothing. Since DF1 and DF2 are too big I tried to use the cluster to have much more memory supposing that the problem was in the available memory…but nothing. The output it gives is the following:
"S1" "S2" "S3"
IL3RA 0 0
SRSF9 1 1
"S1" "S2" "S3"
GGH 2 0
AAK1 0 0
What can I do to solve this?
This should do it.
Edit: Solution using
lapply