> imqd = read.csv("csv/quest/IMQ.csv")
> demod = read.csv("csv/DEMO.csv")
> mcqd = read.csv("csv/quest/MCQ.csv")
>
> length(demod)
[1] 145
> length(demod[[1]])
[1] 9965
> length(mcqd)
[1] 168
> length(mcqd[[1]])
[1] 9493
> length(imqd)
[1] 5
> length(imqd[[1]])
[1] 9965
>
> mydata = merge(imqd, demod)
> length(mydata)
[1] 148
> length(mydata[[1]])
[1] 9965
So far, so good. But, if I try to merge mcqd with anything, I lose most of my rows, even though the data looks good to me.
> intersect(intersect(names(imqd), names(mcqd)), names(demod))
[1] "X" "seqn"
> finaldata = merge(mydata, mcqd)
> length(finaldata)
[1] 314
> length(finaldata[[1]])
[1] 18
Why are there only 18 rows now?
If you want to play along at home, you can get the csv files here.
mergeis trying to return only those rows which match on each of the common columns.Looking at MCQ.csv, we see that the 20th row starts off:
mergewill not use this row. Both of the common columns, the first and the second columns, do not match for every file. Thepkof the data to merge on is clearlyseqn. So, we can simply use thebyargument tomerge: