I have two csv files:
csv1 <- data.frame(y=c("classA", "classB", "classA", "classB", "classA", "classC"),
DBID=c("d1", "d1", "d2", "d3", "d3", "d3"))
y DBID
1 classA d1
2 classB d1
3 classA d2
4 classB d3
5 classA d3
6 classC d3
csv2 <- data.frame(tm=c("t1","t1","t2"),
y=c("classA","classC","classB"))
tm y
1 t1 classA
2 t1 classC
3 t2 classB
I want to extract information to get a table by matching column y in both csv files, i.e.
t1 has classA and classC in csv2 file, so, all the DBID classified as classA in csv1 (d1,d2 and d3) are listed in the resulting dataframe with t1 in the first column, d1,d2 and d3 as the second column
t2 has class B in csv2 file, so, all the DBID classified as classB in csv1 (d1 and d3) are listed in the result dataframe with t2 listed in the first column, d1 and d3 as the second column.
and get a dataframe as follows:
tm DBID endcol
t1 d1 1
t1 d2 1
t1 d3 1
t1 d3 1
t2 d1 1
t2 d3 1
Please instruct how to do so with R.
Maybe
merge?You can add the column of all ones yourself.
mergeis (by default) merging the two based on columns with identical names, which is why I didn’t have to pass any other arguments. If you have other column names that match, you’ll need to specify thebyargument explicitly to get the behavior you want.