I have two dataframes, looking sort of like:
Source_name <- c("name1", "name2", "name3", "name4", "name5")
Target_name <- c("name10", "name11", "name12", "name13", "name14")
values <- c("asd", "213", "kahsd", "a9u", "oau92")
values2 <- c("asdd", "oau892", "kahsd", "213", "213")
dat <- cbind(Source_name, values)
daf <- cbind(Target_name, values2)
dat
Source_name values
[1,] "name1" "asd"
[2,] "name2" "213"
[3,] "name3" "kahsd"
[4,] "name4" "a9u"
[5,] "name5" "oau92"
daf
Target_name values2
[1,] "name10" "asdd"
[2,] "name11" "oau892"
[3,] "name12" "kahsd"
[4,] "name13" "213"
[5,] "name14" "213"
Each value only occurs once in dat, but may occur more than once in daf (or not at all). I would like to record those values in dat that occur at most once in daf, as per the desired_output data.frame.
unique_values <- c( "asd", "kahsd", "a9u", "oau92")
Source_name <- c( "name1", "name3", "name4", "name5")
Target_name <- c( "NA", "name12", "NA", "NA")
desired_output <- data.frame(cbind(unique_values, Source_name, Target_name))
desired_output
unique_values Source_name Target_name
1 asd name1
3 kahsd name3 name12
2 a9u name4
4 oau92 name5
I imagine there’s an easy way to do this using apply or something, but Im stumped.
You could merge your two data.frames:
Then remove rows with values that show up twice or more:
Or as @hadley pointed out in the comments (thanks!):