I have found myself doing a “conditional left join” several times in R. To illustrate with an example; if you have two data frames such as:
> df
a b
1 1 0
2 2 0
> other.df
a b
1 2 3
The goal is to end up with this data frame:
> final.df
a b
1 1 0
2 2 3
The code I’ve been written so far:
c <- merge(df, other.df, by=c("a"), all.x = TRUE)
c[is.na(c$b.y),]$b.y <- 0
d<-subset(c, select=c("a","b.y"))
colnames(d)[2]<-b
to finally arrive with the result I wanted.
Doing this in effectively four lines makes the code very opaque.
Is there any better, less cumbersome way to do this?
Here are two ways. In both cases the first line does a left merge returning the required columns. In the case of
mergewe then have to set the names. The final line in both lines replacesNAs with0.merge
sqldf