These are the definitions of df1 and df2:
df1 <- data.frame(x = 1:3, y=letters[1:3])
df2 <- data.frame(x= rep(c(1,2,3),each=3))
I want to assign the value of column y in df1 to column y in df2, where the value in column x of df1 is equal to the value in column x of df2. As shown above df1 and df2 are of unequal length.
for(i in 1:length(df2$x)){
df2$y[i]<- df1$y[which(df1$x == df2$x[i])]
}
I am not looking for short cuts to do this (no builtin functions please). I want to learn it the right way.
Is my logic correct?
If it is why is this not working?
Any guidance will be highly appreciated.
Taking what you call “shortcuts” is actually the right way to do things in R. But I do think that looping through manually is sometimes a good exercise. But in your “production code”, ie code that you want to count on, use the built-in functions when they are applicable.
You’re just missing one option for your
data.frame. Everything else is fine. The problem is that by default, character vectors are input asfactorsin adata.frameand when you try to replace a value with a value from afactorvector it is replacing it with the underlying numeric index of that level. Here is the complete code:See
?data.framefor more info on thestringsAsFactorsoptionSince you seem interested in learning, here’s a way you might have gone about debugging. Suppose your original commands are in a file called
temp.R. Theniis left over after the for loop. Let’s use it so that your following commands withiin them will work. You can reassign a value toito see what your command would give for other values. Now lets start breaking your code down to see where the problem is.Looks good so far. 3 is what we expect it to be, right?
Here is where you need to recognize “oh, this is a factor!”. Whenever you see “Levels” the “factor” lightbulb should be lighting up in your head.
Let’s see the value before we try the replacement just to be sure the rest of your code didn’t accidentally modify it:
Looks good. We know what happens after the replacement, so clearly something goes wrong with the assignment. Let’s try this just to see what happens:
Clearly something is wrong. Thus we’ve narrowed the problem down to here. Now we need to go back up to find out why we’re replacing with a factor. Hopefully that will lead you to the
data.framehelp.Things like this are annoying in
Rbut you just have to have faith that there are reasons for behavior like this, and once you learn more coding inRand more ofR‘s philosophy, you won’t have as many surprises such as this. Good luck!