Given two data frames:
C1<-c(3,4,4,4,5)
C2<-c(3,7,3,4,5)
C3<-c(5,6,3,7,4)
DF<-data.frame(C1=C1,C2=C2,C3=C3)
DF
C1 C2 C3
1 3 3 5
2 4 7 6
3 4 3 3
4 4 4 7
5 5 5 4
and
V1<-c(3,2,2,4,5)
V2<-c(3,7,3,5,2)
V3<-c(5,2,5,7,5)
V4<-c(1,1,2,3,4)
V5<-c(1,2,6,7,5)
DF2<-data.frame(V1=V1,V2=V2,V3=V3,V4=V4,V5=V5)
DF2
V1 V2 V3 V4 V5
1 3 3 5 1 1
2 2 7 2 1 2
3 2 3 5 2 6
4 4 5 7 3 7
5 5 2 5 4 5
Looking at each equivalent row in both data frames, there is a relationship between the value in C3 and the number of columns I want to drop in that same row in DF2.
The relationship between the value in C3 and the # of columns in DF2 to drop looks like this
If C3≥7 drop V5
If C3=6.0:6.9 drop V4 and up (so basically V5,V4)
If C3=5.0:5.9 drop V3 and up (so basically V5,V4,V3)
If C3=4.0:4.9 drop V2 and up (so basically V5,V4,V3,V2)
If C3≤3.9 drop entire row
For this example, based on the values of C3, I would want DF2 to look like this
V1 V2 V3 V4 V5
1 3 3
2 2 7 2
4 4 5 7 3
5 5
I’ve tried write a simple script to do this (I’m pretty new so I like to keep things simple so I can see what’s going on) but I’m throwing errors left and right so I’d appreciate some advice on how to proceed
I like Koshke’s answer, but if your rules for setting to
NAdon’t have a nice mathematical property to them or you need to define your rules arbitrarily, this approach should give you that flexibility. First, define a function that returns the columns to drop based on your rules:Next, create a list for the column indices to drop:
Finally, loop through each row setting the corresponding columns to NA: