I have a data set raw.data.2010 that needs several steps of subsetting with different animal species. I also need to name them accordingly after every filtering process. I wrote a simple code as below:
#Creating reproducible data######
site=rep(list("Q", "R", "S", "T"), each=500)
grid=sample(1:2, size=2000, replace=TRUE)
spp=rep(list("A", "B", "C", "D", "E"), each=400)
fate=sample(1:5, size=20000, replace=TRUE)
sex=rep(list("M","F"), each=2000)
weight=sample(85:140, size=2000, replace=TRUE)
raw.data=as.data.frame(cbind(site, grid, spp, fate, sex, weight))
### main codes#####
spp=c("A", "B", "C", "D", "E")
for (i in spp){
name=paste(i, "raw", sep=".", collapse="")
filter=paste("select",i, sep="", collapse="")
assign(filter, raw.data$spp==i)
assign(name, raw.data[get(filter),])
}
I checked the filters and they worked without problem. But the last line didn’t work so all the subsetted data I called returned NA. What was wrong? Thank you.
EDIT: Hi, thank you all for your advice. I edited my codes so it’s reproducible. Basically I would like to first filter my raw.data with spp. Then I can keep adding more filters to group them according to site, grid, fate…etc. I need to be able to access the filtered data individually so I can manipulate them for later use, ex. calculate weight and other measurements for different sex or age group. I want to be able to call A.raw, A.Q.data later.
Since I would like to analyze my data at different levels (e.g. population level, individual level, site/grid level), and be able to pool/split them according to my needs. That’s the purpose of this code. Hope my explanation doesn’t confuse you.
You will probably save yourself a lot of work and grief in the long run if you move away from using global variables with
assignandgetand instead work with lists (and remember to subset using[[instead of$).