While writing convenience functions for subset(), I ran into a strange situation where using equivalent logical statements returns different subsets. So, for example:
dat = data.frame(ttl.stims = c(4,4,8,8), change = c('big', 'small'))
dat
ttl.stims = 4
#logical statements are equivalent
dat$ttl.stims == 4
dat$ttl.stims == ttl.stims
#subset evaluates differently
subset(dat, dat$ttl.stims == 4)
subset(dat, dat$ttl.stims == ttl.stims)
I’ve been working around this by doing:
index = dat$ttl.stims == ttl.stims
subset(dat, index)
But I’m so curious about why the first two subsets don’t produce identical results! Ideas? Thoughts? Pontifications?
Because inside the call to
subsetthe symbolttl.stimsgets interpreted in the environment ofdat, so it becomes (after interpretation)dat$ttl.stims. I predict that the second call to subset returns the entire dataframe.