I have a script that has a bunch of quality control checksums and it got caught on a dataset that had no need to remove any samples (rows) due to quality control. However, this script gave me an unexpected result of a dataframe with zero rows. With example data, why does this work:
data(iris)
##get rid of those pesky factors
iris$Species <- NULL
med <- which(iris[, 1] < 4.9)
medtemp <- iris[-med, ]
dim(medtemp)
[1] 134 4
but this returns a dataframe of zero rows:
small <- which(iris[, 1] < 4.0)
smalltemp <- iris[-small, ]
dim(smalltemp)
[1] 0 4
As does this:
x <- 0
zerotemp <- iris[-x, ]
dim(zerotemp)
[1] 0 4
It seems that the smalltemp dataframe should be the same size as iris since there are no rows to remove at all. Why is this?
Copied verbatim from Patrick Burns’s R Inferno p. 41 (I hope this constitutes “fair use” — if someone objects I’ll remove it)
negative nothing is something
The command above returns all of the values in
x2not equal to 3.The hope is that the above command returns all of
x2since no elements areequal to 5. Reality will dash that hope. Instead it returns a vector of length
zero.
There is a subtle difference between the two following statements:
Subtle difference in the input, but no subtlety in the difference in the output.
There are at least three possible solutions for the original problem.
Another solution is to use logical subscripts:
Or you can, in a sense, work backwards: