I have read in a CSV and would like to find “empty” rows and columns, applying something like
isempty = function(x) all(is.na(x) | x == 0 | x == "")
to all columns. The first column is of mode character, all others are numeric.
However, when I do emptycols = apply(mydf, 2, isempty) the logical vector that is returned is all FALSE.
When I try emptycols = apply(mydf[ , -1], 2, isempty) it works perfectly, returning a logical vector which is TRUE for all “empty” columns.
I am aware that I could just use sapply, which works fine anyway, still I wonder: What causes this behaviour? How can the first (character) column affect the application of my function to all the other columns?
@Backlin was right. If you change isemtpy like this:
The following results show what happens:
Quoting @Backlin: “the first column causes apply to turn your data frame into a character matrix, in which “0” would not match 0. However, when you [,-1] it gets turned into a numeric matrix and it works fine.“
sapply behaves itself better: