I’m trying to parse a multivariate function across a data.frame with ddply, in order to detect multivariate outliers per group. I expect to obtain a vector or an new column containing 1 (inliers) and 0 (outliers) using the the wfinal01 value of the sign1 function of the mvoutlier package. The following code is an example of what I have tried yet, without success:
library(plyr)
library(mvoutlier)
data(coffee)
myFunc<- function(X) sign1(unclass(X), qcrit=0.975)$wfinal01
ddply(coffee, .(sort), transform, outliers=myFunc(c(Metpyr, `5-Met`, furfu)))
The following error message is returned.
Erreur dans apply(x, 2, mad) : dim(X) must have a positive length
Your problem is that
ccreates a numeric vector where you want a matrix containing three columns passed. You can usecbindto do this.vectors have only 1 dimension,
applyrequires a matrix or array with greater than 2 dimensions (hence the error)Edit — reference by columns
I think reference by column number is dangerous, however this is possible if you were to use
data.tabledata.tablewill be faster and more efficient thanddply.You could just as easily (and more explicitly) pass
c('Metpyr', `5-Met`, 'furfu')as the argument to .SDcols.