I have a data frame 1488 obs. and 400 var. I am trying to log all the values in the table and then using the package outliers with the command rm.outlier, I am tyring to remove the outliers. The only problem is that I get this error:
Error in data.frame(V1 = c(-0.886056647693163, -0.677780705266081, -1.15490195998574, : arguments imply differing number of rows: 1487, 1480, 1481, 1475, 1479, 1478, 1483, 1485, 1484, 1477, 1482, 1469
This is my code:
datalog <- matrix(0,nrow(data),ncol(data))
datalog[,] <- apply(data,2,log10)
datalog[datalog==-Inf] <- 0
datalog <- as.data.frame(datalog, stringsAsFactors=F)
testNoOutliers <- rm.outlier(datalog, fill = FALSE,
median = FALSE, opposite = FALSE)
My data:
https://skydrive.live.com/redir?resid=CEC7696F3B5BFBC6!341&authkey=!APiwy6qasD3-yGo
Thanks for any help
You got this error because different number of outliers are removed from each column and so columns can not be put together in one data frame.
If you want to replace outliers with NA, one solution would be
To remove entire rows containing outlier values, you could add additional line to @agstudy solution