I have looked at a set of data and decided it would be good to remove outliers, with an outlier having the definition of being 2SD away from the mean.
If I have a set of data, say 500 rows with 15 different attributes, how can I remove all the rows which have 1 or more attribute which is 2 standard deviations away from the mean?
Is there an easy way to do this using R?
Thanks,
There’s probably lots of ways and probably add on packages to deal with this. I’d suggest you try this first:
Here’s a way you could do what your asking for using the
scalefunction that can standardize vectors.So in answering your question yes there is an easy way in that the code to do this could be boiled down to 1 line of code:
And I’m guessing there’s a package that may do this and more. The
sospackage is terrific (IMHO) for finding functions to do what you want.