I have some data in csv format I want to use for predictive modeling. I read the data in R and apply some simple preprocessing (ommitting NA etc.). Before I want to train a SVM classifier I want to scale the data using the scale(x) function. The problem is that my label column is part of the dataset. How can I tell R to ignore that columns? Or what is best practice here?
label, X1, X2, X3, ..., Xn
Y, 34, 74, 29, ..., 47
N, 88, 46, 95, ..., 33
N, 58, 78, 25, ..., 68
Y, 33, 56, 61, ..., 13
If I try:
x <- scale(trouble[,-c(1)])
trouble <- x
summary(trouble)
rm(x);
The first column is deleted and gone for good.
you can do partial assignment: