So…I have a large data set with a variable that has many categories. I want to create new variables that group some of those categories into one.
I could do that with a conditional statement, but given the amount of categories it would take me forever to go one line at the time. Also, while my original variable is numeric, the values themselves are random so I can´t use logical or range statements.
How do I create this conditional variable based on many particular values?
I tried the following, but without success. Below is an example of the different categories I want to group into one.
classes <- c(549,162,210,222,44,96,62,208,525,202,149,442,427,
564,423,106,422,546,205,560,127,536,34,261,568,
366,524,401,548,95,156,8,528, 430,527,556,203,554,523,
501,530,55,252,585,19,540,71,204,502,504, 196,436,48,
102,526,201,521,23,558,552,118,416,117,216,510,494,
516,544,518)
So this seemed pretty intuitive to me, but it doesn´t work.
df$chem<- cbind(ifelse(df$class == classes ,1,0))
Needless to say I´m a beginner, and this is probably not so hard to do, but I´ve been looking for a solution to this particular problem and I can´t seem to find it. What am I missing? Thanks!
You are looking for
%in%not==eg
or using the logical to numeric conversion
if you want individual dummy variables for all the categories in
df$classthen you can use theclass.indfunction in the packagennet(which is shipped as a recommended package)