So…I have a large data set with a variable that has many categories. I

Question

0

Asked: June 10, 20262026-06-10T17:01:58+00:00 2026-06-10T17:01:58+00:00

So…I have a large data set with a variable that has many categories. I

0

So…I have a large data set with a variable that has many categories. I want to create new variables that group some of those categories into one.

I could do that with a conditional statement, but given the amount of categories it would take me forever to go one line at the time. Also, while my original variable is numeric, the values themselves are random so I can´t use logical or range statements.

How do I create this conditional variable based on many particular values?

I tried the following, but without success. Below is an example of the different categories I want to group into one.

classes <- c(549,162,210,222,44,96,62,208,525,202,149,442,427,
      564,423,106,422,546,205,560,127,536,34,261,568,
      366,524,401,548,95,156,8,528, 430,527,556,203,554,523,
      501,530,55,252,585,19,540,71,204,502,504, 196,436,48,
      102,526,201,521,23,558,552,118,416,117,216,510,494,
      516,544,518)

So this seemed pretty intuitive to me, but it doesn´t work.

df$chem<- cbind(ifelse(df$class == classes ,1,0))

Needless to say I´m a beginner, and this is probably not so hard to do, but I´ve been looking for a solution to this particular problem and I can´t seem to find it. What am I missing? Thanks!

Report

Leave an answer
Cancel reply

You must login to add an answer.

Need An Account,

1 Answer

Editorial Team · Answer 1 · 2026-06-10T17:02:00+00:00

You are looking for %in% not ==

eg

df$chem <- cbind(ifelse(df$class %in% classes ,1,0))

or using the logical to numeric conversion

df$chem <-  as.numeric(df$class %in% classes)

if you want individual dummy variables for all the categories in df$class then you can use the class.ind function in the package nnet (which is shipped as a recommended package)

library(nnet)

class_ind <- class.ind(df$class)
# add if you want to combine with the original
df_ind <- do.call(cbind, list(df, class.ind(df$class))

Sign Up

Sign In

Forgot Password

The Archive Base Latest Questions

So…I have a large data set with a variable that has many categories. I

Leave an answerCancel reply

1 Answer

Leave an answer
Cancel reply