I’m trying to do some data manipulation in R. I have 2 data frames,

Question

0

Asked: June 6, 20262026-06-06T15:00:46+00:00 2026-06-06T15:00:46+00:00

I’m trying to do some data manipulation in R. I have 2 data frames,

0

I’m trying to do some data manipulation in R. I have 2 data frames, one is training data, the other testing data all the data is categorical and stored as factor variables.

There are some NA’s in the data and I’m trying to convert them to “-1”. When I do it for the training data, things go fine, but not for the test data.

Something changes the values during a loop I run but I can’t figure out what.

Here’s the before:

> class(catTrain1[,"Cat_111"])
[1] "factor"
> class(catTest1[,"Cat_111"])
[1] "factor"

> table(catTrain1[,"Cat_111"])

  1   2 
726  25 
> table(catTest1[,"Cat_111"])

  0   1   2 
  1 503  15

Here’s the loop:

> for(i in 1:ncol(catTrain1)){
+ catTrain1[,i] <- as.factor(as.character(ifelse(is.na(catTrain1[,i]), "-1", catTrain1[,i])))
+ }
> for(i in 1:ncol(catTest1)){
+ catTest1[,i]  <- as.factor(as.character(ifelse(is.na(catTest1[,i]), "-1", catTest1[,i])))
+ }

Here’s the after:

> table(catTrain1[,"Cat_111"])

  1   2 
726  25 
> table(catTest1[,"Cat_111"])

  1   2   3 
  1 503  15

I’ve seen the shift up by one with character -> numeric conversions but I can’t figure out why this is happening, especially for just one of the dataframes / loops.

Any suggestions?

Report

Leave an answer
Cancel reply

You must login to add an answer.

Need An Account,

1 Answer

Editorial Team · Answer 1 · 2026-06-06T15:00:47+00:00

Editorial Team

2026-06-06T15:00:47+00:00Added an answer on June 6, 2026 at 3:00 pm

The column names in your first set of calls to table are the levels of the factor. In the second set of calls to table, the column names are the level indexes. ifelse is pulling the indexes, not the levels. In your loops, move the as.character in around the final catTest1[,i] and catTrain1[,i].

0

Reply
Share
Share

- Report

Sign Up

Sign In

Forgot Password

The Archive Base Latest Questions

I’m trying to do some data manipulation in R. I have 2 data frames,

Leave an answerCancel reply

1 Answer

Leave an answer
Cancel reply