I’m having a problem with a data.frame. To make it very simple I start with
test<-data.frame(char=character(10), numr=numeric(10))
test$char[1]<-"ery"
The result is
Warning message:In `[<-.factor`(`*tmp*`, 1, value = c(NA, 1L, 1L, 1L, 1L, 1L, 1L, :
invalid factor level, NAs generated
If I do mode(test$char) I get [1] "numeric"
If I do mode(test$numr) I get [1] "character"
I can also do test$numr[1]<-"fjfj" without an error and the data is stored in that particular place.
If I instead of setting the data.frame with character(10) I just do everything as numeric then as in the previous example it will allow me to change the numeric to character simply by storing a string to something in a column even though it was previously defined as numeric.
Why does R treat character differently than I expect as in my example?
I’m a little suspicious of your results posted above.
This is telling me that
charis a factor,numris numeric, and both are stored as numeric (factors have an additional attribute that maps the numeric level codes to labels). You’re getting an error because you’re trying to set a value incharthat isn’t included in the list of levels (which includes only the blank string""). As @GSee says in the comments, you probably wantedstringsAsFactors=FALSE:You can set
options(stringsAsFactors=FALSE)to make this your global default behaviour. There is a tradeoff here between convenience for yourself and confusion the next time you forget that you have this option set globally, ask a question on StackOverflow, and have everyone wonder why you’re getting different answers …Finally, as you mentioned above, if
charstarts out as numeric, R will silently coerce it to a character string when you try to set an element to a character value. I think this is actually pretty bad design, but it’s too deeply built into R’s behaviour to change now …