Note: The title may be misleading. If you understand my problem and think of something more descriptive – please change it.
I’ve got a strange situation where the responses from a survey are all character, rather than numeric. It seems that R, really doesn’t like this. Let’s say I asked a question:
Q. In what area do you work?
East
West
Central
North
South
None of the above
But respondents were only from the east, west and central.
dat <- rep(c("East", "West", "Central"),100)
Now, for presentation purposes, it’s important that I include North, south and None of the above, even if they are none. However, factoring those elements in is challenging.
Let’s try:
fac1 <- factor(dat, labels=c("East","West","Central","North","South","None of the above"))
Error in factor(dat, labels = c("East", "West", "Central", "North", "South", :
invalid labels; length 6 should be 1 or 3
Basically, what i’d like to do is factor this data with the missing values. So that when I type something like summary(fac1) it shows them having 0 responses in that category.
There has to be an easier way to do this!
Almost there. You need to use the
levelsargument:The difference between
levelsandlabelsis this:levelsdefines the factor levels in your datalabelsallows you to rename the factor levels in one go.For example: