I have a data frame that I’ve discretized using RWeka. RWeka’s discretization creates bins with single quotes in them. Although they are not causing any problems, while plotting it looks ugly to have a variable with 'All' category.
Here’s the discretized data frame:
structure(list(outlook = structure(c(1L, 1L, 2L, 3L, 3L, 3L,
2L, 1L, 1L, 3L, 1L, 2L, 2L, 3L), .Label = c("sunny", "overcast",
"rainy"), class = "factor"), temperature = structure(c(1L, 1L,
1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L), .Label = "'All'", class = "factor"),
humidity = structure(c(1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L,
1L, 1L, 1L, 1L, 1L), .Label = "'All'", class = "factor"),
windy = c(FALSE, TRUE, FALSE, FALSE, FALSE, TRUE, TRUE, FALSE,
FALSE, FALSE, TRUE, TRUE, FALSE, TRUE), play = structure(c(2L,
2L, 1L, 1L, 1L, 2L, 1L, 2L, 1L, 1L, 1L, 1L, 1L, 2L), .Label = c("yes",
"no"), class = "factor")), .Names = c("outlook", "temperature",
"humidity", "windy", "play"), row.names = c(NA, -14L), class = "data.frame")
How can I remove the single quotes from the data and recreate the factors?
This should do it:
If you need to do the same over several columns, this might be more efficient.