I am using RTextTools to train and classify data which comes from a MySQL table. I have a field called id that identifies each document in the database. However, after using the following code, the id field is no longer present.
matrix <- create_matrix(cbind(data$text,data$id),
language="english", removeNumbers=TRUE,
removeSparseTerms=.998)
corpus <- create_corpus(matrix,
as.numeric(data$valid),
trainSize=1:750, testSize=751:1000,
virgin=FALSE)
SVM <- train_model(corpus,"SVM")
SVM_CLASSIFY <- classify_model(corpus, SVM)
As stated above, the data$id seems to be lost during the process. Any idea how I can keep the ID linked to the data?
You can use the
cbindcommand to add the ID column back to the output. For example: