I have written the code to create the model and save it. It works fine. My understanding is data, by default, is split in 10 folds. I want data to be split into two sets (training and testing) when I create the model. On Weka UI, I can do it by using “Percentage split” radio button. I want to know how to do it through code. I want it to be split in two parts 80% being the training and 20% being the testing. Here is my code.
FilteredClassifier model = new FilteredClassifier();
model.setFilter(new StringToWordVector());
model.setClassifier(new NaiveBayesMultinomial());
try {
model.buildClassifier(trainingSet);
} catch (Exception e1) { // TODO Auto-generated catch block
e1.printStackTrace();
}
ObjectOutputStream oos = new ObjectOutputStream(
new FileOutputStream(
"/Users/me/models/MyModel.model"));
oos.writeObject(model);
oos.flush();
oos.close();
trainingSet here is already populated Instances object. Can someone help me with this?
Thanks in advance!
In the UI class
ClassifierPanel‘s methodstartClassifier(), I found the following code:so after randomizing your dataset…
… I suggest you split your
trainingSetin the same way:then use
Classifier#buildClassifier(Instances data)to train the classifier with 80% of your set instances:UPDATE: thanks to @ChengkunWu’s answer, I added the randomizing step above.