I’m trying to classify some data using SVM classification, implemented in weka library. My code for classification looks like:
BufferedReader reader = new BufferedReader(new FileReader(arffDataFile));
Instances data = new Instances(reader);
reader.close();
data.setClassIndex(0);
NumericToNominal filter = new NumericToNominal();
String[] options = new String[2];
options[0] = "-R";
options[1] = "1";
filter.setOptions(options);
filter.setInputFormat(data);
Instances newData = Filter.useFilter(data, filter);
newData.setClassIndex(0);
weka.classifiers.functions.LibSVM svm = new weka.classifiers.functions.LibSVM();
svm.buildClassifier(newData);
Evaluation eval = new Evaluation(newData);
eval.crossValidateModel(svm, newData, folds, new Random(1));
System.out.println(eval.toSummaryString("\nResults\n======\n", false));
System.out.println();
Arff data file consist of 2973 instances and each instance has 27 attributes.
My question is, how can I find out weights for instance attributes. I need to investigate which attributes are the most useful in process of classification.
I’m beginner at field of machine learning, so simple language and sample code would be appreciated.
Thanks in advance for any help.
Weka has options to select attributes from attribute pools. In other words, it provides you means to rank attributes. They are in weka.attributeSelection and you have plenty of choices to use attribute evaluators with a particular search method. My personal preference for my task is to use InfoGainAttributeEval as attribute evaluator along with Ranker as search method. It depends on your task which combination you want to use.
See the documentations to use the attribute evaluators and search methods with the JAVA API as you are using code to interact with Weka. Personally, I use the GUI.