We are two students who want to use one-class svm for dectection of summary worthy sentences in text documents. We have already implemented sentence similarity functions for sentences, which we have used for another algorithm. We would now want to use the same functions as kernels for a one-class svm in libsvm for java.
We are using the PRECOMPUTED enum for the kernel_type field in our svm_parameter (param). In the x field of our svm_problem (prob) we have the kernel matrix on the form:
0:i 1:K(xi,x1) ... L:K(xi,xL)
where K(x,y) is the kernel value for the similarity of x and y, L is the number of sentences to compare and i is the current row index (0 to L).
The training of the kernel (svm.svm_train(prob, param)) seems to get sometimes get “stuck” in what seems like a infinite loop.
Have we missunderstood how to use the PRECOMPUTED enum, or does the problem lay elsewhere?
We solved this problem
It turns out that the “series numbers” in the first column needs to go from
1toL, not0toL-1, which was our initial numbering. We found this out by inspecting the source insvm.java:The reason for starting the numbering at 1 instead of 0, is that the first column of a row is used as column index when returning the value
K(i,j).Example
Consider this Java matrix:
Now, libsvm needs the kernel value
K(i,j)for sayi=1andj=3. The expressionx[i][(int)(x[j][0].value)].valuewill break down to:This was a bit messy to realize at first, but changing the indexing solved our problem. Hopefully this might help someone else with similar problems.