I need to use Wordnet in a java-based app.
I want to:
-
search synsets
-
find similarity/relatedness between synsets
My app uses RDF graphs and I know there are SPARQL endpoints with Wordnet, but I guess it’s better to have a local copy of the dataset, as it’s not too big.
I’ve found the following jars:
- General library – JAWS http://lyle.smu.edu/~tspell/jaws/index.html
- General library – JWNL http://sourceforge.net/projects/jwordnet
- Similarity library (Perl) – Wordnet::similarity http://wn-similarity.sourceforge.net/
- Java version of Wordnet::similarity http://www.cogs.susx.ac.uk/users/drh21/ (beta)
What would you recommend for my app?
Is it possible to use a Perl library from a java app via some bindings?
Thanks!
Mulone
I use JAWS for normal wordnet stuff because it’s easy to use. For similarity metrics, though, I use the library located here. You’ll also need to download this folder, containing pre-processed WordNet and corpus data, for it to work. The code can be used like this, assuming you placed that folder in another called “lib” in your project folder:
This will print something like the following, showing the similarity score between each possible combination of synsets represented by the words to be compared:
There are also methods that allow you to specify which sense of either/both words:
res(String word1, int senseNum1, String word2, partOfSpeech), etc. Unfortunately, the source documentation is not JavaDoc, so you’ll need to inspect it manually. The source can be downloaded here.The available algorithms are:
Also, it requires you to have the jar file for MIT’s JWI