I’ve found the class WordnetSynonymParser in org.apache.lucene.analysis.synonym but there aren’t examples of its usage neither in the API nor in google. Does any one have experience with it?
Thank you!
EDIT: I know that there used to be the class SynExpand, but with version 3.6 it disappeared…
I try:
try {
FileReader rulesReader = new FileReader("wn/wn_s.pl");
SynonymMap.Builder parser = null;
parser = new WordnetSynonymParser(true, true, analyzer);
((WordnetSynonymParser)parser).add(rulesReader);
synonymMap = parser.build();
} catch (Exception e) {
e.printStackTrace();
System.exit(1);
}
But I get the following error:
java.text.ParseException: Invalid synonym rule at line 109
at org.apache.lucene.analysis.synonym.WordnetSynonymParser.add(WordnetSynonymParser.java:75)
at pirServer.QueryClassifier.<init>(QueryClassifier.java:77)
at pirServer.PIRServer.main(PIRServer.java:32)
Caused by: java.lang.IllegalArgumentException: term: course of action analyzed to a token with posinc != 1
at org.apache.lucene.analysis.synonym.SynonymMap$Builder.analyze(SynonymMap.java:131)
at org.apache.lucene.analysis.synonym.WordnetSynonymParser.parseSynonym(WordnetSynonymParser.java:92)
at org.apache.lucene.analysis.synonym.WordnetSynonymParser.add(WordnetSynonymParser.java:67)
... 2 more
I am working on a similar thing and just read the documentation – so a relevant caution from the SynonymFilter doc is very fresh:
“”This token stream cannot properly handle position increments != 1, ie, you should place this filter before filtering out stop words””
http://lucene.apache.org/core/3_6_0/api/all/org/apache/lucene/analysis/synonym/SynonymFilter.html
It’s possible that the analyzer you’re passing (which you fail to describe in your post) to the WordNetSynonymParser does remove stop words (as is the case for most of them) causing:
java.lang.IllegalArgumentException: term: course of action analyzed to a token with posinc != 1