I want to POStag an English sentence and do some processing. I would like to use openNLP. I have it installed
When I execute the command
I:\Workshop\Programming\nlp\opennlp-tools-1.5.0-bin\opennlp-tools-1.5.0>java -jar opennlp-tools-1.5.0.jar POSTagger models\en-pos-maxent.bin < Text.txt
It gives output POSTagging the input in Text.txt
Loading POS Tagger model ... done (4.009s)
My_PRP$ name_NN is_VBZ Shabab_NNP i_FW am_VBP 22_CD years_NNS old._.
Average: 66.7 sent/s
Total: 1 sent
Runtime: 0.015s
I hope it installed properly?
Now how do i do this POStagging from inside a java application? I have added the openNLPtools, jwnl, maxent jar to the project but how do i invoke the POStagging?
Here’s some (old) sample code I threw together, with modernized code to follow:
The output is:
This is basically working from the POSTaggerTool class included as part of OpenNLP. The
sample.getTags()is aStringarray that has the tag types themselves.This requires direct file access to the training data, which is really, really lame.
An updated codebase for this is a little different (and probably more useful.)
First, a Maven POM:
And here’s the code, written as a test, therefore located in
./src/test/java/org/javachannel/opennlp/example:This code doesn’t actually test anything – it’s a smoke test, if anything – but it should serve as a starting point. Another (potentially) nice thing is that it downloads a model for you if you don’t have it downloaded already.