Are there some common or recommended techniques for using the context of word to improve the accuracy of part-of-speech tagging?
For example if I had the sentence:
I played golf on a links.
The word “links” could be either singular (a golf course) or plural. I tried this sentence in several grammar checkers and they all correctly recognized the sentence as valid.
The problem is they also thought that this sentence was valid:
I clicked on a links.
Is there a good way to use the context (clicked vs played golf) to infer the correct part-of-speech?
Thanks!
Determining whether “links” is a “golf course” or “references” is a task called word-sense disambiguation.
Here is what Wikipedia’s article on Word-sense disambiguation says about the relation to part-of-speech tagging:
I am not aware of works that use WSD to inform POS-tagging (however, using POS tags to inform WSD is the standard.) This sounds like a good idea to me, even if the benefit to accuracy would be small because accuracy is already high. It could be implemented as a feature in Toutanova’s CRF tagger.