How can I extract person names from the text?
I have applied some NLP toolkit for this, specifically I used the Stanford NER toolkit to extract names from text. With that, I can extract person names from the text, but when I want the program to extract words like ‘programmer’, ‘lecturer’ or ‘engineer’, the libraries couldn’t extract those. Is there any way to extract these from the text?
Since “Programmer, lecturer, and engineer” are not named-entities, you may have to maintain a list of those words. I think you can obtain them from word derivation relationships in Wordnet, like “sing” (verb) and “singer” or “lecture” (verb) and “lecturer” (noun).
A SuperSense tagger may also be used as NER, I think it can tag those words you mentioned as “noun.person” which is what you need. ArkRef (Java) is a coreference tool that uses it (through a Java port of supersense tagger, bundled), and there’s an online demo there, so you can check if your target words are tagged in square brackets.