I am learning to tag part of speech by applying transformational rules. The first step is to tag the possible POS to each word in a text by using a dictionary like:
communicative JJ
communicator NN
communicators NNS
communion NN
communique NN
communiques NNS
communism NN
The second step is to apply transformational rules to change tags. I have only a very small dictionary containing the above word/tag pairs. Where can I find a large one and where can I find transformational rules? It is said tagging based on transformation may have a lot of rules. Where can I find the rules?
Thank you in advance.
You’d obtain the possibilities from a corpus, such as those available in NLTK. That would also give you frequencies from which to estimate probabilities, if you want to do machine-learned tagging (Brill-style).
The rules must be handcrafted, after which the machine learner can find out when to apply which ones. See, e.g., Brill’s PhD thesis for English rules.