I am working with text which is, unfortunately, given in ALL CAPS. The default

Question

0

Asked: May 27, 20262026-05-27T01:10:41+00:00 2026-05-27T01:10:41+00:00

I am working with text which is, unfortunately, given in ALL CAPS. The default

0

I am working with text which is, unfortunately, given in ALL CAPS. The default nltk.pos_tag function does not do a very good job on this text (it thinks everything is a proper noun).

What is the best way to deal with this issue?

Report

Leave an answer
Cancel reply

You must login to add an answer.

Need An Account,

1 Answer

Editorial Team · Answer 1 · 2026-05-27T01:10:41+00:00

The best would be to apply truecasing to your text before POS-tagging.

If that is too much efford for you, you can transform your Python string x to lower characters using x.lower(), that should at least avoid the problem of getting only proper noun tags (there might be some confusions with too less proper noun tags though).

You could train a POS-Tagger by transforming a tagged corpus previously to lower aswell, but if you want to get the best results you probably want to go for the truecasing.

Sign Up

Sign In

Forgot Password

The Archive Base Latest Questions

I am working with text which is, unfortunately, given in ALL CAPS. The default

Leave an answerCancel reply

1 Answer

Leave an answer
Cancel reply