I want to create tags for my content automatically. There would be a constant tag list and the bot should create tags regarding to it. How can I do that? Do you know a class for that? Any suggestions would be appreciated!
Thank you!
Sign Up to our social questions and Answers Engine to ask questions, answer people’s questions, and connect with other people.
Login to our social questions & Answers Engine to ask questions answer people’s questions & connect with other people.
Lost your password? Please enter your email address. You will receive a link and will create a new password via email.
Please briefly explain why you feel this question should be reported.
Please briefly explain why you feel this answer should be reported.
Please briefly explain why you feel this user should be reported.
How good do you need the tags to be?
You could simply count n-gram word frequencies.
With some tweaking this can create perfectly valid tags to use with blog posts, for example.
If you’re looking for something more advanced, and you have a corpus of documents, you could use TF*IDF (Term frequency, Inverse Document Frequency). This will show meaningful keywords mentioned in 1 document, based on their improbability of appearing in other documents. It will give you good results providing your corpus is large enough.
A shortcut approach might be to identify a relevant section of the content (title? category? source?) and use something like this instead.
Also Yahoo has a term extraction API which you might find interesting.