Is there any java library that with given text (title) gets collection of important words in it.
EDITED: By important I mean the one that has define the main idea of the sentence.
Thank You.
Is there any java library that with given text (title) gets collection of important
Share
You might want to take a look at Apache Mahout.
You also might want to read more on tf-idf model which is often used for cases similar to the one you describe.
EDIT: more info on Tf-Idf model:
The tf-idf model basically says 2 things:
The tf-idf model utilize this assumptions and gives a rating for each term according to the tf,idf values.
To find the idf value you might want to index your collection or use some search engine API and estimate how common each term is, based on the number of results [note that the number returned by the engine is not exact, but it might be used as a rough estimation]