I just went through solr wiki page for clustering. But i am not getting what is the benefit of using clustering. Can anyone tell me what is actually clusering and what its use in indexing and searching.
Please reply
Sign Up to our social questions and Answers Engine to ask questions, answer people’s questions, and connect with other people.
Login to our social questions & Answers Engine to ask questions answer people’s questions & connect with other people.
Lost your password? Please enter your email address. You will receive a link and will create a new password via email.
Please briefly explain why you feel this question should be reported.
Please briefly explain why you feel this answer should be reported.
Please briefly explain why you feel this user should be reported.
Clustering is a statistical technique to group data in to groups ‘which belong together’.
In Solr specifically, this means that it will try to group the results for a certain query and label those groups.
This could give you additional information in the nature of the results returned.
Example: if you search for ‘Python’ on a very broad set of documents, the clustering component might create groups for ‘The Python programming language’, ‘Python the snake’, etc.
Have a look at the Carrot2 demo site for a demo: (Carrot2 is the clustering engine shipped with Solr)
http://search.carrot2.org/stable/search
Solr’s clustering components (Carrot2) clusters the documents using the text fields which are returned by Solr in a result list. (The fields used are configurable.)
It uses the terms in the text field to build the clusters and label them.
There is a very interesting presentation on the Carrot2 website:
http://project.carrot2.org/publications/carrot2-dresden-2007.pdf