I have two clusters as a class which has
Cluster : class
DocumentList : List<Document>
centroidVector : Map<String,Double>
Now the problem is that when the query is searched it is parsed as a file and then made into a document object , added to documentIndex and its index is constructed along with other documents . I did that because it had to go through the same procedure i.e tokenizing ,stemming etc. But now i want to implement query search in a specific cluster with which the query vector is most similar with , i.e dot product ~ 0.5 -1 . So i would have to take a dot product between the query vector and the cluster vector to do that. But i dont know how to implement it because the index is created in memory and is not stored in the database. Still in the process of doing that .
Thank you
Clustering is not meant for searching (i.e. indexing etc.). It is an analysis step meant to find possible unknown structure within your data set, not to retrieve information faster.
You can exploit the structure sometimes for faster search, but then you need an index that can make use of this.
Just do an index right away if you want to do similarity search! Then try to improve the index by doing some clustering before.