I have a system that analyzes logs. It already uses lucene to index data. Now I want to add a distributed search to it, so opting solr. My problem goes this way, I am querying for the search term “logoff”. I have 3 categories of indices – hot, warm and cold. Hot exists as current indices, where as warm and cold are previous days indices.
Assume for the query “logoff”, I don’t have enough results from the hot section, so now I have to go to the warm section and then search through it. Even if I don’t get the desired count with warm as well, I will have to search through cold.
So is there any way I can optimistically do it via solr, or should I write my wrapper over it based on the numFound parameter? In short I want to prioritize the indices, and go and search into the one with lower priority only if sufficient results are not obtained from the higher priority indices.
The easiest solution would be having all your data in the same index with a field containing the type of data:
hot,warmorcold. Then you can useedismaxand give a different weight to the different values for type. That way you would do everything with a single query, but you would always have all three type of data in your results, withhoton top, thenwarmand thencold. The other problem is swithinghottowarmandwarmtocoldevery day: this solution would require to re-submit all your documents chaning only the field type, and this doesn’t seem what you want.Otherwise you could use three different indexes each on a different solr core, and switch on client side between one core rather than the other ones. This way you have to make more than one query. Then you can have a look at the Solr CoreAdmin swap capability to switch
hottowarmandwarmtocold.