I’ve just started trying to use Solr, and already I think that I’m attempting to use it backwards. Could someone let me know if what I’m trying to do is possible?
In normal use, one might specify a phrase and then search stored documents for instances of that phrase. However, I have a list of stored phrases and I’m trying to determine which of those phrases my query string contains.
For instance: suppose that I have phrases like these stored in Solr:
1:"fish fingers"
2:"apple pie"
If my search term is “I like fish fingers” then I want Solr to return the first record. If it’s “I like fish fingers and apple pie” then I want it to return both records. But if it’s “I like apple fingers and fish pie” then I want it to return no records.
(Of course, if the phrases were always two words then it would be pretty simple to do this by constructing a disjunctive query with all the two word phrases. But the phrases can potentially be any length.).
Thanks for any help.
I decided to read through the documentation on each Filter and Tokenizer, which is where I came across this description of the PositionFilterFactory:
The configuration given on this page is nearly exactly what I want. Unfortunately, since there doesn’t seem to be a filter which glues terms split by the tokenizer back into a single token, I can’t do any stemming. But maybe I can knock up such a filter myself.