Say, I have this code in my model:
class Facility < ActiveRecord::Base
...
searchable do
text :name
text :facility_type do
end
...
And this in search controller:
@search = Facility.search do
keywords(query) do
boost_fields :name => 1.9,
:facility_type => 1.98
end
...
And I have two Facility objects – first one having a type “cafe”, but not having a word “cafe” in the name, a second one – called “cafe sun”, for example, but being of a “bar” type in fact.
I run the search with query=”cafe” and get both facilities in the response, but the score is 5.003391 for a “cafe sun” and 1.250491 for a real “cafe”
For the second try I set
boost_fields :name => 1.9, :facility_type => 3
Score for “cafe sun” doesn’t change, but “cafe” somewhat grew up – 1.8946824
So, as long as results get sorted by the score, I am interested how is it calculated ?
Or am I choosing wrong tokenizers or something, here is what I have in schema.xml
<fieldType name="text" class="solr.TextField" omitNorms="false">
<analyzer type="index">
<tokenizer class="solr.StandardTokenizerFactory"/>
<filter class="solr.StandardFilterFactory"/>
<filter class="solr.LowerCaseFilterFactory"/>
<filter class="solr.EdgeNGramFilterFactory"
minGramSize="3"
maxGramSize="30"/>
</analyzer>
<analyzer type="query">
<tokenizer class="solr.StandardTokenizerFactory"/>
<filter class="solr.StandardFilterFactory"/>
<filter class="solr.LowerCaseFilterFactory"/>
</analyzer>
</fieldType>
Scoring results is the domain of the Lucene library, and the crux of its algorithm is described in detail here:
To inspect the raw scoring data, run a query against your Solr instance directly and append the
debugQuery=onparameter to see scoring data.For general relevancy optimizations in Solr, you can consult the SolrRelevancyFAQ. It also has one question specifically demonstrating the output of
debugQueryAll in all: you ask a very good question with a very deep answer. I may edit my response down the road to expand on the subject.