When running a search such as:
field_name:#
field_name:"#"
field_name:"\#"
where there is a record with the value of exactly “#”, Solr returns 0 rows.
The workaround we are having to use is to use a range query on the
field such as:
field_name:[# TO #]
and this returns the correct documents.
Use case details:
We have a field that indexes a text field and calculates a “letter
group”. This keeps only the first significant character from a value
(number or letter), and if it is a number the simply stores “#” as we
want all numbered items grouped together.
I’m also aware that we could also fix this by using a specific number
instead of the hash character, however, I though I’d raise this to see
if there is a wider issue. I’ve listed some specific details below.
Field definition:
<fieldType name="letterGrouping" class="solr.TextField" sortMissingLast="true" omitNorms="true">
<analyzer>
<tokenizer class="solr.PatternTokenizerFactory" pattern="^([a-zA-Z0-9]).*" group="1"/>
<filter class="solr.LowerCaseFilterFactory" />
<filter class="solr.TrimFilterFactory" />
<filter class="solr.PatternReplaceFilterFactory" pattern="([^a-z0-9])" replacement="" replace="all"/>
<filter class="solr.PatternReplaceFilterFactory" pattern="([0-9])" replacement="#" replace="all" />
</analyzer>
</fieldType>
Server information:
Solr Specification Version: 3.2.0
Solr Implementation Version: 3.2.0 1129474 - rmuir - 2011-05-30 23:07:15
Lucene Specification Version: 3.2.0
Lucene Implementation Version: 3.2.0 1129474 - 2011-05-30 23:08:57
The issue is the fieldtype is applied at both the index and query time.
I tried to check the conversion for # for the fieldtype at query type and it seems to be returning blank.
However, field_name:123 would return back the results, as the 123 should be converted to # and should match the indexed value.
Or just apply the fieldType during index time.