So, I have a Solr instance which processes inputs and queries using StandardTokenizer (as

Question

0

Asked: June 14, 20262026-06-14T11:24:27+00:00 2026-06-14T11:24:27+00:00

So, I have a Solr instance which processes inputs and queries using StandardTokenizer (as

0

So, I have a Solr instance which processes inputs and queries using StandardTokenizer (as well as ClassicFilterfactory, LowercaseFilterFactory and Stopfilterfactory).

In my index are a number of files with underscore separated names (eg. some_indexed_file.jpg).

I’ve noticed that if I query for some_indexed_file.jpg, I get the file I’m looking for returned correctly.

However, if I alternatively search for some_indexed_file.jp*, (that’s with an asterisk, which I am presuming is acting as a wildcard) which, by my understanding should produce similar results, I get no results.

Any idea what’s going on: I assume I’m misunderstanding something about the way solr processes queries?

Edit: as requested, here are the schema XML configuration entries:

    <fieldType name="default" class="solr.TextField">
        <analyzer type="index">
            <tokenizer class="solr.StandardTokenizerFactory" />
            <filter class="solr.ClassicFilterFactory" />
            <filter class="solr.LowerCaseFilterFactory" />
            <filter class="solr.StopFilterFactory" />
        </analyzer>
        <analyzer type="query">
            <tokenizer class="solr.StandardTokenizerFactory" />
            <filter class="solr.ClassicFilterFactory" />
            <filter class="solr.LowerCaseFilterFactory" />
            <filter class="solr.StopFilterFactory" />
        </analyzer>
    </fieldType>



   <field name="filename" type="default" multiValued="true" omitNorms="false" termVectors="false"/>

Report

Leave an answer
Cancel reply

You must login to add an answer.

Need An Account,

1 Answer

Editorial Team · Answer 1 · 2026-06-14T11:24:28+00:00

Well, a bit more research has solved the problem:
The base issue is that Solr doesn’t apply text analysis to wildcard queries.

This meant that it was searching for an exact match to some_indexed_file.jp*. However, when the filename was indexed, it was tokenised into “some” “indexed” and file.jpg, which does not match this search term.
Searching for some_indexed_file.jpg was being tokenised properly, and therefore returning the right results.

Sign Up

Sign In

Forgot Password

The Archive Base Latest Questions

So, I have a Solr instance which processes inputs and queries using StandardTokenizer (as

Leave an answerCancel reply

1 Answer

Leave an answer
Cancel reply