My application uses Lucene.NET to index various text files. Since each text file is

Question

0

Asked: May 13, 20262026-05-13T06:59:20+00:00 2026-05-13T06:59:20+00:00

My application uses Lucene.NET to index various text files. Since each text file is

0

My application uses Lucene.NET to index various text files. Since each text file is different in structure, the entire content of each file is stored in a single “content” field.

Some of the text files contains URLs, e.g:

http://domain1.co.uk/blah
http://domain2.co.ru/blahblah

etc.

The code I use to index each file is:

Lucene.Net.Documents.Field fldContent = new Lucene.Net.Documents.Field("content", contents, Lucene.Net.Documents.Field.Store.YES, Lucene.Net.Documents.Field.Index.TOKENIZED, Lucene.Net.Documents.Field.TermVector.YES);

Where “contents” is the file contents.

When querying the file, Lucene returns result only when searching for the exact domain name (e.g domain1.co.uk) and nothing is returned for partial domain name (e.g domain1.co).
The code used to build the query is:

Lucene.Net.Index.Term searchTerm = new Lucene.Net.Index.Term("content", "domain1.co");
Lucene.Net.Search.Query query = new Lucene.Net.Search.TermQuery(searchTerm);

Do you have any idea why must I search using the exact domain name?

Report

Leave an answer
Cancel reply

You must login to add an answer.

Need An Account,

1 Answer

Editorial Team · Answer 1 · 2026-05-13T06:59:21+00:00

Editorial Team

2026-05-13T06:59:21+00:00Added an answer on May 13, 2026 at 6:59 am

The StandardAnalyzer/Tokenizer is the culprit here – it does it’s best to make URLs searchable, but in this case, it will not match a partial hostname. The standard approach is to create a custom analyzer/tokenizer – for this I can point you to another SO question with a similar problem and solution.

0

Reply
Share
Share

- Report

Sign Up

Sign In

Forgot Password

The Archive Base Latest Questions

My application uses Lucene.NET to index various text files. Since each text file is

Leave an answerCancel reply

1 Answer

Leave an answer
Cancel reply