Suppose I have a document collection that I have indexed in Lucene. I submit

Question

0

Asked: May 27, 20262026-05-27T08:03:13+00:00 2026-05-27T08:03:13+00:00

Suppose I have a document collection that I have indexed in Lucene. I submit

0

Suppose I have a document collection that I have indexed in Lucene. I submit a query and get hits. Now what I want is to find where in a particular document hit(s) occur(s). I know that I can use the Lucene Highlighting classes to obtain relevant fragments. But how can I find out where exactly these fragments appear in the original contents?

A related question is how to make sure the found fragments are actually very close to the original query? I noticed in my experiments with highlighting that often I would have a multi-word query and it would return fragments that would have only some of these words. But what if I want to make sure I get hits with all the words?

Thanks!

Report

Leave an answer
Cancel reply

You must login to add an answer.

Need An Account,

1 Answer

Editorial Team · Answer 1 · 2026-05-27T08:03:13+00:00

Not an actual answer, just a few links to a solution to a similar problem.

First of all, here you can see the actual results of the highlighting (note that were is highlighted though am was in the query. Stemming is an additional feature of this implementation):
http://hunglish.hu/search?huSentence=&enSentence=I%20am%20highlighted&size=20&page=2&doc.genre=-10

Here’s the source. Look for these methods: highlightField, highlightBisen
http://code.google.com/p/hunglish-webapp/source/browse/trunk/src/main/java/hu/mokk/hunglish/lucene/Searcher.java

Disclaimer: I wrote this a while ago, it is not very nice code, and it is buggy in special cases: there is an open issue relating to highlighting. Furthermore, it uses version 3.2.0 of the lucene-highlighter, which is possibly not the newest.

Anyway, I hope if you look at how it works, it helps you write a better one, or at least something that works as expected.

Sign Up

Sign In

Forgot Password

The Archive Base Latest Questions

Suppose I have a document collection that I have indexed in Lucene. I submit

Leave an answerCancel reply

1 Answer

Leave an answer
Cancel reply