I know how to get relevant highlighted fragments together with some surrounding text using Lucene highlighter, namely, using
Highlighter highlighter = new Highlighter(scorer);
String[] fragments = highlighter.getBestFragments(stream, fieldContents, fragmentNumber);
But can I instead get pointers to these fragments in the original contents? In other words, I need to know where these fragments start and, if possible, end.
If you use the
GetBestTextFragmentsmethod instead, you will get back an array ofTextFragments. These have propertiestextStartPosandtextEndPos.(They are marked internal in Lucene.NET, which will require you to make some code changes to get access to them. I’m not sure about Java Lucene.)