If I add a custom Attribute, for example part of speech, to TokenStream is

Question

0

Editorial Team

Asked: June 13, 20262026-06-13T04:29:40+00:00 2026-06-13T04:29:40+00:00

If I add a custom Attribute, for example part of speech, to TokenStream is

0

If I add a custom Attribute, for example part of speech, to TokenStream is it used in indexing process?

Can I retrieve this attribute from the index? Is it stored for every token?

Report

Leave an answer
Cancel reply

You must login to add an answer.

Need An Account,

1 Answer

Editorial Team · Answer 1 · 2026-06-13T04:29:41+00:00

If I understand what you are looking for here, I think you would need to create you own custom TokenStream (extending a standard TokenStream, I would think) to accomplish this, and determine how you want to store all this extra information. And how to meaningfully retrieve that information from the index.
I know of no way to accomplish something like that out-of-the-box.

Off the top of my head, I’d think you’dd need to write a new document for each token coming through your custom tokenstream. Then on searching, use a highlighter, or some such, to get which terms a query is matching on and query the index again to retrieve these metadata documents about that term. This assumes that any token reused by this or another document that is written will have the same metadata assigned to it. If that’s not the case, you’dd have to determine how to indentify the documents you were looking for that wouldn’t be sensitive to collisions.

Or you could write another field of the same document, creating an ordered list of metadata for each token paralleling the structure of the data. Store both, use a highlighter again to find the searched for result, and parse out the matching position in the list your tokenstream created.

Well, that’s a couple of thoughts anyway.

Sign Up

Sign In

Forgot Password

The Archive Base Latest Questions

If I add a custom Attribute, for example part of speech, to TokenStream is

Leave an answerCancel reply

1 Answer

Leave an answer
Cancel reply