Right now, my documents in lucene can have very very large values in one

Question

0

Editorial Team

Asked: May 21, 20262026-05-21T14:05:21+00:00 2026-05-21T14:05:21+00:00

Right now, my documents in lucene can have very very large values in one

0

Right now, my documents in lucene can have very very large values in one field (from 0 to say hundreds of MB).

I am using Lucene 3.1.0, I create documents like this:

doc = new Document();
Field field = new Field(fieldname, VERYLARGEVALUE, store, tokenize, storevector);
doc.add(field);

Where VERYLARGEVALUE is a String in memory. I am thinking that maybe writing VERYLARGEVALUE to a file while it is being created (it is created by extracting text from a number of sources so it is incremental), and then using:

Field field = Field(String name, Reader reader, Field.TermVector termVector); 
doc.add(field);

Where reader reads from the File I wrote VERYLARGEVALUE to.

Will this decrease the memory requirement or VERYLARGEVALUE will be eventually read to memory sooner or later?

Report

Leave an answer
Cancel reply

You must login to add an answer.

Need An Account,

1 Answer

Editorial Team · Answer 1 · 2026-05-21T14:05:22+00:00

Editorial Team

2026-05-21T14:05:22+00:00Added an answer on May 21, 2026 at 2:05 pm

Looking through the Lucene code, the Reader you pass into Field ultimately gets passed to the TokenStream that tokenizes your data (namely in DocInverterPerField). So your plan should definitely save memory since it’ll stream directly from that reader to do its indexing. You’ll like want to use a BufferedReader on top of the FileReader for better performance.

0

Reply
Share
Share

- Report

Sign Up

Sign In

Forgot Password

The Archive Base Latest Questions

Right now, my documents in lucene can have very very large values in one

Leave an answerCancel reply

1 Answer

Leave an answer
Cancel reply