I have a SOLR database that needs to have a new field containing a

Question

0

Asked: May 26, 20262026-05-26T11:03:13+00:00 2026-05-26T11:03:13+00:00

I have a SOLR database that needs to have a new field containing a

0

I have a SOLR database that needs to have a new field containing a list of strings that are kind of like tags, except they are predefined and used for an internal purpose. The search results from this SOLR core will go across the public Internet to 3rd party website developers. Therefore I want to obfuscate the tags, and make it impossible for someone to guess a tag that would reveal information about another customer.

I could easily accomplish this using GUIDs, but I wonder what the impact will be of having hundreds of thousands of records in RAM with a field containing an array of several GUIDs.

If the GUIDs were recorded as atoms, i.e. one copy of the GUID and many references to it, then this is a non-issue. But I cannot find out whether SOLR or Lucene use atoms in their in-RAM data structures. The disk storage is not an issue.

This is similar to dedup issues, but my research shows that people are mostly concerned with whole duplicate documents, not with individual fields.

Report

Leave an answer
Cancel reply

You must login to add an answer.

Need An Account,

1 Answer

Editorial Team · Answer 1 · 2026-05-26T11:03:13+00:00

There are two indexes:

Inverted index. Each guid will be stored once (actually less than once) no matter how many times it is used.
Normal index. Each guid will be stored once every time it is used. You can use compression here if you like. (“Compression” can mean you have a special table which translates numbers <-> tags, so each tag is stored as a number –> each tag takes 1 byte [assuming less than 2^8 tags].)

Sign Up

Sign In

Forgot Password

The Archive Base Latest Questions

I have a SOLR database that needs to have a new field containing a

Leave an answerCancel reply

1 Answer

Leave an answer
Cancel reply