I am working to create a very big inverted index terms. What method would

Question

0

Editorial Team

Asked: June 13, 20262026-06-13T12:15:55+00:00 2026-06-13T12:15:55+00:00

I am working to create a very big inverted index terms. What method would

0

I am working to create a very big inverted index terms. What method would you suggest?

First

termId - > docId
  a        doc2[locations],doc5[locations],doc12[locations] 
  b        doc5[locations],doc7[locations],doc4[locations]

Second

termId - > docId
  a        doc2[locations]
  a        doc5[locations]
  a        doc12[locations]
  b        doc5[locations]
  b        doc7[locations] 
  b        doc4[locations]

p.s Lucene is not an option

Report

Leave an answer
Cancel reply

You must login to add an answer.

Need An Account,

1 Answer

Editorial Team · Answer 1 · 2026-06-13T12:15:56+00:00

The right table design depends on how you plan on using the data. If you plan on using strings like "doc2[locations],doc5[locations],doc12[locations]" as is — without any further postprocessing, then your First design is fine.

But if — as your question tacitly suggests — that you may at times want to regard doc2[locations], doc5[locations], etc. as separate entities, then you should definitely use your Second design.

Here are some use cases which show why the Second design is better:

If you use First and ask for all docs with termID = a then you
get back a string like
doc2[locations],doc5[locations],doc12[locations] which you then
have to split.

If you use Second, you get each doc as a separate row. No splitting!

The Second structure is more convenient.
Or, suppose at some point doc5[locations] changes and you need to
update your table. If you use the First design, you’d have to use
some relatively complicated MySQL string function to find and replace the substring in all rows that contain it. (Note that MySQL does not come with regex substitution built in.)

If you use the Second design, updating is easy:
```
UPDATE table SET docId = "newdoc5[locations]" where docId = "doc5[locations]"
```

Sign Up

Sign In

Forgot Password

The Archive Base Latest Questions

I am working to create a very big inverted index terms. What method would

Leave an answerCancel reply

1 Answer

Leave an answer
Cancel reply