In a project I am working, the client has a an old and massive(terabyte

Question

0

Asked: May 22, 20262026-05-22T17:02:45+00:00 2026-05-22T17:02:45+00:00

In a project I am working, the client has a an old and massive(terabyte

0

In a project I am working, the client has a an old and massive(terabyte range) RDBMS. Queries of all kinds are slow and there is no time to fix/refactor the schema. I’ve identified the sets of common queries that need to be optimized. This set is divided in two: full-text and metadata queries.

My plan is to extract the data from their database and partition it across two different storage systems each optimized for a particular query set.

For full-text search, Solr is the engine that makes most sense. It’s sharding and replication features make it a great fit for half of the problem.

For metadata queries, I am not sure what route to take. Currently, I’m thinking of using an RDBMS with an extremely de-normalized schema that represents a particular subset of the data from the “Authoritative” RDBMS. However, my client is concerned about the lack of sharding and replication of such subsystem and difficulty/complications of setting such features as compared with Solr that already includes them. Metadata in this case takes the form of integers, dates, bools, bits, and strings(with max size of 10chars).

Is there a database storage system that features built-in sharding and replication that may be particular useful to query said metadata? Maybe a no-sql solution out there that provides a good query engine?

Illuminate please.

Additions/Responses:

Solr can be used for metadata, however, the metadata is volatile. Therefore, I would have to commit often to the indexes. This would cause search to degrade pretty fast.

Report

Leave an answer
Cancel reply

You must login to add an answer.

Need An Account,

1 Answer

Editorial Team · Answer 1 · 2026-05-22T17:02:45+00:00

Editorial Team

2026-05-22T17:02:45+00:00Added an answer on May 22, 2026 at 5:02 pm

Use MongoDB for your metadata store:

Built-in sharding
Built-in replication
Failover & high availability
Simple query engine that should work for most common cases

However, the downside is that you can not perform joins. Be smart about denormalizing your data so that you can avoid this.

0

Reply
Share
Share

- Report

Sign Up

Sign In

Forgot Password

The Archive Base Latest Questions

In a project I am working, the client has a an old and massive(terabyte

Leave an answerCancel reply

1 Answer

Leave an answer
Cancel reply