I can see myself using Project Voldermort to cache results from a Traditional RDBMS

Question

0

Asked: May 13, 20262026-05-13T09:49:03+00:00 2026-05-13T09:49:03+00:00

I can see myself using Project Voldermort to cache results from a Traditional RDBMS

0

I can see myself using Project Voldermort to cache results from a Traditional RDBMS query. But in this case, it provides almost no major advantage over other (Java) caching system such as EHcache Jcache etc.

Where else could I use Project Voldermort or similar Key Value stores ? How are you using this in your business applications ?

Report

Leave an answer
Cancel reply

You must login to add an answer.

Need An Account,

1 Answer

Editorial Team · Answer 1 · 2026-05-13T09:49:04+00:00

One approach to improving the speed of your database is to denormalize. Take this MySQL example:

CREATE TABLE `users` (
    `user_id` INT NOT NULL AUTO_INCREMENT,
    … -- Additional user data
    PRIMARY KEY (`user_id`)
);


CREATE TABLE `roles` (
    `role_id` INT NOT NULL AUTO_INCREMENT,
    `name` VARCHAR(64),
    PRIMARY KEY (`role_id`)
);


CREATE TABLE `users_roles` (
    `user_id` INT NOT NULL,
    `role_id` INT NOT NULL,
    PRIMARY KEY (`user_id`, `role_id`)
);

Neat, tidy, normalized. But if you want to get users and their roles, the query is complex:

SELECT u.*, r.*
  FROM `users` u
  LEFT JOIN `user_roles` ur ON u.`user_id` = ur.`user_id`
  JOIN `roles` r ON ur.`role_id` = r.`role_id`;

If you denormalized this, it might look something like:

CREATE TABLE `users` (
    `user_id` INT NOT NULL AUTO_INCREMENT,
    `role` VARCHAR(64),
    … -- Additional user data
    PRIMARY KEY (`user_id`)
);

And the equivalent query would be:

SELECT * FROM `users`;

This improves some of the performance characteristics of your queries:

Because the result you want is already in a table, you don’t have to perform read-side calculations. e.g. if you wanted to see the number of users with a given role, you’d need a GROUP BY and COUNT. If it were denormalized, you would store it in a different table devoted to holding roles and counts of users who have that role.
The data you want is in the same place, and hopefully in the same place on disk. Rather than requiring many random seeks, you can do one to a few sequential reads.

NoSQL DBs are highly optimized for these cases, where you want to access a mostly-static sequential dataset. At that point, it’s just moving bytes from disk to the network. Less work, less overhead, more speed. Despite how simple this sounds, it’s possible to model your data and application so it feels natural.

The trade-off for this performance is write load, disk space, and some app complexity. Denormalizing your data means more copies, which means more disk space and write load. Essentially, you have one dataset per query. Because you shift the burden of those computations to write-time instead of read-time, you really need some sort of asynchronous mechanism to do that, hence some app complexity.

And because you have to store more copies, you have to perform more writes. This is why you can’t practically replicate this kind of architecture with a SQL database – it’s extremely difficult to scale writes.

In my experience, the trade-off is well worth it for a large-scale application. If you’d like to read a bit more about a practical application of Cassandra, I wrote this piece a few months ago, and you might find it helpful.

Sign Up

Sign In

Forgot Password

The Archive Base Latest Questions

I can see myself using Project Voldermort to cache results from a Traditional RDBMS

Leave an answerCancel reply

1 Answer

Leave an answer
Cancel reply