We have few very big tables (3 tables, each 2 ~ 5 GB) in our MySQL DB. We are running logistic applications where we combine entities like route,schedule,capacity,location,price rules etc.. and those huge tables contains “joined” data from mentioned entities.
We must have those tables because doing JOINS on-the-run kills performance totally. We do have indexes ;), caching mechanisms,efficient prepared statements,proper transaction management setup but performance is not sufficient (~ thousands of customers, ~hundreds or VIP customers).
Our customers are doing mostly 99% read-only operations like searching for connections,schedule,pricing, and then sometimes there is some 1-2% of UPDATE/INSERT operations e.g. booking some journey,capacity etc….
Our idea is to use some no-sql DB (propably MongoDB) as second database where we would put all pregenerated read-only data into some key-value or tree structures. We believe that performance will be much better, what are the caveeats of this solution ? Do you have personal experience with such task ?
We plan to make fast prototype but nobody has actually real experience with NoSQL.
When you have a lot of JOINed data in your data model, then MongoDB is definitely not the right choice, because it doesn’t support joins. You aren’t saying much about your data model, but when you can convert it in a way where most data is embedded in other entities and not stored in separate collections, then MongoDB could work for you. Thanks to sharding and replica sets it scales very well, especially for write access.
Or did you consider caching your three huge tables with Memcached? 3 x 5 GB = 15 GB – that’s not much for a server to keep in memory.