I have a table for user and another table for follower. The followers table is a list of user_ids and follower_ids. Seems pretty straight forward.
I’ve been planning on using mysql for production and I feel like down the road, this is really going to bite me in the a$$
Should I switch to MongoDB? Is it too late?
I’ve never dealt with NoSQL-anything and I’m wondering how to get around the issue of joins. I wouldn’t care about putting a little effort forth to fix this problem except I separated my users from their profiles. I am under the assumption that activerecord uses joins in a statement such as @name = User.profile.full_name
What you need to consider is separating, conceptually, the data storage technology from your data structure. MySQL can be scaled, scaled, and scaled some more if you know how to do it as it’s an old and proven platform. While it probably doesn’t have the same shiny new appeal of something like NoSQL, it does have a very good track-record and that’s often what counts.
There’s a number of ways to tune MySQL to perform more quickly. The built-in clustering and replication features mean you can often scale up to multiple instances very easily, and using a simpler, faster database engine like MyISAM can give you order-of-magnitude performance gains in some circumstances.
MongoDB is a very interesting experiment, but so far it hasn’t really earned its stripes. If it’s anything like other noble NoSQL projects like Cassandra it will still need years of work to be truly “web scale”.
In your particular case, let’s say you want to find a list of a user’s followers. You’re probably doing something like this:
You’re right in presuming that the
JOINhere will cause trouble down the road. What you’re overlooking is that you can easily remove the join using the same humble trick that is essential to making your application scale: denormalizing important information.What if you copied the follower’s name into the
user_followerstable each time you add an entry to it:Now there’s no joins. The only catch is that when you start to denormalize things you should implement a method to bring the copies back into sync from the master should something get messed up, and you must be careful to ensure that a change in the master value should propagate to the copies as expediently as is required.
A simple mass update could be as easy as:
If the users are unable to change their names, though, you wouldn’t even need to worry. Sometimes constraints can help you in this regard.