I’m currently building a web application that has multiple user types, where users engage in multiple activity types.
I need to design a table that associates “Likes” (upvotes, +1’s, whatever) between users and activities. I am not an expert in MySQL by any means, so I want to avoid heading down the wrong path with my database design, especially with something like this.
What I’m thinking is a table like the following:
CREATE TABLE likes (
from_id int(11) NOT NULL,
from_type VARCHAR(100) NOT NULL,
to_id int(11) NULL,
to_type VARCHAR(100) NULL,
posted TIMESTAMP DEFAULT CURRENT_TIMESTAMP,
PRIMARY KEY(from_id, from_type, to_id, to_type),
INDEX(from_id, from_type),
INDEX(to_id, to_type)
)Engine=InnoDB;
What are the performance issues with this as the system scales? These “Likes” will be used in a news feed, so there will be a lot of reads and writes.
Is there a better approach that I simply haven’t thought of?
Thanks in advance!
Your
from_typeandto_typefields, as VARCHAR(100), are going to be a huge waste of space, and matching against them will not be nearly as efficient as matching against, say, an integerfrom_type_idandto_type_id. They will also make maintaining the index slower as you add more entries to the table.If you need a human-readable type name, then make another table that has them listed, along with their numeric ids. You can make the id a foreign key to that table, although there will be some cost associated with maintaining the integrity of the foreign key.
Presumably there wouldn’t be too many types, at least as compared with the number of rows in the table, so any process that needed them could just read that table once, rather than asking the database to do the joins with every read.
About indexes:
These are going to be what slows down your inserts. You should aim to have just enough to service all of your read queries, and make the ones that you do have do as much as possible.
Multi-column indexes are certainly useful, and they have the nice property that all of the prefixes are essentially indexed as well. That is to say, if you have an index on
(a, b, c), then that can also be used as an index on(a, b), or even just ona. (And MySQL will use the indexes this way).That means that, since you have an index on
Then you still need the index on
But you can eliminate the index on
Making the database maintain that index separately is just going to slow it down.
That also means that you might want to do some thinking up-front about about the order of columns in your index. It might make more sense to put the
typebefore the id, since it’s more likely that you would query the database for all likes from a specific type, rather than all likes from a specific id (for each type).If you structure your indexes like this:
Then you can efficiently search for:
(from_type)(from_type, from_id)(to_type)(to_type, to_id)(to_type, to_id, from_type)(to_type, to_id, from_type, from_id)