Can anyone instantly see a problem or bottleneck in the schema below? Reads are 90% of the operations, but I’d like to know if I’m shooting myself in the foot anywhere on the writes either.
The Proposition
Each object (row in another table) can be related to other objects. There can only ever be one record for each relation pair (pairs are directionally sensitive, so X to Y can co-exist with Y to X).
+-------+---------------------+------+-----+---------+----------------+
| Field | Type | Null | Key | Default | Extra |
+-------+---------------------+------+-----+---------+----------------+
| a_id | bigint(20) unsigned | NO | PRI | NULL | |
| b_id | bigint(20) unsigned | NO | PRI | NULL | |
+-------+---------------------+------+-----+---------+----------------+
A typical request will be to get all related objects (for a given object A):
SELECT * FROM objects INNER JOIN relations ON id = b_id WHERE a_id = A
Relations are managed using a simple checkbox array in the UI. To save relations, I’d calculate the difference between checked/non-checked in the current set (objects will be paged in the UI), and then just insert/delete accordingly;
DELETE FROM relations WHERE a_id = A AND b_id IN(B,C)
# if these relations already exist, will fail silently
INSERT IGNORE INTO relations (a_id, b_id) VALUES (A,D), (A,E)
I may also need to query reverse-relations, using just b_id for selects – since it’s not left of the index, will it be used at all? And if not, will adding a separate index on it induce significant overhead for writes?
The advantage of adding a second index on
b_idwill outweigh any overhead for writes, since without the index, it will need to do a full table scan to filter byb_id.As far as making the index on just
b_idorb_id, a_iddepends on the table engine, SinceInnoDBstores the primary key in the secondary indexes, howeverMyISAMdoes not.