I’m a relative newbie when it comes to database design. I know how to define the tables, but I’m starting to run into performance issues as my database grows. Can someone give me some guidance on how to use mysql indexes to improve query performance? My understanding of indexes is that they prevent the entire database table from being searched line by line, while still being able to return valid results.
I have typically just defined a primary key for each table that gives a unique id for each row. Is the idea behind defining an “index” that you do it for each field that will be used in the WHERE clause of a mysql statement? Not sure if that is too general of a statement.
As an example, lets say we have the following three tables:
products
products_id, products_name
categories
categories_id, categories_name
products_to_categories
products_id
categories_id
and we want to run the following query:
SELECT p.products_name, c.categories_name
FROM categories as c
JOIN products as p
JOIN products_to_categories as p2c
WHERE p.products_id=12345
Would we define both fields in the products_to_categories table as indexes and then the products_id and categories_id as primary keys within their parent table?
Any advice or guidance on the general approach to indexes would be much appreciated!
The best way to think about indexes is how you expect to query the data.
Let’s assume that products_id and categories_id are
PRIMARY KEYin your database, which means they are indexed automatically. If not, start with that.When I do multi-join tables, if you want to be paranoid, create two indexes to allow bi-directional accessibility of the IDs, e.g.
This takes a lot of space, but it will be really, really fast, and unless you query both directions (from products to categories, and then reverse), it’s probably overkill. Alternatively, by default, I do:
If you need some sort of constraint (many-to-one, one-to-many) then change your index types to UNIQUE etc.
In general, start with the latter definition, do your query, and run an EXPLAIN on it. If it shows anything which is more than 1 for the number of matched rows (except for the first table), then re-work the indexes.
Database indexing is really more a matter of testing and diagnostics than many think. I didn’t know how to do this for a while, until I actually had a problem. In short:
As one user commented below, EXPLAIN is a good starting point before running timing tests, but nothing beats actual timing tests in the wild.