I have two tables in MySQL, “apps” and “icons”, each with about 750K rows. In Hibernate I was modeling them like:
public class App {
@Basic
private String title;
@OneToOne(mappedBy = "app")
private Icon icon;
// etc...
}
public class Icon {
@Basic
private String name;
@OneToOne
private App app;
// etc...
}
When I added this relation I quickly ran into a performance problem- reading in a single App was taking > 1 second. I examined the SQL that Hibernate was producing and found it was joining like this:
select
apps.id as app_id,
apps.title as app_title,
icons.id as icon_id,
icons.name as icon_name
from
apps
left outer join
icons
on apps.id=icons.app_id
where
apps.id="zyz";
I found that adding @Fetch(FetchMode.SELECT) to the annotation greatly sped up the performance, bringing it down to around 30ms for effectively the same result. Here’s the produced SQL with the @Fetch(FetchMode.SELECT) annotation:
select
apps.id as app_id,
apps.title as title
from
apps
where
apps.id="xyz";
select
icons.id as icon_id,
icons.name as icon_name
from
icons
where
icons.app_id="xyz";
Why is the left outer join so much slower? “Explain” on the joined query shows:
+----+-------------+-------+-------+---------------+---------+---------+-------+--------+-------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+-------+-------+---------------+---------+---------+-------+--------+-------+
| 1 | SIMPLE | apps | const | PRIMARY | PRIMARY | 767 | const | 1 | |
| 1 | SIMPLE | icons | ALL | NULL | NULL | NULL | NULL | 783556 | |
+----+-------------+-------+-------+---------------+---------+---------+-------+--------+-------+
So it’s apparently visiting every row, vs a single row for the multiple-select query. Can’t the join use the index that I have on icons.app_id?
PS: yes, I used “RESET QUERY CACHE” in-between timing runs.
Update: moved to a bigint primary key, used that to join the tables in instead of the VARCHAR, and performance of the join is now on par with the “multiple selects” method.
According to your explain, I believe the issue is with your schema at the database layer and not at the application layer.
The fact that the join to the icons table has no entry for
possible_keysleads me to believe you are running a MyISAM storage engine or an InnoDB storage engine with no FK constraints. Also, a key length of 767 strikes me as unusual, I’ve only ever seen this value < 10.If the engine is MyISAM: Add an index to the
icons.app_idcolumn. And consider using an InnoDB engine so you can establish FK constraints so that you do not end up with orphaned rows.If the engine is InnoDB: Add a FK constraint to the
icons.app_idwhich referencesapps.id. By adding a FK constraint not only are you ensuring your data doesn’t become orphaned, you are also optimizing the joins between the tables because you are forced to create an index on both column.Either of the solutions mentioned above should greatly improve your performance. Let me know how it goes.
You can read more info about some discussed topics using these links:
InnoDB Storage Engine
Foreign Key Constraints
— Update —
Here are some example alters when you are ready to add the INT columns, remember to do this on dev first and make sure this resolves the issue before pushing to production.
For the apps table:
For the icons table:
These alters are demonstrative only, but should be enough to get you started in the right direction. You can read up on the MySQL docs I posted about Foreign Key Constraints for more information. Now, with the FK constraint setup between apps and icons, if any apps are deleted, and rows with icons matching
app.idwill also be deleted, ensuring you don’t orphan any data. If you don’t want to delete the associated rows in theiconstable, you can changeON DELETE CASCADEtoON DELETE NULL, and they will be unlinked from theappstable, but still reside in theiconstable.