I’m running a query which looks like this
SELECT parent.field, child.field
FROM parent
JOIN child ON (child.id = parent.id
OR child.id = parent.otherid)
This is however really slow (about 100k records, and JOINs to other tables in the real version), but despite having tried indexes on
parent.id (PRIMARY),
parent.otherid,
child.id (PRIMARY),
and a composite index of parent.id and parent.otherid
I cannot get MySQL to use any of those indexes when making this join.
I read that MySQL can use only one index per join, but can’t find anywhere whether it can use a composite index when a JOIN contains an OR condition.
Does anyone here know if it’s possible to make this query reference an index?
If so, how?
MY SOLUTION
(SO won’t let me answer my own question below atm)
A bunch of tweaking and came up with a fairly decent solution which retains the ability to JOIN and aggregate other tables.
SELECT parent.field, child.field
FROM parent
JOIN (
SELECT parent.id as parentid,
# Prevents the need to union
IF(NOT ISNULL(parent.otherid) AND parent.otherid <> parent.id,
parent.otherid,
parent.id) as getdataforid
FROM parent
WHERE (condition)
) as foundrecords
ON foundrecords.parentid = parent.id
JOIN child ON child.id = parent.getdataforid
For speed requires a condition inside the subquery to reduce the number of records placed in a temporary table, but I have tons of additional joins on the outer query, some joining to the child and some to the parent (with some aggregates) so this one worked best for me.
In many cases a union will be faster and more effective, but since I’m filtering on parent, but want additional data from child (parent self-references), the union caused extra rows for me which I couldn’t consolidate.
It’s possible the same result can be found just by joining parent to itself and aliasing a where condition in the outer query, but this one works quite nicely for me.
Thanks to Jirka for the UNION ALL suggestion, it’s what prompted me to get here 🙂
Your query makes it theoretically possible that a single child has two distinct parents, which would make it for quite nonstandard terminology. Let’s however assume that your data patterns make that impossible.
Then the following gives you the same result using separate indexes, one index per column.