I really need help getting this query right. I can’t share actual table and column names, but will try my best to layout the problem simply.
Assume the following tables. The tables and keys CANNOT be changed. Period. I don’t care if you think it’s a bad design, this question isn’t a design question, it’s on SQL syntax.
- Table A – Primary key named id1
- Table B – Contains two foreign keys, TableA.id1 and Foo.id2(ignore Foo, it doesn’t matter for this)
- Table C – Contains two foreign keys, TableA.id1 and Foo.id2, additional interesting
columns.
Constraints:
- The SQL gets a set of id1s passed in as an argument.
- It must return a list of Table C rows.
- It must only return Table C rows where a Table B row exists with a matching TableA.id1 and Foo.id2 – There ARE rows in Table C that don’t match Table B
- A row MUST be returned for every id1 passed in, even if no Table C row exists.
At first I tried a Left Outer Join from Table A to Table B then an Inner Join to Table C. That violates the 4th rule above, as the Inner Join drops out those rows.
Next I tried two Left Outer joins. This is closer, but has the side effect of including rows that match the Table A join to Table B, but don’t have a corresponding Table C entry, which isn’t what I want.
So, here’s what I came up with.
SELECT
a.id1,
c.*
FROM
TableB b
INNER JOIN
TableC c USING (id1,id2)
RIGHT OUTER JOIN
TableA a USING (id1)
WHERE
a.id1 in (x,y,z)
I’m a bit wary of a Right Outer Join, as the documentation I’ve read says it can be replaced with a Left Outer, but it doesn’t appear so for this case. It also seems a bit rare, which is making other devs nervous, so I’m being cautious.
So, three questions in one.
- Is this correct?
- Did I use the Right Outer Join correctly?
- Is there a cleaner way to achieve the same thing?
EDIT: DB is MySQL
You can rewrite it as a LEFT OUTER JOIN by using parentheses. In pseudo-SQL change this:
to this: