I have a table say table1 which has 3 columns column1, column2 and column3.
The column1 and column2 are a FOREIGN KEY with 2 other tables. However the data in column3 is from n number of tables.
For e.g. Let us consider Facebook. To display the activities it might maintain a table which could have user1 photoliked photo1 or user1 statusliked status1. So in this case column3 cannot be a FOREIGN KEY with a specific table.
Now there are 2 ways of getting real data –
1st way –
SELECT user_id,
verb_id,
CASE WHEN verb_id = photoliked THEN
(SELECT photo_name FROM photos WHERE photo_id = column3) -- getting the desired data from the third column
WHEN verb_id = statusliked THEN
(SELECT status FROM statustable WHERE status_id = column3)
ELSE '' END AS performedon
FROM table1
JOIN table2 ON user_id = user_id -- joining the first column
JOIN table3 ON verb_id = verb_id -- joining the second column
2nd way –
SELECT user_id,
verb_id,
CASE WHEN verb_id = photoliked THEN
p.photo_name
WHEN verb_id = statusliked THEN
s.status
ELSE '' END AS performedon
FROM table1
JOIN table2 ON user_id = user_id -- joining the first column
JOIN table3 ON verb_id = verb_id -- joining the second column
LEFT JOIN photos p ON p.photo_id = column3 -- joining the column3 with specific table
LEFT JOIN statustable s ON s.status_id = column3
Question
Which of the 2 ways is better to retrieve data?
and which of the 2 queries is less expensive?
The second would be faster and the reason is the first one contains what is called correlated subqueries. The subqueries have a correlation with records from the master query. So the subqueries need to be run once for every matching record in the master query. In your case it can’t run the subquery until it determines the value of verb_id in the master query. That is a lot of queries to run.
An EXPLAIN on the first query should indicate this issue. It’s is usually a red flag when you see that in an EXPLAIN.