I know that many threads has been created here & on the internet about this topic. But I really can’t get the final point on the difference between the two statements! I mean, trying and trying I can reach all the results I need with my queries, but I really don’t have full control of the knife!
I’m considering myself a very good programmer and a very good SQL-ista and I feel a little ashamed about this…
Here’s an example:
- I have a table with the pages of a website (“web_page”)
- a table with the categories (“category”).
- a category can contain one or more pages, but not vice versa
- a category may contain NO pages at all
- a page can be visible or not in the website
So if I want to show all the categories and their pages, I mean both categories with pages and without, I have to do something like this:
FROM category
LEFT JOIN web_page ON ( web_page.category_id = category.category_id AND web_page.active = "Y" )
So if a category has no pages, I’ll see web_page_id NULL on the record of that category.
But if I do:
FROM category
LEFT JOIN web_page ON ( web_page.category_id = category.category_id )
...
WHERE web_page.active = "Y"...
I’ll select only the categories that have at least one web_page… But WHY?
This was just an example… I’d like to understand once forever this difference!
Thank you.
To make your query to work as you intended, put the condition into the
ONclause:The reason this works is (with most databases, but not all) the
WHEREclause filters the rows After they are joined. If the join doesn’t result in a web page row joining (because the category had no web pages), then all the columns of web page will benull, and any comparison of a value (like"Y") to anullis false, so those non-joining rows will be filtered out.However, by moving the condition into the
ONclause, the condition is executed as the join is made, so that you only join rows that areactive = "Y", but if there aren’t any such rows, you’ll just get the left joinnullweb page.This version of the query is really saying: “give me all categories and their active web pages (if any)”
Note that I said “most databases”… mysql for example is smart enough to understand what you are trying to do, and your query will work as you intended if run on mysql.