I’d like to know the most efficient SQL query for achieving this problem:
Say we have a table with two columns, one storing entry ids (entry_id) and one storing category ids (cat_id):
entry_id cat_id
3 1
3 2
3 3
3 20
4 1
4 2
4 21
I’d like to count how many distinct entry_id‘s there are in the categories 1, 2 OR 3 but that also must be in cat_id 20.
For example, categories 1, 2 and 3 might represent music genres (Country, Pop etc.), while category 20 might be recording formats (CD, Vinyl etc.). So another way of putting it verbally could be: “How many products are there that are on Vinyl and in either the Pop or Country category?”
I could achieve this with a nested loop in code (PHP) or possibly with a nested SQL subquery, but neither feels that efficient. I feel there must be an obvious answer to this staring me in the face…
EDIT TO ADD:
I would also like to do this without modifying the database design, as it’s a third party system.
FURTHER EXAMPLE TO CLARIFY:
Another real-world example of why I’d need this data:
Let’s say the category ids instead represent either:
- Accommodation Types (Camping = 20, Holiday Cottage = 21)
OR
- Continents and their sub-regions (i.e. Europe = 1, UK = 2, England = 3)
Let’s say someone has selected that they are interested in camping (cat_id = 1). Now we need to count how many camping products there are in the Europe. A product might be tagged as both Europe (parent), UK (child) AND England (grand-child), giving us an array of category ids 1, 2 or 3. So we now need to count how many distinct products there are in both those categories AND the original accommodation category of 1 (camping).
So having selected Camping, the end result might look something like:
- Europe: 4 camping products
- UK: 2 camping products
- England : 1 camping product
- Wales : 1 camping product
- France: 2 camping products
etc.
- UK: 2 camping products
Hope that helps…
I believe you want GROUP BY, COUNT() and EXISTS()
V2 using join instead of EXISTS()