I’ve got a bunch of tables that I’m joining using a unique item id. The majority of the where clause conditions will be built programatically from a user sumbitted form (search box) and multiple conditions will often be tested against the same table, in this case item tags.
My experience with SQL is minimal, but I understand the basics. I want to find the ids of active (status=1) items that have been tagged with a tag of a certain type, with the values “cats” and “kittens”. Tags are stored as (id, product_id, tag_type_id, value), with id being the only column requiring a unique value. My first attempt was;
select
distinct p2c.product_id
from '.TABLE_PRODUCT_TO_CATEGORY.' p2c
inner join '.TABLE_PRODUCT.' p on p2c.product_id = p.id
inner join '.TABLE_PRODUCT_TAG.' pt on p.id = pt.product_id
inner join '.TABLE_TAG_TYPE.' tt on pt.tag_type_id = tt.id
where
tt.id = '.PRODUCT_TAG_TYPE_FREE_TAG.'
and p.status = 1
and lower(pt.value) = "cats"
and lower(pt.value) = "kittens"
but that returned nothing. I realised that the final AND condition was the problem, so tried using a self-join instead;
select
distinct p2c.product_id
from '.TABLE_PRODUCT_TO_CATEGORY.' p2c
inner join '.TABLE_PRODUCT.' p on p2c.product_id = p.id
inner join '.TABLE_PRODUCT_TAG.' pt on p.id = pt.product_id
inner join '.TABLE_PRODUCT_TAG.' pt2 on p.id = pt2.product_id
inner join '.TABLE_TAG_TYPE.' tt on pt.tag_type_id = tt.id
where
tt.id = '.PRODUCT_TAG_TYPE_FREE_TAG.'
and p.status = 1
and lower(pt.value) = "cats"
and lower(pt2.value) = "kittens"
Now everything works as expected and the result set is correct. So what do I want to know? To re-iterate, the results I’m after are the ids of active (status = 1) items that have been tagged with a tag of a certain type, with the values “cats” AND “kittens”…
- Are self-joins the best way of achieving these results?
- This query has the potential to be huge (I’ve omitted a category condition, of which there may be ~300), so does this self-join approach scale well? If not, is there an alternative?
- Will the self-join approach be the best way forward (assuming there is an alternative) if I allow users to specify complex tag searches? ie “cats” and (“kittens” or “dogs”) not “parrots”.
The problem with the initial query was this:
There exists no tag for which the value is both “cats” and “kittens”, therefore no records will be returned. Using an IN clause as SQLMenace suggests would be the solution – that way you’re saying, “give me back any active item that has been tagged ‘cats’ or ‘kittens'”.
But if you want any active item that has BOTH tags – then you need to do something like your second query. It’s not perfectly clear from your question if that’s what you’re after.
For something like your Question #3:
you would want pt1, pt2, and (in a subquery) pt3, and something like this:
The broadly general case could get quite messy…