So I have a query that is trying to grab “related posts”.
Categories have a one-to-many relationship with posts. Tags have a many-to-many relationship. So my tables look roughly like this:
posts table:
id | category_id | ... | ...
tags table:
id | ... | ...
post_tag intermediate table:
id | post_id | tag_id | ... | ...
So if I have a single Post row already, and what to grab its “related” posts. My logic is roughly that I want to grab only posts that are in the same category, but to order those posts by the amount of tags that match the original post. So another post in the same category that has the exact same tags as the original post, should be a very high match, whereas a post that only matches 3/4 of the tags will show up lower in the results.
Here is what I have so far:
SELECT *
FROM posts AS p
WHERE p.category_id=?
ORDER BY ( SELECT COUNT(id)
FROM post_tag AS i
WHERE i.tag_id IN( ? )
)
LIMIT 5
BINDINGS:
Initial Posts Category ID;
Initial Posts Tag IDs;
Clearly this is not going to actually order the results by the correct values in the sub-select. I am having trouble trying to think of how to join this to achieve the correct results.
Thanks in advance!
If I undestood your question correctly this is what you’re looking for:
BINDINGS: Initial
posts.id;and you only have to specify the id of the current post in my version so you don’t have to fetch the posts tags beforehand and format them suitably for an in clause
EDIT:
This should be a faster query by avoiding double joining posts, if you don’t like user variables just replace all currentpostid with ? and triple-bind post_id: