I currently have two tables, one with documents, and another with ratings
doc_id | doc_groupid | doc_name | doc_time
and then
rating_id | rating_docid | rating_score
where rating_score is either -1 or 1.
What I need to do is have a single query that retrieves every column in the document table WHERE groupid = #, but also has columns which aggregate the ratings. I can retrieve a list of ratings using
SELECT rating_docid,
SUM(CASE WHEN rating_type = 1 THEN 1 ELSE 0 END ) AS UpVotes,
SUM(CASE WHEN rating_type = -1 THEN 1 ELSE 0 END) AS DownVotes
GROUP BY rating_docid
Which gives me a list of documents (as long as they have been rated) and how many upvotes or downvotes they have. I can also obviously very easily get a list of documents with
SELECT * FROM documents WHERE doc_groupid = #
But I have no idea how to do this without a subquery (using JOIN or LEFT JOIN), which my understanding is too slow. Honestly, I have no idea how to do this with a subquery either.
So my question is:
- How can I do this with a speedy join?
- How can I do this with a subquery?
Thanks!
Use:
The
doc_timeis odd to me, makes me think you can have duplicates but with different time values…JOIN vs Subquery
JOINs (INNER and OUTER) are not subqueries. To make things more complicated, subqueries can mean:
a query in the SELECT clause (AKA sub-select):
a query in the WHERE or HAVING clauses:
a query in the JOIN (AKA derived table, inline view):
There’s no hard’n’fast rule about one being better than the other because it all depends on:
All that really matters is the work is done in as few passes over a table as necessary–ideally one.