I have two tables orders, and line_items with following structure:
Orders (id = PK, indexes on user_id)
-------------------------------------
id user_id
== ======
1 1
2 2
3 1
4 3
5 1
LineItems (id = PK, indexes on order_id and product_id)
id order_id product_id quantity
== ====== ======== ======
1 1 1 1
2 1 2 2
3 2 1 4
4 2 3 6
5 3 1 1
6 4 1 1
7 5 1 1
I am trying to find the most efficient way to solve the following requirements:
-
Given a
userand aproductfind theLineItemsbelonging toOrderswhere given product is the only product ordered.
E.g: If user_id is 1 and product_id is 1, the query should return line items 5 and 7. -
Given a
userand aproductfind theOrderswhere given product is the only product ordered.
E.g: If user_id is 1 and product_id is 1, the query should return orders 3 and 5.
The Orders and LineItems table can have millions of rows.
I have a working solution that uses COUNT and HAVING. I am not certain that this is the most efficient solution.
Specifically, I am wondering if this can be addressed by using the technique outlined by Cletus in this answer.
Note:
I am using Orders and LineItems tables to describe the scenario. My actual table is quite different and it not related to order etc.
Edit 2
Is this query efficient than using GROUP BY and HAVING?
SELECT A.id
FROM LineItems A
JOIN Orders B ON B.id = A.order_id AND B.user_id = 1
LEFT OUTER JOIN LineItems C ON C.order_id = A.order_id AND
C.product_id != A.product_id
WHERE A.product_id = 1 AND C.id IS NULL
GROUP BY, HAVING, COUNT is the most efficient for this type of query. Basically it will scan the required data fully, but only within the user’s orders, but in that single pass will produce the result.
You can kill two birds with one stone, since for orders with a single line item, min(i.id) gives you the (only) LineItemID.
Indexes you NEED to have:
orders.user_id,lineitems.order_id