I have an STI-based model called Buyable, with two models Basket and Item. The attributes of concern here for Buyable are:
- shop_week_id
- location_id
- parent_id
There’s a parent-child relationship between Basket and Item. parent_id is always nil for basket, but an item can belong to a basket by referencing the unique basket id. So basket has_many items, and an item belongs_to a basket.
I need a method on the basket model that:
Returns true of false if there are any other baskets in the table with both the same number of and types of items. Items are considered to be the same type when they share the same shop_week_id and location_id.
For ex:
Given a basket (uid = 7) with 2 items:
item #1
- id = 3
- shop_week_id = 13
- location_id = 103
- parent_id = 7
item #2
- id = 4
- shop_week_id = 13
- location_id = 204
- parent_id = 7
Return true if there are any other baskets in the table that contain exactly 2 items, with one item having a shop_week_id = 13 and location_id = 103 and the other having a shop_week_id = 13 and location_id = 204. Otherwise return false.
How would you approach this problem? This goes without saying, but I am looking for a very efficient solution.
To clarify my query, and somewhat vague description of the table columns of the “buyable” table, The “Parent_ID” is the basket in question. The “Shop_Week_ID” is the consideration for baskets to be compared… don’t compare a basket from week 1 to week 2 to week 3. The #ID column appears to be a sequential ID in the table, but not the actual ID of the item to be compared… The Location_ID appears to be the common “Item”. In the scenario, assuming a shopping cart, Location_ID = 103 = “Computer”, Location_ID = 204 = “Television” (just for my interpretation of the data). If this is incorrect, minor adjustments may be needed, in addition to the original poster showing a list of say… a dozen entries of the data to show proper correlation.
So, now, on to my query.. I’m doing a STRAIGHT_JOIN so it joins in the order I’ve listed.
The first query for alias “MainBasket” is exclusively used to query how many items are in the basket in question ONCE, so it doesn’t need to be re-joined/queried again for each possible basket to match. There is no “ON” clause as this will be a single record, and thus no Cartesian impact, as I want this COUNT(*) value applied to EVERY record in the final result.
The NEXT Query is to find a DISTINCT OTHER Basket where at LEAST ONE “Location_ID” (Item) within the same week as the parent in question… This could result in other baskets having 1, same or more entries than the basket. But if there are 100 baskets, but only 18 have at least 1 entry that matches 1 item in the original basket, you’ve just significantly cut down the number of baskets to do final compare against (SameWeekSimilar alias result).
Finally is a Join to the buyable table again, but based on a join for the SameWeekSimilar, but only on per “other” basket that had a close match… No specific items, just by the basket. The query used to get the SameWeekSimilar already pre-qualified the same week, and at least one matching item from the original basket in question, but specifically excluding the original basket so it doesn’t compare to itself.
By doing a group at the outer level based on the SameWeekSimilar.NextBasket, we can get the count of actual items for that basket. Since a simple Cartesian join to the MainBasket, we just grab the original count.
Finally, the HAVING clause. Since this is applied AFTER the “COUNT(*)”, we know how many items were in the “Other” baskets, and how many in the “Main” basket. So, the HAVING clause is only including those where the counts were the same.
If you want to test to ensure what I’m describing, run this against your table but DO NOT include the HAVING clause. You’ll see which were all the POSSIBLE… Then re-add the HAVING clause and see which ones DO match same count…