Update: After playing around with this for a few hours, went with a multi-query solution and used a table that only contained parent attributes to determine which items needed updating.
Sorry for the poor title, I couldn’t think how to concisely describe this problem.
I have a set of items that should have a 1-to-1 relationship with an attribute.
I have a query to return those rows where the data is wrong and this relationship has been broken (1-to-many). I’m gathering these rows to fix them and restore this 1-to-1 relationship.
This is a theoretical simplification of my actual problem but I’ll post example table schema here as it was requested.
item table:
+------------+------------+-----------+
| item_id | name | attr_id |
+------------+------------+-----------+
| 1 | BMW 320d | 20 |
| 1 | BMW 320d | 21 |
| 2 | BMW 335i | 23 |
| 2 | BMW 335i | 34 |
+------------+------------+-----------+
attribute table:
+---------+-----------------+------------+
| attr_id | value | parent_id |
+---------+-----------------+------------+
| 20 | SE | 21 |
| 21 | M Sport | 0 |
| 23 | AC | 24 |
| 24 | Climate control | 0 |
....
| 34 | Leather seats | 0 |
+---------+-----------------+------------+
A simple query to return items with more than one attribute.
SELECT item_id, COUNT(DISTINCT(attr_id)) AS attributes
FROM item GROUP BY item_id HAVING attributes > 1
This gets me a result set like so:
+-----------+------------+
| item_id | attributes |
+-----------+------------+
| 1 | 2 |
| 2 | 2 |
| 3 | 2 |
-- etc. --
However, there’s an exception. The attribute table can hold a tree structure, via parent links in the table. For certain rows, parent_id can hold the ID of another attribute. There’s only one level to this tree. Example:
+---------+-----------------+------------+
| attr_id | value | parent_id |
+---------+-----------------+------------+
| 20 | SE | 21 |
| 21 | M Sport | 0 |
....
I do not want to retrieve items in my original query where, for a pair of associated attributes, they related like attributes 20 & 21.
I do want to retrieve items where:
- the attributes have no parent
- for two or more attributes they are not related (e.g. attributes 23 & 34)
Example result desired, just the item ID:
+------------+
| item_id |
+------------+
| 2 |
+------------+
How can I join against attributes from items and exclude these rows?
Do I use a temporary table or can I achieve this from a single query?
Thanks.
The following query will extract only unique pairs of item and attribute (or its parent if any), thus eliminating the duplicates (this is per your request that an attribute can have only one parent, and the parent has no parents).
So, using the above query as a subtable for your counting query will work (same approach I used with the A subtable above):
I have added 3 more rows to your example items table, to accommodate the case, where an item can be linked only to an attribute with parent, but not the parent itself (i.e. item 3 -> 23 and 3 -> 20), and 4 -> 23.
Running the above query lists only items 2 and 3 with 2 attributes each.