I apologize in advance for the length, the solution may well be trivial, just wanted to be as informative as I could.
The Tables
I have two tables of note: items and products, which is a 1 to many relationship. One item can have multiple product which are variations in color and material. Brand is an external category table that doesn’t have to much part to play in this select statement.
So an item is, for example, a specific shoe, e.g. a “park avenue” shoe.
A product is, for example, merlot burnished calfskin.
And the brand would just be Allen Edmonds.
Overall you get an Allen Edmonds park avenue shoe in merlot burnished calfskin.
Missing results in a “show almost everything” search
Someone decided to create a manual flag to associate the default color and material with a shoe, so that when you search, each type of shoe only shows up once, and when you click on it you can find it’s other colors and materials. That’s fine, but some shoes have no default material and color set. As an unfortunate result, those without at least one default set don’t show up in the search.
Current Select Statement
Here is the current select, which filters out everything that doesn’t have a default manually set:
SELECT DISTINCT items.ItemId
, items.Name
, items.BrandCategoryId
, items.CatalogPage
, items.GenderId
, items.PriceRetail
, items.PriceSell
, items.PriceHold
, items.Descr
, items.FlagStatus as ItemFlagStatus
, products.ImagetnURL
, products.FlagDefault
, products.ProductId
, products.Code as ProductCode
, products.Name as ProductName
, brands.Name as BrandName
FROM items
, products
, brands
WHERE items.ItemId = products.ItemId
AND items.BrandCode = brands.Code
AND items.FlagStatus != 'U'
AND products.FlagStatus != 'U'
AND products.FlagDefault = 'Y';
Not my choice of code, I suspect that the “DISTINCT” part of that statement is a bad idea, but I’m not exactly clear how to get rid of it.
The big problem I’m having right now, though is that final line
AND products.FlagDefault = 'Y'
that filters out everything that doesn’t have at least one manual default set.
Edit: Here’s an explain for the query:
+----+-------------+----------+--------+-----------------------------------------------------------+---------+---------+-------------------------+-------+--------------------------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+----------+--------+-----------------------------------------------------------+---------+---------+-------------------------+-------+--------------------------------+
| 1 | SIMPLE | brands | ALL | NULL | NULL | NULL | NULL | 38 | Using temporary |
| 1 | SIMPLE | products | ALL | FlagStatus,FlagStatus_2,FlagStatus_3,flagstatusanddefault | NULL | NULL | NULL | 16329 | Using where; Using join buffer |
| 1 | SIMPLE | items | eq_ref | PRIMARY,BrandCode,FlagStatus,FlagStatus_2,FlagStatus_3 | PRIMARY | 4 | sherman.products.ItemId | 1 | Using where |
+----+-------------+----------+--------+-----------------------------------------------------------+---------+---------+-------------------------+-------+--------------------------------+
3 rows in set (0.01 sec)
And here is a describe on products, items, and brands:
mysql> describe products;
+-------------+--------------+------+-----+-------------------+-----------------------------+
| Field | Type | Null | Key | Default | Extra |
+-------------+--------------+------+-----+-------------------+-----------------------------+
| ProductId | int(11) | NO | PRI | NULL | auto_increment |
| ItemId | int(11) | YES | | NULL | |
| Code | varchar(15) | YES | MUL | NULL | |
| Name | varchar(100) | YES | | NULL | |
| MaterialId | int(11) | YES | MUL | NULL | |
| PriceRetail | decimal(6,2) | YES | | NULL | |
| PriceSell | decimal(6,2) | YES | | NULL | |
| PriceHold | decimal(6,2) | YES | | NULL | |
| Cost | decimal(6,2) | YES | | NULL | |
| FlagDefault | char(1) | NO | | N | |
| FlagStatus | char(1) | YES | MUL | NULL | |
| ImagetnURL | varchar(50) | YES | | NULL | |
| ImagefsURL | varchar(50) | YES | | NULL | |
| ImagelsURL | varchar(50) | YES | | NULL | |
| DateStatus | timestamp | NO | | CURRENT_TIMESTAMP | on update CURRENT_TIMESTAMP |
| DateCreated | timestamp | YES | | NULL | |
+-------------+--------------+------+-----+-------------------+-----------------------------+
16 rows in set (0.02 sec)
mysql> describe items
-> ;
+-----------------+--------------+------+-----+-------------------+-----------------------------+
| Field | Type | Null | Key | Default | Extra |
+-----------------+--------------+------+-----+-------------------+-----------------------------+
| ItemId | int(11) | NO | PRI | NULL | auto_increment |
| Code | varchar(25) | YES | | NULL | |
| Name | varchar(100) | YES | MUL | NULL | |
| BrandCode | char(2) | YES | MUL | NULL | |
| CatalogPage | int(3) | YES | | NULL | |
| BrandCategoryId | int(11) | YES | | NULL | |
| TypeId | int(11) | YES | MUL | NULL | |
| StyleId | int(11) | YES | MUL | NULL | |
| GenderId | int(11) | YES | MUL | NULL | |
| PriceRetail | decimal(6,2) | YES | | NULL | |
| PriceSell | decimal(6,2) | YES | | NULL | |
| PriceHold | decimal(6,2) | YES | | NULL | |
| Cost | decimal(6,2) | YES | | NULL | |
| PriceNote | longtext | YES | | NULL | |
| FlagTaxable | char(1) | YES | | NULL | |
| FlagStatus | char(1) | YES | MUL | NULL | |
| FlagFeatured | char(1) | YES | | NULL | |
| MaintFlagStatus | char(1) | YES | | NULL | |
| Descr | longtext | YES | | NULL | |
| DescrNote | longtext | YES | | NULL | |
| ImagetnURL | varchar(50) | YES | | NULL | |
| ImagefsURL | varchar(50) | YES | | NULL | |
| ImagelsURL | varchar(50) | YES | | NULL | |
| DateCreated | date | NO | | 0000-00-00 | |
| DateStatus | timestamp | NO | | CURRENT_TIMESTAMP | on update CURRENT_TIMESTAMP |
+-----------------+--------------+------+-----+-------------------+-----------------------------+
25 rows in set (0.00 sec)
mysql> describe brands;
+--------------+------------------+------+-----+-------------------+-----------------------------+
| Field | Type | Null | Key | Default | Extra |
+--------------+------------------+------+-----+-------------------+-----------------------------+
| BrandId | int(11) unsigned | NO | PRI | NULL | auto_increment |
| Code | varchar(6) | YES | | NULL | |
| PriceCode | varchar(4) | YES | | NULL | |
| Name | varchar(50) | YES | | NULL | |
| WebsiteURL | varchar(50) | YES | | NULL | |
| LogoURL | varchar(50) | YES | | NULL | |
| LogoTopURL | varchar(50) | YES | | NULL | |
| BrandURL | varchar(50) | YES | | NULL | |
| Descr | longtext | YES | | NULL | |
| DescrShort | longtext | YES | | NULL | |
| BeltDescr | longtext | YES | | NULL | |
| ImageURL | varchar(50) | YES | | NULL | |
| SaleImageURL | varchar(50) | YES | | NULL | |
| SaleCode | varchar(6) | YES | | NULL | |
| SaleDateBeg | date | YES | | NULL | |
| SaleDateEnd | date | YES | | NULL | |
| FlagStatus | char(1) | YES | | NULL | |
| DateStatus | timestamp | NO | | CURRENT_TIMESTAMP | on update CURRENT_TIMESTAMP |
| DateCreated | timestamp | YES | | NULL | |
+--------------+------------------+------+-----+-------------------+-----------------------------+
19 rows in set (0.00 sec)
Possibilities that I am exploring
Subselect that grinds everything to a halt
I have a select statement that might, in a perfect, zero-execution-time world, work, by selecting the products the first product for each item, ordered by that flagdefault field, e.g.:
AND products.productid =
(select productid
from products
where products.itemid = items.itemid
AND products.FlagStatus != 'U'
order by FlagDefault='Y'
, itemid
limit 1);
replacing the check for a manually toggled default with an id that’s only ordered by default, even if it’s not toggled, and only takes the first result.
That statement grinds to a halt, and actually causes other use on the site to put mysql statements into deadlock (I suppose because reading of those tables is making them unavailable elsewhere).
Join that makes sure one table is distinct and not the next?
One way to get around it that might work is doing a:
select distinct ItemId from products ORDER BY default
And then just going further to obtain data for those itemids specifically, but I’m not sure how to make that happen in a single statement, not sure how to join select distincts well, and I expect that even making that select “distinct” in the first place isn’t ideal, since it’s selecting more than is needed to begin with and then cutting them down afterwards, but I don’t have a better alternative for determining distinctness, really.
Advice?
In general, the select statement could use a lot of improvement, and specifically I could really use some advice on how to filter down the results for the most specific table and only -then- join upstream to the table that is the “one” in the one to many relationship.
Remove from
WHERE:Add to
FROM:Add to
WHERE:I’m using the term “non-hidden” for rows which have
FlagStatus != 'U', since I’m assuming that’s what the flag is for.The first
SELECTgives theProductIdof all default products, and the second one gives aProductIdfor all the items without a default product. Hidden items are filtered by both, so if a default product has been hidden, a non-default product is displayed instead. When concatenated, you get aProductIdfor every item that has some non-hidden product.I’m assuming
FlagDefaultcan only have values'Y'or'N'. The second query filters out the items having a default product by usingMAX(FlagDefault), which works because'Y' > 'N'.By joining this to the
productstable of the original query, instead of filtering withFlagDefault, you should get the same results as the original, except you also get one row for every item which does not have a default product.I’ve tested this query, but I haven’t tested it with your original one since I don’t have any meaningful data (read: your data) to test it against. This one works, so the combination should also work. For the same reason, I don’t have any real numbers about performance – and I’m not an expert on query performance, either (more like a newbie). However, from what I’ve heard, subqueries in the
WHEREclause are supposed to be bad for the performance, but in theFROMclause they should be okay. So, test it, I hope it’s fast enough and fits the job.As others mentioned, if you haven’t got an index for the
products.ItemIdandBrandCodecolumns, you should definitely add them. You should also consider if requiring every item to have one hand-picked default would be okay, or maybe ditching the hand-picked defaults and always using random ones. Another thing to consider is if you really need the data from a product when there is no default – could you live without the image url, product name (use the item name?) and product code for those products?Edit: One more possibility: You could change
products.FlagDefaulttoitems.DefaultProductId. That way it’d be easier to find out if an item has a default product and it enforces only one default product per item.