Ok so I have some data in a product table that contains a category id column that only contains 2 of the 4 available characters used in the category list table.
Example: product has category ID ‘AA’ but could be from either the Desktops (AAA) or Server (AAB) categories.
In the category table we have 3 columns that contain info to help us: include, atrID and valID. The valID column contains a comma-separated list of values (e.g. ‘K01819’ or ‘K00846,K00851’) that may contain only one ID or more. The atrID column contains a similar string (e.g. ‘A00432’). The include column is either 1 or 0 based on some work done by a programmer that came and went long ago.
Basically he had trawled through the various tables and found that for certain categories (e.g. AA) we could join another table from the category table and use the atrID and list of valIDs to tie it all in.
Example:
Product “a” has a catID of AA and is a desktop machine. This belongs in category ‘AAA’ which has 0 for include, ‘A00432’ for atrID and ‘K01819’ for valID.
Example SQL to get category info:
SELECT cat.atrID, cat.valID, cat.[include]
FROM db.dbo.cat AS cat
WHERE cat.catID = 'AAA'
I would then save these variables in the .Net code and build the SQL depending on the value of include. I would also split the valID up and pass this in as individual parameters resulting in the following code being executed on the SQL server.
SELECT DISTINCT prod.prodID
FROM db.dbo.prod AS prod
INNER JOIN db.dbo.atr AS atr ON atr.prodID = prod.prodID
WHERE prod.catID = 'AA'
AND atr.atrID = 'A00432'
AND atr.valID NOT IN('K01819')
To change this for servers I would change the NOT IN to an IN as servers (AAB) have an include value of 1. Example:
SELECT DISTINCT prod.prodID
FROM db.dbo.prod AS prod
INNER JOIN db.dbo.atr AS atr ON atr.prodID = prod.prodID
WHERE prod.catID = 'AA'
AND atr.atrID = 'A00432'
AND atr.valID IN('K01819')
I use IN and NOT IN because some of the categories have more than one value in the valID column.
My question is, if I want to get the category for a specific product without previously knowing it, is it even possible? (Failing getting a list of all the products from the categories you think it’s in and matching it in one or another of the lists). I’ve been telling my boss I can’t figure it out for days now and I’m not getting anywhere. Any help would be greatly appreciated.
EDIT:
What I’m trying to do is get the category for any product. I included the above SQL as an example of how I get a list of products for the Desktop and Server categories.
I want the reverse, i.e. getting the 3 letter category from the categories table when I only have the two letter category from the product table.
EDIT 2:
Product table columns:
prodID varchar(40)
catID char(2)
Sample rows:
‘S101010’, ‘AA’ (desktop)
‘S202020’, ‘AA’ (server)
‘S303030’, ‘ED’ (laser printer)
‘S404040’, ‘ED’ (inkjet printer)
Category table columns:
catID varchar(4)
description varchar(50)
include bit (actually a tinyint which should go to show what I deal with on a daily basis =P)
atrID varchar(6)
valID varchar(250)
Sample rows:
‘AAA’, ‘Desktops’, 0, ‘A00432’, ‘K01819’
‘AAB’, ‘Servers’, 1, ‘A00432’, ‘K01819’
‘EDA’, ‘Laser Printers’, 1, ‘A00172’, ‘K00846,K00851’
‘EDB’, ‘Inkjet Printers’, 1, ‘A00172’, ‘K00845’
Attribute table columns:
prodID varchar(40)
catID char(2)
atrID varchar(10)
valID varchar(10)
Sample rows:
‘S101010’, ‘AA’, ‘A00432’, ‘K01817’
‘S202020’, ‘AA’, ‘A00432’, ‘K01819’
‘S303030’, ‘ED’, ‘A00172’, ‘K00846’
‘S303030’, ‘ED’, ‘A00172’, ‘K00851’
‘S404040’, ‘ED’, ‘A00172’, ‘K00845’
What I have included above is how I get a list of products when I know category.catID. What I want is to be able to get the category when I only have product.prodID and product.catID.
I’m after something like this but it’s incomplete and won’t work as it is. I would then check the result of catIn.catID and catNotIn.catID to see which one wasn’t null and that would be my category but I can’t figure out the joins. If it is even possible. And it doesn’t take into account the categories that have more than two variants.
SELECT prod.prodID, catIn.catID, catNotIn.catID
FROM product AS prod
INNER JOIN attributes AS atr ON atr.prodID = prod.prodID
LEFT OUTER JOIN category AS catIn ON LEFT(catIn.catID, 2) = prod.catID AND catIn.atrID = atr.atrID --more conditions needed here
LEFT OUTER JOIN category AS catNotIn ON LEFT(catNotIn.catID, 2) = prod.catID AND catIn.atrID = atr.atrID --more conditions needed here
I hope this explains what I’m after a bit better.
I’ll start by saying your data model is pretty bad. You shouldn’t have a comma-list of values where you need to then split and match against. It should be a table relationship. But I digress…
So if the only difference is whether or not you use
INvsNOT INbased on the value ofincludethen if I were to write this wholly on the server I would do this:First, get yourself a nice CSV split UDF. how to split and insert CSV data into a new table in single statement? This is a good example — nice and simple.
Then, the query (which I would turn into a stored procedure) would likely look something like this:
Very hard to make sure my syntax above is accurate as I don’t have a full working model. But here’s the general idea: Your UDF
inline_split_mesplits records into a “table” which you can then join against. By using LEFT JOIN, if there’s not a match between theatrtable, then ism.Value will be null. This can be used to doNOT IN. The opposite is, then, also true for theINsimulation.EDIT 1: Based on your feedback, Here’s a test bed I came up with. I don’t have a complete view of the data, but I think this is what you are looking for (or at least getting close). Comment and I’ll adjust.
Given your sample data, I ran this in my tempdb. Since I didn’t have an Attribute record for EDB I can’t be certain if I’m getting back what I’m supposed to get back. I also get back two records S303030, one for each of the valIDs in the Attribute table. You can use DISTINCT to get back just one if you’re looking only for uniqueness across ProdID + Category.CatID, or if there’s a piece of the puzzle I didn’t understand, let me know and I’ll tweak my answer. This works for everything except Ink Jets, because your sample data didn’t have an inkjet record in the #Category table. If I fake one with say
UNION SELECT 'S404040', 'ED', 'A00172', 'K00845'then it seems to work.