I have two tables.
The ‘Store’ table is a demographics table, with fields like these:
ID, ParentID, ConsolidationType
ID is the unique identifier for a given store.
ParentID may be either NULL or contain the ID of its parent store if it has a ConsolidationType of ‘Secondary’.
ConsolidationType is either NULL, ‘Primary’, or ‘Secondary’.
The ‘Sales’ table has fields like these:
StoreID, SalesDate, SalesAmount
StoreID refers to an ID in the Store table.
I’m trying to get the total consolidated and individual sales for a given store in a single row of a single query. I’ve written a SQL query like the following:
SELECT
Store.ID AS 'Store ID',
SUM(IF(YEAR(main.SalesDate) = 2012 AND QUARTER(main.SalesDate) = 1,main.SalesAmount,0)) AS 'Individual Sales',
SUM(IF(YEAR(main.SalesDate) = 2012 AND QUARTER(main.SalesDate) = 1 AND YEAR(secondaries.SalesDate) = 2012 AND QUARTER(secondaries.SalesDate) = 1,main.SalesAmount + secondaries.SalesAmount,0) AS 'Consolidated Sales'
FROM Store
LEFT JOIN Sales AS 'main' ON Store.ID = main.StoreID
LEFT JOIN Sales AS 'secondaries' ON Store.ParentID = secondaries.StoreID
GROUP BY Store.ID
I can’t figure out why this doesn’t work as intended. What am I missing? What’s wrong with my logic?
The problem is that your query is generating a Cartesian-product (-like) result set.
Basically, a row from ‘main’ is getting repeated multiple times, for each matching row from ‘secondaries’.
To get the result you want, join to the sales table just one time, and match on both the StoreID and the ParentID.
To get
Individual Sales, only include in the SUM rows where the StoreID matches, something like this:UPDATE
My bad. (DOH!)
ParentIDis on theStoretable, not theSalestable.The query above does not return the specified result. (Working on it.)
I think you had the solution already…
Use the
ParentIDcolumn in place of theIDcolumn from theStoretable.Where the ParentID column is NULL, we use the value from the
IDcolumn.This query returns the specified result set:
This is not the most efficient, the inline view (or “derived table”) is on the order of the size of the Sales table. It would be more efficient to push the predicates on the date down into the inline view, or to summarize by month, or quarter, in the derived table.
I would more likely return the “quarter” as part of the result set, to allow me to pull more than one quarter.