I am teaching myself MS-SQL and I am trying to find different ways to find the Count of Paid and Unpaid Claims for 2012 grouped by Region from these 3 tables. If there is a returned date, the claim is unpaid if the returned date is null then the claim is paid.
I will attach the code I have ran, but I am not sure if there are better ways to do it.
Thanks.
Here is the code:
SET dateformat ymd;
CREATE TABLE Claims
(
ClaimID INT,
SubID INT,
[Claim Date] DATETIME
);
CREATE TABLE Phoneship
(
ClaimID INT,
[Shipping Number] INT,
[Claim Date] DATETIME,
[Ship Date] DATETIME,
[Returned Date] DATETIME
);
CREATE TABLE Enrollment
(
SubID INT,
Enrollment_Date DATETIME,
Channel NVARCHAR(255),
Region NVARCHAR(255),
Status FLOAT,
Drop_Date DATETIME
);
INSERT INTO [Phoneship]
([ClaimID],
[Shipping Number],
[Claim Date],
[Ship Date],
[Returned Date])
VALUES (102,
201,
'2011-10-13 00:00:00',
'2011-10-14 00:00:00',
NULL);
INSERT INTO [Phoneship]
([ClaimID],
[Shipping Number],
[Claim Date],
[Ship Date],
[Returned Date])
VALUES (103,
202,
'2011-11-02 00:00:00',
'2011-11-03 00:00:00',
'2011-11-20 00:00:00');
INSERT INTO [Phoneship]
([ClaimID],
[Shipping Number],
[Claim Date],
[Ship Date],
[Returned Date])
VALUES (103,
203,
'2011-11-02 00:00:00',
'2011-11-22 00:00:00',
NULL);
INSERT INTO [Phoneship]
([ClaimID],
[Shipping Number],
[Claim Date],
[Ship Date],
[Returned Date])
VALUES (105,
204,
'2012-01-16 00:00:00',
'2012-01-17 00:00:00',
NULL);
INSERT INTO [Phoneship]
([ClaimID],
[Shipping Number],
[Claim Date],
[Ship Date],
[Returned Date])
VALUES (106,
205,
'2012-02-15 00:00:00',
'2012-02-16 00:00:00',
'2012-02-26 00:00:00');
INSERT INTO [Phoneship]
([ClaimID],
[Shipping Number],
[Claim Date],
[Ship Date],
[Returned Date])
VALUES (106,
206,
'2012-02-15 00:00:00',
'2012-02-27 00:00:00',
'2012-03-06 00:00:00');
INSERT INTO [Phoneship]
([ClaimID],
[Shipping Number],
[Claim Date],
[Ship Date],
[Returned Date])
VALUES (107,
207,
'2012-03-12 00:00:00',
'2012-03-13 00:00:00',
NULL);
INSERT INTO [Phoneship]
([ClaimID],
[Shipping Number],
[Claim Date],
[Ship Date],
[Returned Date])
VALUES (108,
208,
'2012-05-11 00:00:00',
'2012-05-12 00:00:00',
NULL);
INSERT INTO [Phoneship]
([ClaimID],
[Shipping Number],
[Claim Date],
[Ship Date],
[Returned Date])
VALUES (109,
209,
'2012-05-13 00:00:00',
'2012-05-14 00:00:00',
'2012-05-28 00:00:00');
INSERT INTO [Phoneship]
([ClaimID],
[Shipping Number],
[Claim Date],
[Ship Date],
[Returned Date])
VALUES (109,
210,
'2012-05-13 00:00:00',
'2012-05-30 00:00:00',
NULL);
INSERT INTO [Claims]
([ClaimID],
[SubID],
[Claim Date])
VALUES (101,
12345678,
'2011-03-06 00:00:00');
INSERT INTO [Claims]
([ClaimID],
[SubID],
[Claim Date])
VALUES (102,
12347190,
'2011-10-13 00:00:00');
INSERT INTO [Claims]
([ClaimID],
[SubID],
[Claim Date])
VALUES (103,
12348723,
'2011-11-02 00:00:00');
INSERT INTO [Claims]
([ClaimID],
[SubID],
[Claim Date])
VALUES (104,
12349745,
'2011-11-09 00:00:00');
INSERT INTO [Claims]
([ClaimID],
[SubID],
[Claim Date])
VALUES (105,
12347190,
'2012-01-16 00:00:00');
INSERT INTO [Claims]
([ClaimID],
[SubID],
[Claim Date])
VALUES (106,
12349234,
'2012-02-15 00:00:00');
INSERT INTO [Claims]
([ClaimID],
[SubID],
[Claim Date])
VALUES (107,
12350767,
'2012-03-12 00:00:00');
INSERT INTO [Claims]
([ClaimID],
[SubID],
[Claim Date])
VALUES (108,
12350256,
'2012-05-11 00:00:00');
INSERT INTO [Claims]
([ClaimID],
[SubID],
[Claim Date])
VALUES (109,
12347701,
'2012-05-13 00:00:00');
INSERT INTO [Claims]
([ClaimID],
[SubID],
[Claim Date])
VALUES (110,
12350256,
'2012-05-15 00:00:00');
INSERT INTO [Claims]
([ClaimID],
[SubID],
[Claim Date])
VALUES (111,
12350767,
'2012-06-30 00:00:00');
INSERT INTO [Enrollment]
([SubID],
[Enrollment_Date],
[Channel],
[Region],
[Status],
[Drop_Date])
VALUES (12345678,
'2011-01-05 00:00:00',
'Retail',
'Southeast',
1,
NULL);
INSERT INTO [Enrollment]
([SubID],
[Enrollment_Date],
[Channel],
[Region],
[Status],
[Drop_Date])
VALUES (12346178,
'2011-03-13 00:00:00',
'Indirect Dealers',
'West',
1,
NULL);
INSERT INTO [Enrollment]
([SubID],
[Enrollment_Date],
[Channel],
[Region],
[Status],
[Drop_Date])
VALUES (12346679,
'2011-05-19 00:00:00',
'Indirect Dealers',
'Southeast',
0,
'2012-03-15 00:00:00');
INSERT INTO [Enrollment]
([SubID],
[Enrollment_Date],
[Channel],
[Region],
[Status],
[Drop_Date])
VALUES (12347190,
'2011-07-25 00:00:00',
'Retail',
'Northeast',
0,
'2012-05-21 00:00:00');
INSERT INTO [Enrollment]
([SubID],
[Enrollment_Date],
[Channel],
[Region],
[Status],
[Drop_Date])
VALUES (12347701,
'2011-08-14 00:00:00',
'Indirect Dealers',
'West',
1,
NULL);
INSERT INTO [Enrollment]
([SubID],
[Enrollment_Date],
[Channel],
[Region],
[Status],
[Drop_Date])
VALUES (12348212,
'2011-09-30 00:00:00',
'Retail',
'West',
1,
NULL);
INSERT INTO [Enrollment]
([SubID],
[Enrollment_Date],
[Channel],
[Region],
[Status],
[Drop_Date])
VALUES (12348723,
'2011-10-20 00:00:00',
'Retail',
'Southeast',
1,
NULL);
INSERT INTO [Enrollment]
([SubID],
[Enrollment_Date],
[Channel],
[Region],
[Status],
[Drop_Date])
VALUES (12349234,
'2012-01-06 00:00:00',
'Indirect Dealers',
'West',
0,
'2012-02-14 00:00:00');
INSERT INTO [Enrollment]
([SubID],
[Enrollment_Date],
[Channel],
[Region],
[Status],
[Drop_Date])
VALUES (12349745,
'2012-01-26 00:00:00',
'Retail',
'Northeast',
0,
'2012-04-15 00:00:00');
INSERT INTO [Enrollment]
([SubID],
[Enrollment_Date],
[Channel],
[Region],
[Status],
[Drop_Date])
VALUES (12350256,
'2012-02-11 00:00:00',
'Retail',
'Southeast',
1,
NULL);
INSERT INTO [Enrollment]
([SubID],
[Enrollment_Date],
[Channel],
[Region],
[Status],
[Drop_Date])
VALUES (12350767,
'2012-03-02 00:00:00',
'Indirect Dealers',
'West',
1,
NULL);
INSERT INTO [Enrollment]
([SubID],
[Enrollment_Date],
[Channel],
[Region],
[Status],
[Drop_Date])
VALUES (12351278,
'2012-04-18 00:00:00',
'Retail',
'Midwest',
1,
NULL);
INSERT INTO [Enrollment]
([SubID],
[Enrollment_Date],
[Channel],
[Region],
[Status],
[Drop_Date])
VALUES (12351789,
'2012-05-08 00:00:00',
'Indirect Dealers',
'West',
0,
'2012-07-04 00:00:00');
INSERT INTO [Enrollment]
([SubID],
[Enrollment_Date],
[Channel],
[Region],
[Status],
[Drop_Date])
VALUES (12352300,
'2012-06-24 00:00:00',
'Retail',
'Midwest',
1,
NULL);
INSERT INTO [Enrollment]
([SubID],
[Enrollment_Date],
[Channel],
[Region],
[Status],
[Drop_Date])
VALUES (12352811,
'2012-06-25 00:00:00',
'Retail',
'Southeast',
1,
NULL);
And Query1
SELECT Count(ClaimID) AS 'Paid Claim',
(SELECT Count(ClaimID)
FROM dbo.phoneship
WHERE [returned date] IS NOT NULL) AS 'Unpaid Claim'
FROM dbo.Phoneship
WHERE [Returned Date] IS NULL
GROUP BY claimid
Query2
SELECT Count(*) AS 'Paid Claims',
(SELECT Count(*)
FROM dbo.Phoneship
WHERE [Returned Date] IS NOT NULL) AS 'Unpaid Claims'
FROM dbo.Phoneship
WHERE [Returned Date] IS NULL;
Query3
Select Distinct(C.[Shipping Number]), Count(C.ClaimID) AS 'COUNT ClaimID',
A.Region, A.SubID
From dbo.HSEnrollment A
Inner Join dbo.Claims B On A.SubId = B.SubId
Inner Join dbo.Phoneship C On B.ClaimID = C.ClaimID
Where C.[Returned Date] IS NULL
Group By A.Region, A.Subid, C.ClaimID, C.[Shipping Number] Order By A.Region
It is difficult to answer this question because I see what you are asking but there are a variety of other little problems leading up to your query difficulties.
My Answer
So to answer the core of your question here is what I would do, if and only if I am interpreting your table structure properly (more to follow on that).
I did not include the number of claims because that would throw off the numbers. I have included two queries, the final query and the break down query. For the sum of sums (since you are learning) I used WITH ROLLUP to get the sum of each grouped column.
The Break Down
This is the break down query utilizing a sub select of your inner virtual table (result table – result set etc…). I did this on purpose to demonstrate a point.
As great as Sub Selects are avoid them if you can. They are not very good for performance, but of course there will be times where you can’t avoid it.
Your table structure/relationships are the root cause of why you are having difficulty performing this query. After reviewing the structure I see that you are replicating data (which is a no no) and you are having trouble pulling details all into 1 nice query.
The problem areas that I saw (and some friendly advice)
Your replicated the ClaimDate column from the Claims table to the PhoneShip table. I am not sure if those have different meanings, but if it is a duplicate – avoid this.
The SubID that is in the Claims table should probably be removed. It would be better if you put the ClaimID as a Foreign Key (FK) into the Enrollment table.
Give the Phoneship table its own Primary Key (PK) – it is for easy of use, making each row unique aside from the combination of ClaimID and ShippingNumber. Look into Table Relationships and Unique Constraints.
I am a bit iffy about using NULL as a good indicator for whether or not something was paid or unpaid. Only the designer would know that a null field would mean paid. You might be better off using a bit field for this purpose with a default value of zero and marked as NOT NULL – that way it is never null. After all that will save you the trouble of having to write a case statement, you can use the bit directly for your sum Ex: SUM(x.Paid). Also columns can sometimes be mistakenly marked as NULL when not intended for a variety of reasons.
Consider pulling the Channel and Region columns out of the Enrollment table completely. Put those in their own tables with a Integer PK. You can reference the PK everywhere where needed using ChannelID and RegionID. This way if the names need to change, you won’t have to worry about data integrity issues (UPDATE Table SET NameCol = ‘a’ WHERE NameCol = ‘b’ — This could cause an unintended renaming disaster.)
Put the RegionID and ChannelID into the Claims table. Now you don’t need it in the Enrollment table if you follow step 2 above (have a ClaimID FK in Enrollment table).
Congrats on taking the initiative to learn this stuff. It is invaluable knowledge (unless you go to college, in which case it is worth about 50K or worse… student loans… sigh…).