I have a table with some of the data a test performs on different samples that have different desirability (P1 is better than S1 which is better than S3 which is better than S2) AND sometimes any of these tests could be repeated. If a test fails, someone has to do it over.
I want my query to display only the best sample (P1>S1>S3>S2) AND only the repeated data, (not the original data).
The following query works but as you can see it is rather long and complicated. I am still a junior SQL person so how can I accomplish the same thing with a shorter/better query ?
I am trying to learn better SQL so I do not always have to ask these types of questions so explanations of why your query works better would be very helpful !
DECLARE @TempTable TABLE (Sample_ID varchar(10), TestRepeat int, TestResult varchar(1))
-- In the end, ONLY the samples with Y should be displayed
INSERT INTO @TempTable VALUES('61-0001-P1', 0, 'R') -- 1 Y
INSERT INTO @TempTable VALUES('61-0002-P1', 0, 'R') -- 2 Y
INSERT INTO @TempTable VALUES('61-0003-S1', 0, 'S') -- 3 Y
INSERT INTO @TempTable VALUES('61-0004-S1', 0, 'R') -- 4 Y
INSERT INTO @TempTable VALUES('61-0005-P1', 0, 'I') -- 5
INSERT INTO @TempTable VALUES('61-0005-P1', 1, 'S') -- 6 Y
INSERT INTO @TempTable VALUES('61-0006-P1', 0, 'S') -- 7 Y
INSERT INTO @TempTable VALUES('61-0006-S3', 0, 'R') -- 8
INSERT INTO @TempTable VALUES('61-0007-P1', 0, 'S') -- 9 Y
INSERT INTO @TempTable VALUES('61-0008-S3', 0, 'I') -- 10
INSERT INTO @TempTable VALUES('61-0008-S3', 1, 'R') -- 11 Y
INSERT INTO @TempTable VALUES('61-0009-P1', 0, 'R') -- 12 Y
INSERT INTO @TempTable VALUES('61-0009-S1', 0, 'S') -- 13
INSERT INTO @TempTable VALUES('61-0010-P1', 0, 'S') -- 14 Y
INSERT INTO @TempTable VALUES('61-0011-S3', 0, 'S') -- 15 Y
DECLARE @TempTable1 TABLE (Subject_ID varchar(7), Sample_ID varchar(10), SampleOrder int, TestRepeat int, TestResult varchar(1))
INSERT @TempTable1
SELECT LEFT(Sample_ID,7) AS Subject_ID,
Sample_ID,
SampleOrder =
CASE
WHEN RIGHT(Sample_ID,2) = 'P1' THEN 4
WHEN RIGHT(Sample_ID,2) = 'S1' THEN 3
WHEN RIGHT(Sample_ID,2) = 'S3' THEN 2
WHEN RIGHT(Sample_ID,2) = 'S2' THEN 1
END,
TestRepeat,
TestResult
FROM @TempTable
ORDER BY Subject_ID, SampleOrder;
--SELECT * FROM @TempTable1;
DECLARE @TempTable2 TABLE (Sample_ID varchar(10), TestRepeat int, TestResult varchar(1))
INSERT @TempTable2 SELECT
tt1.Sample_ID,
tt1.TestRepeat,
tt1.TestResult
FROM @TempTable1 tt1
INNER JOIN (
SELECT Subject_ID, MAX(SampleOrder) AS Max_SampleOrder
FROM @TempTable1
GROUP BY Subject_ID) subQ1
ON (tt1.Subject_ID=subQ1.Subject_ID AND tt1.SampleOrder=subQ1.Max_SampleOrder)
ORDER BY tt1.Sample_ID;
SELECT tt2.Sample_ID,
tt2.TestRepeat,
tt2.TestResult
FROM @TempTable2 tt2
INNER JOIN (
SELECT Sample_ID, MAX(TestRepeat) AS Max_TestRepeat
FROM @TempTable2
GROUP BY Sample_ID) subQ
ON (tt2.Sample_ID = subQ.Sample_ID AND tt2.TestRepeat=subq.Max_TestRepeat)
ORDER BY tt2.Sample_ID, tt2.TestResult;
You can use row_number() in a sub query for this.
You can test the query on SE-Data
Explanation:
row_numberwill enumerate your rows from1. Thepartition byclause controls when numbering start from1again and theorder byclause specifies the order of the numbering. Theover()clause used above will give you arow_number()of1for the rows you are interested in. It is not possible to userow_number()in the where clause of a query so you have to use a derived table to be able to filter your rows on the result ofrow_number()