In my previous question (Select a distinct RowId based on series of trainnumbers) I got a nice list of RowIds.
Now I’d like to make that list complete and totally awesome! 🙂
My raw data(Excel 2010) looks like this:
Time Wagons Delete DayType Plate trainnumber RowId
05.28 1 1 0901-046 2 38676
08.20 2 1 0901-003 2 18676
05.25 2 x 1 0901-046 2 28676
15.28 2 1 0901-046 2 3676
23.20 3 1 0601-001 2 3867
05.08 3 1 0901-046 2 3876
00.28 L x 1 0901-046 2 8676
00.00 1 0901-046 2 367
I need a list that groups the Primary RowIds(the list from the previous question) with the RowIds that match it by the following criteria:
- on the Primary and Matching rows the following must be the SAME:
- trainnumber
- DayType
- Time
- on the Primary and Matching rows the Plate must be DIFFERENT
- Wagons must be MORE THAN 1 (The column contains rows with numbers, letters and nothing)
- Delete must be empty
When matches are found, I need the RowId of the match.
Ideally a dataset like this:
PrimaryRowId Match#1 Match#2 Match#3 Match#4
15674 5465 456 5456 45656
5564 231 132 1321 7862
It’s possible that there’s more matches per Primary RowId, but that’s ok.
My SQL skills is somewhat limited, so that’s why I’m asking you guys. 🙂
I think it might be something like this:
SELECT RowId
FROM Conversion
WHERE trainnumber=trainnumber and daytype=daytype and
time=time and plate<>plate and Wagons>1 and delete=""
GROUP BY RowId
But it would only give me one(1) RowId at a time. :-/
Chris,
To use SQL to match items like this, you need to tell SQL to find the matches in a pair of tables (actually two copies of the same table). This is known as a self join.
The two copies of Conversion are given the aliases A and B.
On the INNER JOIN line, we specify the fields which have to match.
On the WHERE line, we specify the fields which have to be different, as well as the other conditions (which must be specified for both A and B).
The first column gives the lowest row ID for each matching train, and the second column the other matching row IDs.
Now we have a list of master rows, we can join this again to Conversion to produce a list of slave rows :
I’ve tested this with a little extra test data on SQL Fiddle