Sometimes I want to join a record that I can easily identify as “sort by this and that and pick the first record”. I wonder what the best practice here.
For example in my data model I have Services and Implementations. A Service may have multiple Implementations. Implementations have version numbers; zero or more implementations may be installed and one of them may also be enabled. Now I want to select the “current implementation” for each service using the following rules:
- If nothing is installed, then the current implementation is the most recent one.
- If more than one is installed, then the current is either the one that is enabled or, if nothing is enabled, the most recent one again.
Some examples:
Service Version Installed Enabled
======= ======= ========= =======
A 1 False False
A 2 False False <- current for service A because most recent
B 1 True False <- current for service B because installed
B 2 False False
C 1 True False
C 2 True False <- current for service C because most recent
C 3 False False among installed
D 1 True True <- current for service D because enabled
D 2 True False
I figured that if I sort by these fields and pick the first record, it will do the trick. I.e. I do something like that:
SELECT s.service, <other service fields>, i.version, <other impl. fields>
FROM service s, implementation i
WHERE i.rowid == (
SELECT rowid
FROM implementation o
WHERE o.service == s.service
ORDER BY installed DESC, enabled DESC, version ASC)
Update. This one is rather specific to SQLite, I guess. First, rowid is the internal record ID that is used by SQLite. Second, in SQLite the SELECT expression in this context returns a single value. I guess it could be rewritten into a more generic SQL like this:
...
FROM service s, implementation i
WHERE i.service == s.service
AND i.version == (
SELECT version
FROM ...
ORDER BY ...
LIMIT 1)
It works here, but sometimes there’s no rowid or another single field that can serve as an identifier. Are there any other ways to get the same result? Something more generic, perhaps? (And for me something more specific is OK too, as long as it’s SQLite-specific 🙂
You can use a NOT EXISTS condition to exclude all but the desired match. Join rows of [service] and [implementation] that match on the [service] column so long as there does not exist another row of [implementation] that also matches on the [service] column but is preferred because it would rank higher in the desired ordering. Here is the idea.
Using Microsoft SQL Server or another SQL dialect that supports the CROSS APPLY operator, this is much, much simpler:
(Neither solution has been tested with sample data, as CREATE TABLE and INSERT statements were not posted.)
Steve Kass