The basic gist of my issue is, for every event A, I need to find the earliest following event B that’s associated with the same user. Currently, I have:
SELECT e.UserID, e.date, min(e2.date)
FROM Event e INNER JOIN
Event e2 ON e.UserID = e2.UserID AND e.date <= e2.date
WHERE e.Event LIKE 'A' AND e2.Event LIKE 'B'
However, for every event A (which can happen for a user any number of times), numerous event B’s happen, so the inner join is creating numerous extra rows that it then has to weed through on the min function. Is there a more efficient/faster way of doing this?
(the server is MSSQL Server 2008)
UPDATE:
Would it be faster with Rank()?
Select UserID, date, date2
from (
Select e.UserID, e.date, e2.date as date2, rank() OVER (PARTITION BY e.date, e.UserID ORDER BY e2.date) as rank
FROM Event e INNER JOIN Event e2 on e.UserID = e2.UserID
WHERE e.Event = 'A' and e2.Event = 'B' and e.date <= e2.date
)
WHERE rank = 1
Or will optimization bring them out to basically equivalent?
Is it faster to join a third time, like this? Probably not, but it might be worth trying. Here any data returned in table “e3” represent dates inbetween the e date and the e2 date. So we left join with that and grab the
NULLvalues.I am thinking this probably uses the same strategy as your
MINquery, but maybe not? I’m curious to know either way.