I have this SQL that works fine.
Want the my filter to return the LATEST unique SessionGuids with the highest UserSessionSequenceID.
Problem is performance sucks – even though I have good indexes.
How can I rewrite this – to omit the ROW_NUMBER line?
SELECT TOP(@resultCount) * FROM
(
SELECT
[UserSessionSequenceID]
,[SessionGuid]
,[IP]
,[Url]
,[UrlTitle]
,[SiteID]
,[BrowserWidth]
,[BrowserHeight]
,[Browser]
,[BrowserVersion]
,[Referer]
,[Timestamp]
,ROW_NUMBER() over (PARTITION BY [SessionGuid]
ORDER BY UserSessionSequenceID DESC) AS sort
FROM [tblSequence]
) AS t
WHERE ([Timestamp] > DATEADD(mi, -@minutes, GETDATE()))
AND (SiteID = @siteID)
AND sort = 1
ORDER BY [UserSessionSequenceID] DESC
Thanks a lot 🙂
No offense, but let us be the judge of that. Always post the exact schema for your tables, including all indexes and cardinalities, when asking SQL Server performance questions.
For example, lets consider the following table structure:
which is same as yours, but all fields not relevant to the performance problems are aggregated into a generic filler. Lets see, how bad is the performance on, say, 1M rows for about 50k sessions? Lets fill up the table with random data, but we’ll simulate what ammounts to ‘user activity’:
This takes about 1 minute to fill up. Now lets query the same query you asked: what is the last action of any user session on site X in the last Y minutes? I’ll have to use a specific date for @now instead of GETDATE() becaus emy dtaa is simulated, not real, so I’m using whatever max timestamp was filled in randomly for SiteId 1:
This is same query as yours, but the restrictive filters are moved inside the ROW_NUMBER() part subquery. The results come back in:
31 ms response time on a warm cache, 12 pages read out of the nearly 60k pages of the table.
Updated
After reading again original query I realize my modified query is different. You only need new sessions. I still believe that the filtering out by SiteID and Timestmap is the only way to get the necessary performance, so the solution is to validate the candidate finds with a NOT EXISTS condition:
This returns on my laptop, for 1M rows over 400k sessions in 40 ms from a warm cache: