Sign Up

Sign Up to our social questions and Answers Engine to ask questions, answer people’s questions, and connect with other people.

Have an account? Sign In

Have an account? Sign In Now

Sign In

Login to our social questions & Answers Engine to ask questions answer people’s questions & connect with other people.

Sign Up Here

Forgot Password?

Don't have account, Sign Up Here

Forgot Password

Lost your password? Please enter your email address. You will receive a link and will create a new password via email.

Have an account? Sign In Now

You must login to ask a question.

Forgot Password?

Need An Account, Sign Up Here

Please briefly explain why you feel this question should be reported.

Please briefly explain why you feel this answer should be reported.

Please briefly explain why you feel this user should be reported.

Sign InSign Up

The Archive Base

The Archive Base Logo The Archive Base Logo

The Archive Base Navigation

  • SEARCH
  • Home
  • About Us
  • Blog
  • Contact Us
Search
Ask A Question

Mobile menu

Close
Ask a Question
  • Home
  • Add group
  • Groups page
  • Feed
  • User Profile
  • Communities
  • Questions
    • New Questions
    • Trending Questions
    • Must read Questions
    • Hot Questions
  • Polls
  • Tags
  • Badges
  • Buy Points
  • Users
  • Help
  • Buy Theme
  • SEARCH
Home/ Questions/Q 9019207
In Process

The Archive Base Latest Questions

Editorial Team
  • 0
Editorial Team
Asked: June 16, 20262026-06-16T04:43:17+00:00 2026-06-16T04:43:17+00:00

I am using SQL Server to store data about ticket validation. Single ticket can

  • 0

I am using SQL Server to store data about ticket validation. Single ticket can be validated at multiple places. I need to group records by “entry” and “exit” place and calculate statistics about duration which has passed between two validations.
Here is the table (simplified for clarity) :

CREATE TABLE TestDuration
(VALIDATION_TIMESTAMP datetime, 
ID_TICKET bigint, 
ID_PLACE bigint)

And data:

INSERT INTO TestDuration(VALIDATION_TIMESTAMP,ID_TICKET,ID_PLACE) VALUES ('2012-07-25 19:24:05.700', 1, 1)
INSERT INTO TestDuration(VALIDATION_TIMESTAMP,ID_TICKET,ID_PLACE) VALUES ('2012-07-25 20:08:04.250', 2, 2)
INSERT INTO TestDuration(VALIDATION_TIMESTAMP,ID_TICKET,ID_PLACE) VALUES ('2012-07-26 10:18:13.040', 3, 3)
INSERT INTO TestDuration(VALIDATION_TIMESTAMP,ID_TICKET,ID_PLACE) VALUES ('2012-07-26 10:18:20.990', 1, 2)
INSERT INTO TestDuration(VALIDATION_TIMESTAMP,ID_TICKET,ID_PLACE) VALUES ('2012-07-26 10:18:29.290', 2, 4)
INSERT INTO TestDuration(VALIDATION_TIMESTAMP,ID_TICKET,ID_PLACE) VALUES ('2012-07-26 10:25:37.040', 1, 4)

Here is the aggregation query:

SELECT VisitDurationCalcTable.ID_PLACE AS ID_PLACE_IN, 
VisitDurationCalcTable.ID_NEXT_VISIT_PLACE AS ID_PLACE_OUT, 
COUNT(visitduration) AS NUMBER_OF_VISITS, AVG(visitduration) AS AVERAGE_VISIT_DURATION 
FROM (
      SELECT EntryData.VALIDATION_TIMESTAMP, EntryData.ID_TICKET, EntryData.ID_PLACE, 
      (
       SELECT TOP 1 ID_PLACE FROM TestDuration 
          WHERE ID_TICKET=EntryData.ID_TICKET 
          AND VALIDATION_TIMESTAMP>EntryData.VALIDATION_TIMESTAMP 
          ORDER BY VALIDATION_TIMESTAMP ASC
      ) 
      AS ID_NEXT_VISIT_PLACE, 
      DATEDIFF(n,EntryData.VALIDATION_TIMESTAMP,
               (
                SELECT TOP 1 VALIDATION_TIMESTAMP FROM TestDuration WHERE ID_TICKET=EntryData.ID_TICKET and VALIDATION_TIMESTAMP>EntryData.VALIDATION_TIMESTAMP ORDER BY VALIDATION_TIMESTAMP ASC
               )
              ) AS visitduration 
     FROM TestDuration EntryData)
AS VisitDurationCalcTable 
WHERE VisitDurationCalcTable.ID_NEXT_VISIT_PLACE IS NOT NULL
GROUP BY VisitDurationCalcTable.ID_PLACE, VisitDurationCalcTable.ID_NEXT_VISIT_PLACE

The query works, but I’ve hit a performance problem pretty fast. For 40K rows in table query execution time is about 3 minutes. I’m no SQL guru so cannot really see how to transform the query to work faster. It’s not a critical report and is made only about once per month, but nevertheless it makes my app look bad. I have a feeling I’m missing something simple here.

  • 1 1 Answer
  • 0 Views
  • 0 Followers
  • 0
Share
  • Facebook
  • Report

Leave an answer
Cancel reply

You must login to add an answer.

Forgot Password?

Need An Account, Sign Up Here

1 Answer

  • Voted
  • Oldest
  • Recent
  • Random
  1. Editorial Team
    Editorial Team
    2026-06-16T04:43:18+00:00Added an answer on June 16, 2026 at 4:43 am

    TLDR Version

    You are clearly missing an index that would help this query. Adding the missing index will likely cause an order of magnitude improvement on its own.

    If you are on SQL Server 2012 rewriting the query using LEAD would also do it (though that too would benefit from the missing index).

    If you are still on 2005/2008 then you can make some improvements to the existing query but the effect will be relatively minor compared to the index change.

    Longer Version

    For this to take 3 minutes I assume you have no useful indexes at all and that the biggest win would be to simply add an index (for a report run once a month simply copying the data from the three columns into an appropriately indexed #temp table might suffice if you don’t want to create a permanent index).

    You say that you simplified the table for clarity and that it has 40K rows. Assuming the following test data

    CREATE TABLE TestDuration
      (
         Id                   UNIQUEIDENTIFIER DEFAULT NEWID() PRIMARY KEY,
         VALIDATION_TIMESTAMP DATETIME,
         ID_TICKET            BIGINT,
         ID_PLACE             BIGINT,
         OtherColumns         CHAR(100) NULL
      )
    
    INSERT INTO TestDuration
                (VALIDATION_TIMESTAMP,
                 ID_TICKET,
                 ID_PLACE)
    SELECT TOP 40000 DATEADD(minute, ROW_NUMBER() OVER (ORDER BY (SELECT 0)), GETDATE()),
                     ABS(CHECKSUM(NEWID())) % 10,
                     ABS(CHECKSUM(NEWID())) % 100
    FROM   master..spt_values v1,
           master..spt_values v2 
    

    Your original query takes 51 seconds on my machine at MAXDOP 1 and the following IO stats

    Table 'Worktable'. Scan count 79990, logical reads 1167573, physical reads 0
    Table 'TestDuration'. Scan count 3, logical reads 2472, physical reads 0.
    

    Plan 1

    For each of the 40,000 rows in the table it is doing two sorts of all matching ID_TICKET rows in order to identify the next one in order of VALIDATION_TIMESTAMP

    Simply adding an index as below brings the elapsed time down to 406ms, an improvement of more than 100 times (the subsequent queries in this answer assume this index is now in place).

    CREATE NONCLUSTERED INDEX IX
      ON TestDuration(ID_TICKET, VALIDATION_TIMESTAMP)
      INCLUDE (ID_PLACE) 
    

    The plan now looks as follows with the 80,000 sorts and spool operations replaced with index seeks.

    Table 'Worktable'. Scan count 0, logical reads 0, physical reads 0
    Table 'TestDuration'. Scan count 79991, logical reads 255707, physical reads 0
    

    plan 2

    It is still doing 2 seeks for every row however. Rewriting with CROSS APPLY allows these to be combined.

    SELECT VisitDurationCalcTable.ID_PLACE            AS ID_PLACE_IN,
           VisitDurationCalcTable.ID_NEXT_VISIT_PLACE AS ID_PLACE_OUT,
           COUNT(visitduration)                       AS NUMBER_OF_VISITS,
           AVG(visitduration)                         AS AVERAGE_VISIT_DURATION
    FROM   (SELECT EntryData.VALIDATION_TIMESTAMP,
                   EntryData.ID_TICKET,
                   EntryData.ID_PLACE,
                   CA.ID_PLACE                                                          AS ID_NEXT_VISIT_PLACE,
                   DATEDIFF(n, EntryData.VALIDATION_TIMESTAMP, CA.VALIDATION_TIMESTAMP) AS visitduration
            FROM   TestDuration EntryData
                   CROSS APPLY (SELECT TOP 1 ID_PLACE,
                                             VALIDATION_TIMESTAMP
                                FROM   TestDuration
                                WHERE  ID_TICKET = EntryData.ID_TICKET
                                       AND VALIDATION_TIMESTAMP > EntryData.VALIDATION_TIMESTAMP
                                ORDER  BY VALIDATION_TIMESTAMP ASC) CA) AS VisitDurationCalcTable
    GROUP  BY VisitDurationCalcTable.ID_PLACE,
              VisitDurationCalcTable.ID_NEXT_VISIT_PLACE 
    

    This gives me an elapsed time of 269 ms

    Table 'Worktable'. Scan count 0, logical reads 0, physical reads 0
    Table 'TestDuration'. Scan count 40001, logical reads 127988, physical reads 0
    

    PLAN 3

    Whilst the number of reads is still quite high the seeks are all reading pages that have just been read by the scan so they are all pages in cache. The number of reads can be reduced by using a table variable.

    DECLARE @T TABLE (
      VALIDATION_TIMESTAMP DATETIME,
      ID_TICKET            BIGINT,
      ID_PLACE             BIGINT,
      RN                   INT
      PRIMARY KEY(ID_TICKET, RN) )
    
    INSERT INTO @T
    SELECT VALIDATION_TIMESTAMP,
           ID_TICKET,
           ID_PLACE,
           ROW_NUMBER() OVER (PARTITION BY ID_TICKET ORDER BY VALIDATION_TIMESTAMP) AS RN
    FROM   TestDuration
    
    SELECT T1.ID_PLACE                                                        AS ID_PLACE_IN,
           T2.ID_PLACE                                                        AS ID_PLACE_OUT,
           COUNT(*)                                                           AS NUMBER_OF_VISITS,
           AVG(DATEDIFF(n, T1.VALIDATION_TIMESTAMP, T2.VALIDATION_TIMESTAMP)) AS AVERAGE_VISIT_DURATION
    FROM   @T T1
           INNER MERGE JOIN @T T2
             ON T1.ID_TICKET = T2.ID_TICKET
                AND T2.RN = T1.RN + 1
    GROUP  BY T1.ID_PLACE,
              T2.ID_PLACE 
    

    However for me at least that slightly increased the elapsed time to 301 ms (43 ms for the insert + 258 ms for the select) but this could still be a good option in lieu of creating a permanent index.

    (Insert)
    Table 'TestDuration'. Scan count 1, logical reads 233, physical reads 0
    
    (Select)
    Table 'Worktable'. Scan count 0, logical reads 0, physical reads 0
    Table '#0C50D423'. Scan count 2, logical reads 372, physical reads 0
    

    Plan

    Finally if you are using SQL Server 2012 you can use LEAD (SQL Fiddle)

    WITH CTE
         AS (SELECT ID_PLACE AS ID_PLACE_IN,
                    LEAD(ID_PLACE) OVER (PARTITION BY ID_TICKET 
                                             ORDER BY VALIDATION_TIMESTAMP) AS ID_PLACE_OUT,
                    DATEDIFF(n, 
                             VALIDATION_TIMESTAMP, 
                             LEAD(VALIDATION_TIMESTAMP) OVER (PARTITION BY ID_TICKET 
                                                                  ORDER BY VALIDATION_TIMESTAMP)) AS VISIT_DURATION
             FROM   TestDuration)
    SELECT ID_PLACE_IN,
           ID_PLACE_OUT,
           COUNT(*)            AS NUMBER_OF_VISITS,
           AVG(VISIT_DURATION) AS AVERAGE_VISIT_DURATION
    FROM   CTE
    WHERE  ID_PLACE_OUT IS NOT NULL
    GROUP  BY ID_PLACE_IN,
              ID_PLACE_OUT 
    

    That gave me an elapsed time of 249 ms

    Table 'Worktable'. Scan count 0, logical reads 0, physical reads 0
    Table 'TestDuration'. Scan count 1, logical reads 233, physical reads 0
    

    PLAN 4

    The LEAD version also performs well without the index. Omitting the optimal index adds an additional SORT to the plan and means it has to read the wider clustered index on my test table but it still completed in an elapsed time of 293 ms.

    Table 'Worktable'. Scan count 0, logical reads 0, physical reads 0
    Table 'TestDuration'. Scan count 1, logical reads 824, physical reads 0
    
    • 0
    • Reply
    • Share
      Share
      • Share on Facebook
      • Share on Twitter
      • Share on LinkedIn
      • Share on WhatsApp
      • Report

Sidebar

Related Questions

We are considering using a single SQL Server database to store data for multiple
I'm using MS SQL Server 2005 and I need to store date values partially.
I've created a web enabled database using SQL Server 2005 to store the data
I've been using SQL Server to store historical time series data for a couple
Our application (using a SQL Server 2008 R2 back-end) stores data about remote hardware
I need your suggest about data processing. My server is a data server (using
I have been spoiled by either using sql server to store data, or using
I am using SQL Server 2000, Data stored in the database is Photo data
My company is using SQL Server Compact to store and manage a local database
In my website i store User Visit DateTime information in sql server using date

Explore

  • Home
  • Add group
  • Groups page
  • Communities
  • Questions
    • New Questions
    • Trending Questions
    • Must read Questions
    • Hot Questions
  • Polls
  • Tags
  • Badges
  • Users
  • Help
  • SEARCH

Footer

© 2021 The Archive Base. All Rights Reserved
With Love by The Archive Base

Insert/edit link

Enter the destination URL

Or link to existing content

    No search term specified. Showing recent items. Search or use up and down arrow keys to select an item.