I am facing a conceptual problem that I am having a hard time overcoming.

Question

0

Asked: May 17, 20262026-05-17T15:53:17+00:00 2026-05-17T15:53:17+00:00

I am facing a conceptual problem that I am having a hard time overcoming.

0

I am facing a conceptual problem that I am having a hard time overcoming. I am hoping the SO folks can help me overcome it with a nudge in the right direction.

I am in the process of doing some ETL work with the source data being very similar and very large. I am loading it into a table that is intended for replication and I only want the most basic of information in this target table.

My source table looks something like this:

alt text

I need my target table to reflect it as such:

alt text

As you can see I didn’t duplicate the InTransit status where it was duplicated in the source table. The steps I am trying to figure out how to achieve are

Get any new distinct rows entered since the last time the query ran. (Easy)
For each TrackingId I need to check if each new status is already the most recent status in the target and if so disregard otherwise go ahead and insert it. Which this means I have to also start at the earliest of the new statuses and go from there. (I have no *(!#in clue how I’ll do this)
Do this every 15 minutes so that statuses are kept very recent so step #2 must be performant.

My source table could easily consist of 100k+ rows but having the need to run this every 15 minutes requires me to make sure this is very performant thus why I am really trying to avoid cursors.

Right now the only way I can see to do this is using a CLR sproc but I think there may be better ways thus I am hoping you guys can nudge me in the right direction.

I am sure I am probably leaving something out that you may need so please let me know what info you may need and I’ll happily provide.

Thank you in advance!

EDIT:
Ok I wasn’t explicit enough in my question. My source table is going to contain multiple tracking Ids. It may be up to 100k+ rows containing mulitple TrackingId’s and multiple statuses for each trackingId. I have to update the target table as above for each individual tracking Id but my source will be an amalgam of trackingId’s.

Report

Leave an answer
Cancel reply

You must login to add an answer.

Need An Account,

1 Answer

Editorial Team · Answer 1 · 2026-05-17T15:53:18+00:00

Here you go. I’ll let you clean it up and do optimizations. one of the sub queries can go into a view and the messy date comparison can be cleaned up. If you’re using SQL 2008 R2 then use CAST as DATE instead.

    declare @tbl1 table(
id int, Trackingid int, Status varchar(50), StatusDate datetime
)

declare @tbl2 table(
id int, Trackingid int, Status varchar(50), StatusDate datetime
)

----Source data
insert into @tbl1 (id, trackingid, status, statusdate) values(1,1,'PickedUp','10/01/10  1:00') --
insert into @tbl1 (id, trackingid, status, statusdate) values(2,1,'InTransit','10/02/10 1:00') --
insert into @tbl1 (id, trackingid, status, statusdate) values(8,1,'InTransit','10/02/10  3:00')
insert into @tbl1 (id, trackingid, status, statusdate) values(4,1,'Delayed','10/03/10 1:00')
insert into @tbl1 (id, trackingid, status, statusdate) values(5,1,'InTransit','10/03/10 1:01')
insert into @tbl1 (id, trackingid, status, statusdate) values(6,1,'AtDest','10/03/10 2:00')
insert into @tbl1 (id, trackingid, status, statusdate) values(7,1,'Deliv','10/03/10 3:00') --
insert into @tbl1 (id, trackingid, status, statusdate) values(3,2,'InTransit','10/03/10 1:00')
insert into @tbl1 (id, trackingid, status, statusdate) values(9,2,'AtDest','10/04/10 1:00')
insert into @tbl1 (id, trackingid, status, statusdate) values(10,2,'Deliv','10/04/10 1:05')
insert into @tbl1 (id, trackingid, status, statusdate) values(11,1,'Delayed','10/02/10 2:05')

----Target data
insert into @tbl2 (id, trackingid, status, statusdate) values(1,1,'PickedUp','10/01/10  1:00')
insert into @tbl2 (id, trackingid, status, statusdate) values(2,1,'InTransit','10/02/10 1:00')
insert into @tbl2 (id, trackingid, status, statusdate) values(3,1,'Deliv','10/03/10 3:00')


select d.* from
(
    select 
    * ,
    ROW_NUMBER() OVER(PARTITION BY trackingid, CAST((STR( YEAR( statusdate ) ) + '/' +STR( MONTH(statusdate ) ) + '/' +STR( DAY( statusdate ) )) AS DATETIME) ORDER BY statusdate) AS 'RN'
    from @tbl1
) d

where 
not exists
(
    select RN from
    (
        select 
        * ,
        ROW_NUMBER() OVER(PARTITION BY trackingid, CAST((STR( YEAR( statusdate ) ) + '/' +STR( MONTH(statusdate ) ) + '/' +STR( DAY( statusdate ) )) AS DATETIME) ORDER BY statusdate) AS 'RN'
        from @tbl1
    )f where f.RN = d.RN + 1 and d.status = f.status and f.trackingid = d.trackingid and 
    CAST((STR( YEAR( f.statusdate ) ) + '/' +STR( MONTH(f.statusdate ) ) + '/' +STR( DAY( f.statusdate ) )) AS DATETIME) =
            CAST((STR( YEAR( d.statusdate ) ) + '/' +STR( MONTH(d.statusdate ) ) + '/' +STR( DAY( d.statusdate ) )) AS DATETIME)
)

and
not exists 
(
    select 1 from @tbl2 t2
    where (t2.trackingid = d.trackingid
    and t2.statusdate = d.statusdate
    and t2.status = d.status)
)
and (
    not exists
    (
        select 1 from
        (
            select top 1 * from @tbl2 t2 
            where t2.trackingid = d.trackingid
            order by t2.statusdate desc
        ) g
        where g.status = d.status
    )
    or not exists
    (
        select 1 from
        (
            select top 1 * from @tbl2 t2 
            where t2.trackingid = d.trackingid
            and t2.statusdate <= d.statusdate
            order by t2.statusdate desc
        ) g
        where g.status = d.status
    )
)
order by trackingid,statusdate

Sign Up

Sign In

Forgot Password

The Archive Base Latest Questions

I am facing a conceptual problem that I am having a hard time overcoming.

Leave an answerCancel reply

1 Answer

Leave an answer
Cancel reply