Sign Up

Sign Up to our social questions and Answers Engine to ask questions, answer people’s questions, and connect with other people.

Have an account? Sign In

Have an account? Sign In Now

Sign In

Login to our social questions & Answers Engine to ask questions answer people’s questions & connect with other people.

Sign Up Here

Forgot Password?

Don't have account, Sign Up Here

Forgot Password

Lost your password? Please enter your email address. You will receive a link and will create a new password via email.

Have an account? Sign In Now

You must login to ask a question.

Forgot Password?

Need An Account, Sign Up Here

Please briefly explain why you feel this question should be reported.

Please briefly explain why you feel this answer should be reported.

Please briefly explain why you feel this user should be reported.

Sign InSign Up

The Archive Base

The Archive Base Logo The Archive Base Logo

The Archive Base Navigation

  • SEARCH
  • Home
  • About Us
  • Blog
  • Contact Us
Search
Ask A Question

Mobile menu

Close
Ask a Question
  • Home
  • Add group
  • Groups page
  • Feed
  • User Profile
  • Communities
  • Questions
    • New Questions
    • Trending Questions
    • Must read Questions
    • Hot Questions
  • Polls
  • Tags
  • Badges
  • Buy Points
  • Users
  • Help
  • Buy Theme
  • SEARCH
Home/ Questions/Q 8001187
In Process

The Archive Base Latest Questions

Editorial Team
  • 0
Editorial Team
Asked: June 4, 20262026-06-04T16:01:29+00:00 2026-06-04T16:01:29+00:00

I have a set of stored procedures. Each stored procedure supposedly keeps a specific

  • 0

I have a set of stored procedures. Each stored procedure supposedly keeps a specific database table in sync with an identical one in another database.

The database tables have up to hundreds of millions of records. I need to find the quickest way to validate that these procedures are really keeping everything in sync, and I need to be able to locate records which vary between the two tables for each procedure (for debugging purposes).

I was informed that the following (found somewhere on SO I believe, but I don’t have the link as it was a while back):

Insert into target_table(columns)
select columns from table1
except
select columns from table2

Insert into target_table(columns)
select columns from table2
except
select columns from table1

Wouldn’t work fast enough. Can anyone suggest another way to do this that would be faster – either using T-SQL procedures, or even external C# code? (I thought C# code might let me store PKs for hashing purposes so I could at least track the primary keys and find which were surperfluous/missing even if I didn’t track the rest of the fields).

  • 1 1 Answer
  • 0 Views
  • 0 Followers
  • 0
Share
  • Facebook
  • Report

Leave an answer
Cancel reply

You must login to add an answer.

Forgot Password?

Need An Account, Sign Up Here

1 Answer

  • Voted
  • Oldest
  • Recent
  • Random
  1. Editorial Team
    Editorial Team
    2026-06-04T16:01:31+00:00Added an answer on June 4, 2026 at 4:01 pm

    Is fairly difficult to do this, but you can get some mileage out of checksums. One approach is to split the key range into several subranges that can be verified a) in parallel and/or b) at different scheduled intervals. Eg:

    use master;
    go
    
    set nocount on;
    go
    
    if db_id('test') is not null
    begin
        alter database test set single_user with rollback immediate;
        drop database test;
    end
    go
    
    create database test;
    go
    
    use test;
    go
    
    create table data (id int identity(1,1) not null primary key, 
        data1 varchar(38),
        data2 bigint,
        created_at datetime not null default getdate());
    go  
    
    declare @i int = 0;
    begin transaction   
    while @i < 1000000
    begin
        insert into data (data1, data2) values (newid(), @i);
        set @i += 1;
        if @i % 1000 = 0
        begin
            commit;
            raiserror (N'Inserted %d', 0, 0, @i);
            begin tran;
        end
    end
    commit  
    raiserror (N'Inserted %d', 0, 0, @i);
    go
    
    backup database test to disk='c:\temp\test.bak' with init;
    go
    
    if db_id('copy') is not null
    begin
        alter database copy set single_user with rollback immediate;
        drop database copy;
    end
    go
    
    restore database copy from disk='c:\temp\test.bak'
    with move 'test' to 'c:\temp\copy.mdf', move 'test_log' to 'c:\temp\copy_log.ldf';
    go
    
    -- create some differences
    --
    update test..data set data1 = newid() where id = cast(rand()*1000000 as int)
    update copy..data set data1 = newid() where id = cast(rand()*1000000 as int)
    
    delete from test..data where id = cast(rand()*1000000 as int);
    insert into copy..data (data1, data2) values (newid(), -1);
    
    
    -- do the check
    --
    declare @id int = 0;
    while @id < 1010000
    begin
        declare @chk1 int, @chk2 int;
        select @chk1 = checksum_agg(binary_checksum(*)) from test..data where id >= @id and id < @id + 10000
        select @chk2 = checksum_agg(binary_checksum(*)) from copy..data where id >= @id and id < @id + 10000
        if @chk1 != @chk2
        begin
            -- locate the different row(s)
            --
            select t.id, binary_checksum(*) as chk
                from test..data t
                where t.id >= @id and t.id < @id + 10000
            except
            select id, binary_checksum(*) as chk
                from copy..data c
                where c.id >= @id and c.id < @id + 10000;
    
            select t.id, binary_checksum(*) as chk
                from copy..data t
                where id >= @id and id < @id + 10000
            except
            select id, binary_checksum(*) as chk
                from test..data c
                where c.id >= @id and c.id < @id + 10000;
        end
        else
        begin
            raiserror (N'Range %d is OK', 0,0, @id);
        end
        set @id += 10000;
    end
    

    The main issue is that identifying the differences can only be achieved by scanning all the rows, which is very expensive. Using ranges you can submit various ranges to be verified on a rotating schedule. The CHECKSUM_AGG and BINARY_CHECKSUM(*) restrictions apply, of course:

    BINARY_CHECKSUM ignores columns of noncomparable data types in its
    computation. Noncomparable data types include text, ntext, image,
    cursor, xml, and noncomparable common language runtime (CLR)
    user-defined types.

    • 0
    • Reply
    • Share
      Share
      • Share on Facebook
      • Share on Twitter
      • Share on LinkedIn
      • Share on WhatsApp
      • Report

Sidebar

Related Questions

I have a couple of stored procedures in T-SQL where each stored procedure has
I have a stored procedure that has this line: SET @SQL = 'SELECT path,title,tags
I have a stored procedure that says update table1 set value1 = 1 where
I have a MYSQL stored procedure SP1() that returns a result set. I want
I have sql database stored on a shared netwrok drive , after set of
I have a stored procedure that returns a result set (4 columns x n
I have two stored procedures that return a slightly different set of columns. I
I have a result set in MS-SQL within a stored procedure, and lets say
I have a temp table that needs the values of a Stored procedure. So
I have a set of tables where each table's ID key is auto_incrementing. Now

Explore

  • Home
  • Add group
  • Groups page
  • Communities
  • Questions
    • New Questions
    • Trending Questions
    • Must read Questions
    • Hot Questions
  • Polls
  • Tags
  • Badges
  • Users
  • Help
  • SEARCH

Footer

© 2021 The Archive Base. All Rights Reserved
With Love by The Archive Base

Insert/edit link

Enter the destination URL

Or link to existing content

    No search term specified. Showing recent items. Search or use up and down arrow keys to select an item.