Sign Up

Sign Up to our social questions and Answers Engine to ask questions, answer people’s questions, and connect with other people.

Have an account? Sign In

Have an account? Sign In Now

Sign In

Login to our social questions & Answers Engine to ask questions answer people’s questions & connect with other people.

Sign Up Here

Forgot Password?

Don't have account, Sign Up Here

Forgot Password

Lost your password? Please enter your email address. You will receive a link and will create a new password via email.

Have an account? Sign In Now

You must login to ask a question.

Forgot Password?

Need An Account, Sign Up Here

Please briefly explain why you feel this question should be reported.

Please briefly explain why you feel this answer should be reported.

Please briefly explain why you feel this user should be reported.

Sign InSign Up

The Archive Base

The Archive Base Logo The Archive Base Logo

The Archive Base Navigation

  • SEARCH
  • Home
  • About Us
  • Blog
  • Contact Us
Search
Ask A Question

Mobile menu

Close
Ask a Question
  • Home
  • Add group
  • Groups page
  • Feed
  • User Profile
  • Communities
  • Questions
    • New Questions
    • Trending Questions
    • Must read Questions
    • Hot Questions
  • Polls
  • Tags
  • Badges
  • Buy Points
  • Users
  • Help
  • Buy Theme
  • SEARCH
Home/ Questions/Q 7560889
In Process

The Archive Base Latest Questions

Editorial Team
  • 0
Editorial Team
Asked: May 30, 20262026-05-30T13:01:15+00:00 2026-05-30T13:01:15+00:00

im trying to create a sql query, that will detect (possible) duplicate customers in

  • 0

im trying to create a sql query, that will detect (possible) duplicate customers in my database:

I have two tables:

  1. Customer with the columns: cid, firstname, lastname, zip. Note that cid is the unique customer id and primary key for this table.
  2. IgnoreForDuplicateCustomer with the columns: cid1, cid2. Both columns are foreign keys, which references to Customer(cid). This table is used to say, that the customer with cid1 is not the same as the customer with the cid2.

So for example, if i have

  • a Customer entry with cid = 1, firstname=”foo”, lastname=”anonymous” and zip=”11231″
  • and another Customer entry with cid=2, firstname=”foo”, lastname=”anonymous” and zip=”11231″.

So my sql query should search for customers, that have the same firstname, lastname and zip and the detect that customer with cid = 1 is the same as customer with cid = 2.

However, it should be possible to say, that customer cid = 1 and cid=2 are not the same, by storing a new entry in the IgnoreForDuplicateCustomer table by setting cid1 = 1 and cid2 = 2.

So detecting the duplicate customers work well with this sql query script:

SELECT cid, firstname, lastname, zip, COUNT(*) AS NumOccurrences
       FROM Customer
 GROUP BY fistname, lastname,zip
       HAVING ( COUNT(*) > 1 )

My problem is, that i am not able, to integrate the IgnoreForDuplicateCustomer table, to that
like in my previous example the customer with cid = 1 and cid=2 will not be marked / queried as the same, since there is an entry/rule in the IgnoreForDuplicateCustomer table.

So i tried to extend my previous query by adding a where clause:

    SELECT cid, firstname, lastname, COUNT(*) AS NumOccurrences
               FROM Customer    
    WHERE cid NOT IN (
                     SELECT cid1 FROM IgnoreForDuplicateCustomer WHERE cid2=cid 
                     UNION 
                     SELECT cid2 FROM IgnoreForDuplicateCustomer WHERE cid1=cid
                     )  
     GROUP BY firstname, lastname, zip
     HAVING ( COUNT(*) > 1 )

Unfortunately this additional WHERE clause has absolutely no impact on my result.
Any suggestions?

  • 1 1 Answer
  • 0 Views
  • 0 Followers
  • 0
Share
  • Facebook
  • Report

Leave an answer
Cancel reply

You must login to add an answer.

Forgot Password?

Need An Account, Sign Up Here

1 Answer

  • Voted
  • Oldest
  • Recent
  • Random
  1. Editorial Team
    Editorial Team
    2026-05-30T13:01:16+00:00Added an answer on May 30, 2026 at 1:01 pm

    Here you are:

    Select a.*
    From (
      select c1.cid 'CID1', c2.cid 'CID2'
      from Customer c1 
      join Customer c2 on c1.firstname=c2.firstname 
        and c1.lastname=c2.lastname and c1.zip=c2.zip
        and c1.cid < c2.cid) a
    Left Join (
      Select cid1 'CID1', cid2 'CID2'
      From ignoreforduplicatecustomer one
     Union
      Select cid2 'CID1', cid1 'CID2'
      From ignoreforduplicatecustomer two) b on a.cid1 = b.cid1 and a.cid2 = b.cid2
    where b.cid1 is null
    

    This will get you the IDs of duplicate records from customer table, which are not in table ignoreforduplicatecustomer.

    Tested with:

    CREATE TABLE IF NOT EXISTS `customer` (
     `CID` int(11) NOT NULL AUTO_INCREMENT,
     `Firstname` varchar(50) NOT NULL,
     `Lastname` varchar(50) NOT NULL,
     `ZIP` varchar(10) NOT NULL,
     PRIMARY KEY (`CID`)) 
    ENGINE=InnoDB  DEFAULT CHARSET=latin1 AUTO_INCREMENT=100 ;
    
    INSERT INTO `customer` (`CID`, `Firstname`, `Lastname`, `ZIP`) VALUES
    (1, 'John', 'Smith', '1234'),
    (2, 'John', 'Smith', '1234'),
    (3, 'John', 'Smith', '1234'),
    (4, 'Jane', 'Doe', '1234');
    

    And:

    CREATE TABLE IF NOT EXISTS `ignoreforduplicatecustomer` (
     `CID1` int(11) NOT NULL,
     `CID2` int(11) NOT NULL
    ) ENGINE=InnoDB DEFAULT CHARSET=latin1;
    
    
    INSERT INTO `ignoreforduplicatecustomer` (`CID1`, `CID2`) VALUES
    (1, 2);
    

    Results for my test setup are:

    CID1  CID2
     1     3
     2     3
    
    • 0
    • Reply
    • Share
      Share
      • Share on Facebook
      • Share on Twitter
      • Share on LinkedIn
      • Share on WhatsApp
      • Report

Sidebar

Related Questions

I'm trying to write a custom SQL query that will create a list of
I am trying to create a query in SQL Server that will search for
I'm trying to create a query on SQL server 2005 that will check if
I am trying to create a query in MS SQL 2005 that will return
Ok, I have a SQL query that I'm trying to generate that will combine
I'm trying to create an SQL query that will order the results by a
I am trying to create an HQL query that will filter a tree based
So I'm trying to create a function that generates a SQL query string based
I'm new to SQLite3 with iOS I'm trying to create an SQL Query that
SQL 2005: I am trying to create an outer join that will pull records

Explore

  • Home
  • Add group
  • Groups page
  • Communities
  • Questions
    • New Questions
    • Trending Questions
    • Must read Questions
    • Hot Questions
  • Polls
  • Tags
  • Badges
  • Users
  • Help
  • SEARCH

Footer

© 2021 The Archive Base. All Rights Reserved
With Love by The Archive Base

Insert/edit link

Enter the destination URL

Or link to existing content

    No search term specified. Showing recent items. Search or use up and down arrow keys to select an item.