I’m trying to find duplicate customers in a table that looks like this: customer_id

Question

0

Asked: June 1, 20262026-06-01T16:54:20+00:00 2026-06-01T16:54:20+00:00

I’m trying to find duplicate customers in a table that looks like this: customer_id

0

I’m trying to find duplicate customers in a table that looks like this:

customer_id | first_name | last_name 
-------------------------------------
          0 | Rich       | Smith
          1 | Paul       | Jones
          2 | Richard    | Smith
          3 | Jimmy      | Roberts

In this situation, I need a query that will return with customer_id 0 and customer_id 2. The query needs to find matches where a customer may have shortened their name, Rich instead of Richard — or Rob instead of Robert.

I have this query but it’s only returning ONE (not both) of the matches. I need both Rich & Richard returned by the query.

select distinct customers.customer_id, concat(customers.first_name,' ',customers.last_name) as name from customers
inner join customers dup on customers.last_name = dup.last_name
where (dup.first_name like concat('%', customers.first_name, '%')
and dup.customer_id <> customers.customer_id )
order by name

Can someone please point me in the right direction?

Per @tsOverflow , this is the final query that solved my problem:

select distinct customers.customer_id, concat(customers.first_name,' ',customers.last_name) as name 
from customers
    inner join customers dup on customers.last_name = dup.last_name
where ((dup.first_name like concat('%', customers.first_name, '%') 
            OR (customers.first_name like concat('%', dup.first_name, '%')) 
        )
    and dup.customer_id <> customers.customer_id )
order by name

The above solution may have performance issues.

Report

Leave an answer
Cancel reply

You must login to add an answer.

Need An Account,

1 Answer

Editorial Team · Answer 1 · 2026-06-01T16:54:21+00:00

Your problem is because Rich is a substring of Richard, but not the other way around.

This will check for both ways:

select distinct randomtest.customer_id, concat(randomtest.first_name,' ',randomtest.last_name) as name 
from randomtest
    inner join randomtest dup on randomtest.last_name = dup.last_name
where ((dup.first_name like concat('%', randomtest.first_name, '%') 
            OR (randomtest.first_name like concat('%', dup.first_name, '%')) 
        )
    and dup.customer_id <> randomtest.customer_id )
order by name

I added the OR and do the like check the other way around.
Note that using like statement in query has performance implcations – I am not expert in this, just a thought.

EDIT:
As others mentioned on comment – this will only catch cases where the “shorten” version is really just a substring, it wont catch cases where Michael -> Mike, or William -> Bill, and on the other hand John and some guy named Johnson might be 2 totaly different people too.

Sign Up

Sign In

Forgot Password

The Archive Base Latest Questions

I’m trying to find duplicate customers in a table that looks like this: customer_id

Leave an answerCancel reply

1 Answer

Leave an answer
Cancel reply