Based on the assumption that you can't change your three…

Question

0

Asked: May 12, 20262026-05-12T22:04:57+00:00 2026-05-12T22:04:57+00:00

I want to find possible candidate duplicate records in a large database matching on

0

I want to find possible candidate duplicate records in a large database matching on fields like COMPANYNAME and ADDRESSLINE1

Example:

For a record with the following COMPANYNAME:

“Acme, Inc.”

I would like for my query to spit out other records with these COMPANYNAME values as possible dups:

“Acme Corporation”
“Acme, Incorporated”
“Acme”

I know how to do the joins, correlated subqueries, etc. to do the mechanics of pulling the set of data I want. And I know that has been covered on here before. I am interested hearing thoughts on the best way to do the fuzzy searching – should I use full-text indexing or the soundex function or something else that I am unware of for this process? (I am using SQL Server 2005)

Any help is appreciated!

Report

Leave an answer
Cancel reply

You must login to add an answer.

Need An Account,

1 Answer

Editorial Team · Answer 1 · 2026-05-12T22:04:57+00:00

Editorial Team

2026-05-12T22:04:57+00:00Added an answer on May 12, 2026 at 10:04 pm

It will of course depend on your exact requirements, but using CONTAINS in your SQL gives you the ability to carry out proximity searches, as well as thematic and fuzzy searches.

http://www.developer.com/db/article.php/3446891/Understanding-SQL-Server-Full-Text-Indexing.htm

http://msdn.microsoft.com/en-us/library/ms187787(SQL.90).aspx

0

Reply
Share
Share

- Report

How to approach applying for a job at a company ...

How to handle personal stress caused by utterly incompetent and ...

What is a programmer’s life like?

Sign Up

Sign In

Forgot Password

The Archive Base Latest Questions