Sign Up

Sign Up to our social questions and Answers Engine to ask questions, answer people’s questions, and connect with other people.

Have an account? Sign In

Have an account? Sign In Now

Sign In

Login to our social questions & Answers Engine to ask questions answer people’s questions & connect with other people.

Sign Up Here

Forgot Password?

Don't have account, Sign Up Here

Forgot Password

Lost your password? Please enter your email address. You will receive a link and will create a new password via email.

Have an account? Sign In Now

You must login to ask a question.

Forgot Password?

Need An Account, Sign Up Here

Please briefly explain why you feel this question should be reported.

Please briefly explain why you feel this answer should be reported.

Please briefly explain why you feel this user should be reported.

Sign InSign Up

The Archive Base

The Archive Base Logo The Archive Base Logo

The Archive Base Navigation

  • SEARCH
  • Home
  • About Us
  • Blog
  • Contact Us
Search
Ask A Question

Mobile menu

Close
Ask a Question
  • Home
  • Add group
  • Groups page
  • Feed
  • User Profile
  • Communities
  • Questions
    • New Questions
    • Trending Questions
    • Must read Questions
    • Hot Questions
  • Polls
  • Tags
  • Badges
  • Buy Points
  • Users
  • Help
  • Buy Theme
  • SEARCH
Home/ Questions/Q 5973649
In Process

The Archive Base Latest Questions

Editorial Team
  • 0
Editorial Team
Asked: May 22, 20262026-05-22T20:49:49+00:00 2026-05-22T20:49:49+00:00

I’m currently writing a script tasked with going through tens of thousands of rows

  • 0

I’m currently writing a script tasked with going through tens of thousands of rows of account information and cleaning mistyped addresses, as well as printing out reports on how the address was cleaned. Currently the biggest source of unclean addresses is mistyped street-names (it’s amazing how many ways you can spell a street-name). In any case, currently my script grabs the input street-name and performs a series of edits specific to the Norwegian language (v. becomes vegen, gt. becomes gata etc.) and searches for the street-name in a ~2 million row database of addresses. If it doesn’t find a match it proceeds to split off the latter half of the street-name and replacing it with a wildcard. It tries out different variations of the wildcard search.

Anyway, my question is:

Does MySQL include anything that could make this easier for me? I recall hearing mention of a "search" function in MySQL that finds the cells in a column with the most matching characters or something. In the cases where my wild-card search fails it would be a great tool to have.

Anything else that would help with finding matches to mistyped addresses would be great.

  • 1 1 Answer
  • 0 Views
  • 0 Followers
  • 0
Share
  • Facebook
  • Report

Leave an answer
Cancel reply

You must login to add an answer.

Forgot Password?

Need An Account, Sign Up Here

1 Answer

  • Voted
  • Oldest
  • Recent
  • Random
  1. Editorial Team
    Editorial Team
    2026-05-22T20:49:50+00:00Added an answer on May 22, 2026 at 8:49 pm

    One option might be to try to use SOUNDEX to get you close to what you want. SOUNDEX will make matches off of pronunciation so it might get you closer if people are mistyping based off of the phonetic spelling of a street name.

    You might also try the Levenshtein distance algorithm. This is probably more closely tied to what you are looking for. Basically it looks at how close one word is to another. It can be used for spell checking, etc. and it might be useful when looking for bad data in address fields. Here is a link to it:

    http://www.merriampark.com/ld.htm

    If you want the function to use the Levenshtein distance algorithm in MySQL, you can look at an example here:

    http://www.artfulsoftware.com/infotree/queries.php#552

    • 0
    • Reply
    • Share
      Share
      • Share on Facebook
      • Share on Twitter
      • Share on LinkedIn
      • Share on WhatsApp
      • Report

Sidebar

Related Questions

link Im having trouble converting the html entites into html characters, (&# 8217;) i
I have a string like this: La Torre Eiffel paragonata all’Everest What PHP function
I want use html5's new tag to play a wav file (currently only supported
I'm parsing an RSS feed that has an ’ in it. SimpleXML turns this
I have an autohotkey script which looks up a word in a bilingual dictionary
I am currently running into a problem where an element is coming back from
I have thousands of HTML files to process using Groovy/Java and I need to
I have a view passing on information from a database: def serve_article(request, id): served_article
I am writing an app with both english and french support. The app requests
I am trying to loop through a bunch of documents I have to put

Explore

  • Home
  • Add group
  • Groups page
  • Communities
  • Questions
    • New Questions
    • Trending Questions
    • Must read Questions
    • Hot Questions
  • Polls
  • Tags
  • Badges
  • Users
  • Help
  • SEARCH

Footer

© 2021 The Archive Base. All Rights Reserved
With Love by The Archive Base

Insert/edit link

Enter the destination URL

Or link to existing content

    No search term specified. Showing recent items. Search or use up and down arrow keys to select an item.