Sign Up

Sign Up to our social questions and Answers Engine to ask questions, answer people’s questions, and connect with other people.

Have an account? Sign In

Have an account? Sign In Now

Sign In

Login to our social questions & Answers Engine to ask questions answer people’s questions & connect with other people.

Sign Up Here

Forgot Password?

Don't have account, Sign Up Here

Forgot Password

Lost your password? Please enter your email address. You will receive a link and will create a new password via email.

Have an account? Sign In Now

You must login to ask a question.

Forgot Password?

Need An Account, Sign Up Here

Please briefly explain why you feel this question should be reported.

Please briefly explain why you feel this answer should be reported.

Please briefly explain why you feel this user should be reported.

Sign InSign Up

The Archive Base

The Archive Base Logo The Archive Base Logo

The Archive Base Navigation

  • SEARCH
  • Home
  • About Us
  • Blog
  • Contact Us
Search
Ask A Question

Mobile menu

Close
Ask a Question
  • Home
  • Add group
  • Groups page
  • Feed
  • User Profile
  • Communities
  • Questions
    • New Questions
    • Trending Questions
    • Must read Questions
    • Hot Questions
  • Polls
  • Tags
  • Badges
  • Buy Points
  • Users
  • Help
  • Buy Theme
  • SEARCH
Home/ Questions/Q 6192767
In Process

The Archive Base Latest Questions

Editorial Team
  • 0
Editorial Team
Asked: May 24, 20262026-05-24T03:00:02+00:00 2026-05-24T03:00:02+00:00

I was granted with the beautiful task ;-) to design some tables in a

  • 0

I was granted with the beautiful task 😉 to design some tables in a MySQL Database which should hold human names.

Criteria:

  1. I have only the full names. (There is no separation for e.g. prename, surname and so on)
  2. The storage should be diacritic sensitive. (The following names stand for different persons)

    • “Voss” and “Voß”.
    • “Joel” and “Joël”.
    • “franc” and “Franc” and “Fránc”.
  3. A search should return all similar names to the search string: E.g: Search for “franc” should return [“franc”, “Franc”, “Fránc”] and so on… (It would be awesome if the search would return not only the diacritice insensitive matches but perhaps similar sounding names or names that match in parts to the search string, too…)

I thougt of using the COLLATION utf8_bin for the column (declared as unique) in which I will store the names. This would satisfy point 2. But this will hurt point three. Declaring the column name as unique with collation utf8_unicode_ci satisfys point 3. but it hurts point two.

So my question is: Is there a way to solve this task and respecting all criteria? And since I don’t want to reinvent the wheel: Is there an elegant way to handle human names (and their searches) in databases? (Sadly, I do not have the possibility of splitting the names into prename, surnames and optional middlenames…)

Edit:

The amount of names is arount a million (~1.000.000) entrys. And if it matters: I am using python as scripting language to populate the database and query the data later on.

  • 1 1 Answer
  • 0 Views
  • 0 Followers
  • 0
Share
  • Facebook
  • Report

Leave an answer
Cancel reply

You must login to add an answer.

Forgot Password?

Need An Account, Sign Up Here

1 Answer

  • Voted
  • Oldest
  • Recent
  • Random
  1. Editorial Team
    Editorial Team
    2026-05-24T03:00:02+00:00Added an answer on May 24, 2026 at 3:00 am

    What is useful is if you can decompose the full name into component “name words” and store a phonetic encoding (metaphone or one of the many other choices) for each of them. You just need the notion of name words though, not specifically categorizing it as first or middle or last, which is fine because those categories don’t work well across cultures anyway). But you can use positional order information later in ranking if you want so that searching for “Paul Carl” matches “Paul Karl” better than matching “Carl Paul”. You need to be aware of ambiguous punctuation that may require storing multiple versions of some name words. For instance Bre-Anna Heim would be broken into the name words “bre” “anna” “breanna” and “heim”. Sometimes the dash is irrelevant like Bre-Anna, but sometimes not like in Sally-June”. Bre-Anna never uses just Bre or Anna, but Sally-June may just use Sally or just June sometimes. It’s hard to know which, so cover both possibilities.

    You can write your query against this by similarly decomposing and phonetically encoding the full name you’re searching for. Your query can return, say, those full names that have two or more component name phonetic matches (or one if there is only one name in the search or the source). This gives you a subset of full names to consider further. You could come up with a simple ranking of them, or even do something like a distance matching algorithm on this subset, which would be too expensive computationally to do against the entire million names. When I say distance matching, I’m talking on-line algorithms like Levenshtein distance and the like.

    (edit) The reasoning for this is handling cases like the following name: Maria de los Angeles Gomez-Rodriguez. One data entry person may just enter Maria Gomez. Another might enter Maria Gomez Rodriguez. Yet another might enter Maria Angeles Rodrigus.

    • 0
    • Reply
    • Share
      Share
      • Share on Facebook
      • Share on Twitter
      • Share on LinkedIn
      • Share on WhatsApp
      • Report

Sidebar

Related Questions

I have a WCF service to which access is only granted to a few
I am writing an android application to which I have granted the following privileges
I need to find out which schemata have already been granted execute permission on
We have an app with some users. The users have already granted authorization for
I have the user for example HR which has been granted very powerful system
What permissions should be granted to the remote user to do anything with the
I have two images like so. (granted that the code is indented) @Html.Image(/Images/icons/arrow-up.gif, up)
I've been struggling with this for hours now, and granted I have issues with
version:SQL Server 2008 R2 I have granted a user the ability to EXECUTE and
Some items don't have write access right due to workflow state write not being

Explore

  • Home
  • Add group
  • Groups page
  • Communities
  • Questions
    • New Questions
    • Trending Questions
    • Must read Questions
    • Hot Questions
  • Polls
  • Tags
  • Badges
  • Users
  • Help
  • SEARCH

Footer

© 2021 The Archive Base. All Rights Reserved
With Love by The Archive Base

Insert/edit link

Enter the destination URL

Or link to existing content

    No search term specified. Showing recent items. Search or use up and down arrow keys to select an item.