Sign Up

Sign Up to our social questions and Answers Engine to ask questions, answer people’s questions, and connect with other people.

Have an account? Sign In

Have an account? Sign In Now

Sign In

Login to our social questions & Answers Engine to ask questions answer people’s questions & connect with other people.

Sign Up Here

Forgot Password?

Don't have account, Sign Up Here

Forgot Password

Lost your password? Please enter your email address. You will receive a link and will create a new password via email.

Have an account? Sign In Now

You must login to ask a question.

Forgot Password?

Need An Account, Sign Up Here

Please briefly explain why you feel this question should be reported.

Please briefly explain why you feel this answer should be reported.

Please briefly explain why you feel this user should be reported.

Sign InSign Up

The Archive Base

The Archive Base Logo The Archive Base Logo

The Archive Base Navigation

  • SEARCH
  • Home
  • About Us
  • Blog
  • Contact Us
Search
Ask A Question

Mobile menu

Close
Ask a Question
  • Home
  • Add group
  • Groups page
  • Feed
  • User Profile
  • Communities
  • Questions
    • New Questions
    • Trending Questions
    • Must read Questions
    • Hot Questions
  • Polls
  • Tags
  • Badges
  • Buy Points
  • Users
  • Help
  • Buy Theme
  • SEARCH
Home/ Questions/Q 42035
In Process

The Archive Base Latest Questions

Editorial Team
  • 0
Editorial Team
Asked: May 10, 20262026-05-10T15:16:29+00:00 2026-05-10T15:16:29+00:00

Is there any simple algorithm to determine the likeliness of 2 names representing the

  • 0

Is there any simple algorithm to determine the likeliness of 2 names representing the same person?

I’m not asking for something of the level that Custom department might be using. Just a simple algorithm that would tell me if ‘James T. Clark’ is most likely the same name as ‘J. Thomas Clark’ or ‘James Clerk’.

If there is an algorithm in C# that would be great, but I can translate from any language.

  • 1 1 Answer
  • 0 Views
  • 0 Followers
  • 0
Share
  • Facebook
  • Report

Leave an answer
Cancel reply

You must login to add an answer.

Forgot Password?

Need An Account, Sign Up Here

1 Answer

  • Voted
  • Oldest
  • Recent
  • Random
  1. 2026-05-10T15:16:30+00:00Added an answer on May 10, 2026 at 3:16 pm

    I’ve faced similar problem and tried to use Levenstein distance first, but it did not work well for me. I came up with an algorithm that gives you ‘similarity’ value between two strings (higher value means more similar strings, ‘1’ for identical strings). This value is not very meaningful by itself (if not ‘1’, always 0.5 or less), but works quite well when you throw in Hungarian Matrix to find matching pairs from two lists of strings.

    Use like this:

    PartialStringComparer cmp = new PartialStringComparer(); tbResult.Text = cmp.Compare(textBox1.Text, textBox2.Text).ToString(); 

    The code behind:

    public class SubstringRange {     string masterString;      public string MasterString {         get { return masterString; }         set { masterString = value; }     }     int start;      public int Start {         get { return start; }         set { start = value; }     }     int end;      public int End {         get { return end; }         set { end = value; }     }     public int Length {         get { return End - Start; }         set { End = Start + value;}     }      public bool IsValid {         get { return MasterString.Length >= End && End >= Start && Start >= 0; }     }      public string Contents {         get {             if(IsValid) {                 return MasterString.Substring(Start, Length);             } else {                 return '';             }         }     }     public bool OverlapsRange(SubstringRange range) {         return !(End < range.Start || Start > range.End);     }     public bool ContainsRange(SubstringRange range) {         return range.Start >= Start && range.End <= End;     }     public bool ExpandTo(string newContents) {         if(MasterString.Substring(Start).StartsWith(newContents, StringComparison.InvariantCultureIgnoreCase) && newContents.Length > Length) {             Length = newContents.Length;             return true;         } else {             return false;         }     } }  public class SubstringRangeList: List<SubstringRange> {     string masterString;      public string MasterString {         get { return masterString; }         set { masterString = value; }     }      public SubstringRangeList(string masterString) {         this.MasterString = masterString;     }      public SubstringRange FindString(string s){         foreach(SubstringRange r in this){             if(r.Contents.Equals(s, StringComparison.InvariantCultureIgnoreCase))                 return r;         }         return null;     }      public SubstringRange FindSubstring(string s){         foreach(SubstringRange r in this){             if(r.Contents.StartsWith(s, StringComparison.InvariantCultureIgnoreCase))                 return r;         }         return null;     }      public bool ContainsRange(SubstringRange range) {         foreach(SubstringRange r in this) {             if(r.ContainsRange(range))                 return true;         }         return false;     }      public bool AddSubstring(string substring) {         bool result = false;         foreach(SubstringRange r in this) {             if(r.ExpandTo(substring)) {                 result = true;             }         }         if(FindSubstring(substring) == null) {             bool patternfound = true;             int start = 0;             while(patternfound){                 patternfound = false;                 start = MasterString.IndexOf(substring, start, StringComparison.InvariantCultureIgnoreCase);                 patternfound = start != -1;                 if(patternfound) {                     SubstringRange r = new SubstringRange();                     r.MasterString = this.MasterString;                     r.Start = start++;                     r.Length = substring.Length;                     if(!ContainsRange(r)) {                         this.Add(r);                         result = true;                     }                 }             }         }         return result;     }      private static bool SubstringRangeMoreThanOneChar(SubstringRange range) {         return range.Length > 1;     }      public float Weight {         get {             if(MasterString.Length == 0 || Count == 0)                 return 0;             float numerator = 0;             int denominator = 0;             foreach(SubstringRange r in this.FindAll(SubstringRangeMoreThanOneChar)) {                 numerator += r.Length;                 denominator++;             }             if(denominator == 0)                 return 0;             return numerator / denominator / MasterString.Length;         }     }      public void RemoveOverlappingRanges() {         SubstringRangeList l = new SubstringRangeList(this.MasterString);         l.AddRange(this);//create a copy of this list         foreach(SubstringRange r in l) {             if(this.Contains(r) && this.ContainsRange(r)) {                 Remove(r);//try to remove the range                 if(!ContainsRange(r)) {//see if the list still contains 'superset' of this range                     Add(r);//if not, add it back                 }             }         }     }      public void AddStringToCompare(string s) {         for(int start = 0; start < s.Length; start++) {             for(int len = 1; start + len <= s.Length; len++) {                 string part = s.Substring(start, len);                 if(!AddSubstring(part))                     break;             }         }         RemoveOverlappingRanges();     } }  public class PartialStringComparer {     public float Compare(string s1, string s2) {         SubstringRangeList srl1 = new SubstringRangeList(s1);         srl1.AddStringToCompare(s2);         SubstringRangeList srl2 = new SubstringRangeList(s2);         srl2.AddStringToCompare(s1);         return (srl1.Weight + srl2.Weight) / 2;     } } 

    Levenstein distance one is much simpler (adapted from http://www.merriampark.com/ld.htm):

    public class Distance {     /// <summary>     /// Compute Levenshtein distance     /// </summary>     /// <param name='s'>String 1</param>     /// <param name='t'>String 2</param>     /// <returns>Distance between the two strings.     /// The larger the number, the bigger the difference.     /// </returns>     public static int LD(string s, string t) {         int n = s.Length; //length of s         int m = t.Length; //length of t         int[,] d = new int[n + 1, m + 1]; // matrix         int cost; // cost         // Step 1         if(n == 0) return m;         if(m == 0) return n;         // Step 2         for(int i = 0; i <= n; d[i, 0] = i++) ;         for(int j = 0; j <= m; d[0, j] = j++) ;         // Step 3         for(int i = 1; i <= n; i++) {             //Step 4             for(int j = 1; j <= m; j++) {                 // Step 5                 cost = (t.Substring(j - 1, 1) == s.Substring(i - 1, 1) ? 0 : 1);                 // Step 6                 d[i, j] = System.Math.Min(System.Math.Min(d[i - 1, j] + 1, d[i, j - 1] + 1), d[i - 1, j - 1] + cost);             }         }         // Step 7         return d[n, m];     } } 
    • 0
    • Reply
    • Share
      Share
      • Share on Facebook
      • Share on Twitter
      • Share on LinkedIn
      • Share on WhatsApp
      • Report

Sidebar

Related Questions

Can someone post any simple explanation of cache aware algorithms? There are lot of
Is there any simple way to programatically colorize images in .NET? Basically we have
Is there any simple way to generate a default crud (given an entity) with
I'm wondering if there are any simple ways to get a list of all
Very simple question, is there any cloud server enviroments avaliable these days for us
I need a simple app to edit database tables. Are there any code generators
Is there a freely available library to create a MPEG (or any other simple
I hope this is a simple enough question for any SQL people out there...
Simply, are there any Java Developer specific Linux distros?
I trying find a simple python-based algorithmic ranking system. Here's the scenario: There will

Explore

  • Home
  • Add group
  • Groups page
  • Communities
  • Questions
    • New Questions
    • Trending Questions
    • Must read Questions
    • Hot Questions
  • Polls
  • Tags
  • Badges
  • Users
  • Help
  • SEARCH

Footer

© 2021 The Archive Base. All Rights Reserved
With Love by The Archive Base

Insert/edit link

Enter the destination URL

Or link to existing content

    No search term specified. Showing recent items. Search or use up and down arrow keys to select an item.