Sign Up

Sign Up to our social questions and Answers Engine to ask questions, answer people’s questions, and connect with other people.

Have an account? Sign In

Have an account? Sign In Now

Sign In

Login to our social questions & Answers Engine to ask questions answer people’s questions & connect with other people.

Sign Up Here

Forgot Password?

Don't have account, Sign Up Here

Forgot Password

Lost your password? Please enter your email address. You will receive a link and will create a new password via email.

Have an account? Sign In Now

You must login to ask a question.

Forgot Password?

Need An Account, Sign Up Here

Please briefly explain why you feel this question should be reported.

Please briefly explain why you feel this answer should be reported.

Please briefly explain why you feel this user should be reported.

Sign InSign Up

The Archive Base

The Archive Base Logo The Archive Base Logo

The Archive Base Navigation

  • Home
  • SEARCH
  • About Us
  • Blog
  • Contact Us
Search
Ask A Question

Mobile menu

Close
Ask a Question
  • Home
  • Add group
  • Groups page
  • Feed
  • User Profile
  • Communities
  • Questions
    • New Questions
    • Trending Questions
    • Must read Questions
    • Hot Questions
  • Polls
  • Tags
  • Badges
  • Buy Points
  • Users
  • Help
  • Buy Theme
  • SEARCH
Home/ Questions/Q 42035
In Process

The Archive Base Latest Questions

Editorial Team
  • 0
Editorial Team
Asked: May 10, 20262026-05-10T15:16:29+00:00 2026-05-10T15:16:29+00:00

Is there any simple algorithm to determine the likeliness of 2 names representing the

  • 0

Is there any simple algorithm to determine the likeliness of 2 names representing the same person?

I’m not asking for something of the level that Custom department might be using. Just a simple algorithm that would tell me if ‘James T. Clark’ is most likely the same name as ‘J. Thomas Clark’ or ‘James Clerk’.

If there is an algorithm in C# that would be great, but I can translate from any language.

  • 1 1 Answer
  • 0 Views
  • 0 Followers
  • 0
Share
  • Facebook
  • Report

Leave an answer
Cancel reply

You must login to add an answer.

Forgot Password?

Need An Account, Sign Up Here

1 Answer

  • Voted
  • Oldest
  • Recent
  • Random
  1. 2026-05-10T15:16:30+00:00Added an answer on May 10, 2026 at 3:16 pm

    I’ve faced similar problem and tried to use Levenstein distance first, but it did not work well for me. I came up with an algorithm that gives you ‘similarity’ value between two strings (higher value means more similar strings, ‘1’ for identical strings). This value is not very meaningful by itself (if not ‘1’, always 0.5 or less), but works quite well when you throw in Hungarian Matrix to find matching pairs from two lists of strings.

    Use like this:

    PartialStringComparer cmp = new PartialStringComparer(); tbResult.Text = cmp.Compare(textBox1.Text, textBox2.Text).ToString(); 

    The code behind:

    public class SubstringRange {     string masterString;      public string MasterString {         get { return masterString; }         set { masterString = value; }     }     int start;      public int Start {         get { return start; }         set { start = value; }     }     int end;      public int End {         get { return end; }         set { end = value; }     }     public int Length {         get { return End - Start; }         set { End = Start + value;}     }      public bool IsValid {         get { return MasterString.Length >= End && End >= Start && Start >= 0; }     }      public string Contents {         get {             if(IsValid) {                 return MasterString.Substring(Start, Length);             } else {                 return '';             }         }     }     public bool OverlapsRange(SubstringRange range) {         return !(End < range.Start || Start > range.End);     }     public bool ContainsRange(SubstringRange range) {         return range.Start >= Start && range.End <= End;     }     public bool ExpandTo(string newContents) {         if(MasterString.Substring(Start).StartsWith(newContents, StringComparison.InvariantCultureIgnoreCase) && newContents.Length > Length) {             Length = newContents.Length;             return true;         } else {             return false;         }     } }  public class SubstringRangeList: List<SubstringRange> {     string masterString;      public string MasterString {         get { return masterString; }         set { masterString = value; }     }      public SubstringRangeList(string masterString) {         this.MasterString = masterString;     }      public SubstringRange FindString(string s){         foreach(SubstringRange r in this){             if(r.Contents.Equals(s, StringComparison.InvariantCultureIgnoreCase))                 return r;         }         return null;     }      public SubstringRange FindSubstring(string s){         foreach(SubstringRange r in this){             if(r.Contents.StartsWith(s, StringComparison.InvariantCultureIgnoreCase))                 return r;         }         return null;     }      public bool ContainsRange(SubstringRange range) {         foreach(SubstringRange r in this) {             if(r.ContainsRange(range))                 return true;         }         return false;     }      public bool AddSubstring(string substring) {         bool result = false;         foreach(SubstringRange r in this) {             if(r.ExpandTo(substring)) {                 result = true;             }         }         if(FindSubstring(substring) == null) {             bool patternfound = true;             int start = 0;             while(patternfound){                 patternfound = false;                 start = MasterString.IndexOf(substring, start, StringComparison.InvariantCultureIgnoreCase);                 patternfound = start != -1;                 if(patternfound) {                     SubstringRange r = new SubstringRange();                     r.MasterString = this.MasterString;                     r.Start = start++;                     r.Length = substring.Length;                     if(!ContainsRange(r)) {                         this.Add(r);                         result = true;                     }                 }             }         }         return result;     }      private static bool SubstringRangeMoreThanOneChar(SubstringRange range) {         return range.Length > 1;     }      public float Weight {         get {             if(MasterString.Length == 0 || Count == 0)                 return 0;             float numerator = 0;             int denominator = 0;             foreach(SubstringRange r in this.FindAll(SubstringRangeMoreThanOneChar)) {                 numerator += r.Length;                 denominator++;             }             if(denominator == 0)                 return 0;             return numerator / denominator / MasterString.Length;         }     }      public void RemoveOverlappingRanges() {         SubstringRangeList l = new SubstringRangeList(this.MasterString);         l.AddRange(this);//create a copy of this list         foreach(SubstringRange r in l) {             if(this.Contains(r) && this.ContainsRange(r)) {                 Remove(r);//try to remove the range                 if(!ContainsRange(r)) {//see if the list still contains 'superset' of this range                     Add(r);//if not, add it back                 }             }         }     }      public void AddStringToCompare(string s) {         for(int start = 0; start < s.Length; start++) {             for(int len = 1; start + len <= s.Length; len++) {                 string part = s.Substring(start, len);                 if(!AddSubstring(part))                     break;             }         }         RemoveOverlappingRanges();     } }  public class PartialStringComparer {     public float Compare(string s1, string s2) {         SubstringRangeList srl1 = new SubstringRangeList(s1);         srl1.AddStringToCompare(s2);         SubstringRangeList srl2 = new SubstringRangeList(s2);         srl2.AddStringToCompare(s1);         return (srl1.Weight + srl2.Weight) / 2;     } } 

    Levenstein distance one is much simpler (adapted from http://www.merriampark.com/ld.htm):

    public class Distance {     /// <summary>     /// Compute Levenshtein distance     /// </summary>     /// <param name='s'>String 1</param>     /// <param name='t'>String 2</param>     /// <returns>Distance between the two strings.     /// The larger the number, the bigger the difference.     /// </returns>     public static int LD(string s, string t) {         int n = s.Length; //length of s         int m = t.Length; //length of t         int[,] d = new int[n + 1, m + 1]; // matrix         int cost; // cost         // Step 1         if(n == 0) return m;         if(m == 0) return n;         // Step 2         for(int i = 0; i <= n; d[i, 0] = i++) ;         for(int j = 0; j <= m; d[0, j] = j++) ;         // Step 3         for(int i = 1; i <= n; i++) {             //Step 4             for(int j = 1; j <= m; j++) {                 // Step 5                 cost = (t.Substring(j - 1, 1) == s.Substring(i - 1, 1) ? 0 : 1);                 // Step 6                 d[i, j] = System.Math.Min(System.Math.Min(d[i - 1, j] + 1, d[i, j - 1] + 1), d[i - 1, j - 1] + cost);             }         }         // Step 7         return d[n, m];     } } 
    • 0
    • Reply
    • Share
      Share
      • Share on Facebook
      • Share on Twitter
      • Share on LinkedIn
      • Share on WhatsApp
      • Report

Sidebar

Related Questions

In git, is there any (simple) way to modify the index so that only
Is there any reliable and simple priority queue (linked list preferred, not necessary) implementation
Is there a simple algorithm to encrypt integers? That is, a function E(i,k) that
Is there any simple way how to draw obliquely strike through on TextView? Now
Is there any simple way how to initialize String in Objective-C with int such
Is there any (simple) way to get some control of the order in which
Is there any simple way to create instance of modal DialogBox with single text
Is there any simple way to tell if UNC path points to a local
As a follow up question to my last one , is there any simple
Are there any tool / simple method to read the properties (Ex : Compression

Explore

  • Home
  • Add group
  • Groups page
  • Communities
  • Questions
    • New Questions
    • Trending Questions
    • Must read Questions
    • Hot Questions
  • Polls
  • Tags
  • Badges
  • Users
  • Help
  • SEARCH

Footer

© 2021 The Archive Base. All Rights Reserved
With Love by The Archive Base

Insert/edit link

Enter the destination URL

Or link to existing content

    No search term specified. Showing recent items. Search or use up and down arrow keys to select an item.