I’m using Asp.Net 4 and SQL Server 2008 R2.
I would like to know if exist any class or tool in the database or in the .Net framework for calculating data similarities between two string values.
What I would need is a value in percent indicating the similarities between the two strings, so
I can execute some logic based on that percentage (like refusing the user’s input if some data is too similar to some already present in my system).
Any ideas? Thanks
PS please comment if you need more information or my question is not appropriate.
There is fuzzy comparison in SQL but it’s not great. Instead, use the Levenstein algorithm which has an implementation in both SQL and C#.
http://en.wikipedia.org/wiki/Levenshtein_distance
Or a similar approach, the Wiki page has a lot of information.