Sign Up

Sign Up to our social questions and Answers Engine to ask questions, answer people’s questions, and connect with other people.

Have an account? Sign In

Have an account? Sign In Now

Sign In

Login to our social questions & Answers Engine to ask questions answer people’s questions & connect with other people.

Sign Up Here

Forgot Password?

Don't have account, Sign Up Here

Forgot Password

Lost your password? Please enter your email address. You will receive a link and will create a new password via email.

Have an account? Sign In Now

You must login to ask a question.

Forgot Password?

Need An Account, Sign Up Here

Please briefly explain why you feel this question should be reported.

Please briefly explain why you feel this answer should be reported.

Please briefly explain why you feel this user should be reported.

Sign InSign Up

The Archive Base

The Archive Base Logo The Archive Base Logo

The Archive Base Navigation

  • Home
  • SEARCH
  • About Us
  • Blog
  • Contact Us
Search
Ask A Question

Mobile menu

Close
Ask a Question
  • Home
  • Add group
  • Groups page
  • Feed
  • User Profile
  • Communities
  • Questions
    • New Questions
    • Trending Questions
    • Must read Questions
    • Hot Questions
  • Polls
  • Tags
  • Badges
  • Buy Points
  • Users
  • Help
  • Buy Theme
  • SEARCH
Home/ Questions/Q 38915
In Process

The Archive Base Latest Questions

Editorial Team
  • 0
Editorial Team
Asked: May 10, 20262026-05-10T14:46:58+00:00 2026-05-10T14:46:58+00:00

Hey, I’m using Levenshteins algorithm to get distance between source and target string. also

  • 0

Hey, I’m using Levenshteins algorithm to get distance between source and target string.

also I have method which returns value from 0 to 1:

/// <summary> /// Gets the similarity between two strings. /// All relation scores are in the [0, 1] range,  /// which means that if the score gets a maximum value (equal to 1)  /// then the two string are absolutely similar /// </summary> /// <param name='string1'>The string1.</param> /// <param name='string2'>The string2.</param> /// <returns></returns> public static float CalculateSimilarity(String s1, String s2) {     if ((s1 == null) || (s2 == null)) return 0.0f;      float dis = LevenshteinDistance.Compute(s1, s2);     float maxLen = s1.Length;     if (maxLen < s2.Length)         maxLen = s2.Length;     if (maxLen == 0.0F)         return 1.0F;     else return 1.0F - dis / maxLen; } 

but this for me is not enough. Because I need more complex way to match two sentences.

For example I want automatically tag some music, I have original song names, and i have songs with trash, like super, quality, years like 2007, 2008, etc..etc.. also some files have just http://trash..thash..song_name_mp3.mp3, other are normal. I want to create an algorithm which will work just more perfect than mine now.. Maybe anyone can help me?

here is my current algo:

/// <summary> /// if we need to ignore this target. /// </summary> /// <param name='targetString'>The target string.</param> /// <returns></returns> private bool doIgnore(String targetString) {     if ((targetString != null) && (targetString != String.Empty))     {         for (int i = 0; i < ignoreWordsList.Length; ++i)         {             //* if we found ignore word or target string matching some some special cases like years (Regex).             if (targetString == ignoreWordsList[i] || (isMatchInSpecialCases(targetString))) return true;         }     }     return false; }  /// <summary> /// Removes the duplicates. /// </summary> /// <param name='list'>The list.</param> private void removeDuplicates(List<String> list) {     if ((list != null) && (list.Count > 0))     {         for (int i = 0; i < list.Count - 1; ++i)         {             if (list[i] == list[i + 1])             {                 list.RemoveAt(i);                 --i;             }         }     } }  /// <summary> /// Does the fuzzy match. /// </summary> /// <param name='targetTitle'>The target title.</param> /// <returns></returns> private TitleMatchResult doFuzzyMatch(String targetTitle) {     TitleMatchResult matchResult = null;     if (targetTitle != null && targetTitle != String.Empty)    {        try        {            //* change target title (string) to lower case.            targetTitle = targetTitle.ToLower();             //* scores, we will select higher score at the end.            Dictionary<Title, float> scores = new Dictionary<Title, float>();             //* do split special chars: '-', ' ', '.', ',', '?', '/', ':', ';', '%', '(', ')', '#', '\'', '\'', '!', '|', '^', '*', '[', ']', '{', '}', '=', '!', '+', '_'            List<String> targetKeywords = new List<string>(targetTitle.Split(ignoreCharsList, StringSplitOptions.RemoveEmptyEntries));            //* remove all trash from keywords, like super, quality, etc..            targetKeywords.RemoveAll(delegate(String x) { return doIgnore(x); });           //* sort keywords.           targetKeywords.Sort();         //* remove some duplicates.         removeDuplicates(targetKeywords);          //* go through all original titles.         foreach (Title sourceTitle in titles)         {             float tempScore = 0f;             //* split orig. title to keywords list.             List<String> sourceKeywords = new List<string>(sourceTitle.Name.Split(ignoreCharsList, StringSplitOptions.RemoveEmptyEntries));             sourceKeywords.Sort();             removeDuplicates(sourceKeywords);              //* go through all source ttl keywords.             foreach (String keyw1 in sourceKeywords)             {                 float max = float.MinValue;                 foreach (String keyw2 in targetKeywords)                 {                     float currentScore = StringMatching.StringMatching.CalculateSimilarity(keyw1.ToLower(), keyw2);                     if (currentScore > max)                     {                         max = currentScore;                     }                 }                 tempScore += max;             }              //* calculate average score.             float averageScore = (tempScore / Math.Max(targetKeywords.Count, sourceKeywords.Count));               //* if average score is bigger than minimal score and target title is not in this source title ignore list.             if (averageScore >= minimalScore && !sourceTitle.doIgnore(targetTitle))             {                 //* add score.                 scores.Add(sourceTitle, averageScore);             }         }          //* choose biggest score.         float maxi = float.MinValue;         foreach (KeyValuePair<Title, float> kvp in scores)         {             if (kvp.Value > maxi)             {                 maxi = kvp.Value;                 matchResult = new TitleMatchResult(maxi, kvp.Key, MatchTechnique.FuzzyLogic);             }         }     }     catch { } } //* return result. return matchResult; } 

This works normally but just in some cases, a lot of titles which should match, does not match… I think I need some kind of formula to play with weights and etc, but i can’t think of one..

Ideas? Suggestions? Algos?

by the way I already know this topic (My colleague already posted it but we cannot come with a proper solution for this problem.): Approximate string matching algorithms

  • 1 1 Answer
  • 0 Views
  • 0 Followers
  • 0
Share
  • Facebook
  • Report

Leave an answer
Cancel reply

You must login to add an answer.

Forgot Password?

Need An Account, Sign Up Here

1 Answer

  • Voted
  • Oldest
  • Recent
  • Random
  1. 2026-05-10T14:46:59+00:00Added an answer on May 10, 2026 at 2:46 pm

    Your problem here may be distinguishing between noise words and useful data:

    • Rolling_Stones.Best_of_2003.Wild_Horses.mp3
    • Super.Quality.Wild_Horses.mp3
    • Tori_Amos.Wild_Horses.mp3

    You may need to produce a dictionary of noise words to ignore. That seems clunky, but I’m not sure there’s an algorithm that can distinguish between band/album names and noise.

    • 0
    • Reply
    • Share
      Share
      • Share on Facebook
      • Share on Twitter
      • Share on LinkedIn
      • Share on WhatsApp
      • Report

Sidebar

Ask A Question

Stats

  • Questions 137k
  • Answers 137k
  • Best Answers 0
  • User 1
  • Popular
  • Answers
  • Editorial Team

    How to approach applying for a job at a company ...

    • 7 Answers
  • Editorial Team

    How to handle personal stress caused by utterly incompetent and ...

    • 5 Answers
  • Editorial Team

    What is a programmer’s life like?

    • 5 Answers
  • Editorial Team
    Editorial Team added an answer The code you have just call the copy constructor, this… May 12, 2026 at 7:20 am
  • Editorial Team
    Editorial Team added an answer This is normal. To update the model/forms/filters etc, you can… May 12, 2026 at 7:20 am
  • Editorial Team
    Editorial Team added an answer well I dont see the problem on having a version… May 12, 2026 at 7:19 am

Related Questions

Hey, I'm using Levenshteins algorithm to get distance between source and target string. also
Hey, I've been developing an application in the windows console with Java, and want
Hey right now I'm using jQuery and I have some global variables to hold
Hey! I was looking at this code at http://www.gnu.org/software/m68hc11/examples/primes_8c-source.html I noticed that in some
Hey. I have an object that has a string property called BackgroundColor. This string

Trending Tags

analytics british company computer developers django employee employer english facebook french google interview javascript language life php programmer programs salary

Top Members

Explore

  • Home
  • Add group
  • Groups page
  • Communities
  • Questions
    • New Questions
    • Trending Questions
    • Must read Questions
    • Hot Questions
  • Polls
  • Tags
  • Badges
  • Users
  • Help
  • SEARCH

Footer

© 2021 The Archive Base. All Rights Reserved
With Love by The Archive Base

Insert/edit link

Enter the destination URL

Or link to existing content

    No search term specified. Showing recent items. Search or use up and down arrow keys to select an item.