Sign Up

Sign Up to our social questions and Answers Engine to ask questions, answer people’s questions, and connect with other people.

Have an account? Sign In

Have an account? Sign In Now

Sign In

Login to our social questions & Answers Engine to ask questions answer people’s questions & connect with other people.

Sign Up Here

Forgot Password?

Don't have account, Sign Up Here

Forgot Password

Lost your password? Please enter your email address. You will receive a link and will create a new password via email.

Have an account? Sign In Now

You must login to ask a question.

Forgot Password?

Need An Account, Sign Up Here

Please briefly explain why you feel this question should be reported.

Please briefly explain why you feel this answer should be reported.

Please briefly explain why you feel this user should be reported.

Sign InSign Up

The Archive Base

The Archive Base Logo The Archive Base Logo

The Archive Base Navigation

  • Home
  • SEARCH
  • About Us
  • Blog
  • Contact Us
Search
Ask A Question

Mobile menu

Close
Ask a Question
  • Home
  • Add group
  • Groups page
  • Feed
  • User Profile
  • Communities
  • Questions
    • New Questions
    • Trending Questions
    • Must read Questions
    • Hot Questions
  • Polls
  • Tags
  • Badges
  • Buy Points
  • Users
  • Help
  • Buy Theme
  • SEARCH
Home/ Questions/Q 9239605
In Process

The Archive Base Latest Questions

Editorial Team
  • 0
Editorial Team
Asked: June 18, 20262026-06-18T08:01:40+00:00 2026-06-18T08:01:40+00:00

I am using the levenstein edit distance to find how similar two strings are.

  • 0

I am using the levenstein edit distance to find how similar two strings are. The two strings are as such. The first one is the longer of the two if at all; also it is the non-truncated non-modified string I wish to compare the other too. The second string could be truncated at the end, and missing characters. There can be multiple unique string one and string twos.

I read in the list of second strings and each is contained on a line with this format
“[string two] – $0.00” So it is string two plus a space, a dash, a space, and then a price.

So I have a list of second strings (in the format) and I have two options. Remove the price and the ” – ” or keep it there.

  • If I remove it. I read in each string two and tokenize it with the delimiter “$”. I do not know how long any string two is so I must do a stringtwo.removeAll(“-“) to get rid of the dash and then a .trim() for the white space. Well if there is a dash in string two it will also be removed un-voluntairly. So with this I get either exact strings (levenstein = 0), truncated but still exact strings (strings are the same up to length string one – levenstein), truncated and missing a integer amount of dashes (strings the same in a few places between dashes, and if truncated also missing at the end), or not truncated but missing an integer number of dashes.

  • If I leave it. Still read in each string two and tokenize with delimiter “$”. So now I have this format for string two “[string two] – “. So all levenstein distance will be off by 3. The problem here is if I have a string one Ex. “dog food is yummy” and the string two I try to compare is “dog food is yum – ” the levD = 3 but this is the same levD as if I have the string two “dog food is yummy – “.

As you can see both options yield problems. It seems I cannot overcome these problems in my program to try and match the input list of string twos to my list of string ones.

Can anyone see a better way of doing this, are there any other string comparators that I could use to make this less problematic?

  • 1 1 Answer
  • 0 Views
  • 0 Followers
  • 0
Share
  • Facebook
  • Report

Leave an answer
Cancel reply

You must login to add an answer.

Forgot Password?

Need An Account, Sign Up Here

1 Answer

  • Voted
  • Oldest
  • Recent
  • Random
  1. Editorial Team
    Editorial Team
    2026-06-18T08:01:41+00:00Added an answer on June 18, 2026 at 8:01 am

    Try this: should truncate the String at the last “-” found in each string while keeping the rest of the string intact.

    StringTwo.substring(0, StringTwo.lastIndexOf("-")).trim();
    

    These String manipulations can be expensive so if you are working with a lot of string you might look into other optimizations.

    Also this solutions is brittle because it hardcodes the value to determine where to trim into the code. This can be defined elsewhere and passed in so it can vary.

    Once you have that working relatively well and safe, next try and look into StringUtils from Apache which has more extensive String manipulations.

    org.apache.commons.lang.StringUtils from Apache Commons Lang
    
    • 0
    • Reply
    • Share
      Share
      • Share on Facebook
      • Share on Twitter
      • Share on LinkedIn
      • Share on WhatsApp
      • Report

Sidebar

Related Questions

I am using the Levenshtein distance to find similar strings after OCR. However, for
I'm using the Levenshtein algorithm to find the similarity between two strings. This is
I'm trying to calculate the similarity (read: Levenshtein distance ) of two images, using
I've had some success comparing strings using the PHP levenshtein function. However, for two
Are there examples of algorithms for determining the edit distance between 2 strings when
Hey, I'm using Levenshteins algorithm to get distance between source and target string. also
Using CI for the first time and i'm smashing my head with this seemingly
Using the Redis info command, I am able to get all the stats of
I have a distance matrix composed of pair-wise levenshtein's distance. I was using scikits-learn.
I'm trying to find out the 'best' option for using levenshtein() in PHP .

Explore

  • Home
  • Add group
  • Groups page
  • Communities
  • Questions
    • New Questions
    • Trending Questions
    • Must read Questions
    • Hot Questions
  • Polls
  • Tags
  • Badges
  • Users
  • Help
  • SEARCH

Footer

© 2021 The Archive Base. All Rights Reserved
With Love by The Archive Base

Insert/edit link

Enter the destination URL

Or link to existing content

    No search term specified. Showing recent items. Search or use up and down arrow keys to select an item.