Approximate string matching is not a stranger problem. I am learning and trying to

Question

0

Asked: June 4, 20262026-06-04T21:09:08+00:00 2026-06-04T21:09:08+00:00

Approximate string matching is not a stranger problem. I am learning and trying to

0

Approximate string matching is not a stranger problem.

I am learning and trying to understand how to solve it. I even now don’t want to get too deep into it and just want to understand the brute-force way.

In its wiki page (Approximate string matching), it says

A brute-force approach would be to compute the edit distance to P (the pattern) for all substrings of T, and then choose the substring with the minimum distance. However, this algorithm would have the running time O(m * n^3), n is the length of T, m is the length of P

Ok. I understand this statement in the following way:

We find out all possible substrings of T
We compute the edit distance of each pair of strings {P, t1}, {P, t2}, …
We find out which substring has the shortest distance from P and this substring is the answer.

I have the following question:

a. I can use two for-loop to get all possible substrings and this requires O(n^2). So when I try to compute the edit distance of one substring and the patter, does it need O(n*m)? Why?

b. How exactly do I compute the distance of one pair (one substring and the patter)? I know I can insert, delete, substitute, but can anyone give me a algorithm that do just the calculation for one pair?

Thanks

Edit

Ok, I should use Levenshtein distance, but I don’t quite understand its method.

Here is part of the code

for j from 1 to n
{
    for i from 1 to m
    {
      if s[i] = t[j] then  
        d[i, j] := d[i-1, j-1]       // no operation required
      else
        d[i, j] := minimum
                   (
                     d[i-1, j] + 1,  // a deletion
                     d[i, j-1] + 1,  // an insertion
                     d[i-1, j-1] + 1 // a substitution
                   )
    }
  }

So, assume I am now comparing {"suv", "svi"}.

So 'v' != 'i', then I have to see three other pairs:

{"su", "sv"}
{"suv", "sv"}
{"su", "svi"}

How can I understand this part? Why I need to see these 3 parts?

Does the distance between two prefixes mean that we need distance number of changes in order to make the two prefixes (or strings) equal?

So, let’s take a look at {"su", "sv"}. We can see that distance of {"su", "sv"} is 1. Then how can {"su", "sv"} become {"suv", "svi"} by just adding 1? I think we need to insert ‘v’ into “su” and ‘v’ into “sv” and then substitute the last ‘i’ with ‘v’, which has 3 operations involved, right?

Report

Leave an answer
Cancel reply

You must login to add an answer.

Need An Account,

1 Answer

Editorial Team · Answer 1 · 2026-06-04T21:09:09+00:00

The standard way of measuring the edit distance between two strings is called Levenshtein distance – the wikipedia page contains pseudocode for the algorithm.

As for your edit: You need to look at {"su", "sv"} because it is possible that the best way to change "suv" into "svi" is to replace the last v by i, whose cost will come on top of the cost for changing "su" to "sv". Or, it could be that the best way is to change "suv" into "sv" somehow and then add an i. Or, it could be that the best way is to first delete the v from "suv" and then change "su" into "svi". The first way turns out to be best (or as good as the other options) in this case. The edit distance is indeed 2, and the operations are to change the u into a v and the v into an i.

Sign Up

Sign In

Forgot Password

The Archive Base Latest Questions

Approximate string matching is not a stranger problem. I am learning and trying to

Edit

Leave an answerCancel reply

1 Answer

Leave an answer
Cancel reply