Sign Up

Sign Up to our social questions and Answers Engine to ask questions, answer people’s questions, and connect with other people.

Have an account? Sign In

Have an account? Sign In Now

Sign In

Login to our social questions & Answers Engine to ask questions answer people’s questions & connect with other people.

Sign Up Here

Forgot Password?

Don't have account, Sign Up Here

Forgot Password

Lost your password? Please enter your email address. You will receive a link and will create a new password via email.

Have an account? Sign In Now

You must login to ask a question.

Forgot Password?

Need An Account, Sign Up Here

Please briefly explain why you feel this question should be reported.

Please briefly explain why you feel this answer should be reported.

Please briefly explain why you feel this user should be reported.

Sign InSign Up

The Archive Base

The Archive Base Logo The Archive Base Logo

The Archive Base Navigation

  • SEARCH
  • Home
  • About Us
  • Blog
  • Contact Us
Search
Ask A Question

Mobile menu

Close
Ask a Question
  • Home
  • Add group
  • Groups page
  • Feed
  • User Profile
  • Communities
  • Questions
    • New Questions
    • Trending Questions
    • Must read Questions
    • Hot Questions
  • Polls
  • Tags
  • Badges
  • Buy Points
  • Users
  • Help
  • Buy Theme
  • SEARCH
Home/ Questions/Q 7898221
In Process

The Archive Base Latest Questions

Editorial Team
  • 0
Editorial Team
Asked: June 3, 20262026-06-03T08:22:29+00:00 2026-06-03T08:22:29+00:00

Let TARGET be a set of strings that I expect to be spoken. Let

  • 0
  • Let TARGET be a set of strings that I expect to be spoken.
  • Let SOURCE be the set of strings returned by a speech recognizer (that is, the possible sentences that it has heard).

I need a way to choose a string from TARGET. I read about the Levenshtein distance and the Damerau-Levenshtein distance, which basically returns the distance between a source string and a target string, that is the number of changes needed to transform the source string into the target string.

But, how can I apply this algorithm to a set of target strings?

I thought I’d use the following method:

  1. For each string that belongs to TARGET, I calculate the distance from each string in SOURCE. In this way we obtain an m-by-n matrix, where n is the cardinality of SOURCE and n is the cardinality of TARGET. We could say that the i-th row represents the similarity of the sentences detected by the speech recognizer with respect to the i-th target.
  2. Calculating the average of the values ​​on each row, you can obtain the average distance between the i-th target and the output of the speech recognizer. Let’s call it average_on_row(i), where i is the row index.
  3. Finally, for each row, I calculate the standard deviation of all values in the row. For each row, I also perform the sum of all the standard deviations. The result is a column vector, in which each element (Let’s call it stadard_deviation_sum(i)) refers to a string of TARGET.

The string which is associated with the shortest stadard_deviation_sum could be the sentence pronounced by the user. Could be considered the correct method I used? Or are there other methods?
Obviously, too high values ​​indicate that the sentence pronounced by the user probably does not belong to TARGET.

  • 1 1 Answer
  • 0 Views
  • 0 Followers
  • 0
Share
  • Facebook
  • Report

Leave an answer
Cancel reply

You must login to add an answer.

Forgot Password?

Need An Account, Sign Up Here

1 Answer

  • Voted
  • Oldest
  • Recent
  • Random
  1. Editorial Team
    Editorial Team
    2026-06-03T08:22:31+00:00Added an answer on June 3, 2026 at 8:22 am

    You need to calculate these probabilities first: probability of insertion, deletion and substitution. Then use log of these probabilities as penalties for each operation.

    In a “context independent” situation, if pi is probability of insertion, pd is probability of deletion and ps probability of substitution, the probability of observing the same symbol is pp=1-ps-pd.

    In this case use log(pi/pp/k), log(pd/pp) and log(ps/pp/(k-1)) as penalties for insertion, deletion and substitution respectively, where k is the number of symbols in the system.

    Essentially if you use this distance measure between a source and target you get log probability of observing that target given the source. If you have a bunch of training data (i.e. source-target pairs) choose some initial estimates for these probabilities, align source-target pairs and re-estimate these probabilities (AKA EM strategy).

    You can start with one set of probabilities and assume context independence. Later you can assume some kind of clustering among the contexts (eg. assume there are k different sets of letters whose substitution rate is different…).

    • 0
    • Reply
    • Share
      Share
      • Share on Facebook
      • Share on Twitter
      • Share on LinkedIn
      • Share on WhatsApp
      • Report

Sidebar

Related Questions

I have a TARGET variable that can be set to dev, test or prod.
let's say I've got: target.o: target.h target.c gcc $(CFLAGS) -c target.c But I'd like
Let's consider the below example. There, I have: target MAIN calls target t and
Let me explain best with an example. Say you have node class that can
Let's say that I have a SQLite database that I create in a separate
Let's say there is a set of number 1, 2, 3, 4, 5, 6,
I'm trying to produce the set of entities to be created in a target
Let me first stress the fact that I'm talking about the Mac OS X
When I have my project's target framework set to Framework 4.0 and run the
How do I set the path in for coldfusion file/template so that files referenced

Explore

  • Home
  • Add group
  • Groups page
  • Communities
  • Questions
    • New Questions
    • Trending Questions
    • Must read Questions
    • Hot Questions
  • Polls
  • Tags
  • Badges
  • Users
  • Help
  • SEARCH

Footer

© 2021 The Archive Base. All Rights Reserved
With Love by The Archive Base

Insert/edit link

Enter the destination URL

Or link to existing content

    No search term specified. Showing recent items. Search or use up and down arrow keys to select an item.