Sign Up

Sign Up to our social questions and Answers Engine to ask questions, answer people’s questions, and connect with other people.

Have an account? Sign In

Have an account? Sign In Now

Sign In

Login to our social questions & Answers Engine to ask questions answer people’s questions & connect with other people.

Sign Up Here

Forgot Password?

Don't have account, Sign Up Here

Forgot Password

Lost your password? Please enter your email address. You will receive a link and will create a new password via email.

Have an account? Sign In Now

You must login to ask a question.

Forgot Password?

Need An Account, Sign Up Here

Please briefly explain why you feel this question should be reported.

Please briefly explain why you feel this answer should be reported.

Please briefly explain why you feel this user should be reported.

Sign InSign Up

The Archive Base

The Archive Base Logo The Archive Base Logo

The Archive Base Navigation

  • Home
  • SEARCH
  • About Us
  • Blog
  • Contact Us
Search
Ask A Question

Mobile menu

Close
Ask a Question
  • Home
  • Add group
  • Groups page
  • Feed
  • User Profile
  • Communities
  • Questions
    • New Questions
    • Trending Questions
    • Must read Questions
    • Hot Questions
  • Polls
  • Tags
  • Badges
  • Buy Points
  • Users
  • Help
  • Buy Theme
  • SEARCH
Home/ Questions/Q 6898335
In Process

The Archive Base Latest Questions

Editorial Team
  • 0
Editorial Team
Asked: May 27, 20262026-05-27T07:16:59+00:00 2026-05-27T07:16:59+00:00

Why is speech recognition so difficult? What are the specific challenges involved? I’ve read

  • 0

Why is speech recognition so difficult? What are the specific challenges involved? I’ve read through a question on speech recognition, which did partially answer some of my questions, but the answers were largely anecdotal rather than technical. It also still didn’t really answer why we still can’t just throw more hardware at the problem.

I’ve seen tools that perform automated noise reduction using neural nets and ambient FFT analysis with excellent results, so I can’t see a reason why we’re still struggling with noise except in difficult scenarios like ludicrously loud background noise or multiple speech sources.

Beyond this, isn’t it just a case of using very large, complex, well-trained neural nets to do the processing, then throwing hardware at it to make it work fast enough?

I understand that strong accents are a problem and that we all have our colloquialisms, but these recognition engines still get basic things wrong when the person is speaking in a slow and clear American or British accent.

So, what’s the deal? What technical problems are there that make it still so difficult for a computer to understand me?

  • 1 1 Answer
  • 0 Views
  • 0 Followers
  • 0
Share
  • Facebook
  • Report

Leave an answer
Cancel reply

You must login to add an answer.

Forgot Password?

Need An Account, Sign Up Here

1 Answer

  • Voted
  • Oldest
  • Recent
  • Random
  1. Editorial Team
    Editorial Team
    2026-05-27T07:16:59+00:00Added an answer on May 27, 2026 at 7:16 am

    Some technical reasons:

    • You need lots of tagged training data, which can be difficult to acquire once you take into account all the different accents, sounds etc.
    • Neural networks and similar gradient descent algorithms don’t scale that well – just making them bigger (more layers, more nodes, more connections) doesn’t guarantee that they will learn to solve your problem in a reasonable time. Scaling up machine learning to solve complex tasks is still a hard, unsolved problem.
    • Many machine learning approaches require normalised data (e.g. a defined start point, a standard pitch, a standard speed). They don’t work well once you move outside these parameters. There are techniques such as convolutional neural networks etc. to tackle these problems, but they all add complexity and require a lot of expert fine-tuning.
    • Data size for speech can be quite large – the size of the data makes the engineering problems and computational requirements a little more challenging.
    • Speech data usually needs to be interpreted in context for full understanding – the human brain is remarkably good at “filling in the blanks” based on understood context. Missing informations and different interpretations are filled in with the help of other modalities (like vision). Current algorithms don’t “understand” context so they can’t use this to help interpret the speech data. This is particularly problematic because many sounds / words are ambiguous unless taken in context.

    Overall, speech recognition is a complex task. Not unsolvably hard, but hard enough that you shouldn’t expect any sudden miracles and it will certainly keep many reasearchers busy for many more years…..

    • 0
    • Reply
    • Share
      Share
      • Share on Facebook
      • Share on Twitter
      • Share on LinkedIn
      • Share on WhatsApp
      • Report

Sidebar

Related Questions

Microsoft Speech Recognition comes with a Speech Reference Card. It consists in some pre-determined
I'm writing some speech recognition code in python and I want it to be
I am writing an app which uses android's speech recognition. However my app doesnt
Using Windows7 speech recognition I wish to create specialised vocabularies for recognising a domain-specific
Speech recognition is one of the many features of my current project which will
I am looking for some free speech recognition engines to use in my iphone
I wrote an app in C# for speech recognition using System.Speech which works fine
I dictate SQL using speech recognition, and lining things up is a pain. If
I have successfully managed to get System.Speech.Synthesis to read English text in arbitrary voices
I'm building a speech recognition + processing tool with PHP, and I've just run

Explore

  • Home
  • Add group
  • Groups page
  • Communities
  • Questions
    • New Questions
    • Trending Questions
    • Must read Questions
    • Hot Questions
  • Polls
  • Tags
  • Badges
  • Users
  • Help
  • SEARCH

Footer

© 2021 The Archive Base. All Rights Reserved
With Love by The Archive Base

Insert/edit link

Enter the destination URL

Or link to existing content

    No search term specified. Showing recent items. Search or use up and down arrow keys to select an item.