I developed an application which converts from voice to text using SAPI 5.1.
As the accuracy is too weak, I decided to create my own grammar, I created my own grammmar which only recognizes numbers from one to ten.
I failed in accuracy again. So I went in deep with the grammar file. I went through Lexion File which is used for pronunciation. So my question is
-
will lexicon file improve the
accuracy? so that I can use
pronunciation of numbers one to ten
in the Lexicon file and then
use it. -
I need a template on how
to create a lexicon file.
If your speech recognition accuracy is weak, it could be any one of the following reasons:
Not enough training data – note that creating a speaker-dependant speech recognition system (that is tied to only one speaker) requires a large number of units of each of the words (one to ten in your case). Individual units are required for training initial models with and then embedded training data maybe required to further improve the models.
A speaker-independent speech recognition model will require even more data.
There is a mismatch between the testing and training data. If the models were created using noise-less data or on data with an accent, it may be difficult to get good results when testing with data that has a lot of noise or has a different accent.
But more details about the speech recognition system you are trying to build would be better.
Update 1: Since you mention in the comments that you are using Microsoft Speech SDK, here is a guide to training the speech SDK on sounds/accents. Just follow the instructions and that should set you on your way.