I’m quite curious about this.
In a broad way, how does one go about doing the following:
- Detection of word separations.
- Detection of syllables.
- Compensate for normal speech word connections.
Sign Up to our social questions and Answers Engine to ask questions, answer people’s questions, and connect with other people.
Login to our social questions & Answers Engine to ask questions answer people’s questions & connect with other people.
Lost your password? Please enter your email address. You will receive a link and will create a new password via email.
Please briefly explain why you feel this question should be reported.
Please briefly explain why you feel this answer should be reported.
Please briefly explain why you feel this user should be reported.
This is subject of extensive research until these days. One usually starts building a model based on linguistic analysis of the language you’ll do recognition in and detect all the cases for word separations and syllables. Then recognition is mainly done using Hidden Markov Models over the signal.
Here are some references that might give you some better ideas:
http://lands.let.kun.nl/literature/eric.2004.2.pdf
http://www.asel.udel.edu/icslp/cdrom/vol4/778/a778.pdf
http://en.wikipedia.org/wiki/Speech_segmentation