How would you go about comparing a spoken word to an audio file and determining if they match? For example, if I say “apple” to my iPhone application, I would like for it to record the audio and compare it with a prerecorded audio file of someone saying “apple”. It should be able to determine that the two spoken words match.
What kind of algorithm or library could I use to perform this kind of voice-based audio file matching?
Sphinx does voice recognition and pocketSphinx has been ported to the iPhone by Brian King
check https://github.com/KingOfBrian/VocalKit
He has provided excellent details and made it easy to implement for yourself. I’ve run his example and modified my own rendition of it.