so I’m making an app and what I need to do is when for example someone starts talking I need to detect that there is a sound and then record it.
I found this tutorial http://mobileorchard.com/tutorial-detecting-when-a-user-blows-into-the-mic/ but it starts the recording on the beginning and then based on the recording it detects the sound.
Is there any other way to detect a sound without actually starting the recorder first? What I thought of would be having 2 recorders, one for detection and one for actually recording the sound. Another solution would be to edit (trim) the sound after it’s recorded.
Are these approaches somehow standard or is there a better way to detect sound?
Thanks.
edit: if anyone ever reads this, I also found this http://bonkel.wordpress.com/2010/03/03/frequency-detection-using-fourier-transform/
If you don’t mind getting a little dirty, you could go down to a lower level, to CoreAudio, and read data out of the input buffers until you see values exceeding your threshold, and start recording those input buffers, or triggering a high level recording call. You can similarly stop recording after a period of silence.
If you use CoreAudio, you have a lot of control over what you record. You could, pretty easily, filter out background noise, or add beeps to signify when the recording stopped due to silence, and even add markers to use later to match time to the recording.
CoreAudio does require you to do more work. You will have to read the microphone buffers on a timely basis and either save or discard the data pretty quickly in order not to drop any sound data. This isn’t that hard, as the devices have plenty of CPU power to do that and other tasks at the same time – you just have to have a good grasp of CoreAudio.
There are plenty of Apple CoreAudio samples that can guide you. The WWDC 2010 and 2010 CoreAudio sessions are also a must-see.