I want to be able to identify an audio sample (that is provided by the user) in a audio file I’ve got (mp3).
The mp3 file is a radio stream that I’ve kept for testing purposes, and I have the Pre-roll of the show. I want to identify it in the file and get the timestamp where it’s playing in the file.
Note: The solution can be in any of the following programming languages: Java, Python or C++. I don’t know how to analyze the video file and any reference about this subject will help.
This problem falls under the category of audio fingerprinting. If you have matched a sample to a song, then you’ll certainly know the timestamp where the sample occurs within the song. There is a great paper by the guys behind Shazam that describes their technique: http://www.ee.columbia.edu/~dpwe/papers/Wang03-shazam.pdf They basically pick out the local maxima in the spectrogram and create a hash based on their relative positions.
Here is a good review on audio fingerprinting algorithms: http://mtg.upf.edu/files/publications/MMSP-2002-pcano.pdf
In any case, you’ll likely be working a lot with FFT and spectrograms. This post talks about how to do that in Python.