I want to make a program that detects the note that is being played in front of the microphone. I am testing the FFT function of Naudio, but with the tests that I did in audacity it seems that FFT does not detect the pitch correctly. I played an C5, but the highest pick was at E7.
I changed the first dropdown box in the frequency analysis window to “enchanced autocorrelation” and after that the highest pick was at C5.
I googled “enchanced autocorrelation” and had no luck.
The highest peak in an audio spectrum is not necessarily the musical pitch as a human would perceive it, especially in a sound with strong overtones. That’s because pitch is a human psycho-perceptual phenomena, the brain will often deduce frequencies that aren’t even present in a waveform.
Auto-correlation methods of frequency or pitch estimation (roughly, finding how far apart even a funny-looking and/or non-sinusoidal waveform repeats in time) is usually a better match for what a human would call pitch. The reason for various enhancements to the autocorrelation algorithm is that simple autocorrelation will find an near infinite number of repeating wavelengths (e.g. if it repeats every 1 second it also repeats twice every 2 seconds, etc.) So the trick is to weight the correlation to somehow statistically better match what a human would guess about the same waveform.