Audio processing is pretty new for me. And currently using Python Numpy for processing wave files. After calculating FFT matrix I am getting noisy power values for non-existent frequencies. I am interested in visualizing the data and accuracy is not a high priority. Is there a safe way to calculate the clipping value to remove these values, or should I use all FFT matrices for each sample set to come up with an average number ?
regards
Edit:
from numpy import *
import wave
import pymedia.audio.sound as sound
import time, struct
from pylab import ion, plot, draw, show
fp = wave.open("500-200f.wav", "rb")
sample_rate = fp.getframerate()
total_num_samps = fp.getnframes()
fft_length = 2048.
num_fft = (total_num_samps / fft_length ) - 2
temp = zeros((num_fft,fft_length), float)
for i in range(num_fft):
tempb = fp.readframes(fft_length);
data = struct.unpack("%dH"%(fft_length), tempb)
temp[i,:] = array(data, short)
pts = fft_length/2+1
data = (abs(fft.rfft(temp, fft_length)) / (pts))[:pts]
x_axis = arange(pts)*sample_rate*.5/pts
spec_range = pts
plot(x_axis, data[0])
show()
Here is the plot in non-logarithmic scale, for synthetic wave file containing 500hz(fading out) + 200hz sine wave created using Goldwave.

Simulated waveforms shouldn’t show FFTs like your figure, so something is very wrong, and probably not with the FFT, but with the input waveform. The main problem in your plot is not the ripples, but the harmonics around 1000 Hz, and the subharmonic at 500 Hz. A simulated waveform shouldn’t show any of this (for example, see my plot below).
First, you probably want to just try plotting out the raw waveform, and this will likely point to an obvious problem. Also, it seems odd to have a wave unpack to unsigned shorts, i.e. “H”, and especially after this to not have a large zero-frequency component.
I was able to get a pretty close duplicate to your FFT by applying clipping to the waveform, as was suggested by both the subharmonic and higher harmonics (and Trevor). You could be introducing clipping either in the simulation or the unpacking. Either way, I bypassed this by creating the waveforms in numpy to start with.
Here’s what the proper FFT should look like (i.e. basically perfect, except for the broadening of the peaks due to the windowing)
Here’s one from a waveform that’s been clipped (and is very similar to your FFT, from the subharmonic to the precise pattern of the three higher harmonics around 1000 Hz)
Here’s the code I used to generate these