I’m developing a program for iPhone.
I have read this article and I have some questions. After I get amplitude of sound file, which ranges of spectrum do I need to transform into FFT(Fast Fourier transform)? In article, man says “40-80, 80-120, 120-180, 180-300”, how does he know which ranges? After I get FFT (using OouraFFT) I have frequency spectrum, then as I understood must take control points, so how to take them?
I have a few more questions, but please help me with those questions.
He didn’t know them – he made them up.
Those ranges are very low in frequency. Low frequency sounds tend to have the longest sustain/decay, so you’re less likely to have temporal aliasing problems by using lower frequencies. That’s important in the application you’re looking to implement. The sounds vary over time, and the input samples could be at any given offset of the song/sounds, and most likely won’t exactly match your window offsets. Lower frequency parts of the sound are still susceptible to this, but much less so than higher frequency parts.
OouraFFT is written in C, not Objective-C. Can you link to the wrapper you are using?
If you’re using this wrapper it looks to me like you’re going to have to low-pass filter your data before hand, and maybe modify/additionally process the results of the library to do exactly what you’re trying to accomplish. Or find a different iPhone FFT library that wraps more high level concepts on top of FFT.
That library calls OouraFFT
rdst, and does so in such a way that all buckets are evenly distributed (pretty much just the raw FFT data, with no higher level concepts bolted on). Unless you go with smaller buckets and aggregate them, you’re not going to get those specific buckets described in the article you read.You could try simply using a different evenly spaced bucket selection instead, and end up with something like
40-80, 80-120, 120-160, 160-200, 200-240, 240-280, 280-320. Or you could use40-60, 60-80etc, and combine/average buckets when you are done.To get the bucket sizes you want, you’ll need to do some math. From that library’s readme:
The longer the window, the greater number of buckets, but the more likely you will have temporal issues. So, select your window size, then low-pass filter your input data and reduce the sample rate so you can get bucket sizes (frequency ranges) that fit your need, and run the filtered data through.
BTW, I am not sure about that implementation, but I read here that you have to throw out the lowest frequency bucket when using FFT. This article has a similar notice, saying that the lowest bucket has only half the width.
If you’re really wanting to get accurate results out of this project, I suggest you generate test data with those specific frequencies and window periods so you can verify that your array data is getting populated correctly, and your data isn’t accidentally getting skewed (off-by-one errors, window and filter calculations incorrect, etc). Otherwise your success will be by sheer luck and fiddling around because you won’t be able to diagnose where any problems in your code lie.