I would like to optimize this algorithm. Function makeFrame divides the audio signal into time frames using a Hanning window of about 37 ms. Then function divideFreqs performs the fast fourier transform on each timeframe using jtransforms library (and it is the one that is the most time consuming). How could I cut down the time of this operation as this is taking way too long. For an audio file of 5 secs it takes around 13 secs to perform the operation. I was thinking about using multi-threading but never used it before.
public double[][] makeFrame(double[] audioOutput) {
int length = audioOutput.length;
//calculate a hannining window size of 37 ms
int window = (int) Math.round(0.37 * sampleRate);
int interval = (int) Math.round(0.0116 * sampleRate);
length = length - window;
int numintervals = length / interval;
//calculate hanning window values
double[] hanw = hanning(window);
double[][] sections = new double[numintervals + 1][25];
//divide the signal into timeframes using Hanning window of 37ms
int k = 0;
for (int i = 0; i < length; i += interval) {
double[] temp = new double[88200];
int t = 0;
int s;
s = i;
for (; s < i + window; s++) {
temp[t] = audioOutput[s] * hanw[t];
t++;
}
sections[k] = divideFreqs(temp, sampleRate);
k++;
}
return sections;
}
public static double[] hanning(int window) {
int w = 0;
double h_wnd[] = new double[window]; //Hanning window
for (int i = 1; i < window; i++) { //calculate the hanning window
h_wnd[i] = 0.5 * (1 - Math.cos(2.0 * Math.PI * i / (window + 1)));
}
return h_wnd;
}
public static double[] divideFreqs(double[] audioData, float fs) {
DoubleFFT_1D fft = new DoubleFFT_1D(44100);
int len;
double[] secenergy;
//Frequency bands in the range of 1Hz-20000Hz
int[][] bandsec = new int[][]{
{1, 100},
{100, 200},
{200, 300},
{300, 400},
{400, 510},
{510, 630},
{630, 770},
{770, 920},
{920, 1080},
{1080, 1270},
{1270, 1480},
{1480, 1720},
{1720, 2000},
{2000, 2320},
{2320, 2700},
{2700, 3150},
{3150, 3700},
{3700, 4400},
{4400, 5300},
{5300, 6400},
{6400, 7700},
{7700, 9500},
{9500, 12000},
{12000, 15500},
{15500, 20000}};
//perform FFT on the data
fft.realForwardFull(audioData);
//splitting real and imaginary numbers
double[] real = new double[22050];
double[] imaginary = new double[22050];
for (int row = 0; row < 22050; row++) {
real[row] = (double) Math.round(audioData[row + row] * 100000000) / 100000000;
imaginary[row] = (double) Math.round(audioData[row + row + 1] * 100000000) / 100000000;
}
len = bandsec.length;
secenergy = new double[len];
//calculate energy for each critical band
double[] tempReal;
double[] tempImag;
for (int i = 0; i < len; i++) {
int k = 0;
tempReal = new double[bandsec[i][1] - (bandsec[i][0] - 1)];
tempImag = new double[bandsec[i][1] - (bandsec[i][0] - 1)];
for (int j = bandsec[i][0] - 1; j < bandsec[i][1]; j++) {
tempReal[k] = real[j];
tempImag[k] = imaginary[j];
k++;
}
secenergy[i] = energy(tempReal, tempImag);
}
return secenergy;
}
public static double energy(double[] real, double[] imaginary) {
double e = 0;
Complex sum = new Complex(0, 0);
ArrayList<Complex> complexList = new ArrayList<Complex>();
for (int i = 0; i < real.length; i++) {
Complex comp = new Complex(real[i], imaginary[i]);
complexList.add(comp.multiply(comp));
}
for (int i = 0; i < complexList.size(); i++) {
Complex comp = new Complex(complexList.get(i).getReal(), complexList.get(i).getImaginary());
sum = Complex.add(comp, sum);
}
e = Math.sqrt(sum.magnitude());
e = (double) Math.round(e * 10000) / 10000;
return e;
}
Idont know your library since I’m using FFTW myself, but the things i noticed were
1) your fft size is not a power of 2 2) your window is 370ms not 37ms. 3) Since your window has a size of 370ms (i.e. ~16k samples) why feed a 88200 (or does the constructor value say "take only 44100 values"?) array into it? It is fully sufficient to take pow(2.0, ceil(log2(0.37*44100))) = 2^14 = 16384 as your fft size. Zero padding wont add additional frequency resolution I'm afraid. 4) you instatiate a new FFT object for every call to divideFreq. I'm not sure how expensive the construction is, so try make it a class member. 5) Last but not least (I think this is the major speed loss here) Your hop size is much too small! A common overlap is 1/2 or 2/3 of the window size (in terms of your code: interval = windowSize/3). Your's is around 1/31 of the window size. Thats really overkill give you many redundant results.cheers