Let’s say I have an array of zeros:
a = numpy.zeros(1000)
I then introduce some repetitive ‘events’:
a[range(0, 1000, 30)] = 1
Question is, how do I detect the ‘signal’ there? Because it’s far from the ideal signal if I do the ‘regular’ FFT I don’t get a clear indication of where my ‘true’ signal is:
f = abs(numpy.fft.rfft(a))
Is there a method to detect these repetitions with some degree of certainty? Especially if I have few of those mixed in, for example here:
a[range(0, 1000, 30)] = 1
a[range(0, 1000, 110)] = 1
a[range(0, 1000, 48)] = 1
I’d like to get three ‘spikes’ on the resulting data…
As an analytical technique acf/pacf/ccf are used
to identify periodicity in time-dependent signals, hence the correlogram, the graphical display of acf or pacf, displays self-similarity in a signal as a function of different lags. (So for instance, if you see values on the y axis peak at a lag of 12, and if your date is in months, that’s evidence of annual periodicity.)
To calculate and plot ‘similarity’ versus lag, if you don’t want to roll your own, i am not aware of a native Numpy/Scipy option; I also couldn’t find one in the ‘time series’ scikit (one of the libraries in the Scipy ‘Scikits’, domain-specific modules not included in the standard Scipy distribution) but it’s worth checking again. The other option is to install Python bindings to R (RPy2, available on SourceForge) which will allow you to access the relevant R functions, including ‘acf‘ which will calculate and plot the correlogram just by passing in your time series and calling the function.
On the other hand, if you want to identify continuous (unbroken) streams of a given type in your signal, then “run-length encoding” is probably what you want: