I’m trying to write a program to automate one of my more boring and repetitive work tasks. I have some programming experience but none with processing or interpreting large volumes of data so I am seeking your advice (both suggestions of techniques to try and also things to read to learn more about doing this stuff).
I have a piece of equipment that monitors an experiment by taking repeated samples and displays the readings on its screen as a graph. The input of experiment can be altered and one of these changes should produce a change in a section of the graph which I currently identify by eye and is what I’m looking for in the experiment. I want to automate it so that a computer looks at a set of results and spots the experiment input that causes the change.
I can already extract the results from the machine. Currently they results for a run are in the form of an integer array with the index being the sample number and the corresponding value being the measurement.
The overall shape of the graph will be similar for each experiment run. The change I’m looking for will be roughly the same and will occur in approximately the same place every time for the correct experiment input. Unfortunately there are a few gotchas that make this problem more difficult.
-
There is some noise in the measuring process which mean there is some random variation in the measured values between different runs. Although the overall shape of the graph remains the same.
-
The time the experiment takes varies slightly each run causing two effects. First, the a whole graph may be shifted slightly on the x axis relative to another run’s graph. Second, individual features may appear slightly wider or narrower in different runs.
In both these cases the variation isn’t particularly large and you can assume that the only non random variation is caused by the correct input being found.
I think you’re looking for information on Digital Signal Processing. It can range from very simple to very hard to understand. If, say, your pre-event signal was 0, and every signal after the relevant signal was 1, you could just look for the first 1, figure out the time at which it occurred, and you’d be done. That’s basically the limiting case of simplicity, and it might be a good place to start. Implement that, and you’ve got the beginnings of a sense of how to answer your question. Now, then, you’ve got noise. So, say, pre-event might range from -10 to 10, and post-event might range from 90 to 110. Still simple; watch for the first value greater than 10. But of course it’s never that simple. You might have to average a window of readings, might look for some threshold of change from previous measurement, etc. In advanced cases, you could find yourself using transformations into other spaces, applying filters, pattern matching, and the like. But from your description, it sounds like reasonably simple methods should do the job for you. Don’t get intimidated by concepts like FFT – you probably don’t need them, yet. For now, at least, assume that it can be solved simply. Start with a trivially simple (but insufficient) solution, and work your way towards the solution that works.