I want to write a program in which plays an audio file that reads a text. I want to highlite the current syllable that the audiofile plays in green and the rest of the current word in red. What kind of datastructure should I use to store the audio file and the information that tells the program when to switch to the next word/syllable?
Share
This is a slightly left-field suggestion, but have you looked at Karaoke software? It may not be seen as ‘serious’ enough, but it sounds very similar to what you’re doing. For example, Aegisub is a subtitling program that lets you create subtitles in the SSA/ASS format. It has karaoke tools for hilighting the chosen word or part.
It’s most commonly used for subtitling anime, but it also works for audio provided you have a suitable player. These are sadly quite rare on the Mac.
The format looks similar to the one proposed by Yuval A:
The lengths are durations rather than absolute offsets. This makes it easier to shift the start of the line without recalculating all the offsets. The double entry indicates a pause.
Is there a good reason this needs to be part of your Java program, or is an off the shelf solution possible?