I need to find the number, the in and out timecode points and all lines of the text.
9
00:09:48,347 --> 00:09:52,818
- Let's see... what else she's got?
- Yea... ha, ha.
10
00:09:56,108 --> 00:09:58,788
What you got down there, missy?
11
00:09:58,830 --> 00:10:00,811
I wouldn't do that!
12
00:10:03,566 --> 00:10:07,047
-Shit, that's not enough!
-Pull her back!
I’m currently using this pattern but it forgets all two lines text
(?<Order>\d+)\r\n(?<StartTime>(\d\d:){2}\d\d,\d{3}) --> (?<EndTime>(\d\d:){2}\d\d,\d{3})\r\n(?<Sub>.+)(?=\r\n\r\n\d+|$)
Any help would be much appreciated.
I think there’s two problems with the regex. The first is that the
.near the end in(?<Sub>.+)is not matching newlines. So you could modify it to:Or you could specify
RegexOptions.Singlelineas an option to the regex. The only thing the option does is make the dot match newlines.The second problem is that
.+matches as many lines as it can. You can make it non-greedy like:This matches the least amount of text that ends with an empty line or the end of the string.