Using Ruby, I’m trying to parse some documentation in which I need to split blocks of text, each with a heading and followed by an unknown length of text, and push them to an array;
SECTION 1. A HEADING
Some undetermined length of text,
which can be multiple lines and paragraphs.
SECTION 2. ANOTHER HEADING
Another big block of text.
should become
["SECTION 1. A HEADING
Some undetermined length of text,
which can be multiple lines and paragraphs.",
"SECTION 2. ANOTHER HEADING
Another big block of text."]
I could just use string.split(/\n\n\n/), but I want something more specific as I can’t guarantee that each section will have two blank lines after it. A little more experimenting led me to this;
string.split(/(?:^|\n)(SECTION.+\n)/).each do |s|
sections << s
end
but I’d have to process the output again to get what I need.
Is there some way to get this done without having to do multiple passes?
Thanks for your help.
You can use String#scan with multiline-mode regexp and positive look-ahead: