i’m trying to get a group of text in between 2 strings in ruby, and i can’t seem to get the right method or use the right regex.
text:
<html>
<body>
<!-- begin posts -->
<h1>all kinds of html<h1>
<p> blah blah </p>
<p> i've been working on this forever </p>
<!-- end posts -->
</html>
</body>
and i just want to get everything from <!-- begin posts --> to <!-- end posts -->, inclusive, and save that block of text in a text file.
i figured out how to print the line in the beginning:
File.open("index.html").each_line do |line|
body.each {|line| puts line if line =~ /<!-- begin/}
but not the lines after up and until the last string.
i have a rubular here http://rubular.com/r/0W9QDpMGkM where i haven’t been able to figure out anything.
thanks everyone in advance.
Don’t do it line by line, just slurp the whole thing into a string and rip it apart:
And now everything between your markers is in
want. Don’t forget themmodifier on the regex.While you’re mangling your input you can strip out the stray leading and trailing whitespace too:
As Tudor notes below, you might want to use a non-greedy
(.*?)for the group if you think there is any chance of multiple<!-- end posts -->markers; doesn’t hurt to be a little paranoid when they really are you to get you.References:
File.read(actuallyIO.read)String#matchString#stripUPDATE: the
matchmethod on a string returns a MatchData object. The array access operator:Is used to access the matching parts. There’s only one group in the regex so
[1]gets you the contents of that group without the surrounding HTML comment delimiters.