I need to write a little ruby function that does word wrapping. I have got the following function :
def word_wrap(text, line_width)
if line_width.nil? || line_width < 2
line_width = 40
end
text.split("\n").collect do |line|
line.length > line_width ? line.gsub(/.{1,#{line_width}})(\s+|$)/, "\\1\n").strip : line
end * "\n"
end
This is basically the word_wrap function included in Rails.
I would like to write a similiar function which parse a string with span elements inside, except that the tags should not be counted to wrap the line.
Example:
s = "Lorem <span>ipsum dolor</span> si<span>t</span> amet, conse<span>ctetur adipiscing elit</span> Praesent"
At the moment, word_wrap(s, 20) gives something like this:
Lorem <span>ipsum
dolor</span>
si<span>t</span>
amet,
conse<span>ctetur
adipiscing
elit</span> Praesent
It should be:
Lorem <span>ipsum dolor</span> si
<span>t</span> amet, conse<span>ctetur
adipiscing elit</span>
Praesent
As you can see, the new word_wrap function create lines of (max) 20 characters, without counting the <span> and </span> tags.
How would you do that? All suggestions are welcome!
Thanks in advance for your help.
Here’s a regex solution
Basically we grab as many characters as we can
up to the word width, ignoring spans (and making
sure we don’t grab parts of spans), and making sure
we end on a non-space character, followed by
either a span, or a word break.
I’ll break down how the regex works:
SPAN_REmatches one span tag (either<span>or</span>or or …)ALL_SPAN_REmatches all the spans at a given position – guaranteeing that the next charactermatched is not the start of a span tag.
This means that we can match one character after the
ALL_SPAN_REand be sure that we’re notgrabbing part of a span.
The
full_rethen just greedily matches as many characters as it can,up to the desired width (ignoring spans), making sure that it ends on a
non-space character that is either the end of a word or followed by a span.