i am a ruby beginner and i found a problem, i would like to know if there is a more ‘ruby way’
to solve it.
my problem is:
i got a string, like this:
str = "<div class=\"yui-u first\">\r\n\t\t\t\t\t<h1>Jonathan Doe</h1>\r\n
\t\t\t\t\t<h2>Web Designer, Director</h2>\r\n\t\t\t\t</div>"
# now, i want to replace the substring in <h1> </h1> and <h2> and </h2> with
these two string:"fooo" and "barr".
here is what i did:
# first, i got the exactly matched substrings of str:
r = str.scan(/(?<=<h\d>).*?(?=<\/h\d>)/)
# then, i create a hash table to set the corresponding replace strings
h = {r[0] => 'fooo', r[1] => 'barr'}
# finally, using str.gsub to replace those matched strings
str.gsub!(/(?<=<h\d>).*?(?=<\/h\d>)/, h)
# or like this
str.gsub!(/(?<=<h\d>).*?(?=<\/h\d>)/) {|v| h[v]}
PS: The substring in <h1> </h1> and <h2> </h2> are not fixed, so i have
to get these strings FIRST, so that i can build a hash table. But I
really don’t like the code above (because i wrote two lines almost the same),
i think there must be a elegant way to do so. i have tried something like this:
str.gsub!(/(?<=<h\d>).*?(?=<\/h\d>)/) { ['fooo', 'barr'].each {|v| v}}
but this didn’t work. because this block returns [‘fooo’, ‘barr’] EVERYTIME!
if there is a way to let this block (or something?) return one element at a time(return ‘fooo’ at the first time, then return ‘barr’ at the second), my problem will be solved!
thank you!
Although you really have no business parsing HTML with a regexp, as a library like Nokogiri can make this significantly easier as you can modify the DOM directly, the mistake you’re making is in presuming that the iterator will execute only once per substitution and that the block will return only one value.
eachwill actually return the object being iterated.Here’s a way to avoid all the Regexp insanity:
If you want to do multiple substitutions where each gets a different value, the simple way is to just rip off values from a stack:
Keep in mind that
subsis consumed here. A more efficient approach would be to increment some kind of index variable and use that value instead, but this is a trivial modification.