Take this snippet of code which is supposed to replace a href tag with

Question

0

Asked: June 13, 20262026-06-13T12:20:24+00:00 2026-06-13T12:20:24+00:00

Take this snippet of code which is supposed to replace a href tag with

0

Take this snippet of code which is supposed to replace a href tag with its URL:

irb> s='<p><a href="http://localhost/activate/57f7e805827f" style="color:#F19300;font-weight:bold">Click here!</a></p>'
irb> s.gsub(/<a href="([^ '"]*)"([^>]*)?>([^<]*)<\/a>/, "#{$1}")
=> "<p></p>"

This regex fails (URL is not found). Then I escape the < character in the regex, and it works:

irb> s.gsub(/<a href="([^ '"]*)"([^>]*)?>([^\<]*)<\/a>/, "#{$1}")
=> "<p>http://localhost/activate/57f7e805827f</p>"

1: According to RubyMine’s inspections, this escape should not be necessary. Is this correct? If so, why is the escape of > apparently not necessary as well?

2: Afterwards in the same IRB session, with the same string, the original regex suddenly works too:

irb> s.gsub(/<a href="([^ '"]*)"([^>]*)?>([^<]*)<\/a>/, "#{$1}")
=> "<p>http://localhost/activate/57f7e805827f</p>"

Is this because the $1 variable is not cleared when calling gsub again? If so, is it intentional behaviour or is this a Ruby regex bug?

3: When I change the string, and reexecute the same command, $1 will only change after calling gsub twice on the changed string:

irb> s='<p><a href="http://localhost/activate/xxxxyyy" style="color:#F19300;font-weight:bold">Click here!</a></p>'
=> "<p><a href=\"http://localhost/activate/xxxxyyy\" style=\"color:#F19300;font-weight:bold\">Click here!</a></p>"
irb> s.gsub(/<a href="([^ '"]*)"([^>]*)?>([^\<]*)<\/a>/, "#{$1}")
=> "<p>http://localhost/activate/57f7e805827f</p>"
irb> s.gsub(/<a href="([^ '"]*)"([^>]*)?>([^\<]*)<\/a>/, "#{$1}")
=> "<p>http://localhost/activate/xxxxyyy</p>"

Is this intentional? If so, what is the logic behind this?

4: As replacement character, some tutorials suggest using "#{$n}", others suggest using '\n'. With the backslash variant, the problems above do not appear. Why – what is the difference between the two?

Thank you!

Report

Leave an answer
Cancel reply

You must login to add an answer.

Need An Account,

1 Answer

Editorial Team · Answer 1 · 2026-06-13T12:20:25+00:00

$1 contains the first capture of the last match. In your example, it is evaluated before the matching (actually even before gsub is called), therefore the value of $1 is fixed to nil (because you did not match anything, yet). So you always get the first capture of the previous match, you do not even need to change your original regex to get the expected result the second time:

s='<p><a href="http://localhost/activate/57f7e805827f" style="color:#F19300;font-weight:bold">Click here!</a></p>'

s.gsub(/<a href="([^ '"]*)"([^>]*)?>([^<]*)<\/a>/, "#{$1}")
# => "<p></p>"

s.gsub(/<a href="([^ '"]*)"([^>]*)?>([^<]*)<\/a>/, "#{$1}")
# => "<p>http://localhost/activate/57f7e805827f</p>"

You can pass a block to gsub though, which is evaluated after the matching, e. g.

s.gsub(/<a href="([^ '"]*)"([^>]*)?>([^<]*)<\/a>/){ $1 }
# => "<p>http://localhost/activate/57f7e805827f</p>"

This way, $1 behaves as you’d expect. I like to always use named captures so i don’t have to keep track of the numbers when i add a capture, though:

s.gsub(/<a href="(?<href>([^ '"]*))"([^>]*)?>([^<]*)<\/a>/){ $~[:href] }
# => "<p>http://localhost/activate/57f7e805827f</p>"

Sign Up

Sign In

Forgot Password

The Archive Base Latest Questions

Take this snippet of code which is supposed to replace a href tag with

Leave an answerCancel reply

1 Answer

Leave an answer
Cancel reply