I’m following along a tutorial (Ruby) that uses a regex to remove all html tags from a string:
product.description.gsub(/<.*?>/,'').
I don’t know how to interpret the ?. Does it mean: “at least one of the previous”? In that case, wouldn’t /<.+>/ have been more adequate?
In this case, it make
*lazy.1*– match as many1s as possible.1*?– match as few1s as possible.Here, when you have
<a>text<b>some more text,<.*>will match<a>text<b>.<.*?>, however, will match<a>and<b>.See also: Laziness Instead of Greediness
Another important note here is that this regex can easily fail on valid HTML, it is better to use an HTML parser, and get the text of your document.