I’m trying to parse html page and I use the following regular expression: var

Question

0

Asked: June 12, 20262026-06-12T05:41:38+00:00 2026-06-12T05:41:38+00:00

I’m trying to parse html page and I use the following regular expression: var

0

I’m trying to parse html page and I use the following regular expression:

var regex = new Regex(@"<tag1 id=.id1.>.*<tag2>", RegexOptions.Singleline);

“tag1 id =.id.1” occurs in document only once. “tag2” occurs nearly 50 times after the occurance of “tag 1”. But when I try to match page code with my regular expression, it returns only 1 match. Moreover, when I change RegexOptions to “None” or “Multiline” no matches are returned. I’m very confused about this and would appreciate any help.

Report

Leave an answer
Cancel reply

You must login to add an answer.

Need An Account,

1 Answer

Editorial Team · Answer 1 · 2026-06-12T05:41:39+00:00

Leaving aside the obvious exhortations about not using regex to parse HTML, I can explain to you why you’re seeing what you’re seeing.

If tag1 occurs in your text only once, then the regex can only match it once, so there can never be more than one match. Regular expression matches “consume” the text they have matched, so the next match attempt starts at the end of the last successful match.

This leads to the next problem: .* is greedy, so it matches (with RegexOptions.Singleline) until the end of the string and then backtracks until the last <tag2> it finds in order to allow a successful match. Which is another reason why you only get one match.

As for your second question: Why do the matches go away if you don’t use RegexOptions.Singleline? Simple: Without that option, the dot . cannot match newlines, and there appears to be at least one newline between tag1 and the first tag2.

Sign Up

Sign In

Forgot Password

The Archive Base Latest Questions

I’m trying to parse html page and I use the following regular expression: var

Leave an answerCancel reply

1 Answer

Leave an answer
Cancel reply