I’m trying to write a search and replace regex that will detect whether HTML

Question

0

Asked: May 17, 20262026-05-17T00:01:28+00:00 2026-05-17T00:01:28+00:00

I’m trying to write a search and replace regex that will detect whether HTML

0

I’m trying to write a search and replace regex that will detect whether HTML that has been returned by a web request is complete. I have had cases when the server returns incomplete HTML (half of the page), so I want to detect that in the client and request the page again.

I was thinking the regex could look for the presence of <html[^>]*>, and then the absence of </html>. The replace part would then replace the whole HTML with a bit of special text.

I can’t just check for the absence of </html> because the returned data might be a text file, and I can’t check MIME types.

Any ideas? I just can’t wrap my head around the look-behinds this would require. I’m not trying to parse HTML, just searching for bits of text, which is what regexes are for, right?

EDIT:

The regexes will be run by C#, but I write them in a regex editor. I can only use a search and replace regex to solve this, nothing else.

Report

Leave an answer
Cancel reply

You must login to add an answer.

Need An Account,

1 Answer

Editorial Team · Answer 1 · 2026-05-17T00:01:29+00:00

Oded is correct. You cannot parse HTML with regex. But of course you can see whether some (multiline) string contains <html> not followed by </html>. If you are sure that whatever your web request returns will be consistent and not contain any weird things like html tags inside comments, then

<html\b[^>]*>(?:(?!<\s*/\s*html).)*\Z

will find such a string, if you set the “dot matches newlines” option. How to do this depends on the regex implementation which you didn’t provide yet.

<html\b[^>]*>          # match <html> tag
(?:                    # match the following:
  (?!<\s*/\s*html)     # If it's impossible to match </html here
  .                    # then match any character
)*                     # zero or more times.
\Z                     # Then assert that we are indeed at the end of the string

Sign Up

Sign In

Forgot Password

The Archive Base Latest Questions

I’m trying to write a search and replace regex that will detect whether HTML

Leave an answerCancel reply

1 Answer

Leave an answer
Cancel reply