Before somebody points me to that question, I know that one can’t parse html

Question

0

Asked: June 6, 20262026-06-06T04:39:58+00:00 2026-06-06T04:39:58+00:00

Before somebody points me to that question, I know that one can’t parse html

0

Before somebody points me to that question, I know that one can’t parse html with regex 🙂 And this is not what I am trying to do.

What I need is:

Input: a string containing html.
Output: replace all opening tags

***<tag>

So if I get

<a><b><c></a></b></c>, I want

***<a>***<b>***<c></a></b></c>

as output.

I’ve tried something like:

(<[~/].+>)

and replace it with

***$1

But doesn’t really seem to work the way I want it to. Any pointers?

Clarification: it’s guaranteed that there are no self closing tags nor comments in the input.

Report

Leave an answer
Cancel reply

You must login to add an answer.

Need An Account,

1 Answer

Editorial Team · Answer 1 · 2026-06-06T04:40:00+00:00

Editorial Team

2026-06-06T04:40:00+00:00Added an answer on June 6, 2026 at 4:40 am

You just have two problems: ^ is the character to exclude items from a character class, not ~; and the .+ is greedy, so will match as many characters as possible before the final >. Change it to:

(<[^/].+?>)

You can also probably drop the parentheses and replace with $0 or $&, depending on the language.

0

Reply
Share
Share

- Report

Sign Up

Sign In

Forgot Password

The Archive Base Latest Questions

Before somebody points me to that question, I know that one can’t parse html

Leave an answerCancel reply

1 Answer

Leave an answer
Cancel reply