I’m not so able with regex and I’m looking for the syntax to exclude

Question

0

Asked: May 25, 20262026-05-25T06:55:04+00:00 2026-05-25T06:55:04+00:00

I’m not so able with regex and I’m looking for the syntax to exclude

0

I’m not so able with regex and I’m looking for the syntax to exclude something.
I’m parsing <, >, " and & in html code (to replace with <, etc) and I need to exclude <br/> from parsing.
I.E.:

<html><br/>
   <head><title></title></head><br/>
   <body><br/>
   </body><br/>
</html>

I tried sometihng like i.e.: r'<\b?![br]' and others, but they don’t work completely. I use re.sub() to replace.

Report

Leave an answer
Cancel reply

You must login to add an answer.

Need An Account,

1 Answer

Editorial Team · Answer 1 · 2026-05-25T06:55:05+00:00

Ok, now the question is open again, I can do it as an answer, so…

Unless I’m missing something, and once it’s just <br/> (not any variants), then can just replace <(?!br/>) with < and (?<!<br/)> with > and that’s it?

In Python, it looks like that means this:

text = re.sub( '<(?!br/>)' , '&lt;' , text )
text = re.sub( '(?<!<br/)>' , '&gt;' , text )

To explain what’s going on, (?!…) is a negative lookahead – it only successfully matches at a position if the following text does not match the sub-expression it contains.
(Note lookaheads do not consume the text matched by their sub-expression, they only verify if it exists, or not.)

Similarly, (?<!…) is a negative lookbehind, and does the same thing but using the preceding text.

However, lookbehinds do have a slight different to lookaheads (in some regex implementations) – which is that the sub-expressions inside lookbehinds must represent fixed-width or limited-width matches.

Python is one of the ones that requires a fixed width – so whilst the above expression works (because it’s always four characters), if it was (?<!<br\s*/?)> then it would not be a valid regex for Python because it represents a variable length match. (However, you can stack multiple lookbehinds, so you could potentially manually iterate the assorted options, if that was necessary.)

Sign Up

Sign In

Forgot Password

The Archive Base Latest Questions

I’m not so able with regex and I’m looking for the syntax to exclude

Leave an answerCancel reply

1 Answer

Leave an answer
Cancel reply