I’m trying to delete some things from a block of text using regex. I

Question

0

Asked: June 7, 20262026-06-07T17:15:10+00:00 2026-06-07T17:15:10+00:00

I’m trying to delete some things from a block of text using regex. I

0

I’m trying to delete some things from a block of text using regex. I have all of my patterns ready, but I can’t seem to be able to remove two (or more) that overlap.

For example:

import re

r1 = r'I am'
r2 = r'am foo'

text = 'I am foo'

re.sub(r1, '', text)   # Returns ' foo'
re.sub(r2, '', text)   # Returns 'I '

How do I replace both of the occurrences simultaneously and end up with an empty string?

I ended up using a slightly modified version of Ned Batchelder’s answer:

def clean(self, text):
  mask = bytearray(len(text))

  for pattern in patterns:
    for match in re.finditer(pattern, text):
      r = range(match.start(), match.end())

      mask[r] = 'x' * len(r)

  return ''.join(character for character, bit in zip(text, mask) if not bit)

Report

Leave an answer
Cancel reply

You must login to add an answer.

Need An Account,

1 Answer

Editorial Team · Answer 1 · 2026-06-07T17:15:13+00:00

You can’t do it with consecutive re.sub calls as you have shown. You can use re.finditer to find them all. Each match will provide you with a match object, which has .start and .end attributes indicating their positions. You can collect all those together, and then remove characters at the end.

Here I use a bytearray as a mutable string, used as a mask. It’s initialized to zero bytes, and I mark with an ‘x’ all the bytes that match any regex. Then I use the bit mask to select the characters to keep in the original string, and build a new string with only the unmatched characters:

bits = bytearray(len(text))
for pat in patterns:
    for m in re.finditer(pat, text):
        bits[m.start():m.end()] = 'x' * (m.end()-m.start())
new_string = ''.join(c for c,bit in zip(text, bits) if not bit)

Sign Up

Sign In

Forgot Password

The Archive Base Latest Questions

I’m trying to delete some things from a block of text using regex. I

Leave an answerCancel reply

1 Answer

Leave an answer
Cancel reply