Given this re.sub and ‘replace’ function – thanks, Ignacio, for the pointer! – I am able to replace all matches in my very long text blob with the string ‘* NONSENSE *‘ – so far, so good!
Along the way, I’d like to find the substring within the matchobj, calling it ‘findkey‘, so I can do additional work with it…
How do to this?
data = re.sub('(:::[A-Z,a-z,:]+:::)', replace, data)
def replace(matchobj):
if matchobj.group(0) != '':
# this seems to work:
tag = matchobj.group(1)
# but this doesn't:
findkey = re.search(':::([A-Z,a-z]+):::', tag)
return '******************** NONSENSE ********************'
else:
return ''
Try this. You can match the inner part as part of the initial sub call.
returns the following