I have the following text:
xml = '''
<accessibility_info>
<accessibility role="captions" available="true" />
</accessibility_info>
<crew_member billing="top"
<display_name>John Viscount</display_name>
</crew_member>
<products>
<territory>GB</territory>
</products>'''
I need to remove the following <crew_member> block. This is what I am currently doing:
clean_xml = re.sub('<crew_member>.*</crew_member>', '', metadata_contents,
flags=re.DOTALL)
However, it is also adding a newline:
<accessibility_info>
<accessibility role="captions" available="true" />
</accessibility_info>
<products>
<territory>GB</territory>
</products>
How would I change the regex to strip the newline as well, so it looks like:
<accessibility_info>
<accessibility role="captions" available="true" />
</accessibility_info>
<products>
<territory>GB</territory>
</products>'
try this
print re.sub('<crew_member([^\>]*)>.*</crew_member>\n', '', xml, flags=re.DOTALL)