Currently I have code that does something like this:
soup = BeautifulSoup(value)
for tag in soup.findAll(True):
if tag.name not in VALID_TAGS:
tag.extract()
soup.renderContents()
Except I don’t want to throw away the contents inside the invalid tag. How do I get rid of the tag but keep the contents inside when calling soup.renderContents()?
The strategy I used is to replace a tag with its contents if they are of type
NavigableStringand if they aren’t, then recurse into them and replace their contents withNavigableString, etc. Try this:The result is:
I gave this same answer on another question. It seems to come up a lot.