I have a string containing text and HTML. I want to remove or otherwise disable some HTML tags, such as <script>, while allowing others, so that I can render it on a web page safely. I have a list of allowed tags, how can I process the string to remove any other tags?
I have a string containing text and HTML. I want to remove or otherwise
Share
Here’s a simple solution using BeautifulSoup:
If you want to remove the contents of the invalid tags as well, substitute
tag.extract()fortag.hidden.You might also look into using lxml and Tidy.