I have an old website originally created in MS Frontpage that I’m trying to defrontpagify. I’ve written a BeautifulSoup script that does most of it. Only thing left is to remove empty tables, eg tables with no text content or data in any of their td tags.
The problem I’m stuck on is that what I’ve tried so far removes the table if at least one its td tags contains no data, even if others do. That removes all the tables in the entire document, including ones with data I want to preserve.
tags = soup.findAll('table',text=None,recursive=True)
[tag.extract() for tag in tags]
Any suggestions how to only remove tables in which none of the td tags contain any data? (I don’t care if they contain img or empty anchor tags, as long as there’s no text).
Use the
.textproperty. It retrieves all text content (recursive) within that element.Example:
Outputs: