I need to do a seemingly simple thing in Python which turned out to be quite complex. What I need to do is:
- Open an HTML file.
- Match all instances of a specific HTML element, for example
table. - For each instance, extract the element as a string, pass that string to an external command which will do some modifications, and finally replace the original element with a new string returned from the external command.
I can’t simply do a re.sub(), because in each case the replacement string is different and based on the original string.
Any suggestions?
Sounds like you want BeautifulSoup. Likely, you’d want to do something like:
Alternatively, you may be looking for something closer to
soup.replace_withEDIT: Updated to the eventual solution.