I have a bunch of Word docs which were “saved as” filtered html. The html files contain extraneous ole-links which I need to delete. For example, I want to replace:
<h3><a name="OLE_LINK25">My Section Title</a></h3>
with
<h3>My Section Title</h3>
Any suggestions for how I might do this, in an automated way?
Jsoup could help to remove all anchor tags with name starting with “OLE”.