I have html as follows:
html = '<html><table>this is a table<p>some text</p></table><p>text outside of table</p></html>'
I want to move to the end of table and then find the next tag. I tried using findNext but if there is a tag inside the table it finds that tag instead of the next tag outside of the table.
soup = BeautifulSoup(''.join(text))
table = soup.find('table')
test = table.findNext()
This code gives me:
<p>some text</p>
However, I want it to give me:
<p>text outside of table</p>
The main problem is that I can’t always specify that a tag is a ‘p’ tag. I could have html like this:
html = '<html><table>this is a table<td>some text</td></table><table>text outside of table</table></html>'
So, I can’t really rely on the tag identifiers to get to the next one. In the above code, I want to return:
<table>text outside of table</table>
I realize that I could just use findNext twice, but often there are hundreds of tags inside each table and so that wouldn’t work.
would
work for you?