I was wondering what the easiest way to wrap an element with another element using lxml and Python for example if I have a html snippet:
<h1>The cool title</h1>
<p>Something Neat</p>
<table>
<tr>
<td>aaa</td>
<td>bbb</td>
</tr>
</table>
<p>The end of the snippet</p>
And I want to wrap the table element with a section element like this:
<h1>The cool title</h1>
<p>Something Neat</p>
<section>
<table>
<tr>
<td>aaa</td>
<td>bbb</td>
</tr>
</table>
</section>
<p>The end of the snippet</p>
Another thing I would like to do is scour the xml document for h1s with a certain attribute and then wrap all of the elements until the next h1 tag in an element for example:
<h1 class='neat'>Subject 1</h1>
<p>Here is a bunch of boring text</p>
<h2>Minor Heading</h2>
<p>Here is some more</p>
<h1 class='neat>Subject 2</h1>
<p>And Even More</p>
Converted to:
<section>
<h1 class='neat'>Subject 1</h1>
<p>Here is a bunch of boring text</p>
<h2>Minor Heading</h2>
<p>Here is some more</p>
</section>
<section>
<h1 class='neat>Subject 2</h1>
<p>And Even More</p>
</section>
Thanks for all the help,
Chris
lxml’s awesome for parsing well formed xml, but’s not so good if you’ve got non-xhtml html. If that’s the case then go for BeautifulSoup as suggested by systemizer.
With lxml, this is a fairly easy way to insert a section around all tables in the document:
You could do something similar to wrap the headings, but you would need to iterate through all the elements you want to wrap and move them to the section. element.index(child) and list slices might help here.