I need to parse a website which has a lot of nested <div>s all over. I tried with XML::Simple to get a nice tree-structure, but the parse fails all the time because there seems to be two or three not closed <p> somewhere. I tried HTML::Parser, but that only lets me define some handler functions that give me the right tags, but not their nested elements.
There any way to get XML::Simple accept non-valid XML or HTML::Parser to give me a handy tree structure?
The HTML::TreeBuilder builds nice trees and gives tons of handy methods to traverse it.