I’m using HtmlAgilityPack to parse and analyze HTML pages, and I need to know the “depth” of each node – The distance from the Body node. Example (the “depth” attributes are only for illustration purposes):
<html>
<head></head>
<body depth="0">
<div depth="1">
<ul depth="2">
<li depth="3">
<p depth="4">foo</p>
</li>
<li depth="3">
<p depth="4">bar</p>
</li>
</ul>
</div>
</body>
</html>
I’m trying to avoid the two obvious solutions:
- Scan the HTML tree (DFS, BFS, etc..), calculate the depth of each node, and store the values in a Dictionary, or similar.
- Calculate the depth of each node “on demand” by counting
node.ParentNodeuntilbodyis reached.
Is there a way to avoid these by somehow using the already existing data collected by HtmlAgilityPack on Load?
Are you asking if there’s a built in
NodeDepthproperty or something like that? I’m pretty sure the answer is no, as calculating that for every node parsed by the library would create an overhead that would rarely be warranted. Since counting node depth is pretty easily done with some recursion, I don’t think they’d include that per default.Why do you want to avoid the obvious solutions?