I am currently trying to parse an HTML document to retrieve all of the

Question

0

Asked: June 6, 20262026-06-06T12:48:17+00:00 2026-06-06T12:48:17+00:00

I am currently trying to parse an HTML document to retrieve all of the

0

I am currently trying to parse an HTML document to retrieve all of the footnotes inside of it; the document contains dozens and dozens of them. I can’t really figure out the expressions to use to extract all of content I want. The thing is, the classes (ex. “calibre34”) are all randomized in every document. The only way to see where the footnotes are located is to search for “hide” and it’s always text afterwards and is closed with a < /td> tag. Below is an example of one of the footnotes in the HTML document, all I want is the text. Any ideas? Thanks guys!

<td class="calibre33">1.<span><a class="x-xref" href="javascript:void(0);">
[hide]</a></span></td>
<td class="calibre34">
Among the other factors on which the premium would be based are the
average size of the losses experienced, a margin for contingencies,
a loading to cover the insurer's expenses, a margin for profit or
addition to the insurer's surplus, and perhaps the investment
earnings the insurer could realize from the time the premiums are
collected until the losses must be paid.</td>

Report

Leave an answer
Cancel reply

You must login to add an answer.

Need An Account,

1 Answer

Editorial Team · Answer 1 · 2026-06-06T12:48:21+00:00

Editorial Team

2026-06-06T12:48:21+00:00Added an answer on June 6, 2026 at 12:48 pm

Use HTMLAgilityPack to load the HTML document and then extract the footnotes with this XPath:

//td[text()='[hide]’]/following-sibling::td

Basically,what it does is first selecting all td nodes that contain [hide] and then finally go to and select their next sibling. So the next td. Once you have this collection of nodes you can extract their inner text (in C#, with the support provided in HtmlAgilityPack).

0

Reply
Share
Share

- Report

Sign Up

Sign In

Forgot Password

The Archive Base Latest Questions

I am currently trying to parse an HTML document to retrieve all of the

Leave an answerCancel reply

1 Answer

Leave an answer
Cancel reply