I’m developing a program in C# and I require some help. I’m trying to create an array or a list of items, that display on a certain website. What I’m trying to do is read the anchor text and it’s href. So for example, this is the HTML:
<div class="menu-1">
<div class="items">
<div class="minor">
<ul>
<li class="menu-item">
<a class="menu-link" title="Item-1" id="menu-item-1"
href="/?item=1">Item 1</a>
</li>
<li class="menu-item">
<a class="menu-link" title="Item-1" id="menu-item-2"
href="/?item=2">Item 2</a>
</li>
<li class="menu-item">
<a class="menu-link" title="Item-1" id="menu-item-3"
href="/?item=3">Item 3</a>
</li>
<li class="menu-item">
<a class="menu-link" title="Item-1" id="menu-item-4"
href="/?item=4">Item 4</a>
</li>
<li class="menu-item">
<a class="menu-link" title="Item-1" id="menu-item-5"
href="/?item=5">Item 5</a>
</li>
</ul>
</div>
</div>
</div>
So from that HTML I would like to read this:
string[,] array = {{"Item 1", "/?item=1"}, {"Item 2", "/?item=2"},
{"Item 3", "/?item=3"}, {"Item 4", "/?item=4"}, {"Item 5", "/?item=5"}};
The HTML is an example I had written, the actual site does not look like that.
As others said HtmlAgilityPack is the best for html parsing, also be sure to download HAP Explorer from HtmlAgilityPack site, use it to test your selects, anyway this SelectNode command will get all anchors that have ID and it start with menu-item :