I am attempting to use the Agility pack to parse certain bits of info from various pages. I am kind of worried that using this might be overkill for what I need, if that is case feel free to let me know. Anyway, I am attempting to parse a page from motley fool to get the name of a company based on the ticker. I will be parsing several pages to get stock info in a similar way.
The HTML that I want to parse looks like:
<h1 class="subHead">
Microsoft Corp <span>(NASDAQ:MSFT)</span>
</h1>
Also, the page I want to parse is: http://caps.fool.com/Ticker/MSFT.aspx
So, I guess my question is how do I simply get the Microsoft Corp from the html and should I even be using the agility pack to do things like this?
Edit: Current code
public String getStockName(String ticker)
{
String text ="";
HtmlAgilityPack.HtmlWeb web = new HtmlAgilityPack.HtmlWeb();
HtmlAgilityPack.HtmlDocument doc = web.Load("http://caps.fool.com/Ticker/" + ticker + ".aspx");
var node = doc.DocumentNode.SelectSingleNode("/h1[@class='subHead']");
text = node.FirstChild.InnerText.Trim();
return text;
}
This would give you a list of all stock names, for your sample Html just of Microsoft:
Edit based on updated question – this should work for you: