I have a source code on a webpage that I wish to extract (I’ve narrowed it down to exactly what is relevant here:
<div class="sideInfoPlayer">
<a class="signLink" href="spieler.php?uid=12345" title="Profile">
<span class="wrap">Wagamama</span>
</a>
Now the trick here is that I want to get the word Wagamama into a message box but that word changes on every page of that site so I need to get that element but there is no ID on this page. Therefore I was thinking of doing a search for the class named “sideInfoPlayer” first and then find the “wrap” class within the previous class block.
I have written the below to get the first one but do not know how to tackle the second one and then get the desired value.
HtmlElementCollection col = webBrowser1.Document.GetElementsByTagName("div");
foreach (HtmlElement element in col)
{
string cls = element.GetAttribute("className");
if (String.IsNullOrEmpty(cls) || !cls.Equals("sideInfoPlayer"))
continue;
}
I hope you can help unstuck me on this one.
You have better options. Look at http://htmlagilitypack.codeplex.com/
And here: How can i parse html string
First you’ll need to add reference to HtmlAgilityPack library by downloading it manually or with NuGet package manager.
//div[@class='sideInfoPlayer']/span[@class='wrap']is called Xpath Expression and this one literally means “get me all span elements with class=wrap that are children of div element with class=sideInfoPlayer.I didn’t test it, but it should work.