I am trying to learn how to get all the img src from a URL. But, the imgs variable in my code is always null. What am I doing wrong?
static void Main(string[] args)
{
HtmlAgilityPack.HtmlDocument doc = new HtmlAgilityPack.HtmlDocument();
doc.LoadHtml("http://archive.ncsa.illinois.edu/primer.html");
HtmlAgilityPack.HtmlNodeCollection imgs = doc.DocumentNode.SelectNodes("//img");
if (imgs != null)
{
foreach (HtmlAgilityPack.HtmlNode img in imgs)
{
string imgSrc = img.Attributes["src"].Value;
}
}
Console.ReadKey();
}
You are using HtmlDocument.LoadHtml which is designed to take html source and not a url.
You could use the WebClient to get the html e.g.
HtmlDocument also supports a Load that allows content to be loaded from various other sources.