What I want is, opening a Link from a Website (from HtmlContent) and get

Question

0

Asked: May 26, 20262026-05-26T17:59:24+00:00 2026-05-26T17:59:24+00:00

What I want is, opening a Link from a Website (from HtmlContent) and get

0

What I want is, opening a Link from a Website (from HtmlContent)
and get the Html of this new opened site..

Example: I have http://www.google.com, now I want to find all Links.
For each Link I want to have the HTMLContent of the new Site.

I do something like this:

foreach (String link in GetLinksFromWebsite(htmlContent))
            {
                using (var client = new WebClient())
                {
                    htmlContent = client.DownloadString("http://" + link);
                }

                foreach (Match treffer in istBildURL)
                {
                    string bildUrl = treffer.Groups[1].Value;
                    bildLinks.Add(bildUrl);
                }
            }




   public static List<String> GetLinksFromWebsite(string htmlSource)
    {
        string linkPattern = "<a href=\"(.*?)\">(.*?)</a>";
        MatchCollection linkMatches = Regex.Matches(htmlSource, linkPattern, RegexOptions.Singleline);
        List<string> linkContents = new List<string>();
        foreach (Match match in linkMatches)
        {
            linkContents.Add(match.Value);
        }
        return linkContents;
    }

The other problem is, that I only get Links, not Linkbuttons (ASP.NET)..
How can I solve the problem?

Report

Leave an answer
Cancel reply

You must login to add an answer.

Need An Account,

1 Answer

Editorial Team · Answer 1 · 2026-05-26T17:59:25+00:00

Steps to follow:

Download Html Agility Pack
Reference the assembly you have downloaded in your project
Throw everything that starts with the word regex or regular expression out from your project and which deals with parsing HTML (read this answer to better understand why). In your case this would be the contents of the GetLinksFromWebsite method.
Replace what you have thrown away with a simple call to the Html Agility Pack parser.

Here’s an example:

using System;
using System.Collections.Generic;
using System.Linq;
using System.Net;
using HtmlAgilityPack;

class Program
{
    static void Main()
    {
        using (var client = new WebClient())
        {
            var htmlSource = client.DownloadString("http://www.stackoverflow.com");
            foreach (var item in GetLinksFromWebsite(htmlSource))
            {
                // TODO: you could easily write a recursive function
                // that will call itself here and retrieve the respective contents
                // of the site ...
                Console.WriteLine(item);
            }
        }
    }

    public static List<String> GetLinksFromWebsite(string htmlSource)
    {
        var doc = new HtmlDocument();
        doc.LoadHtml(htmlSource);
        return doc
            .DocumentNode
            .SelectNodes("//a[@href]")
            .Select(node => node.Attributes["href"].Value)
            .ToList();
    }
}

Sign Up

Sign In

Forgot Password

The Archive Base Latest Questions

What I want is, opening a Link from a Website (from HtmlContent) and get

Leave an answerCancel reply

1 Answer

Leave an answer
Cancel reply