I have a C# program that currently downloads data from several sites synchronously after which the code does some work on the data I’ve downloaded. I am trying to move this to do my downloads asynchronously and then process the data I’ve downloaded. I am having some trouble with this sequencing. Below is a snapshot of code I am using:
class Program
{
static void Main(string[] args)
{
Console.WriteLine("Started URL downloader");
UrlDownloader d = new UrlDownloader();
d.Process();
Console.WriteLine("Finished URL downloader");
Console.ReadLine();
}
}
class UrlDownloader
{
public void Process()
{
List<string> urls = new List<string>() {
"http://www.stackoverflow.com",
"http://www.microsoft.com",
"http://www.apple.com",
"http://www.google.com"
};
foreach (var url in urls)
{
WebClient Wc = new WebClient();
Wc.OpenReadCompleted += new OpenReadCompletedEventHandler(DownloadDataAsync);
Uri varUri = new Uri(url);
Wc.OpenReadAsync(varUri, url);
}
}
void DownloadDataAsync(object sender, OpenReadCompletedEventArgs e)
{
StreamReader k = new StreamReader(e.Result);
string temp = k.ReadToEnd();
PrintWebsiteTitle(temp, e.UserState as string);
}
void PrintWebsiteTitle(string temp, string source)
{
Regex reg = new Regex(@"<title[^>]*>(.*)</title[^>]*>");
string title = reg.Match(temp).Groups[1].Value;
Console.WriteLine(new string('*', 10));
Console.WriteLine("Source: {0}, Title: {1}", source, title);
Console.WriteLine(new string('*', 10));
}
}
Essentially, my problem is this. My output from above is:
Started URL downloader
Finished URL downloader
"Results of d.Process()"
What I want to do is complete the d.Process() method and then return to the “Main” method in my Program class. So, the output I am looking for is:
Started URL downloader
"Results of d.Process()"
Finished URL downloader
My d.Process() method runs asynchronously, but I can’t figure out how to wait for all of my processing to complete before returning to my Main method. Any ideas on how to do this in C#4.0? I am not sure how I’d go about ‘telling’ my Process() method to wait until all it’s asynchronous activity is complete before returning to the Main method.
If you are on .NET>=4.0 you can use TPL
I would also use HtmlAgilityPack to parse the page instead of regex.