I am new to Parallel Programming and infact this is the first time I am trying it. I am currently doing a project in .NET 4 and prefer to do have 4 or 5 parallel executions.
I see some options. There is Task.Factory.StartNew Parallel.For Parallel.ForEach etc.
What I am going to do is post to a web-site and fetch the responses for about 200 URLs.
When I use Parallel.ForEach I didn’t find a way to control the number of threads and the application went using 130+ threads and the website went unresponsive 🙂
I am interested in using Task.Factory.StartNew within a for loop and divide the URLs in to 4 or 5 tasks.
List<Task> tasks = new List<Task>();
for (int i = 0; i < 5; i++)
{
List<string> UrlForTask = GetUrlsForTask(i,5); //Lets say will return some thing like 1 of 5 of the list of URLs
int j = i;
var t = Task.Factory.StartNew(() =>
{
List<PageSummary> t = GetSummary(UrlForTask);
Summary.AddRange(t); //Summary is a public variable
}
tasks.Add(t);
}
I believe that these Tasks kind of boil down to threads. So if I make Summary a List<PageSummary> will it be kind of thread safe (I understand there are issues accessing a shared variable by multiple threads)?
Is this where we should use ConcurrentQueue<T> ?
Do you know of a good resource that helps to learn about accessing and updating a shared variable by multiple tasks etc?
What is the best way I could use for this type of task as you may think ?
Parallel.ForEachhas overloads that take aParallelOptionsinstance. TheMaxDegreeOfParallelismproperty of that class is what you need to use.