Sign Up

Sign Up to our social questions and Answers Engine to ask questions, answer people’s questions, and connect with other people.

Have an account? Sign In

Have an account? Sign In Now

Sign In

Login to our social questions & Answers Engine to ask questions answer people’s questions & connect with other people.

Sign Up Here

Forgot Password?

Don't have account, Sign Up Here

Forgot Password

Lost your password? Please enter your email address. You will receive a link and will create a new password via email.

Have an account? Sign In Now

You must login to ask a question.

Forgot Password?

Need An Account, Sign Up Here

Please briefly explain why you feel this question should be reported.

Please briefly explain why you feel this answer should be reported.

Please briefly explain why you feel this user should be reported.

Sign InSign Up

The Archive Base

The Archive Base Logo The Archive Base Logo

The Archive Base Navigation

  • SEARCH
  • Home
  • About Us
  • Blog
  • Contact Us
Search
Ask A Question

Mobile menu

Close
Ask a Question
  • Home
  • Add group
  • Groups page
  • Feed
  • User Profile
  • Communities
  • Questions
    • New Questions
    • Trending Questions
    • Must read Questions
    • Hot Questions
  • Polls
  • Tags
  • Badges
  • Buy Points
  • Users
  • Help
  • Buy Theme
  • SEARCH
Home/ Questions/Q 8730617
In Process

The Archive Base Latest Questions

Editorial Team
  • 0
Editorial Team
Asked: June 13, 20262026-06-13T09:03:42+00:00 2026-06-13T09:03:42+00:00

I have a collection of strings which I need to perform two operations on.

  • 0

I have a collection of strings which I need to perform two operations on.

The first of these can safely be processed independently in any order (yay), but then the output must be processed sequentially (boo) in the original order.

The following Plinq gets me most of the way there:

myStrings.AsParallel().AsOrdered()
         .Select( str => Operation1(str) )
         .AsSequential()
         .Select( str => Operation2(str) );
//immagine Operation2() maintains some sort of state and must take the outputs from Operation1 in the original order    

This gets me most of the way there, but the problem is that because of the AsOrdered(), Operation1 gets executed on every string first, then the result elements are sorted back to their original order, then finally Operation2 starts executing.

Ideally, as soon as the first string (ie myStrings[0], not the first one returned) is returned by an Operation1 call, I’d like Operation2 to begin it’s work.

So this is my attempt to solve the problem generically:

public static class ParallelHelper
{
    public static IEnumerable<U> SelectAsOrdered<T, U>(this ParallelQuery<T> query, Func<T, U> func)
    {
        var completedTasks = new Dictionary<int, U>();
        var queryWithIndexes = query.Select((x, y) => new { Input = x, Index = y })
                                    .AsParallel()
                                    .Select(t => new { Value = func(t.Input), Index = t.Index })
                                    .WithMergeOptions(ParallelMergeOptions.NotBuffered);

        int i = 0;
        foreach (var task in queryWithIndexes)
        {
            if (i==task.Index)
            {
                Console.WriteLine("immediately yielding task: {0}", i);
                i++;
                yield return task.Value;

                U previouslyCompletedTask;
                while (completedTasks.TryGetValue(i, out previouslyCompletedTask))
                {
                    completedTasks.Remove(i);
                    Console.WriteLine("delayed yielding task: {0}", i);
                    yield return previouslyCompletedTask;
                    i++;
                }
            }
            else
            {
                completedTasks.Add(task.Index, task.Value);
            }
        }
        yield break;
    }
}

Then I can re-write my original code block as:

myStrings.AsParallel()
         .SelectAsOrdered( str => Operation1(str) )
         .Select(str => Operation2(str));

and Operation2 kicks off as soon as myStrings[0] comes out from Operation1.

What I’d like to know is:

  1. This is a fairly common problem/pattern within parallelisation, have I missed something out of the box that does this in the .Net framework? Or is there a simpler way?
  2. While the above extension method seems to do the job, how could it be improved? Does anything in the code look like it’s a bad idea?

Thanks!

Andy

Just in case you’re interested:

  • Without the call to .WithMergeOptions(ParallelMergeOptions.NotBuffered) Operation2 doesn’t begin its work until all Operation1 calls have been started (which is better than the original code which waited until they were all completed).

  • The real life problem:
    Operation1 is searching for legal citations and references within large bodies of text (eg: “children act 1989”).
    These references are usually independent, but occasionally a transcript will contain something like “section 6 of the previously mentioned act”.
    Operation2 relies on captures from Operation1 to pick up these partial references.

  • 1 1 Answer
  • 0 Views
  • 0 Followers
  • 0
Share
  • Facebook
  • Report

Leave an answer
Cancel reply

You must login to add an answer.

Forgot Password?

Need An Account, Sign Up Here

1 Answer

  • Voted
  • Oldest
  • Recent
  • Random
  1. Editorial Team
    Editorial Team
    2026-06-13T09:03:43+00:00Added an answer on June 13, 2026 at 9:03 am

    If you need speed, you can parallelize all process (load data, prepare data, process data and aggregate data), I think is better use a producer/consumer pattern.

    But, if you would use “Linq” you can’t generate (in a good way to do a complete paralallel workflow) data as parallel (but yes: prepare, process and resume).

    On the other hand, I think is wrong (you can, yes) trying to use “Linq” as “parallel(A) + sequential(B)”, your process (I think) is

    B = f(A)
    

    then, B must be wait to A.

    Why not do simply “parallel(A/B)”?

    You can do a helper (extension) but I think it isn’t useful in general.

    In your real case, simply use a Semaphore to prevent premature access to a “Article ID”.

    A complete code to prepare, process and resume in parallel (no generate) is:

    class Text {
        public static Regex rx = new Regex(@" (PREVID|ACTID\=([0-9]+)) ");
    
        private Text prv; // previous article
        private string ot; // original text
        private int id; // act id on text
        private Semaphore isComputed = new Semaphore(0, 1);
    
        public int ActID {
            get {
                isComputed.WaitOne();
                int _id = id;
                isComputed.Release();
                return _id;
            }
        }
    
        public bool ProcessText() {
            var mx = rx.Match(ot);
            var prev = mx.Groups [1].Value == "PREVID";
            if(prev)
                id = prv == null ? 0 : prv.ActID;
            else
                if(!int.TryParse(mx.Groups [2].Value, out id))
                    throw new Exception(string.Format(@"Incorrect article id ""{0}""", mx.Groups [0].Value));
            isComputed.Release();
            return !prev;
        }
    
        public Text(string original_text, Text previous) {
            prv = previous;
            ot = original_text;
        }
    
    }
    
    public static void Main(String [] args) {
    
        // same random stream (for debugging)
        var rnd = new Random(1);
    
        var noise = @"These references are usually independent, but occasionally";
    
        // some noise text
        var bit = new Func<string>(() =>
            noise.Substring(0, rnd.Next(noise.Length)));
    
        // random article
        var text = new Func<string>(() =>
            string.Format(@"{0}{1}{2}", bit(),
                rnd.Next() % 2 == 0 ? " PREVID "
                                    : string.Format(@" ACTID={0} ", rnd.Next()), bit()));
    
        // random data input
        var data = new List<Text>();
        Text prv = null;
        for(var n = 0; n < 1000000; n++)
            // producer / consumer is better to parallelize load data step
            data.Add(prv = new Text(text(), prv));
    
        Console.Write("Press key to start...");
        Console.ReadKey();
    
        // parallel processing
        Console.WriteLine("{0} unique ID's", data.AsParallel().Where(n => n.ProcessText()).Count());
    
        Console.WriteLine("Process completed.");
    }
    

    as you can see, ProcessText process all articles in parallel. Only PREVID articles wait until their previous article is computing their own id.

    The real problem to abstract this behavior (I think) is items relations (one item is dependent to another), in Linq, the natural way is a “no items relations” (you must use “group by” to perform it).

    I suggest to you use a producer/consumer pattern.

    • 0
    • Reply
    • Share
      Share
      • Share on Facebook
      • Share on Twitter
      • Share on LinkedIn
      • Share on WhatsApp
      • Report

Sidebar

Related Questions

I have a collection of strings. I need to find out from this collection
I have a collection of picture Objects for which I need to download thumbs
I have a simple json string which contains a collection of objects http://sandapps.com/InAppAds/ads.json.txt When
I have a Person class which has a String collection of aliases representing additional
I have a Dictionary<int, string> which I want to take the Key collection into
I have a class which has two HashSet<String> collections as private members. Other classes
I have a C# list collection that I'm trying to sort. The strings that
I used to have a web service through which a client could perform DB
I have a collection of structures. The structure is just some strings. Example public
First of all, I don't have multiplication, division operations so i could use shifting/adding,

Explore

  • Home
  • Add group
  • Groups page
  • Communities
  • Questions
    • New Questions
    • Trending Questions
    • Must read Questions
    • Hot Questions
  • Polls
  • Tags
  • Badges
  • Users
  • Help
  • SEARCH

Footer

© 2021 The Archive Base. All Rights Reserved
With Love by The Archive Base

Insert/edit link

Enter the destination URL

Or link to existing content

    No search term specified. Showing recent items. Search or use up and down arrow keys to select an item.