Sign Up

Sign Up to our social questions and Answers Engine to ask questions, answer people’s questions, and connect with other people.

Have an account? Sign In

Have an account? Sign In Now

Sign In

Login to our social questions & Answers Engine to ask questions answer people’s questions & connect with other people.

Sign Up Here

Forgot Password?

Don't have account, Sign Up Here

Forgot Password

Lost your password? Please enter your email address. You will receive a link and will create a new password via email.

Have an account? Sign In Now

You must login to ask a question.

Forgot Password?

Need An Account, Sign Up Here

Please briefly explain why you feel this question should be reported.

Please briefly explain why you feel this answer should be reported.

Please briefly explain why you feel this user should be reported.

Sign InSign Up

The Archive Base

The Archive Base Logo The Archive Base Logo

The Archive Base Navigation

  • Home
  • SEARCH
  • About Us
  • Blog
  • Contact Us
Search
Ask A Question

Mobile menu

Close
Ask a Question
  • Home
  • Add group
  • Groups page
  • Feed
  • User Profile
  • Communities
  • Questions
    • New Questions
    • Trending Questions
    • Must read Questions
    • Hot Questions
  • Polls
  • Tags
  • Badges
  • Buy Points
  • Users
  • Help
  • Buy Theme
  • SEARCH
Home/ Questions/Q 9048069
In Process

The Archive Base Latest Questions

Editorial Team
  • 0
Editorial Team
Asked: June 16, 20262026-06-16T12:03:45+00:00 2026-06-16T12:03:45+00:00

I have a list of 500000 randomly generated Tuple<long,long,string> objects on which I am

  • 0

I have a list of 500000 randomly generated Tuple<long,long,string> objects on which I am performing a simple “between” search:

var data = new List<Tuple<long,long,string>>(500000);
...
var cnt = data.Count(t => t.Item1 <= x && t.Item2 >= x);

When I generate my random array and run my search for 100 randomly generated values of x, the searches complete in about four seconds. Knowing of the great wonders that sorting does to searching, however, I decided to sort my data – first by Item1, then by Item2, and finally by Item3 – before running my 100 searches. I expected the sorted version to perform a little faster because of branch prediction: my thinking has been that once we get to the point where Item1 == x, all further checks of t.Item1 <= x would predict the branch correctly as “no take”, speeding up the tail portion of the search. Much to my surprise, the searches took twice as long on a sorted array!

I tried switching around the order in which I ran my experiments, and used different seed for the random number generator, but the effect has been the same: searches in an unsorted array ran nearly twice as fast as the searches in the same array, but sorted!

Does anyone have a good explanation of this strange effect? The source code of my tests follows; I am using .NET 4.0.


private const int TotalCount = 500000;
private const int TotalQueries = 100;
private static long NextLong(Random r) {
    var data = new byte[8];
    r.NextBytes(data);
    return BitConverter.ToInt64(data, 0);
}
private class TupleComparer : IComparer<Tuple<long,long,string>> {
    public int Compare(Tuple<long,long,string> x, Tuple<long,long,string> y) {
        var res = x.Item1.CompareTo(y.Item1);
        if (res != 0) return res;
        res = x.Item2.CompareTo(y.Item2);
        return (res != 0) ? res : String.CompareOrdinal(x.Item3, y.Item3);
    }
}
static void Test(bool doSort) {
    var data = new List<Tuple<long,long,string>>(TotalCount);
    var random = new Random(1000000007);
    var sw = new Stopwatch();
    sw.Start();
    for (var i = 0 ; i != TotalCount ; i++) {
        var a = NextLong(random);
        var b = NextLong(random);
        if (a > b) {
            var tmp = a;
            a = b;
            b = tmp;
        }
        var s = string.Format("{0}-{1}", a, b);
        data.Add(Tuple.Create(a, b, s));
    }
    sw.Stop();
    if (doSort) {
        data.Sort(new TupleComparer());
    }
    Console.WriteLine("Populated in {0}", sw.Elapsed);
    sw.Reset();
    var total = 0L;
    sw.Start();
    for (var i = 0 ; i != TotalQueries ; i++) {
        var x = NextLong(random);
        var cnt = data.Count(t => t.Item1 <= x && t.Item2 >= x);
        total += cnt;
    }
    sw.Stop();
    Console.WriteLine("Found {0} matches in {1} ({2})", total, sw.Elapsed, doSort ? "Sorted" : "Unsorted");
}
static void Main() {
    Test(false);
    Test(true);
    Test(false);
    Test(true);
}

Populated in 00:00:01.3176257
Found 15614281 matches in 00:00:04.2463478 (Unsorted)
Populated in 00:00:01.3345087
Found 15614281 matches in 00:00:08.5393730 (Sorted)
Populated in 00:00:01.3665681
Found 15614281 matches in 00:00:04.1796578 (Unsorted)
Populated in 00:00:01.3326378
Found 15614281 matches in 00:00:08.6027886 (Sorted)
  • 1 1 Answer
  • 0 Views
  • 0 Followers
  • 0
Share
  • Facebook
  • Report

Leave an answer
Cancel reply

You must login to add an answer.

Forgot Password?

Need An Account, Sign Up Here

1 Answer

  • Voted
  • Oldest
  • Recent
  • Random
  1. Editorial Team
    Editorial Team
    2026-06-16T12:03:46+00:00Added an answer on June 16, 2026 at 12:03 pm

    When you are using the unsorted list all tuples are accessed in memory-order. They have been allocated consecutively in RAM. CPUs love accessing memory sequentially because they can speculatively request the next cache line so it will always be present when needed.

    When you are sorting the list you put it into random order because your sort keys are randomly generated. This means that the memory accesses to tuple members are unpredictable. The CPU cannot prefetch memory and almost every access to a tuple is a cache miss.

    This is a nice example for a specific advantage of GC memory management: data structures which have been allocated together and are used together perform very nicely. They have great locality of reference.

    The penalty from cache misses outweighs the saved branch prediction penalty in this case.

    Try switching to a struct-tuple. This will restore performance because no pointer-dereference needs to occur at runtime to access tuple members.

    Chris Sinclair notes in the comments that “for TotalCount around 10,000 or less, the sorted version does perform faster“. This is because a small list fits entirely into the CPU cache. The memory accesses might be unpredictable but the target is always in cache. I believe there is still a small penalty because even a load from cache takes some cycles. But that seems not to be a problem because the CPU can juggle multiple outstanding loads, thereby increasing throughput. Whenever the CPU hits a wait for memory it will still speed ahead in the instruction stream to queue as many memory operations as it can. This technique is used to hide latency.

    This kind of behavior shows how hard it is to predict performance on modern CPUs. The fact that we are only 2x slower when going from sequential to random memory access tell me how much is going on under the covers to hide memory latency. A memory access can stall the CPU for 50-200 cycles. Given that number one could expect the program to become >10x slower when introducing random memory accesses.

    • 0
    • Reply
    • Share
      Share
      • Share on Facebook
      • Share on Twitter
      • Share on LinkedIn
      • Share on WhatsApp
      • Report

Sidebar

Related Questions

I have list of string which I want to add as option in select
I have the following code to populate a DropDownList var list = new Dictionary<string,
I have List I want to sort Desc by Priority, which is int and
I have list of files which contain particular patterns, but those files have been
I have a list of numbers which total 540000. I would like to sort
I have List collection with around 35,000 strings Typical string looks like this: <i>füüs</i>ampri
I have list (array list)that can contain many instances (between 500-3000 instances ). during
Let's say I have a class which, internally, stores a List of data: import
I have the following simple class, containing a String and an Int. private static
I have a list of approx. 500,000 strings, each approx. 100 characters long. Given

Explore

  • Home
  • Add group
  • Groups page
  • Communities
  • Questions
    • New Questions
    • Trending Questions
    • Must read Questions
    • Hot Questions
  • Polls
  • Tags
  • Badges
  • Users
  • Help
  • SEARCH

Footer

© 2021 The Archive Base. All Rights Reserved
With Love by The Archive Base

Insert/edit link

Enter the destination URL

Or link to existing content

    No search term specified. Showing recent items. Search or use up and down arrow keys to select an item.