It seems that when performing an & operation between two long s it takes

Question

0

Asked: May 27, 20262026-05-27T16:23:10+00:00 2026-05-27T16:23:10+00:00

It seems that when performing an & operation between two long s it takes

0

It seems that when performing an & operation between two longs it takes the same amount of time as the equivalent operation inside 4 32bit ints.

For example

long1 & long2

Takes as long as

int1 & int2
int3 & int4

This is running on a 64bit OS and targeting 64bit .net.

In theory, this should be twice as fast. Has anyone encountered this previously?

EDIT

As a simplification, imagine I have two lots of 64 bits of data. I take those 64 bits and put them into a long, and perform a bitwise & on those two.

I also take those two sets of data, and put the 64 bits into two 32 bit int values and perform two &s. I expect to see the long & operation running faster than the int & operation.

Report

Leave an answer
Cancel reply

You must login to add an answer.

Need An Account,

1 Answer

Editorial Team · Answer 1 · 2026-05-27T16:23:11+00:00

I couldn’t reproduce the problem.

My test was as follows (int version shown):

// deliberately made hard to optimise without whole program optimisation
public static int[] data = new int[1000000]; // long[] when testing long

// I happened to have a winforms app open, feel free to make this a console app..
private void button1_Click(object sender, EventArgs e)
{
    long best = long.MaxValue;
    for (int j = 0; j < 1000; j++)
    {
        Stopwatch timer = Stopwatch.StartNew();
        int a1 = ~0, b1 = 0x55555555, c1 = 0x12345678; // varies: see below
        int a2 = ~0, b2 = 0x55555555, c2 = 0x12345678;
        int[] d = data; // long[] when testing long
        for (int i = 0; i < d.Length; i++)
        {
            int v = d[i]; // long when testing long, see below
            a1 &= v; a2 &= v;
            b1 &= v; b2 &= v;
            c1 &= v; c2 &= v;
        }
        // don't average times: we want the result with minimal context switching
        best = Math.Min(best, timer.ElapsedTicks); 
        button1.Text = best.ToString() + ":" + (a1 + a2 + b1 + b2 + c1 + c2).ToString("X8");
    }
}

For testing longs a1 and a2 etc are merged, giving:

long a = ~0, b = 0x5555555555555555, c = 0x1234567812345678;

Running the two programs on my laptop (i7 Q720) as a release build outside of VS (.NET 4.5) I got the following times:

int: 2238, long: 1924

Now considering there’s a huge amount of loop overhead, and that the long version is working with twice as much data (8mb vs 4mb), it still comes out clearly ahead. So I have no reason to believe that C# is not making full use of the processor’s 64 bit bitops.

But we really shouldn’t be benching it in the first place. If there’s a concern, simply check the jited code (Debug -> Windows -> Disassembly). Ensure the compiler’s using the instructions you expect it to use, and move on.

Attempting to measure the performance of those individual instructions on your processor (and this could well be specific to your processor model) in anything other than assembler is a very bad idea – and from within a jit compiled language like C#, beyond futile. But there’s no need to anyway, as it’s all in Intel’s optimisation handbook should you need to know.

To this end, here’s the disassembly of the a &= for the long version of the program on x64 (release, but inside of debugger – unsure if this affects the assembly, but it certainly affects the performance):

00000111  mov         rcx,qword ptr [rsp+60h] ; a &= v
00000116  mov         rax,qword ptr [rsp+38h] 
0000011b  and         rax,rcx 
0000011e  mov         qword ptr [rsp+38h],rax

As you can see there’s a single 64 bit and operation as expected, along with three 64 bit moves. So far so good, and exactly half the number of ops of the int version:

00000122  mov         ecx,dword ptr [rsp+5Ch] ; a1 &= v
00000126  mov         eax,dword ptr [rsp+38h] 
0000012a  and         eax,ecx 
0000012c  mov         dword ptr [rsp+38h],eax 
00000130  mov         ecx,dword ptr [rsp+5Ch] ; a2 &= v
00000134  mov         eax,dword ptr [rsp+44h] 
00000138  and         eax,ecx 
0000013a  mov         dword ptr [rsp+44h],eax

I can only conclude that the problem you’re seeing is specific to something about your test suite, build options, processor… or quite possibly, that the & isn’t the point of contention you believe it to be. HTH.

Sign Up

Sign In

Forgot Password

The Archive Base Latest Questions

It seems that when performing an & operation between two long s it takes

Leave an answerCancel reply

1 Answer

Leave an answer
Cancel reply