I need to iterate through a set of bytes, searching for a 4 byte

Question

0

Asked: June 4, 20262026-06-04T06:38:47+00:00 2026-06-04T06:38:47+00:00

I need to iterate through a set of bytes, searching for a 4 byte

0

I need to iterate through a set of bytes, searching for a 4 byte value (all 4 bytes are the same). The length of the data is variable and these bytes can be anywhere inside the data; I’m looking for the first instance. I’m trying to find the fastest possible implementation because this logic runs in a critical part of my code.

This will only ever run on x86 & x64, under Windows.

typedef unsigned char Byte;
typedef Byte* BytePtr;
typedef unsigned int UInt32;
typedef UInt32* UInt32Ptr;

const Byte MARKER_BYTE = 0xAA;
const UInt32 MARKER = 0xAAAAAAAA;

UInt32 nDataLength = ...;
BytePtr pData = ...;
BytePtr pEnd = pData + nDataLength - sizeof ( UInt32 );

// Option 1 -------------------------------------------
while ( pData < pEnd )
{
    if ( *( (UInt32Ptr) pData ) == MARKER )
    {
        ... // Do something here
        break;
    }

    pData++;
}

// Option 2 -------------------------------------------
while ( pData < pEnd )
{
    if ( ( *pData == MARKER_BYTE ) && ( *( (UInt32Ptr) pData ) == MARKER ) )
    {
        ... // Do something here
        break;
    }

    pData++;
}

I think Option 2 is faster but I’m not sure if my reasoning is correct.

Option 1 first reads 4 bytes from memory, checks it against the 4-byte constant and if not found, it steps onto the next byte and starts over. The next 4-byte ready from memory is going to overlap 3 bytes already read so the same bytes need to be fetched again. Most bytes before my 4-byte marker would be read twice.

Option 2 only reads 1 byte at a time and if that single byte is a match, it reads the full 4-byte value from that address. This way, all bytes are read only once and only the 4 matching bytes are read twice.

Is my reasoning correct or am I overlooking something?

And before someone brings it up, yes, I really do need to perform this kind of optimization. 🙂

Edit: note, that this code will only ever run on Intel / AMD based computers. I don’t care if other architectures would fail to run this, as long as normal x86 / x64 computers (desktops / servers) run this without problems or performance penalties.

Edit 2: compiler is VC++ 2008, if that helps.

Report

Leave an answer
Cancel reply

You must login to add an answer.

Need An Account,

1 Answer

Editorial Team · Answer 1 · 2026-06-04T06:38:49+00:00

You might also try the Boyer-Moore approach.

pData = start + 3;
int i;

while(pData < pEnd) {
    for(i = 0; i < 4; ++i) {
        if (*(pData-i) != MARKER_BYTE) {
            pData += 4-i;
            break;
        }
    }
    if (i == 4) {
        /* do something here with (pData-3) */
        break;
    }
}

If you’re lucky, that tests only every fourth byte until you found a match.

Whether that’s faster or slower than testing every single byte is anybody’s guess for short patterns as this.

Sign Up

Sign In

Forgot Password

The Archive Base Latest Questions

I need to iterate through a set of bytes, searching for a 4 byte

Leave an answerCancel reply

1 Answer

Leave an answer
Cancel reply