Does anyone know of an optimized way of detecting a 37 bit sequence in

Question

0

Asked: May 12, 20262026-05-12T08:47:58+00:00 2026-05-12T08:47:58+00:00

Does anyone know of an optimized way of detecting a 37 bit sequence in

0

Does anyone know of an optimized way of detecting a 37 bit sequence in a chunk of binary data that is optimal. Sure I can do a brute force compare using windowing (just compare starting with index 0+next 36 bits, increment and loop until i find it) but is there a better way? Maybe some hashing search that returns a probability that the sequence lies within a binary chunk? Or am I just pulling that out of my butt? Anyway, I’m going ahead with the brute force search, but I was curious if there was something more optimal. This is in C by the way.

Report

Leave an answer
Cancel reply

You must login to add an answer.

Need An Account,

1 Answer

Editorial Team · Answer 1 · 2026-05-12T08:47:59+00:00

Interesting question. I assume your 37-bit sequence can begin at any point in a byte. Let’s say your sequence is represented by this:

ABCDEFGHIJKLMNOPQRSTUVWXYZ0123456789@

If we have a byte aligned algorithm, we could see these 32-bit sequences bytes:

BCDEFGHIJKLMNOPQRSTUVWXYZ0123456 [call this pattern w_A]
CDEFGHIJKLMNOPQRSTUVWXYZ01234567 [w_B, etc.]
DEFGHIJKLMNOPQRSTUVWXYZ012345678
EFGHIJKLMNOPQRSTUVWXYZ0123456789
FGHIJKLMNOPQRSTUVWXYZ0123456789@
GHIJKLMNOPQRSTUVWXYZ0123456789@x
HIJKLMNOPQRSTUVWXYZ0123456789@xx
IJKLMNOPQRSTUVWXYZ0123456789@xxx

Only these byte values – no others -could form the second third and fourth byte of a byte sequence containing the 37 bits of interest.

This leads to a reasonably obvious implementation:

unsigned char *p = ...; // input data
size_t n = ...;  // bytes available
size_t bitpos;

--n; p++;
bitpos = 0;

while (n--) {
  uint32_t word = *(uint32_t*)p; // nonportable, sorry.
  bitpos += 8; // compiler should be able to optimise this variable out completely

  if (word == w_A) {
    if ((p[4] & 0xF0 == 789@) && (p[-1] & 1 == A)) {
      // we found the data starting at the 8th bit of p-1
      found_at(bitpos-1);
    }
  } else if (word == w_B) {
    if ((p[4] & 0xE0 == 89@) && (p[-1] & 3 == AB)) {
      // we found the data starting at the 7th bit of p-1
      found_at(bitpos-2);
    }
  } else if (word == w_C} {
     ...
  }
...
}

Obviously there are problems with this strategy. First, it might want to evaluate p[-1] first time around the loop, but that’s easy to fix. Second, it fetches a word from odd addresses; that wont work on some CPUs – SPARC and 68k for example. But doing so is an easy way to roll 4 comparisons into one.

kek444’s suggestion would allow you to use a algorithm like KMP to skip forward in the data stream. However, the maximum size of the skip is not huge, so while the Turbo Boyer-Moore algorithm may reduce the number of byte comparisons by 4 or so, that may not be much of a win if the cost of a byte comparison is similar to the cost of a word comparision.

Sign Up

Sign In

Forgot Password

The Archive Base Latest Questions

Does anyone know of an optimized way of detecting a 37 bit sequence in

Leave an answerCancel reply

1 Answer

Leave an answer
Cancel reply