I’m writing a small slab allocator for my program, however instead of using lists

Question

0

Asked: June 10, 20262026-06-10T15:06:18+00:00 2026-06-10T15:06:18+00:00

I’m writing a small slab allocator for my program, however instead of using lists

0

I’m writing a small slab allocator for my program, however instead of using lists with locking mechanism, After reading a paper on lockless heaps by IBM, i decided to implement something like that. However which GCC intrinsic (ffs/ffz/ctz etc..) would be most efficient and why ?

My most likely target will be ARMv7 and ARMv6 processors with CLZ hardware instruction.

I’ve comeup with something like

uint32_t tmp;
uint32_t new_bits;
uint32_t old_bits;

do {
    old_bits = slab->bitmap;
    tmp = <function>(old_bitmap);
    new_bits = old_bits | (1 << tmp);
} while(cpu_atomic_cmpxchg(&slab->bitmap, old_bits, new_bits) != OS_OKAY);

return ((void *) slab->start + (tmp * slab->blksize))

Report

Leave an answer
Cancel reply

You must login to add an answer.

Need An Account,

1 Answer

Editorial Team · Answer 1 · 2026-06-10T15:06:19+00:00

Editorial Team

2026-06-10T15:06:19+00:00Added an answer on June 10, 2026 at 3:06 pm

If you have CTZ then just invert the value and count trailing zeroes:

if (x == -1ULL)
    return /* failure? */;
int index = __builtin_ctz(~x);

E.g.

                                      v
 x = 0000 1111 0000 1111 0000 1111 0000 1111
~x = 1111 0000 1111 0000 1111 0000 1111 0000
index = __builtin_ctz(~x) = 4

0

Reply
Share
Share

- Report

Sign Up

Sign In

Forgot Password

The Archive Base Latest Questions

I’m writing a small slab allocator for my program, however instead of using lists

Leave an answerCancel reply

1 Answer

Leave an answer
Cancel reply