I have a kernel module that allocates a large buffer of memory, this buffer is then mmap-ed into userspace.
The module recieves some data from hardware, and then puts the new data into the buffer with a flag in front of it. (memory is initialized to zero, flag is 1).
The userspace program reads the flag in a loop before returning a pointer to valid data
simplified version of the code:
uint8_t * getData()
{
while(1)
{
if(*((volatile uint32_t*)this->buffer) == 1)
return this->buffer+sizeof(uint32_t);
}
}
the memory region is mapped as shared and a full buffer memory dump confirms that the buffer is written to correctly.
The problem is that after a certain number of correct reads, this function stops returning.
Could this be due to CPU caching? Is there a way to circumvent that and make sure that the read is made directly from RAM each time and not from cache?
Yes it’s likely due to the cpu cache on the reader side. One might think the “volatile” keyword should protect against this sort of problem but that’s not quite right since volatile is simply a directive to the compiler not to registerize the variable, not quite the same thing as directing the cpu to read directly from main memory every time.
The problem needs to be solved on the write side. From your description, it sounds like the write is happening in the kernel module and read from the user side. If these two operations are happening on different cpus (different caching domains), and there’s nothing to trigger a cache invalidation on the read side, you’ll get stuck on the read side as you are describing. You need to force a store buffer flush on the linux kernel after your store instruction. Assuming it’s the linux kernel, inserting a call to smp_mb right after you’ve set the flag and the value from the module will most likely do the right thing on all architectures.