I read about usage of C volatile keyword in memory-mapped hardware register, ISR, and multithreaded program.
1) register
uint8_t volatile * pReg;
while (*pReg == 0) { // do sth } // pReg point to status register
2) ISR
int volatile flag = 0;
int main()
{
while(!flag) { // do sth }
}
interrupt void rx_isr(void)
{
//change flag
}
3) multithread
int volatile var = 0;
int task1()
{
while (var == 0) { // do sth }
}
int task2()
{
var++;
}
I can see why compiler can mistakenly optimize the while in case 1) if volatile is not there, ’cause variable change is made from hardware, compiler may not see any change of the variable made from code.
But for case 2) and 3), why is volatile ever needed? In both cases variable is declared global, and compiler can see it’s used in more than one place. So why would compiler optimize the while loop if the variable is not volatile?
Is it because a compiler by-design has no idea of “asynchronous call” (in case of ISR), or multithreading? But this can’t be, right?
In addition, case 3) looks like a common program in multithreading without the volatile keyword. Let’s say I add some locking to the global variable (no volatile keyword):
int var = 0;
int task1()
{
lock(); // some mutex
while (var == 0) { do sth }
release()
}
int task2()
{
lock();
var++;
release();
}
It looks normal enough to me. So do I really need volatile in multithreading? How come I’ve never seen volatile qualifier added to variable to avoid optimization in multithread program before?
The main point of using
volatilekeyword is to prevent compiler from generating a code that uses CPU registers as faster ways to represent variables. This forces compiled code to access the exact memory location in RAM on every access to the variable to get the latest value of it which may have been changed by another entity. By addingvolatilewe make sure that our code is aware of any change made to a variable by anyone else like hardware or ISR and no coherency issue happens.In absence of
volatilekeyword, compiler tries to generate faster code by reading the content of variable from RAM into a CPU register once and use that cached value in a loop or function. Accessing RAM could be tens of times slower than accessing the CPU register.I’ve had the experience on item 1 and 2 but I don’t think you need to define a variable as
volatilein a multi threded environment. Adding the lock/unlock mechanism is necessary to solve synchronization problem and is not related the whatvolatileis about.