From my understanding, if Hardware supports Cache coherence on a multi-processor system, then writes to a shared variable will be visible to threads running on other processors. In order to test this, I wrote a simple program in Java and pThreads to test this
public class mainTest {
public static int i=1, j = 0;
public static void main(String[] args) {
/*
* Thread1: Sleeps for 30ms and then sets i to 1
*/
(new Thread(){
public void run(){
synchronized (this) {
try{
Thread.sleep(30);
System.out.println("Thread1: j=" + mainTest.j);
mainTest.i=0;
}catch(Exception e){
throw new RuntimeException("Thread1 Error");
}
}
}
}).start();
/*
* Thread2: Loops until i=1 and then exits.
*/
(new Thread(){
public void run(){
synchronized (this) {
while(mainTest.i==1){
//System.out.println("Thread2: i = " + i); Comment1
mainTest.j++;
}
System.out.println("\nThread2: i!=1, j=" + j);
}
}
}).start();
/*
* Sleep the main thread for 30 seconds, instead of using join.
*/
Thread.sleep(30000);
}
}
/* pThreads */
#include<stdio.h>
#include<pthread.h>
#include<assert.h>
#include<time.h>
int i = 1, j = 0;
void * threadFunc1(void * args) {
sleep(1);
printf("Thread1: j = %d\n",j);
i = 0;
}
void * threadFunc2(void * args) {
while(i == 1) {
//printf("Thread2: i = %d\n", i);
j++;
}
}
int main() {
pthread_t t1, t2;
int res;
printf("Main: creating threads\n");
res = pthread_create(&t1, NULL, threadFunc1, "Thread1"); assert(res==0);
res = pthread_create(&t2, NULL, threadFunc2, "Thread2"); assert(res==0);
res = pthread_join(t1,NULL); assert(res==0);
res = pthread_join(t2,NULL); assert(res==0);
printf("i = %d\n", i);
printf("Main: End\n");
return 0;
}
I noticed that the pThread program always ends. (I tested it for different sleep times for thread1). However the Java program ends only a very few times; does not end most of the times.
If I uncomment the Comment1 in java program, then it ends all the time. Also if I use volatile, then it ends for java in all cases.
So my confusion is,
-
if cache coherence is done in hardware, then ‘i=0’ should be visible to other threads unless
compiler optimized the code. But if compiler optimized the code, then I don’t understand why the thread ends sometimes and doesn’t sometimes. Also adding a System.out.println seems to change the behavior. -
Can anyone see a compiler optimization that Java does (which is not done by C compiler), which is causing this behavior?
-
Is there something additional that the Compiler has to do, to get Cache coherence even if the hardware already supports it? (like enable/disable)
-
Should I be using Volatile for all shared variables by default?
Am I missing something? Any additional comments are welcome.
Your specific problem is that the 2nd thread needs to synchronize memory after
ihas been set to 0 by the 1st thread. Because both the threads are synchronizing onthiswhich, as @Peter and @Marko has pointed out are different objects. It is possible for the 2nd thread to enter thewhileloop _before the first thread setsi = 0. There is no additional memory barrier crossed in thewhileloop so the field is never updated.This works is because the underlying
System.outPrintStreamissynchronizedwhich causes a memory-barrier to be crossed. Memory barriers force synchronization memory between the thread and central memory and ensure ordering of memory operations. Here’s thePrintStream.println(...)source:You have to remember that each of the processors has both a few registers and a lot of per-processor cache memory. It is the cached memory which is the main issue here not compiler optimizations.
The use of cached memory and memory operation reordering both are significant performance optimizations. Processors are free to change the order of operations to improve pipelining and they do not synchronize their dirty pages unless a memory barrier is crossed. This means that a thread can run asynchronously using local high-speed memory to [significantly] increase performance. The Java memory model allows for this and is vastly more complicated compared to pthreads.
If you expect thread #1 to update a field and thread #2 to see that update then yes, you will need to mark the field as
volatile. UsingAtomic*classes is often recommended and is required if you want to increment a shared variable (++is two operations).If you are doing multiple operations (such as iterating across a shared collection) then
synchronizedkeyword should be used.