I could find the answer if I read a complete chapter/book about multithreading, but I’d like a quicker answer. (I know this stackoverflow question is similar, but not sufficiently.)
Assume there is this class:
public class TestClass {
private int someValue;
public int getSomeValue() { return someValue; }
public void setSomeValue(int value) { someValue = value; }
}
There are two threads (A and B) that access the instance of this class. Consider the following sequence:
- A: getSomeValue()
- B: setSomeValue()
- A: getSomeValue()
If I’m right, someValue must be volatile, otherwise the 3rd step might not return the up-to-date value (because A may have a cached value). Is this correct?
Second scenario:
- B: setSomeValue()
- A: getSomeValue()
In this case, A will always get the correct value, because this is its first access so he can’t have a cached value yet. Is this right?
If a class is accessed only in the second way, there is no need for volatile/synchronization, or is it?
Note that this example was simplified, and actually I’m wondering about particular member variables and methods in a complex class, and not about whole classes (i.e. which variables should be volatile or have synced access). The main point is: if more threads access certain data, is synchronized access needed by all means, or does it depend on the way (e.g. order) they access it?
After reading the comments, I try to present the source of my confusion with another example:
- From UI thread:
threadA.start() - threadA calls
getSomeValue(), and informs the UI thread - UI thread gets the message (in its message queue), so it calls:
threadB.start() - threadB calls
setSomeValue(), and informs the UI thread - UI thread gets the message, and informs threadA (in some way, e.g. message queue)
- threadA calls
getSomeValue()
This is a totally synchronized structure, but why does this imply that threadA will get the most up-to-date value in step 6? (if someValue is not volatile, or not put into a monitor when accessed from anywhere)
The issue is that java is simply a specification. There are many JVM implementations and examples of physical operating environments. On any given combination an an action may be safe or unsafe. For instance On single processor systems the volatile keyword in your example is probably completely unnecessary. Since the writers of the memory and language specifications can’t reasonably account for possible sets of operating conditions, they choose to white-list certain patterns that are guaranteed to work on all compliant implementations. Adhering to to these guidelines ensures both that your code will work on your target system and that it will be reasonably portable.
In this case “caching” typically refers to activity that is going on at the hardware level. There are certain events that occur in java that cause cores on a multi processor systems to “Synchronize” their caches. Accesses to volatile variables are an example of this, synchronized blocks are another. Imagine a scenario where these two threads X and Y are scheduled to run on different processors.
The point is that
volatile(on compliant implementations) ensures that ordered writes will always be flushed to main memory and that other processor’s caches will be flagged as ‘dirty’ before the next access regardless of the thread from which that access occurs.disclaimer: volatile DOES NOT LOCK. This is important especially in the following case:
this could be relevant to your question if your intent is that
setSomeValuemust always be called beforegetSomeValueIf the intent is that
getSomeValue()must always reflect the most recent call tosetSomeValue()then this is a good place for the use of thevolatilekeyword. Just remember that without it there is no guarantee thatgetSomeValue()will reflect to most recent call tosetSomeValue()even ifsetSomeValue()was scheduled first.