I have two separate caches running in a JVM (one controlled by a third party library) each using soft references. I would prefer for the JVM to clear out my controlled cache before the one controlled by the library. The SoftReference javadoc states:
All soft references to softly-reachable objects are guaranteed to have
been cleared before the virtual machine throws an OutOfMemoryError.
Otherwise no constraints are placed upon the time at which a soft
reference will be cleared or the order in which a set of such
references to different objects will be cleared. Virtual machine
implementations are, however, encouraged to bias against clearing
recently-created or recently-used soft references.Direct instances of this class may be used to implement simple caches;
this class or derived subclasses may also be used in larger data
structures to implement more sophisticated caches. As long as the
referent of a soft reference is strongly reachable, that is, is
actually in use, the soft reference will not be cleared. Thus a
sophisticated cache can, for example, prevent its most recently used
entries from being discarded by keeping strong referents to those
entries, leaving the remaining entries to be discarded at the
discretion of the garbage collector.
How do common JVM implementations, especially HotSpot, handle SoftReferences in practice? Do they “bias against clearing recently-created or recently-used soft references” as encouraged to by the spec?
Looks like it could be tuneable, but it isn’t. The concurrent mark-sweep collector hangs on the default heap’s implementation of
must_clear_all_soft_refs()which apparently is onlytruewhen performing a_last_ditch_collection.While normal handling of failed allocation has three successive calls to the heap’s
do_collectmethod, in theCollectorPolicy.cppWhich tries to collect, tries to reallocate, tries to expand the heap if that fails, and then as a last-ditch effort, tries to collect clearing soft references.
The comment on the last collection is quite telling (and the only one that triggers clearing soft refs)
— Edited in response to the obvious, I was describing weak references, not soft ones —
In practice, I would imagine that SoftReferences are only “not” followed when the JVM is called for garbage collection in response to they attempt to avoid an
OutOfMemoryError.For
SoftReferences to be compatible with all four Java 1.4 garbage collectors, and with the new G1 collector, the decision must lie only with the reachability determination. By the time that reaping and compacting occur, it is far too late to decide if an object is reachable. This suggests (but does not require) that a collection “context” exists which determines reachability based on free memory availability in the heap. Such a context would have to indicate not followingSoftReferences prior to attempting to follow them.Since
OutOfMemoryErroravoidance garbage collection is specially scheduled in a full-collection, stop-the-world manner, it would not be a hard to imagine scenario where the heap manager sets a “don’t followSoftReference” flag before the collection occurs.— Ok, so I decided that a “must work this way” answer just wasn’t good enough —
From the source code src/share/vm/gc_implementation/concurrentMarkSweep/vmCMSOperations.cpp (highlights are mine)
The operation to actually “do” garbage collection:
We better be a VM thread, otherwise a “program” thread is garbage collecting!
We are a concurrent collector, so we better be scheduled concurrently!
Grab the heap (which has the GCCause object in it).
Check to see if we need a foreground “young” collection
Are the program threads not meddling with the heap?
Fetch the garbage collection cause (the reason for this collection) from the heap.
Do a full collection of the young space
Note that his passes in the value of the heap’s must_clear_all_soft_refs flag
Which in an OutOfMemory scenario must have been set to true, and in either case
directs the “do_full_collection” to no follow the soft references
The _gc_cause is an enum, which is (guesswork here) set to
_allocation_failurein the first attempt at avoidingOutOfMemoryErrorand_last_ditch_collectionafter that fails (to attempt to collect transient garbage)A quick look in the memory “heap” module shows that in
do_full_collectionwhich callsdo_collectionsoft references are cleared explicitly (under the “right” conditions) with the line— Original post follows for those who want to learn about weak references —
In the Mark and Sweep algorithm, Soft references are not followed from the Main thread (and thus not marked unless a different branch could reach it through non-soft references.
In the copy algorithm, Objects soft references point to are not copied (again unless they are reached by a different non-soft reference).
Basically, when following the web of references from the “main” thread of execution, soft references are not followed. This allows their objects to be garbage collected just as if they didn’t have references pointing to them.
It is important to mention that soft references are almost never used in isolation. They are typically used in objects where the design is to have multiple references to the object, but only one reference need be cleared to trigger garbage collection (for ease of maintaining the container, or run time performance of not needing to look up expensive references).