EVALUATION
For what it is worth: It would be really nice not to have to mark objects as we copy them to the survivors (to avoid the extra overhead during the GC pause, as well as avoid having to notify the marking phase that those objects have moved). Note that, if we do several GC pauses during a marking phase, the majority of objects in the survivors would be objects that were allocated since the start of the marking phase which, according to the SATB invariants, we do not have to visit during the marking phase; it's only the objects in the survivors after the initial-mark pause we really need to visit. I'll open a CR to track this idea (it's CR 6888336).
|
EVALUATION
The incomplete marking issue is caused because, when marking is in progress, we deal with the survivors spaces incorrectly.
In G1, there are two ways in which an object is considered live. First, if it's marked in the bitmap. Second, if it's over the "TAMS" (top at mark start) variable of its containing region. And we have two copies of this liveness information, one it's the "previous" (the last one that was obtained and which is known to be consistent), one it's the "next" (thte one currently in progress which might be inconsistent). Here we deal with the next marking info, as it's the one that's being obtained during the marking cycle.
One more thing to point out is that, in G1, when we evacuate objects during evacuation pauses, if they are considered live we also have to explicitly mark them in their new location too (typically, by marking them in the bitmap). In some cases we also have to notify the marking threads that an object has been evacuated.
The bug is caused because, during marking, we explicitly set the NTAMS (next TAMS) variable of each region that contains survivors to bottom, thus making all its contents implicitly live. Consider the following scenario, we have
a -> b -> c
with a and b being in a survivor space, and c being, say, in the old generation. Let's also assume that, when we start the evacuation pause, a is marked, b and c are not.
When we copy a and b to a survivor region, we'll propagate a's mark to the bitmap, notify the marking threads to have to visit it, and then set the NTAMS field of that region to bottom, making them both implicitly marked (note that a is both explicitly and implicitly marked at this point).
When marking finally comes across a it says "ah, b is already live" (because it's over NTAMS) and it incorrectly doesn't process it further. As a result, b is never visited by the marking threads and c is never marked.
|
SUGGESTED FIX
The fix is straightforward:
heapRegion.hpp:
void note_end_of_copying() {
- assert(top() >= _next_top_at_mark_start,
- "Increase only");
- // Survivor regions will be scanned on the start of concurrent
- // marking.
- if (!is_survivor()) {
+ assert(top() >= _next_top_at_mark_start, "Increase only");
_next_top_at_mark_start = top();
}
- }
|