Java Solaris Communities Sun Store Join SDN My Profile Why Join?
 
Bug Database
Bug Detail
Quick Lists
Top 25 Bugs
Top 25 RFE's
Recently Closed Bugs
Printable Page Printable Page


Bug Database
Bug ID: 6722116
Votes 0
Synopsis CMS: Incorrect overflow handling when using parallel concurrent marking
Category hotspot:garbage_collector
Reported Against
Release Fixed hs14(b06), hs11(b17) (Bug ID:2166621) , hs10(b27) (Bug ID:2169514) , 6u7-rev(b15) (Bug ID:2169515)
State 10-Fix Delivered, bug
Priority: 3-Medium
Related Bugs 6578335 , 6611406 , 6681372 , 6697967 , 6752663
Submit Date 03-JUL-2008
Description
Here's a description from the Evaluation field of 6578335 during the investigation
of which this problem was first discovered:-

There was a third bug found which relates to the handling of
"second ring overflow" when using parallel concurrent marking
-- the overflow of the global overflow stack (which itself handles
the overflow from the local work queues). The intention was
that this second ring overflow should use the "restart mechanism"
to restart marking from the least overflown address.
That mechanism was not completely extended to the parallel
concurrent marking case. The restart_addr was not pushed
all the way through to the parallel concurrent marking task that controls
the parallel concurrent marking. Because of the partial
change to the state of the parallel concurrent marking task,
we can and often will end up missing the scan of some of the
addresses at the higher extremes of the CMS-collected
generations. Because second-ring overflow is a very rare
event in practice, this appears to have not been detected
before (or at least not until the first two bugs mentioned
above were moved out of our way).

The obvious workaround is to switch off parallel concurrent
marking via -XX:-CMSConcurrentMTEnabled.
Posted Date : 2008-07-03 00:44:17.0
Work Around
-XX:-CMSConcurrentMTEnabled.

Otherwise, increasing the size of the marking stack via -XX:CMSMarkingStackSize{,Max}
would reduce the probability of hitting this bug.
Evaluation
This is a bug since 6.0 when parallel concurrent marking was first introduced.
Because this involves not the first, but the second level of overflow,
it's much less frequent (other than under really high stress
conditions), and so customers are not likely to run into this
very frequently (i think).
Posted Date : 2008-07-03 00:44:17.0

http://hg.openjdk.java.net/jdk7/hotspot-gc/hotspot/rev/ebeb6490b814
Posted Date : 2008-08-27 01:19:31.0
Comments
  
  Include a link with my name & email   

Submitted On 24-SEP-2008
manu4ever
We have been seriously impacted by this on a major application. Our workaround has been to set CMSMarkStackSize to a very large value (8M in this case) to avoid the expand() function.

Generally we see the problem as a crash due to dereferencing of an invalid object. However a greater concern is the risk that the reference might now point to some quite different object than the one it originally pointed to. That would be a particularly serious and insidious form of data corruption. Is this a possibility?



PLEASE NOTE: JDK6 is formerly known as Project Mustang