EVALUATION
The regression was introduced in 6417901 (which was integrated in
6u4 and 7b11). We have verified that the fix in the Suggested Fix
section fixes the problem at both customers who reported the issue.
|
|
|
SUGGESTED FIX
JPRT: [hotspotwest] job notification - success with job 2008-05-06-224147.ysr.hg_gc_fixes
JPRT Job ID: 2008-05-06-224147.ysr.hg_gc_fixes
JPRT System Used: hotspotwest
JPRT Version Used: 1.0: (2008-04-29) Case of the Misguided Missile
[c2c0735e7f00]
Job URL:
http://prt-web.sfbay.sun.com/archive/2008/05/2008-05-06-224147.ysr.hg_gc_fixes
Job ARCHIVE:
/net/prt-archiver.sfbay/data/jprt/archive/2008/05/2008-05-06-224147.ysr.hg_gc_fixes
User: ysr
Email: ###@###.###
Release: jdk7
Job Source: Mercurial:
/net/spot/workspaces/ysr/hg_gc_fixes/{make,src,agent}
Parent: http://hg.openjdk.java.net/jdk7/hotspot-gc/hotspot
Push Parent:
ssh://###@###.###/jdk7/hotspot-gc-gate/hotspot
CR List: 6662086
Changeset:
http://hg.openjdk.java.net/jdk7/hotspot-gc/hotspot/rev/b5489bb705c9
File List: {.}
Exclude List: {build}
Command Line: jprt submit -m jprt.txt -noforest
Job submitted at: Tuesday May 6, 2008 15:41:49 PDT
Total time in queue: 2h 02m 35s
Job started at: Tuesday May 6, 2008 15:43:48 PDT
Job integrated at: Tuesday May 6, 2008 17:44:00 PDT
Job finished at: Tuesday May 6, 2008 17:44:24 PDT
Job run time: 2h 35s
Job state: success
Job flags: SYNC INTEGRATE PRECIOUS
Bundles: USE: jprt install 2008-05-06-224147.ysr.hg_gc_fixes
NOTE: Zip files containing exe or dll files on windows have had problems with
execute permissions. You may need to 'chmod a+x' the windows exe and
dll files.
User Comments:
6662086: 6u4+, 7b11+: CMS never clears referents when
-XX:+ParallelRefProcEnabled
Summary: Construct the relevant CMSIsAliveClosure used by CMS during parallel
reference processing with the correct span. It had incorrectly been
constructed with an empty span, a regression introduced in 6417901.
Reviewed-by: jcoomes
|
|
|
EVALUATION
Verified that only CMS was affected by the issue.
Reworked some of the code and added asserts so as to reduce the possibility of
inadvertent such regressions in the future.
|
|
|
EVALUATION
Based on data collected by ###@###.### using an instrumented
jvm that he built and visual inspection of the code, the problem
appears to be that the CMSIsAliveClosure passed into the
work method does not have a correctly initialized
_span (from code inspection). We'll also need to check whether
the same (or similar) problem(s) also exist(s) in the other collectors.
In the case of CMS, this was causing the _is_alive closure to
declare that all referents were strongly reachable even when they were
not (when +ParallelRefProcEnabled).
The fix is to pass in the _span to the CMSIsAliveClosure at
construction time so it's correctly initialized. See suggested
fix section.
|
|
|
WORK AROUND
Do not use -XX:+ParallelRefProcEnabled
(i.e. revert to default which disables parallel reference processing).
However, this might adversely affect CMS remark pause times in
applications that make heavy use of Reference objects (including
for instance Finalizers) and run on large MP boxes.
|
|
|
EVALUATION
Added "when -XX:+ParallelRefProcEnabled" to synopsis, based on further
tests at customer. The regression appears to have started with 6417901
where the parallel reference processing code was extensively reworked
to extend it to other collectors besides CMS. We are investigating the
root cause of the problem now.
|
|
|