SUGGESTED FIX
Should make a single warning per CMS cycle.
Note that the messages themselves are harmless in the
sense that the GC proceeds correctly despite the
reported resoure constraints (but possibly with some
minor performance impact in terms of slightly longer pause
times or slightly longer collection cycles during such events).
Also, in the case of such overflow during remark (1-thread
rescan or 1-thread weak ref tracing), should expand
marking stack so as to reduce possibility of overflow
in the future, thus reducing the performance impact.
Will try to target this fix for Tiger-RC, if possible,
since the current warning appears to be causing some
customer confusion.
------
Here are the relevant diffs in a Tiger-based workspace:
------- concurrentMarkSweepGeneration.hpp -------
429a430,439
> // Counters in support of marking stack / work queue overflow handling:
> // a non-zero value indicates certain types of overflow event during
> // the most recent previous CMS cycle and could lead to stack
> // resizing efforts at an opprotune future time.
> size_t _ser_pmc_preclean_ovflw;
> size_t _ser_pmc_remark_ovflw;
> size_t _par_pmc_remark_ovflw;
> size_t _ser_kac_ovflw;
> size_t _par_kac_ovflw;
>
------- concurrentMarkSweepGeneration.cpp -------
438c438,443
< _stats(cmsGen)
---
> _stats(cmsGen),
> _ser_pmc_preclean_ovflw(0),
> _ser_pmc_remark_ovflw(0),
> _par_pmc_remark_ovflw(0),
> _ser_kac_ovflw(0),
> _par_kac_ovflw(0)
3439a3445,3467
>
> // If we encountered any (marking stack / work queue) overflow
> // events during the current CMS cycle, take appropriate
> // remedial measures, where possible, so as to try and avoid
> // recurrence of that condition.
> assert(_markStack.isEmpty(), "Zero grey objects");
> size_t ser_ovflw = _ser_pmc_remark_ovflw + _ser_pmc_preclean_ovflw +
> _ser_kac_ovflw;
> if (ser_ovflw > 0) {
> if (PrintGCDetails) {
> gclog_or_tty->print_cr("Marking stack overflow (benign) "
> "(pmc_pc=%d, pmc_rm=%d, kac=%d)",
> _ser_pmc_preclean_ovflw, _ser_pmc_remark_ovflw,
> _ser_kac_ovflw);
> }
> _markStack.expand();
> }
> if (PrintGCDetails && (_par_pmc_remark_ovflw > 0 || _par_kac_ovflw > 0)) {
> gclog_or_tty->print_cr("Work queue overflow (benign) "
> "(pmc_rm=%d, kac=%d)",
> _par_pmc_remark_ovflw, _par_kac_ovflw);
> }
>
4735c4763
< warning("hit CMSMarkStackMax limit");
---
> warning("Hit CMSMarkStackMax limit");
4756c4784
< warning("failed to expand marking stack from %d K to % d K",
---
> warning("Failed to expand marking stack from %d K to % d K",
5244c5272
< warning("Marking stack overflow during preclean");
---
> _collector->_ser_pmc_preclean_ovflw++;
5249c5277
< warning("Marking stack overflow during re-mark");
---
> _collector->_ser_pmc_remark_ovflw++;
5281c5309
< warning("Work queue overflow during re-mark");
---
> _collector->_par_pmc_remark_ovflw++; // imprecise OK: no need to CAS
5936d5963
< warning("Mark stack overflow in CMSKeepAliveClosure");
5937a5965
> _collector->_ser_kac_ovflw++;
5994c6022
< warning("Work queue overflow in CMSInnerParMarkAndPushClosure");
---
> _collector->_par_kac_ovflw++;
###@###.### 2004-06-07: Final fix can be obtained from
the following location; put back to Tiger b56:
Event: putback-to
Parent workspace: /net/jano.sfbay/export/disk05/hotspot/ws/main/gc_baseline
(jano.sfbay:/export/disk05/hotspot/ws/main/gc_baseline)
Child workspace: /prt-workspaces/20040604170326.ysr.tiger/workspace
(prt-web:/prt-workspaces/20040604170326.ysr.tiger/workspace)
User: ysr
Comment:
---------------------------------------------------------
Original workspace: neeraja:/net/jano.sfbay/export/hotspot/users1/ysr/tiger
Submitter: ysr
Archived data: /net/prt-archiver.sfbay/export2/archived_workspaces/main/gc_baseline/2004/20040604170326.ysr.tiger/
Webrev: http://analemma.sfbay.sun.com/net/prt-archiver.sfbay/export2/archived_workspaces/main/gc_baseline/2004/20040604170326.ysr.tiger/workspace/webrevs/webrev-2004.06.04/index.html
Fixed 5053106: A lot of warnings "Mark stack overflow in CMSKeepAliveClosure" is printed to console
http://analemma.sfbay/net/jano/export/disk05/hotspot/users/ysr/tiger/webrev
Previously each overflow event generated a message, even without any
remedial action to prevent its future recurrence. In some cases,
this would generate a lot of output which was both off-putting
as well as confusing to customers (several external customers
reported this confusion in the past several weeks; this bug is new
in Tiger).
The changes are:
. an event (overflow, stack expansion/failure) is logged to a counter
. per each cycle, the counters are checked and as
necessary a single message issued and remedial action
taken where appropriate
. the message is issued only when PrintGCDetails or
PrintCMSStatictis (as appropriate) are set
. the message goes to the GC log
. the message is prefixed with "benign" so as not to cause
customers to panic upon seeing such a message
We requested this fix in Tiger RC because:
. it is very low risk (risk & change limited to CMS)
. it fixes a performance anomaly (albeit a rare case)
. it reduces the possibility of customer confusion
Fixed some print format control strings (%d to SIZE_FORMAT
where appropriate).
Reviewed by: John Coomes, Paul Hohensee, Jon Masamitsu
Approved by: (low risk) core team
Fix verified (y/n): y
Verification testing:
. product/debug CMS with ridiculously small marking stack size
. debug CMS with CMSMarkStackOverflowALot
Other testing:
. PRT
. spec
. refworkload
Files:
update: src/share/vm/memory/concurrentMarkSweepGeneration.cpp
update: src/share/vm/memory/concurrentMarkSweepGeneration.hpp
Examined files: 3215
Contents Summary:
2 update
3213 no action (unchanged)
|