EVALUATION
Email from James McIlree <###@###.###>
To: <###@###.###>
Sent: Monday, August 05, 2002 1:12 AM
Subject: Potential problem for pause-time collectors
>
> I've spent some time looking at GenerateOopMap
> performance over the weekend. The goal was to track down
> a performance problem for iPlanet, but I've found some
> interesting data.
>
> I did a fairly quick hack to "CompileTheWorld"
> to cause it to OopMapTheWorld instead, and print out
> how long it takes to generate an OopMap for any given
> method.
>
> I ran against rt.jar for 1.4.0_02 and sorted
> the results. Some of the more interesting entries:
>
> [0.0101962 sun/io/ByteToCharCp949 : <clinit> (17663)]
> [0.0104368 javax/swing/plaf/basic/BasicLookAndFeel : initComponentDefaults (13763)]
> [0.0112697 sun/text/resources/DateFormatZoneData : getContents (14119)]
> [0.0169934 javax/swing/text/html/parser/Parser : parseAttributeSpecificationList (805)]
> [0.0186694 sun/io/ByteToCharCp933 : <clinit> (32638)]
> [0.0345885 sun/io/CharToByteCp949C : <clinit> (65393)]
> [0.0355946 sun/io/CharToByteCp970 : <clinit> (65393)]
> [0.0359730 sun/io/CharToByteCp933 : <clinit> (65389)]
> [0.0360463 sun/io/CharToByteCp949 : <clinit> (65393)]
> [0.0402428 java/util/HashMap : eq (19)]
> [0.0402560 sun/java2d/pipe/PixelToShapeConverter : fillRect (24)]
> [0.0801238 java/awt/image/Raster : getSample (25)]
> [0.0820165 sun/reflect/MethodAccessorGenerator : emitInvoke (1012)]
> [0.1201715 java/awt/font/TextLayout : equals (29)]
> [0.1203534 java/nio/CharBuffer : hashCode (38)]
> [0.1911322 sun/security/provider/certpath/Builder : targetDistance (741)]
>
> The times were generated on a 450MHz US2.
>
> Note that hitting just *one* targetDistance method during
> a stack crawl will blow CMS's entire time budget for Nortel. A few
> of the cheaper methods can still break the bank if they happen at
> the same time.
>
> Clearly, pause time oriented collectors are going to need
> a higher performance solution.
>
> James M
Will look into this for post-tiger.
###@###.### 2004-08-02
I am in the process of working on this now (and before last week). I've written
a 1 pager on this that I have attached. The one pager outlines some of the
alternatives that we've discussed for solving this problem.
Because this is an escalation, the idea of saving the oopmap information generated for large methods will be investigated first (it'll be easier to backport).
Fixed 4734748: Pathologically slow oopmap generation
The interpreter is modified to write tags to the interpreter stack in order
to find oopmaps. See one pager at:
http://j2se.east/jruntime/task_templates/Mustang/tagged_stack.1pager.html
Latest Webrev:
http://jruntime.east/~coleenp/webrev/tag4
The tagged stack interpreter is on the switch TaggedStackInterpreter. The
current plan is to have the default value of this switch be false, and will
be documented for users with large JSP who notice pathological GC slowdowns.
Much of the code for the switch is common, so I'm less worried about bit rot
under a non-default switch than users complaining that they're now getting
more StackOverflowErrors with mustang.
Inlined functions in interpreter_<cpu>.hpp were added to support tags.
A lot of changes were of the form:
wordsize => Interpreter::StackElementSize()
in places where the assumption was that a java stack element was a wordSize.
Steve Goldman suggested the revised naming for the interpreter expression
stack and local items.
The other major changes were to write/check tags in interp_masm_<cpu>.cpp,
skip them in interpreterRT_<cpu>.cpp and sharedRuntime_<cpu>.cpp,
and to gc them in frame.cpp.
Performance: The Alacrity numbers are mixed but not bad. See:
http://alacrity.sfbay/query/ones.jsp?VersionSelector_baseline=1.6.0%7Cclean-rt-base_124500&VersionSelector=1.6.0%7Ctaggedstack_125110&esum=Summary&pval=0.01&type_baseline=o&type_build=o
The microbenchmark submitted with bug 5049261 shows speedup with nested try
finally blocks. See attached.
sample from before:
[GC 2153K->105K(3520K), 0.0896927 secs]
sample from tagged stack interpreter:
[GC 2153K->105K(3520K), 0.0002645 secs]
Fix checked in 8/8/2005 for mustang. I don't think that this will be easy or even possible to backport to earlier releases because the argument marshalling between the compilers and interpreted code was rewritten in mustang, and this relies on the new interface.
|