EVALUATION
Attaching four more J2DBench comparison files to this bug report just to show
that the benefits are just as compelling on other platforms.
The solaris-sparc results are gathered on a SunBlade 2000, 900MHz USIII, XVR-1200.
The windows-i586 results are gathered on a 2x 2.8 GHz P4, ATI Radeon 9500 Pro,
Catalyst 6.4 drivers. (The results look especially good here because we are
essentially working around the slow OGLMaskFill operation, which is
pathologically slow on ATI hardware on Windows due to a glDrawPixels() bug
that has been filed with, but not yet fixed, by ATI.)
|
EVALUATION
As described above, we can add two new specialized subclasses of OGLMaskFill:
- OGLMaskFill.Solid
- OGLMaskFill.Gradient
- OGLMaskFill.Texture
This mirrors the existing state of affairs in OGLRenderer. We can then refactor
the code in OGLRenderer.Gradient/Texture so that the respective OGLMaskFill
subclasses can easily call into the enable/disable*Paint() routines to leverage
that existing setup code. Finally, we can modify our existing native implementation
of OGLMaskFill to do the appropriate multitexturing setup so that we effectively
modulate the paint fragments (autogenerated on texture unit 0) with the coverage
values from the mask tile (provided on texture unit 1), all of which is modulated
with the current color value (which contains the extra alpha value).
Perhaps a picture would help, where:
Ea = extra alpha
Px = gradient/texture paint color (generated for each fragment)
Cx = coverage value from mask tile
Rx = resulting color/alpha component
A R G B
primary color Ea Ea Ea Ea (modulated with...)
texture unit 0 Pa Pr Pg Pa (modulated with...)
texture unit 1 Ca Ca Ca Ca
---------------------------------------
resulting color Ra Rr Rg Rb
In other words, Ra=Ea*Pa*Ca and so on...
I'm attaching the full J2DBench results for solaris-i586 (executed on a Sun W2100z,
2x 2.0 GHz Opteron, Nvidia Quadro FX 1100, 87.11 drivers, Mustang b86). One file
(aapaint.solaris-i586.ogl-to-ogl.comp) compares the new OGL performance to the
old, here is the summary:
ogl.old:
Number of tests: 72
Overall average: 224132.19274302718
Best spread: 0.04% variance
Worst spread: 12.44% variance
(Basis for results comparison)
ogl.new:
Number of tests: 72
Overall average: 235543.05038378254
Best spread: 0.04% variance
Worst spread: 12.18% variance
Comparison to basis:
Best result: 2542.97% of basis
Worst result: 92.45% of basis
Number of wins: 44
Number of ties: 20
Number of losses: 8
And the other file (aapaint.solaris-i586.x11-to-ogl.comp) compares the new and
old OGL performance to the default (X11-based) pipeline as a baseline,
in summary:
Summary:
x11:
Number of tests: 72
Overall average: 43299.88938471539
Best spread: 0.04% variance
Worst spread: 83.13% variance
(Basis for results comparison)
ogl.old:
Number of tests: 72
Overall average: 224132.19274302718
Best spread: 0.04% variance
Worst spread: 12.44% variance
Comparison to basis:
Best result: 63904.51% of basis
Worst result: 17.15% of basis
Number of wins: 40
Number of ties: 0
Number of losses: 32
ogl.new:
Number of tests: 72
Overall average: 235543.05038378254
Best spread: 0.04% variance
Worst spread: 12.18% variance
Comparison to basis:
Best result: 68439.2% of basis
Worst result: 27.0% of basis
Number of wins: 54
Number of ties: 5
Number of losses: 13
The executive summary is (compared to OGL performance in b85):
- up to 25x improvement for antialiased TexturePaint operations
- 2-7x improvement for antialiased GradientPaint operations
- no significant impact on existing accelerated operations
|