EVALUATION
J2DBench results for d3d, optimized vs non-optimized:
graphics.imaging.tests.drawimage,graphics.imaging.src=unmanaged3ByteBgr opaque,graphics.opts.sizes=1000:
d3d_3byte_noopt: 86677.36757 (var=0.56%) (100.0%)
d3d_3byte_opt: 104417.67068 (var=0.77%) (120.47%)
graphics.imaging.tests.drawimage,graphics.imaging.src=unmanaged3ByteBgr opaque,graphics.opts.sizes=250:
d3d_3byte_noopt: 84648.67617 (var=1.55%) (100.0%)
d3d_3byte_opt: 103302.07501 (var=0.92%) (122.04%)
graphics.imaging.tests.drawimagescaledown,graphics.imaging.src=unmanaged3ByteBgr opaque,graphics.opts.sizes=1000:
d3d_3byte_noopt: 1433.121019 (var=1.24%) (100.0%)
d3d_3byte_opt: 58772.01761 (var=0.6%) (4100.98%)
graphics.imaging.tests.drawimagescaledown,graphics.imaging.src=unmanaged3ByteBgr opaque,graphics.opts.sizes=250:
d3d_3byte_noopt: 995.8624898 (var=0.45%) (100.0%)
d3d_3byte_opt: 57882.63793 (var=1.13%) (5812.31%)
graphics.imaging.tests.drawimagescaleup,graphics.imaging.src=unmanaged3ByteBgr opaque,graphics.opts.sizes=1000:
d3d_3byte_noopt: 1571.229050 (var=2.45%) (100.0%)
d3d_3byte_opt: 234860.55776 (var=0.4%) (14947.57%)
graphics.imaging.tests.drawimagescaleup,graphics.imaging.src=unmanaged3ByteBgr opaque,graphics.opts.sizes=250:
d3d_3byte_noopt: 1092.344644 (var=1.19%) (100.0%)
d3d_3byte_opt: 231106.37876 (var=0.64%) (21156.91%)
graphics.imaging.tests.drawimagetxform,graphics.imaging.src=unmanaged3ByteBgr opaque,graphics.opts.sizes=1000:
d3d_3byte_noopt: 1718.75 (var=1.19%) (100.0%)
d3d_3byte_opt: 126001.58982 (var=0.92%) (7331.0%)
graphics.imaging.tests.drawimagetxform,graphics.imaging.src=unmanaged3ByteBgr opaque,graphics.opts.sizes=250:
d3d_3byte_noopt: 1418.912175 (var=1.17%) (100.0%)
d3d_3byte_opt: 123388.15789 (var=0.89%) (8695.97%)
Results for OGL, inlcuding loops which we know are slower
and won't be inlcuded in the fix:
graphics.imaging.tests.drawimage,graphics.imaging.src=unmanaged3ByteBgr opaque,graphics.opts.sizes=1000:
ogl_3byte_noopt: 86021.50537 (var=0.56%) (100.0%)
ogl_3byte_opt: 7416.563658 (var=0.33%) (8.62%)
graphics.imaging.tests.drawimage,graphics.imaging.src=unmanaged3ByteBgr opaque,graphics.opts.sizes=250:
ogl_3byte_noopt: 80210.42084 (var=0.81%) (100.0%)
ogl_3byte_opt: 5773.092369 (var=1.17%) (7.2%)
graphics.imaging.tests.drawimagescaledown,graphics.imaging.src=unmanaged3ByteBgr opaque,graphics.opts.sizes=1000:
ogl_3byte_noopt: 3155.048076 (var=0.64%) (100.0%)
ogl_3byte_opt: 2708.667736 (var=0.89%) (85.85%)
graphics.imaging.tests.drawimagescaledown,graphics.imaging.src=unmanaged3ByteBgr opaque,graphics.opts.sizes=250:
ogl_3byte_noopt: 1621.264588 (var=1.91%) (100.0%)
ogl_3byte_opt: 2673.937004 (var=1.0%) (164.93%)
graphics.imaging.tests.drawimagescaleup,graphics.imaging.src=unmanaged3ByteBgr opaque,graphics.opts.sizes=1000:
ogl_3byte_noopt: 5601.659751 (var=1.25%) (100.0%)
ogl_3byte_opt: 10856.45355 (var=0.89%) (193.81%)
graphics.imaging.tests.drawimagescaleup,graphics.imaging.src=unmanaged3ByteBgr opaque,graphics.opts.sizes=250:
ogl_3byte_noopt: 2662.923045 (var=0.6%) (100.0%)
ogl_3byte_opt: 10749.11347 (var=1.56%) (403.66%)
graphics.imaging.tests.drawimagetxform,graphics.imaging.src=unmanaged3ByteBgr opaque,graphics.opts.sizes=1000:
ogl_3byte_noopt: 3942.973523 (var=0.41%) (100.0%)
ogl_3byte_opt: 5826.645264 (var=1.38%) (147.77%)
graphics.imaging.tests.drawimagetxform,graphics.imaging.src=unmanaged3ByteBgr opaque,graphics.opts.sizes=250:
ogl_3byte_noopt: 2360.369609 (var=1.32%) (100.0%)
ogl_3byte_opt: 5833.668139 (var=0.84%) (247.15%)
|
EVALUATION
Ok, here's the data with the latest version of the fix (full
J2DBench results files attached):
I have fixed J2DBench to add a set of
drawImage+touch tests. But currently one would have
to specify accthreshold=0 to properly test texture
uploads (SwToTexture) because otherwise we'd be
testing the 'unmanaged image' case (SwToSurface).
So I ran the benchmarks, and now most variants show
improvement, and the results are consistent between
ogl and d3d.
I tested
unmanaged scale/blit/tx
'managed, touched on each iteration' scale/blit/tx
d3d_3byte_noopt:
Number of tests: 32
Overall average: 1371683.1740006404
Best spread: 0.04% variance
Worst spread: 8.94% variance
(Basis for results comparison)
d3d_3byte_opt:
Number of tests: 32
Overall average: 1434948.8126830158
Best spread: 0.0% variance
Worst spread: 1.58% variance
Comparison to basis:
Best result: 20532.49% of basis
Worst result: 99.58% of basis
Number of wins: 24
Number of ties: 8
Number of losses: 0
ogl_3byte_noopt:
Number of tests: 32
Overall average: 557509.3068086591
Best spread: 0.04% variance
Worst spread: 1.31% variance
(Basis for results comparison)
ogl_3byte_opt:
Number of tests: 32
Overall average: 659663.2264295145
Best spread: 0.04% variance
Worst spread: 4.64% variance
Comparison to basis:
Best result: 11709.7% of basis
Worst result: 99.96% of basis
Number of wins: 24
Number of ties: 8
Number of losses: 0
The ties are "managed, untouched" cases, where we only pay
the penalty of missing loops once when uploading to the texture
for the first time.
|
EVALUATION
The same applies to the OpenGL pipeline, but with a twist.
Adding these loops only helps in case of scaling, straight
blits are much slower in OGL because we need to upload the data
scan line by scan line because of possible alignment issues (see
bug 6207877).
Here's some performance data for non-optimized vs optimized (with the
loops added) case. The benchmark tests blit/scale of unmanaged 3bytebgr image.
d3d:
graphics.imaging.tests.drawimage,graphics.opts.sizes=1000:
3byte_noopt: 86301.92230 (var=0.49%) (100.0%)
3byte_opt: 104693.14079 (var=0.73%) (121.31%)
graphics.imaging.tests.drawimage,graphics.opts.sizes=250:
3byte_noopt: 83373.93846 (var=1.73%) (100.0%)
3byte_opt: 103826.17728 (var=0.79%) (124.53%)
graphics.imaging.tests.drawimagescaleup,graphics.opts.sizes=1000:
3byte_noopt: 1527.494908 (var=2.59%) (100.0%)
3byte_opt: 235402.19134 (var=0.56%) (15411.0%)
graphics.imaging.tests.drawimagescaleup,graphics.opts.sizes=250:
3byte_noopt: 1087.896986 (var=1.48%) (100.0%)
3byte_opt: 233458.12958 (var=0.82%) (21459.58%)
Summary:
3byte_noopt:
Number of tests: 4
Overall average: 43072.81316563192
Best spread: 0.49% variance
Worst spread: 2.59% variance
(Basis for results comparison)
3byte_opt:
Number of tests: 4
Overall average: 169344.90975135576
Best spread: 0.56% variance
Worst spread: 0.82% variance
Comparison to basis:
Best result: 21459.58% of basis
Worst result: 121.31% of basis
Number of wins: 4
Number of ties: 0
Number of losses: 0
ogl:
graphics.imaging.tests.drawimage,graphics.opts.sizes=1000:
3byte_noopt: 85970.39250 (var=0.78%) (100.0%)
3byte_opt: 7409.440175 (var=0.95%) (8.62%)
graphics.imaging.tests.drawimage,graphics.opts.sizes=250:
3byte_noopt: 80193.14868 (var=5.45%) (100.0%)
3byte_opt: 5703.422053 (var=21.79%) (7.11%)
graphics.imaging.tests.drawimagescaleup,graphics.opts.sizes=1000:
3byte_noopt: 5492.270138 (var=1.89%) (100.0%)
3byte_opt: 319366.47955 (var=0.89%) (5814.84%)
graphics.imaging.tests.drawimagescaleup,graphics.opts.sizes=250:
3byte_noopt: 2676.494431 (var=6.58%) (100.0%)
3byte_opt: 313342.59059 (var=4.39%) (11707.2%)
Summary:
3byte_noopt:
Number of tests: 4
Overall average: 43583.076440138895
Best spread: 0.78% variance
Worst spread: 6.58% variance
(Basis for results comparison)
3byte_opt:
Number of tests: 4
Overall average: 161455.48309378154
Best spread: 0.89% variance
Worst spread: 21.79% variance
Comparison to basis:
Best result: 11707.2% of basis
Worst result: 7.11% of basis
Number of wins: 2
Number of ties: 0
Number of losses: 2
|