At first this looked like a problem with our VolatileImage punting mechanism.
We have a mechanism inside of VolatileImage that detects when lots of reads
are happening on that image (reads are a Bad Thing when an image is cached
in VRAM) and punts that image into system memory when the number of pixels
read surpasses some threshold. If the mechanism was not working correctly,
we could be stuck with a back buffer in VRAM that is being used for lots
of read operations, causing hte kind of bad performance that we are seeing.
But, in fact, our image punting mechanism is working fine; we detect the
read-modify-write situation at work in this app and correctly punt the
image to live forever in system memory.
The problem is that the two non-VolatileImage objects in this app end up
living in VRAM. And since we copy from those two images on every frame,
our performance problem is due to reading from those VRAM images and not
anything to do with the Swing back buffer itself.
The bug is in how we punt the image and how we determine when we can
accelerate operations such as copies between images. Currently, the
VolatileImage object lives as a DirectDraw surface on Windows. When we
punt the image into system memory, we do this within DirectDraw; it is still
represented as a DDraw surface, but just happens not to live in VRAM.
The bg and fg images are loaded by ImageIcon and thus live in system
memory (as BufferedImage objects). But we detect the case where we copy from
a software image to a DirectDraw image and attempt to accelerate this
situation by creating a DDraw/VRAM-cached version of the image under the hood.
Subsequently, copies from this type of image to any DirectDraw surface
(including the screen) will come from the cached version of the image
instead of the original system memory version.
The problem here is that we do not distinguish between ddraw VRAM images
and ddraw system memory images. So when we decide to copy from the
bg image to the Swing back buffer, we detect that the back buffer is a ddraw
surface and thus use the cached bg image and call DirectDraw::Blt() to do
the copy. But since the back buffer now lives in system memory, this Blt
call will need to read from VRAM for the bg image (instead of the expected
hardware-accelerated VRAM->VRAM copy if the back buffer were actually in VRAM).
The fix could take several forms. But the most straightforward fix would
probably including being able to distinguish between ddraw system and ddraw
VRAM surfaces and taking that information into account when deciding which
routines to use and image versions to use during copying operations.
Fixed as suggested above; we now set a flag at the Java level to indicate when
an image has been punted into ddraw system memory. Then later operations to
that image (such as copying an image to that buffer, as in the AutoTest app
attached to this bug) will detect that punt and use the system memory version
of the source image instead, thus preventing the read-from-VRAM problem
that caused this bug.