I have addressed the problem with Swing apps double-painting
for translucent window with a simple reordering: we first
go through the dirty components, and if they belong to
perpixel-translucent top-levels, paint them through the
UpdateWindow mechanism, and remove the list.
Then process the list with the rest of the dirty components
as usual. This cuts the rendering time even for non-accelerated
case (when the hw pipeline is not enabled) in half.
Another optimization is to always use INT_ARGB_PRE image
format (since it's the native format for layered windows,
and also happen to be the format in which translucent
surfaces are for D3D and OGL) - this eliminates the need
for conversion when uploading pixels into the layered
There's little hope of fully accelerated path on Windows XP for
perpixel translucent windows since they can only be
implemented via layered windows, so at least the last step
will be through sw.
But on modern PCI Express video boards it is worth it to
render the window contents into an accelerated offscreen surface,
and then pull that surface from vram to sysmem and update
the window using the layered window api.
On older PCI and AGP boards the advantage is less pronounced
and will be very dependent on the amount of time spent
rendering vs uploading to the layered window, which
is why this fix will not be enabled for AGP/PCI boards
(note that since it is nearly impossible to tell whether
or not a board pci express, we rely on board showing
hw support for PS_30, which we hope is rough enough
On Vista there is a way to use fully hw accelerated
perpixel translucent windows using DWM-specific APIs.
This fix will not go that far but will leave room for
integrating it in the future if it proves beneficial.
There are still issues with how the updates of Swing
windows is handled. Currently each repaint is done twice: once
into the Swing's back-buffer, and then once again
into the buffer which is then uploaded into the layered window.
This will be handled by a separate swing-related bug.
Anyway, with this fix, the performance had improved 2-3x,
depending on the complexity of the window content; and it also
eliminated the full GCs which were happening on each update.