This issue was filed against Nvidia's drivers a few months ago under incident
number 162467. I also discussed the issue with driver engineers from Nvidia,
and they understand the impact, but claim it is a difficult issue to fix in
their drivers for render-to-texture surfaces (especially when the source and
destination regions are overlapping). I will continue to pursue this issue
with them, but in the meantime I have filed this bug report as a placeholder
for comments about this problem.
This problem also affects FBO surfaces on Nvidia's drivers (on all platforms)
because they are treated just like render-to-texture surfaces, as described above.
So far, we have not been able to find a way to work around this problem in the
###@###.### 2005-07-18 16:13:08 GMT
The first issue (162467: glCopyPixels() slow for pbuffers on Windows) is
supposedly fixed in Nvidia's 85.xx series, but those aren't yet publically
The second issue (174034: glCopyPixels() slow for FBO destinations) is
still pending. Nvidia is investigating a fix for their 90.xx series drivers,
but nothing is set in stone. I'm hoping that this will be resolved soon as
it's the only issue preventing us from turning on our FBO codepath by default
I'll leave this bug open until driver fixes are publically available for both
of these issues.
It turns out that Nvidia did not in fact fix the first issue (slow glCopyPixels()
for pbuffers on Windows); apparently they marked it fixed without actually
resolving the problem.
We have been communicating with Nvidia over this issue for a number of months
in the hopes that they would fix these issues in their drivers, but it appears
that the issue is hardware related (due to the way textures are stored on
nv3x boards and earlier, basically any board earlier than the GeForce 6xxx
series), and any workaround they apply to try to speed up glCopyPixels()
probably will not meet our needs. They have told us that the problem does not
apply when using the GL_ARB_texture_rectangle extension since those textures
are stored more efficiently, so to workaround the performance issue we could
start using the GL_TEXTURE_RECTANGLE_ARB target instead of GL_TEXTURE_2D.
While this requires a number of changes in our OGL pipeline, it seems to be
the only viable way to work around the problem in the short term, and at least
it has the added benefit that we will now use less VRAM since we can store
rectangular images directly instead of padding up to pow2-sized textures.
This extension is available on Nvidia boards going back to the GeForce 2 series
and on ATI Radeon 8500 and above. (We've had support for the
GL_ARB_texture_non_power_of_two extension since Tiger, but that extension is
only available on very new hardware, like GeForce 6xxx. So we will continue
to prefer that newer extension when it is available since it allows for
non-pow2 textures and the GL_REPEAT mode, which we use in our TexturePaint
acceleration code, whereas GL_REPEAT is not supported on
The original synopsis of this bug was:
OGL: copyArea is extremely slow for pbuffers on Nvidia drivers (Windows only)
but the larger issue here is that scrolling (and dragging internal frames)
in Swing apps is unusably slow since the gray rect fix was integrated in
Mustang b32. So while the copyArea() performance issue has been reproducible
since Tiger, the more visible performance problem is the slow scrolling
behavior, which is a regression in Mustang, so I am going to update the
synopsis appropriately and mark this as an important regression:
OGL: scrolling is extremely slow in Swing apps in Mustang (Nvidia only)
We should try to fix this for Mustang if possible, since the OGL pipeline
is essentially unusable on Nvidia hardware for end users of typical Swing apps.