Thank you. However, after a close look at your code, I see the following problems:
mTexName should not be changed at random times during the rendering of a frame, and your code now changes it from the GL images creation threads. LL's code was instead ensuring it was only changed from the lambda stored in the work queue, which is executed from the main thread, and out of the frame rendering code...
Nice idea, but alas it also may cause issues in your code, since mTexName may be used in other methods not covered by your ensureSync() method, e.g. in setSubImage() when use_name = 0... A solution would be to replace all mTexName uses with getTexName() (so that ensureSync() is called there, when needed), however it slows down (even if only a tiny bit) the code... Another issue with ensureSync() is that mTexNameSync is not atomic, while set in both the main and GL image threads; granted, having an image re-created fast enough (in the same frame) for this to be a problem is quite unlikely, but still, code correctness would impose an atomic variable here, meaning an even (slightly) slower ensureSync().
And we do not now either how it would work with Mesa under Linux (cannot test it myself, by lack of an AMD card)...
For next release, I simply reverted the code as it used to be for non-NVIDIA cards (NVIDIA can still enjoy a smoother experience, since the fence is placed within the GL image thread, and does not affect/stall the main render thread GL pipeline as a result).EDIT: I found a better solution and kept your (excellent) idea for non-NVIDIA GPUs; the latter will still, by default, wait for the fence in the GL image creation thread, to avoid risking to stall the main thread GL pipeline at all, but I added a debug setting ("RenderGLImageSyncInThread") for them, in case people want to experiment with the new syncing algorithm. This allowed me to remove entirely the posting to the main thread work queue (simpler) while ensuring (via an atomic boolean test each time we use mTexName) that the tex name is swapped in the main thread when first needed.