Cool VL Viewer forum http://sldev.free.fr/forum/ |
|
Try a threadpool for llimageworker http://sldev.free.fr/forum/viewtopic.php?f=10&t=2141 |
Page 1 of 2 |
Author: | kathrine [ 2021-03-14 21:57:07 ] | ||
Post subject: | Try a threadpool for llimageworker | ||
Hi Henri, another little patch, not quite polished, but maybe have a look if it is worth it. It adds a 4 thread pool to the image decode worker thread to increase texture throughput a little bit. Greetings Kathrine
|
Author: | Henri Beauchamp [ 2021-03-15 08:57:15 ] |
Post subject: | Re: Try a threadpool for llimageworker |
Nice code, thanks ! A couple changes needed: isBusy() must also return true when STATUS_QUEUED, and LLImageBase::sRawImageCount (and likely sGlobalRawMemory as well) needs to be turned into an LLAtomicS32. Even after these changes, I'm still seeing two glitches that will need to be resolved however:
But aside from these glitches, the textures decoding throughput does increase nicely. With some more work, it could be a great addition to the viewer ! |
Author: | Henri Beauchamp [ 2021-03-15 10:37:43 ] | ||||||||||||||||||
Post subject: | Re: Try a threadpool for llimageworker | ||||||||||||||||||
I think I found (part of) the issue. Your sendToPool() code is racy:
The proper code is:
This solves the startup login screen issue, at least, but I am still seeing a few (but significantly less) "stuck DEC" in the textures console... |
Author: | Henri Beauchamp [ 2021-03-15 11:22:37 ] | |||||||||
Post subject: | Re: Try a threadpool for llimageworker | |||||||||
However I see much lower spikes in CPU usage; not a big surprise since instead of taking shortcuts (and risking to see the texture fetcher not being able to recover the decoded data for the previous decoded texture before a new one is assigned to the decode thread), we properly wait for the decode thread to expire. With a quick benchmark (on mainland, 512m draw distance, at a fixed position and camera FOV and with pre-cached textures, meshes, etc), I see a 15-20 seconds difference in rezzing time after login for a total of 1'45" with the threaded image decoder (and 7 threads, for an octo-core CPU) against 2'00" with the mono-threaded decoder (please note that the accuracy of this benchmark is of only +/- 5s). Not a dramatic increase in performances, but still worth it. For a better effect, the textures cache file reads/writes would need to be multi-threaded as well (i.e. multithreading LLLFSThread in the same way; note that you'll need to enable the ThreadedFilesystem setting as well): your next challenge ?... EDIT: scratch that ! isBusy() should return false for anything else than STATUS_QUEUED or STATUS_INPROGRESS, meaning the stuck DEC state is still unsolved (but also that we can expect a much higher throughput once this will be fixed). |
Author: | kathrine [ 2021-03-16 01:33:49 ] | |||
Post subject: | Re: Try a threadpool for llimageworker | |||
Hi Henri, i sprinkled a bunch of std::atomic<> things over the code and that seems to not get stuck on decode. Might be possible to port to LLAtomic but didn't look yet, Kathrine
|
Author: | Henri Beauchamp [ 2021-03-16 10:25:01 ] | ||||||||||||||||||
Post subject: | Re: Try a threadpool for llimageworker | ||||||||||||||||||
Congratulations for a very nice (and very useful) work and contribution ! I'm currently refining it (round-robin threads allocation to avoid always hammering the first threads with requests while they are busy, more stats, adjustment of the number of sub-threads to allocate depending on the number of CPU cores), but it shows a great improvement in rezzing already: "Yava-podding" at 100% speed with a 256m draw distance is becoming an enjoyable experience (only a few blurry textures, compared to a lot more before) ! Provided all my testings go fine, this contribution will be integrated to the next release !
|
Author: | ZaneZimer [ 2021-03-16 12:33:57 ] |
Post subject: | Re: Try a threadpool for llimageworker |
I have been playing around with kathrine's patch and integrated the suggestions from Henri, in this thread. I even managed, I think, to get sRawImageCount and sGlobalRawMemory switched to LLAtomicS32. My C knowledge is quite old so who knows if I actually did it properly. Given my hardware, I may not see as much benefit at you guys. It does seem to help a bit especially since I keep DD fairly small and use speed rezzing. Thanks for the great work guys. I look forward to the improvement being integrated. |
Author: | Henri Beauchamp [ 2021-03-16 14:16:09 ] | |||||||||
Post subject: | Re: Try a threadpool for llimageworker | |||||||||
|
Author: | ZaneZimer [ 2021-03-16 14:57:45 ] | |||||||||
Post subject: | Re: Try a threadpool for llimageworker | |||||||||
|
Author: | Henri Beauchamp [ 2021-03-20 11:23:06 ] |
Post subject: | Re: Try a threadpool for llimageworker |
Feature implemented in today's release ! I added a few refinements to it (round-robin threads affectation, configurable number of image decode threads in the pool, including auto-sizing of the pool based on available threads concurrency, and a couple useful stats), unleashed the power of the queue thread by removing the silly ms_sleep(1) (1ms is a very large delay for modern CPUs !) calls in LL's LLQueuedThread code and replacing them with yield() calls, and I modified the texture fetcher algorithm by adding "boosting" features to it. The latter features work as follow (and allow to take benefit from the new threaded decoding):
|
Page 1 of 2 | All times are UTC |
Powered by phpBB® Forum Software © phpBB Group https://www.phpbb.com/ |