Some Threadripper results
10396 15 2- Beatnutz
- Member
- 7 posts
- Joined: 5月 2017
- Offline
- neil_math_comp
- Member
- 1743 posts
- Joined: 3月 2012
- Offline
Could you post a HIP file and a description of what you were testing? It could be that whatever you were testing currently doesn't thread much, which we might be able to do something about. There have also been a few reports of Threadripper having poor performance or even crashing under moderately heavy load, (e.g. compiling the Linux kernel), due to heating issues, but I don't have one, so I haven't independently tested that.
Writing code for fun and profit since... 2005? Wow, I'm getting old.
https://www.youtube.com/channel/UC_HFmdvpe9U2G3OMNViKMEQ [www.youtube.com]
https://www.youtube.com/channel/UC_HFmdvpe9U2G3OMNViKMEQ [www.youtube.com]
- malexander
- スタッフ
- 5201 posts
- Joined: 7月 2005
- Offline
- Beatnutz
- Member
- 7 posts
- Joined: 5月 2017
- Offline
ndickson
Could you post a HIP file and a description of what you were testing? It could be that whatever you were testing currently doesn't thread much, which we might be able to do something about. There have also been a few reports of Threadripper having poor performance or even crashing under moderately heavy load, (e.g. compiling the Linux kernel), due to heating issues, but I don't have one, so I haven't independently tested that.
I also had the crashing. But I think that is only with the MSI X399 motherboard. If you turn off a setting in the BIOS then the crashes go away. It is a solution until a new BIOS version comes out.
Here are the HIP-files. I exported them as Alembic (all 240 frames) and timed it.
- Skybar
- Member
- 166 posts
- Joined: 3月 2013
- Offline
- Beatnutz
- Member
- 7 posts
- Joined: 5月 2017
- Offline
twod
The pyro sim results show the biggest improvement of any of those benchmarks, even the non Houdini ones. You don't consider that impressive?
To be honest, that is the smallest improvement I was expecting. TR has 10 extra cores over my old CPU. Twice as fast is very nice indeed, but anything below is not enough.
If I was only doing pyro then this would have been amazing. But now I'm a bit pissed (not blaming SideFX or anything) about spending all that money and having a slower machine . Next time I'll just wait for other peoples benchmarks. But it would be nice to see a fix for this in the future.
- Beatnutz
- Member
- 7 posts
- Joined: 5月 2017
- Offline
Skybar
You are testing a lot of GPU stuff for benchmarking a CPU. What about a Mantra test?
Yeah I mainly wanted to test it in software that I work/play in frequently. Regardless of GPU/CPU. The C4D test was mainly to see how well it handles large simulations from Houdini in Cinema 4D with 8K materials. So lots of polys etc and big VDB files to load etc.
I also tried caching an X-Particles sim but I had to ditch it since it gave me such poor results. It seemed to be single threaded which is strange. I'm going to check with them about it.
Never used Mantra to be honest, I always send Alembics/Digital assets over to C4D/Octane.
Edited by Beatnutz - 2017年8月31日 11:21:25
- anon_user_37409885
- Member
- 4189 posts
- Joined: 6月 2012
- Offline
These are very basic scene file tests so I wouldn't put too much value on them for real-world performance. Things to improve the tests are turning on OpenCL CPU and turning on OpenCL in the solvers, use file caches instead of going straight out to abc.
In the end if you aren't using threading well then CPUs haven't really got faster in 7 years
In the end if you aren't using threading well then CPUs haven't really got faster in 7 years
- Beatnutz
- Member
- 7 posts
- Joined: 5月 2017
- Offline
aRtye
These are very basic scene file tests so I wouldn't put too much value on them for real-world performance. Things to improve the tests are turning on OpenCL CPU and turning on OpenCL in the solvers, use file caches instead of going straight out to abc.
In the end if you aren't using threading well then CPUs haven't really got faster in 7 years
OK, wasn't aware that would make such a huge impact if they are both simulated the same way. If u have another HIP and your sim time I'd be happy to try it out so we can check the difference.
- anon_user_37409885
- Member
- 4189 posts
- Joined: 6月 2012
- Offline
The main advantage of the newer processors is the same or better performance for much cheaper than before and the better infrastructure on the motherboard. i.e. M2, PCIe 3, USB 3.1, DDR 4 etc. These other bits play a huge role in the speed of working i.e. loading/writing a 5gb pyro frame! You still need to ‘work’ Houdini around it's slower bits as it performs much better when you know the weakest parts and how to build the fastest networks.
No time for a while to build a test file. sorry.
No time for a while to build a test file. sorry.
Edited by anon_user_37409885 - 2017年8月31日 14:44:39
- anon_user_40689665
- Member
- 648 posts
- Joined: 7月 2005
- Offline
Tested using a scene based on the Ryzen benchmark for Blender,
swapped the zero decal with volume.
AMD 16-core Threadripper 1950X, no OC (waiting for TR4 coolers 1st).
sim to disk….: 03m 52s
render to Mplay: 08m 02s
Intel 6-core i7-5930K CPU @ 3.50GHz
sim to disk….: 06m 22s
render to Mplay: 14m 49s
so in this case Threadripper is about
1.6x faster to sim
1.8x faster to render
1.8x more expensive
swapped the zero decal with volume.
AMD 16-core Threadripper 1950X, no OC (waiting for TR4 coolers 1st).
sim to disk….: 03m 52s
render to Mplay: 08m 02s
Intel 6-core i7-5930K CPU @ 3.50GHz
sim to disk….: 06m 22s
render to Mplay: 14m 49s
so in this case Threadripper is about
1.6x faster to sim
1.8x faster to render
1.8x more expensive
- tricecold
- Member
- 260 posts
- Joined: 7月 2006
- Offline
You need to be careful on how to squeeze the maximum performance from your hardware. Test both with openCL on and off, write straight to disk as bgeo.sc, never render your simulations without caching. Why, because you may want to play with shading, lightning etc. TR will make even more gap with higher-resolution simulations, why because you will keep threading busier longer instead of occupying CPU with thread management.
It was the same case when small simulation times compared between dual Xeons vs one fast I7. There is no magic button that makes everything faster. Especially with compile workflow TR will be so much faster in Houdini, why because Compile SOP compiles your many small SOPs into a an imaginary SOP that multithread so much better. I've had sped improvements 5 to 10 times after converting old tools with compile sop on same CPU
Grain solver works best with a fast GPU, so make your comparisons with it turned on and off
It was the same case when small simulation times compared between dual Xeons vs one fast I7. There is no magic button that makes everything faster. Especially with compile workflow TR will be so much faster in Houdini, why because Compile SOP compiles your many small SOPs into a an imaginary SOP that multithread so much better. I've had sped improvements 5 to 10 times after converting old tools with compile sop on same CPU
Grain solver works best with a fast GPU, so make your comparisons with it turned on and off
Edited by tricecold - 2017年9月1日 13:48:02
Head of CG @ MPC
CG Supervisor/ Sr. FX TD /
https://gumroad.com/timvfx [gumroad.com]
www.timucinozger.com
CG Supervisor/ Sr. FX TD /
https://gumroad.com/timvfx [gumroad.com]
www.timucinozger.com
- Beatnutz
- Member
- 7 posts
- Joined: 5月 2017
- Offline
tricecold
You need to be careful on how to squeeze the maximum performance from your hardware. Test both with openCL on and off, write straight to disk as bgeo.sc, never render your simulations without caching. Why, because you may want to play with shading, lightning etc. TR will make even more gap with higher-resolution simulations, why because you will keep threading busier longer instead of occupying CPU with thread management.
It was the same case when small simulation times compared between dual Xeons vs one fast I7. There is no magic button that makes everything faster. Especially with compile workflow TR will be so much faster in Houdini, why because Compile SOP compiles your many small SOPs into a an imaginary SOP that multithread so much better. I've had sped improvements 5 to 10 times after converting old tools with compile sop on same CPU
Grain solver works best with a fast GPU, so make your comparisons with it turned on and off
Thank you for your insights. Good to know!
I've done renders with and without OpenCL, it was faster with. But I never did the original tests (5820K) with it so hard to say how big the difference is.
I also tried using OpenCL CPU but that slowed things down by a lot.
I never really use cache since I'm sending my Alembics over to C4D ususally, but I'll try some comparisons there too.
- sachiman
- Member
- 39 posts
- Joined: 6月 2015
- Offline
- RodTebisx
- Member
- 36 posts
- Joined: 2月 2015
- Offline
TR 1950X here, with H80i V2 premium liquid cooling in place..
Linux System, 32Gb ram at 3200Mhz in just dual channel, not 4 channel:
FLIP scene, just 20 seconds to write to disk! (very sparse use of cores…)
PYRO one, 5min 45s (and not using all cores fully, the graphic of cpu usage is very ondulated)
GRAIN stuff, 1min 32s (again, no using all cores fully)
anyway, i see very nice results here, but still a lot of room for improvement for more full usage of cores (in all algorithms) and multithreading. maybe Houdini 17? maybe multi GPU support too?
thnx.
Linux System, 32Gb ram at 3200Mhz in just dual channel, not 4 channel:
FLIP scene, just 20 seconds to write to disk! (very sparse use of cores…)
PYRO one, 5min 45s (and not using all cores fully, the graphic of cpu usage is very ondulated)
GRAIN stuff, 1min 32s (again, no using all cores fully)
anyway, i see very nice results here, but still a lot of room for improvement for more full usage of cores (in all algorithms) and multithreading. maybe Houdini 17? maybe multi GPU support too?
thnx.
Edited by RodTebisx - 2018年5月1日 19:46:12
Rod Tebisx | Senior Creature FX TD | London
- BabaJ
- Member
- 2120 posts
- Joined: 9月 2015
- Offline
Really couldn't say there is no room for improvement for using all cores.
There likely is always room for improvement.
But sometimes there are operations that are necessarily single threaded and cannot turned into a mutlithreaded operation.
I made an hda were the operation needed to be single threaded and it was maxing the cpu but hardly touched memory.
In those instances I would take a fast cpu over any number of cores/threads capacity.
There likely is always room for improvement.
But sometimes there are operations that are necessarily single threaded and cannot turned into a mutlithreaded operation.
I made an hda were the operation needed to be single threaded and it was maxing the cpu but hardly touched memory.
In those instances I would take a fast cpu over any number of cores/threads capacity.
-
- Quick Links