Threadripper 1950X Production Style Benchmark Scene

   11606   9   2
User Avatar
Member
260 posts
Joined: July 2006
Offline
Hi community,
After 6 years, I finallly upgraded my 4.5 GHZ OCed 2600K to Threadripper 1950X, another 6 or so years to stick to a CPU

For those who dont know me, I am a hardware enthusiasit, vfx hobbyist for the last 20+ years and I also do it professionally at MPC as a lead FX TD.

I have been confused with general claims of these many core consumer PCs, people have been saying, “yes but I render with GPU I need fast single threading for SOPs, and dont need slow cores”, and some think it is the only other way around. I am here to proove wrong that more cores at 3+ GHZ is the only way to go as a VFX artist/freelancer.

So I prepared one of my recent scenes to share for benchmaring at a whole new level. This file would mimmick a real life scenario.

I tried to make the scene OS independent, Please let me know if you catch anything.

Requirements
Project File:

Download Project… [goo.gl]

A spinning drive if possible
Houdini 16.0.705
Deadline Monitor and a slave as the same machine with Houdini Plugins installed DEADLINE- INSTALL ME [downloads.thinkboxsoftware.com]
Min 32GB Ram

Why do we need Deadline, because most occasionally in production, we cache every step, usually on a farm. A lot of SOP operators are not well multithreaded. With Deadline, I can tell a task to be divided into 10 subtasks for 100 frames. Then I can tell Deadline to run each of these tasks simultaneously, It would lunch 10 Houdinis in the background and process the information. As long as you can fit it into your ram you get almost full threading.
This makes a huge difference in scenarios like, cleaning up caches, meshing geometry, caching collision geometry or VDBs before pyro or flip simulations.

Basically after you ran submit, and then one ran one COP job, you should ge a sequence of EXR reesulting into this
but in 1080P and better sampling.



Mine is still running, but I will post the final video and results here, current progress is attached,
Deadline will be able to tell the memory usage, CPU Utilization , actual task run time, etc

This is a long benchmark. I would be very happy to finally have some real numbers to compare, It will take a few days to complete.
Edited by tricecold - Sept. 18, 2017 09:19:11

Attachments:
DeadlineCap.png (392.9 KB)

Head of CG @ MPC
CG Supervisor/ Sr. FX TD /
https://gumroad.com/timvfx [gumroad.com]
www.timucinozger.com
User Avatar
Member
4677 posts
Joined: Feb. 2012
Offline
Awesome Tim you did it again
Senior FX TD @ Industrial Light & Magic
Get to the NEXT level in Houdini & VEX with Pragmatic VEX! [www.pragmatic-vfx.com]

youtube.com/@pragmaticvfx | patreon.com/animatrix | pragmaticvfx.gumroad.com
User Avatar
Member
4189 posts
Joined: June 2012
Offline
I'm not a fan of poor-mans threading. Nuke has the frame-server that appears to exactly the same thing, launching many instances of Nuke, and I'm always running out of ram on the CPU or GPU- 64GB ram/11 GB. In real-world scenarios it's too prone to failing and is only faster for the simple ops like transcoding to JPGs etc.

So, do you always do run this when you have access to a farm too?
User Avatar
Member
260 posts
Joined: July 2006
Offline
LoL, I am also not a fan of it, but for now, it is the best way to keep my cores busy for any job other than render pyro or flip, but indeed you would be limited to the ram. At work the workflow is exactly same, except we have quiet many houdini engine licenses and
quiet many machines with 128GB ram.

But it is not really that easy to make it fail, as long as you have an idea of the ram usage per frame, it wouldn`t be so difficult to adjust accordingly.

The problem is not the hardware really, I do understand parallel computing is not an easy task for a programmer, but until more tools are rewritten for compile sop, this seems to be a temporary solution.
Edited by tricecold - Sept. 18, 2017 21:07:33
Head of CG @ MPC
CG Supervisor/ Sr. FX TD /
https://gumroad.com/timvfx [gumroad.com]
www.timucinozger.com
User Avatar
Member
8034 posts
Joined: Sept. 2011
Offline
If you have work that is divisible by frames, it really is the best way to distribute work among servers. I don't see Houdini becoming an MP app anytime soon. Also, assume some level of thread parallelism, instead of running 32 processes on a server with 32 threads, assign each process 4-16 threads and distribute according to how much memory a task requires and how well it scales with threads. For example, a task with 200 frames is run as 20 batches of 10 frames running as 4 concurrent processes on each of 5 servers.
Edited by jsmack - Sept. 18, 2017 21:11:03
User Avatar
Member
83 posts
Joined: Feb. 2016
Offline
Why does the CPU speed read as 2.2 GHz?

The threadripper can boost two cores ( I think ) around the 4.2/4 GHz mark. That should take care of fast cores when multithreading isn't the better option. In renders, it can read reach 4GHZ on all 16 cores.

Also is the ram ECC?

Looking forward to the results.

I am myself looking to build a quad/Tri GPU setup with the threadripper platform.
Edited by nisachar - Sept. 18, 2017 21:19:21
User Avatar
Member
260 posts
Joined: July 2006
Offline
nisachar
Why does the CPU speed read as 2.2 GHz?

The threadripper can boost two cores ( I think ) around the 4.2/4 GHz mark. That should take care of fast cores when multithreading isn't the better option. In renders, it can read reach 4GHZ on all 16 cores.

Also is the ram ECC?

Looking forward to the results.

I am myself looking to build a quad/Tri GPU setup with the threadripper platform.

Hi it shows the minimum CPU similar to what my os tells also

lscpu | grep MHz
CPU MHz: 2200.000
CPU max MHz: 3400.0000
CPU min MHz: 2200.0000
Head of CG @ MPC
CG Supervisor/ Sr. FX TD /
https://gumroad.com/timvfx [gumroad.com]
www.timucinozger.com
User Avatar
Member
1 posts
Joined: July 2006
Offline
Great thanks for your sharing.
Look forwards to hear your update.
Cheers~
User Avatar
Member
2 posts
Joined: May 2017
Offline
Could be a bit offtopic, but how is the 1950x working nowadays? After price drop I'm tempted to build a machine for Houdini with a 1920x, 64gb ram and a 1080ti, cool idea?
Nuke compositor at Do Postproduction.
User Avatar
Member
4 posts
Joined: Oct. 2010
Offline
tricecold
Hi community,
After 6 years, I finallly upgraded my 4.5 GHZ OCed 2600K to Threadripper 1950X, another 6 or so years to stick to a CPU

For those who dont know me, I am a hardware enthusiasit, vfx hobbyist for the last 20+ years and I also do it professionally at MPC as a lead FX TD.

I have been confused with general claims of these many core consumer PCs, people have been saying, “yes but I render with GPU I need fast single threading for SOPs, and dont need slow cores”, and some think it is the only other way around. I am here to proove wrong that more cores at 3+ GHZ is the only way to go as a VFX artist/freelancer.

So I prepared one of my recent scenes to share for benchmaring at a whole new level. This file would mimmick a real life scenario.

I tried to make the scene OS independent, Please let me know if you catch anything.

Requirements
Project File:

Download Project… [goo.gl]

A spinning drive if possible
Houdini 16.0.705
Deadline Monitor and a slave as the same machine with Houdini Plugins installed DEADLINE- INSTALL ME [downloads.thinkboxsoftware.com]
Min 32GB Ram

Why do we need Deadline, because most occasionally in production, we cache every step, usually on a farm. A lot of SOP operators are not well multithreaded. With Deadline, I can tell a task to be divided into 10 subtasks for 100 frames. Then I can tell Deadline to run each of these tasks simultaneously, It would lunch 10 Houdinis in the background and process the information. As long as you can fit it into your ram you get almost full threading.
This makes a huge difference in scenarios like, cleaning up caches, meshing geometry, caching collision geometry or VDBs before pyro or flip simulations.

Basically after you ran submit, and then one ran one COP job, you should ge a sequence of EXR reesulting into this
but in 1080P and better sampling.



Mine is still running, but I will post the final video and results here, current progress is attached,
Deadline will be able to tell the memory usage, CPU Utilization , actual task run time, etc

This is a long benchmark. I would be very happy to finally have some real numbers to compare, It will take a few days to complete.
Can you upload scene file again? I wanna try to test. I click the link, but ..it's gone.
  • Quick Links