Copernicus Performance

Forums Technical Discussion Copernicus Performance

1544 8 2


9of9: Member; 39 posts; Joined: Oct. 2017; Offline

July 16, 2024 7:11 a.m.

I'm wondering if there is any advice around how to improve performance in Copernicus? Testing it out locally, it seems to run much slower than in any of the demos that I have seen.

A good example is the mask-painting content example that can be found here: https://www.sidefx.com/contentlibrary/texture-mask-paint/ [www.sidefx.com]

If you look at the launch presentation here: https://player.vimeo.com/video/973697874 [player.vimeo.com] it shows the mask paint being drawn, and updating the quick material from Copernicus a bit slowly, but just about interactively, at maybe 5-10 fps or so.

I've loaded this file up on a couple of beefy workstations now, one with a 3090 and another with a 4090, with plenty of CPU and GPU horsepower, and I'm getting about 1 fps updates if I change the mask paint. Looking at performance capture, a single update takes about 1.08s, with the biggest culprits being the monotosdf node (0.166s) idtomask (0.149s), the two dilate erodes (about 0.11s each) etc.

The copnet node has parameters on it that sound like they could speed it up, like enabling Compiled Cook, or lowering the Default Resolution or the Precision, but none of these seem like they have any impact at all on the quality or the speed of the cook. Even the Proxy 1:2 tickbox makes no difference to the resolution or speed of the network.

I've gone through the docs, but I feel like I'm missing some key part of understanding how this should be set up correctly?


alexwheezy: Member; 318 posts; Joined: Jan. 2013; Offline

July 16, 2024 8:44 a.m.

The heighttoambientocclusion5 node has a View Radius parameter, if you reduce it, the performance will increase.


9of9: Member; 39 posts; Joined: Oct. 2017; Offline

July 16, 2024 10:38 a.m.

I mean, yeah, I've tried even bypassing outright, the most expensive nodes like monotosdf and heighttoambientocclusion, but at best I can maybe get the performance down from just over one second to maybe 0.9 or 0.8 seconds. I can completely break the material in that regard, and yet it will remain largely unusable and very, very far from the performance shown at the keynote - so I'm feeling like there's some broader principle missing than just optimising individual nodes.

As far as computation running on the GPU goes, this material does not look that complex, and my gut says this math should fundamentally run on a modern GPU in maybe a couple of milliseconds at most. Obviously there will be some overhead from Houdini, but compiling the graph should take care of that to some degree in principle. Even smaller, simpler networks made of very simple maths functions run orders of magnitude slower than they really should be - I'd love to get a better sense of where the bottlenecks are, and what the base knobs for tuning overall system performance for copernicus are, and at the very least getting up to the speed of execution shown at the keynote seems like it should be plausible

Edited by 9of9 - July 16, 2024 12:35:00


ikoon: Member; 212 posts; Joined: Jan. 2016; Offline

July 18, 2024 3:04 a.m.

Hi 9of9, I am not sure if I am doing this right, but I have tried this:

- open the file as it is
- set the display flag to /obj/Texture_Mask_Paint/merge1
- set the brush to Erase on the /obj/Texture_Mask_Paint/texturemaskpaint1
- hit Enter in the viewport to start the Paint tool
- (for some reason the painting transform is reversed, but I don't investigate)

I am getting ~4 fps, as in the gif

My specs:
- gpu: 4090
- cpu: intel i9-12900K
- windows 11
- nvidia drivers: 560.70

If you want to reach support, I can give them my Houdini_Info.txt file (Help>About>Details)

Attachments:
paint.gif (5.9 MB)


kodra: Member; 373 posts; Joined: June 2023; Offline

July 18, 2024 3:37 a.m.

I don't know, even 4 fps sounds extremely unperformant...

Edited by kodra - July 18, 2024 03:39:59


Soothsayer: Member; 874 posts; Joined: Oct. 2008; Offline

July 18, 2024 3:39 a.m.

Wild random guess but does copernicus also selectively run opencl on cpu or gpu based on an env variable setting?

--
Jobless


9of9: Member; 39 posts; Joined: Oct. 2017; Offline

July 18, 2024 8:20 a.m.

ikoon
Hi 9of9, I am not sure if I am doing this right, but I have tried this:

- open the file as it is
- set the display flag to /obj/Texture_Mask_Paint/merge1
- set the brush to Erase on the /obj/Texture_Mask_Paint/texturemaskpaint1
- hit Enter in the viewport to start the Paint tool
- (for some reason the painting transform is reversed, but I don't investigate)

I am getting ~4 fps, as in the gif

That is far more reasonable than my results! Will try to attach a video. Your frame time looks to be about 230ms on average - mine is about 1650ms if following those exact steps, so that's a 7x difference!

While 4fps is low, that does look closer to what was demoed and is at least somewhat usable - I can see that being something that could be optimised down across the specific nodes, but I can't see a way for me to claw back ~1400ms of frame time as it stands!

My specs:
- GPU: NVIDIA 4090
- CPU: AMD Ryzen 3970X 32 Cores (64 CPUs), ~3.7GHz
- RAM: 64GB
- OS: Windows 11
- Driver: 555.99

Edited by 9of9 - July 18, 2024 08:20:29

Attachments:
2024-07-18 13-11-33.mp4 (11.9 MB)


ikoon: Member; 212 posts; Joined: Jan. 2016; Offline

July 18, 2024 9:02 a.m.

I watched my GPU and CPU usage. CPU goes to some 11% (probably single core full load). GPU goes to 70-90%

Single core performance of that i9 may be 60% higher than Ryzen's. I am not sure where else might be the difference. Maybe try to update the nvidia drivers too. (I have the Studio Drivers 560.70)


9of9: Member; 39 posts; Joined: Oct. 2017; Offline

July 18, 2024 10:53 a.m.

That's a good shout - upgrading to the latest 560.70 driver, whether Game-Ready or Studio improves my frametime to about 850ms on the 4090, approximately halving it (and still over 1200ms on the 3090!). Though that's still almost four times slower than yours!

My GPU utilisation is about 90-100% while painting, whereas CPU remains steady at 5%.

Quick Links

                    
                        Search links
                        Show recent posts
                        Show unanswered posts
                        
                    
                

                    
	

                

                    
                    ▼Jump to
Houdini
News
Houdini Indie and Apprentice
Houdini Lounge
Technical Discussion
Work in Progress
Houdini Learning Materials
BYOC + Illume
Houdini for Realtime
Solaris and Karma
Rigging
Animation
PDG/TOPs
The Orbolt Smart 3D Asset Store
Houdini Jobs
日本語フォーラム
Licensing
Houdini Engine
Houdini Engine API
Houdini Engine for Unreal
Houdini Engine for Unity
Houdini Engine for Maya
Houdini Engine for 3ds Max
Private
Contests
Mardini 2024
H20 Tech Art Challenge
Mardini 2023
SideFX Labs Tech Art Challenge 2022
MARDINI Daily Art Challenge 2022
SideFX Labs Tech Art Challenge 2021
MARDINI Daily Challenge 2021
Nodevember 2020
HOULY Daily Challenge
3rd Party Software and Tools
3rd Party
SI Users