Nvidia rtx 4090 driver issues?

   20899   31   7
User Avatar
Member
23 posts
Joined: May 2015
Offline
Guess i'm not the only one with this issue. I had instabilities with a certain Vellum setup, but noticed that when switching to openCL CPU it was all fine. But in GPU it crashes. 4090 drivers have been a mess... There's also a memory leak issue for redshift where it can only use 16gb of vram, wonder if it has anything to do with eachother.
www.timvanhelsdingen.com
User Avatar
Member
8046 posts
Joined: Sept. 2011
Online
Has anyone tried disabling native command queueing to see if it fixes the crashes with OpenCL?

HOUDINI_OCL_FEATURE_DISABLE=CL_DEVICE_DEVICE_ENQUEUE_SUPPORT
User Avatar
Member
23 posts
Joined: May 2015
Offline
jsmack
Has anyone tried disabling native command queueing to see if it fixes the crashes with OpenCL?

HOUDINI_OCL_FEATURE_DISABLE=CL_DEVICE_DEVICE_ENQUEUE_SUPPORT

Just tried this on my 4090 and this seems to solve it! What exactly does this do?
www.timvanhelsdingen.com
User Avatar
Member
8046 posts
Joined: Sept. 2011
Online
Tim van Helsdingen
Just tried this on my 4090 and this seems to solve it! What exactly does this do?

It disables native command queuing, a feature of newer graphics cards (2000 series+) that allows for faster pressure solves in Vellum with OpenCL. It was broken with the 3000 series came out too, requiring a new Nvidia driver and Houdini fix. Probably the same story with 4000 series.
User Avatar
Member
8 posts
Joined: July 2017
Offline
jsmack
HOUDINI_OCL_FEATURE_DISABLE=CL_DEVICE_DEVICE_ENQUEUE_SUPPORT

Sorry for my lack of knowledge, but where should we be entering the above line to try and fix the issue?

Thanks
User Avatar
Member
8046 posts
Joined: Sept. 2011
Online
micro88
jsmack
HOUDINI_OCL_FEATURE_DISABLE=CL_DEVICE_DEVICE_ENQUEUE_SUPPORT

Sorry for my lack of knowledge, but where should we be entering the above line to try and fix the issue?

Thanks

To permanently set it for just the user, add the line to your houdini.env file in the preferences folder.

To add it temporarily, it needs to be exported in the environment that launches Houdini.

On linux you would type export FOO=BARbefore launching, or add it to the script wrapper that your studio uses to launch Houdini.

On Windows, the Houdini Command line tools allows for setting env vars before launching Houdini with the set command.
User Avatar
Member
8 posts
Joined: July 2017
Offline
jsmack
micro88
jsmack
HOUDINI_OCL_FEATURE_DISABLE=CL_DEVICE_DEVICE_ENQUEUE_SUPPORT

Sorry for my lack of knowledge, but where should we be entering the above line to try and fix the issue?

Thanks

To permanently set it for just the user, add the line to your houdini.env file in the preferences folder.

To add it temporarily, it needs to be exported in the environment that launches Houdini.

On linux you would type export FOO=BARbefore launching, or add it to the script wrapper that your studio uses to launch Houdini.

On Windows, the Houdini Command line tools allows for setting env vars before launching Houdini with the set command.


Cool thanks that seems to have worked for me too!
User Avatar
Member
42 posts
Joined: Oct. 2018
Offline
I have to say, SideFX you have the best rocking client support in the entire freaking world. It doesn't mater if I am a lonely guy or backed up by a power house studio, you always listen and take action upon that. ABSOLUTELY PRICELESS.

Thank you SideFX Staffers !
User Avatar
Member
580 posts
Joined: Aug. 2014
Offline
I think I have similar problem, and I've experienced it the hard way --- on a scene I've been working on for some time now. Several days ago I did an upgrade of nvidia-drivers to version 520.56.06 and this might have triggered the issue. Previous driver version was 510.108.03-1, I believe, and OpenCL was working just fine.

Since the upgrade, Attribute Blur SOP stopped cooking with the following error:
OpenCL Exception: clBuildProgram (-11)

What errors out inside attribblur's subnet is of course the OpenCL SOP.

I already tried @jsmack's suggestion of running Houdini with HOUDINI_OCL_FEATURE_DISABLEvariable, but sadly it didn't help. Neither did Houdini upgrade from production build 19.5.435 to 19.5.493.

I have an RTX 3070.

▶ apt list | grep nvidia | grep installed

firmware-nvidia-gsp/testing,now 520.56.06-2 amd64 [installed,automatic]
glx-alternative-nvidia/testing,now 1.2.2 amd64 [installed,automatic]
libegl-nvidia0/testing,now 520.56.06-2 amd64 [installed,automatic]
libegl-nvidia0/testing,now 520.56.06-2 i386 [installed,automatic]
libgl1-nvidia-glvnd-glx/testing,now 520.56.06-2 amd64 [installed,automatic]
libgl1-nvidia-glvnd-glx/testing,now 520.56.06-2 i386 [installed,automatic]
libgles-nvidia1/testing,now 520.56.06-2 amd64 [installed,automatic]
libgles-nvidia1/testing,now 520.56.06-2 i386 [installed,automatic]
libgles-nvidia2/testing,now 520.56.06-2 amd64 [installed,automatic]
libgles-nvidia2/testing,now 520.56.06-2 i386 [installed,automatic]
libglx-nvidia0/testing,now 520.56.06-2 amd64 [installed,automatic]
libglx-nvidia0/testing,now 520.56.06-2 i386 [installed,automatic]
libnvidia-allocator1/testing,now 520.56.06-2 amd64 [installed,automatic]
libnvidia-allocator1/testing,now 520.56.06-2 i386 [installed,automatic]
libnvidia-cfg1/testing,now 520.56.06-2 amd64 [installed,automatic]
libnvidia-compiler/testing,now 520.56.06-2 amd64 [installed,automatic]
libnvidia-egl-gbm1/testing,now 1.1.0-2 amd64 [installed,automatic]
libnvidia-egl-gbm1/testing,now 1.1.0-2 i386 [installed,automatic]
libnvidia-egl-wayland1/testing,now 1:1.1.10-1 amd64 [installed,automatic]
libnvidia-eglcore/testing,now 520.56.06-2 amd64 [installed,automatic]
libnvidia-eglcore/testing,now 520.56.06-2 i386 [installed,automatic]
libnvidia-encode1/testing,now 520.56.06-2 amd64 [installed,automatic]
libnvidia-encode1/testing,now 520.56.06-2 i386 [installed,automatic]
libnvidia-glcore/testing,now 520.56.06-2 amd64 [installed,automatic]
libnvidia-glcore/testing,now 520.56.06-2 i386 [installed,automatic]
libnvidia-glvkspirv/testing,now 520.56.06-2 amd64 [installed,automatic]
libnvidia-glvkspirv/testing,now 520.56.06-2 i386 [installed,automatic]
libnvidia-ml1/testing,now 520.56.06-2 amd64 [installed,automatic]
libnvidia-ptxjitcompiler1/testing,now 520.56.06-2 amd64 [installed,automatic]
libnvidia-ptxjitcompiler1/testing,now 520.56.06-2 i386 [installed,automatic]
libnvidia-rtcore/testing,now 520.56.06-2 amd64 [installed,automatic]
nvidia-alternative/testing,now 520.56.06-2 amd64 [installed,automatic]
nvidia-driver-bin/testing,now 520.56.06-2 amd64 [installed,automatic]
nvidia-driver-libs/testing,now 520.56.06-2 amd64 [installed,automatic]
nvidia-driver-libs/testing,now 520.56.06-2 i386 [installed,automatic]
nvidia-driver/testing,now 520.56.06-2 amd64 [installed]
nvidia-egl-common/testing,now 520.56.06-2 amd64 [installed,automatic]
nvidia-egl-icd/testing,now 520.56.06-2 amd64 [installed,automatic]
nvidia-egl-icd/testing,now 520.56.06-2 i386 [installed,automatic]
nvidia-installer-cleanup/testing,now 20220217+2 amd64 [installed,automatic]
nvidia-kernel-common/testing,now 20220217+2 amd64 [installed,automatic]
nvidia-kernel-dkms/testing,now 520.56.06-2 amd64 [installed,automatic]
nvidia-kernel-support/testing,now 520.56.06-2 amd64 [installed,automatic]
nvidia-legacy-check/testing,now 520.56.06-2 amd64 [installed,automatic]
nvidia-modprobe/testing,now 525.78.01-1 amd64 [installed,automatic]
nvidia-opencl-common/testing,now 520.56.06-2 amd64 [installed,automatic]
nvidia-opencl-icd/testing,now 520.56.06-2 amd64 [installed]
nvidia-persistenced/testing,now 520.56.06-1 amd64 [installed,automatic]
nvidia-settings/testing,now 520.56.06-1 amd64 [installed,automatic]
nvidia-smi/testing,now 520.56.06-2 amd64 [installed,automatic]
nvidia-support/testing,now 20220217+2 amd64 [installed,automatic]
nvidia-vdpau-driver/testing,now 520.56.06-2 amd64 [installed,automatic]
nvidia-vulkan-common/testing,now 520.56.06-2 amd64 [installed,automatic]
nvidia-vulkan-icd/testing,now 520.56.06-2 amd64 [installed,automatic]
nvidia-vulkan-icd/testing,now 520.56.06-2 i386 [installed,automatic]
xserver-xorg-video-nvidia/testing,now 520.56.06-2 amd64 [installed,automatic]

▶ apt list | grep opencl | grep installed

nvidia-opencl-common/testing,now 520.56.06-2 amd64 [installed,automatic]
nvidia-opencl-icd/testing,now 520.56.06-2 amd64 [installed]
ocl-icd-libopencl1/testing,now 2.3.1-1 amd64 [installed,automatic]
ocl-icd-libopencl1/testing,now 2.3.1-1 i386 [installed,automatic]
Edited by ajz3d - Feb. 11, 2023 14:26:30
User Avatar
Member
580 posts
Joined: Aug. 2014
Offline
Success!

Thanks to Krzysztof Marczak [bugs.debian.org] I found a workaround for the time being. I'll post it here, verbatim:

I have also tested it with libnvidia-nvvm4 installed. I got exactly the same 
result.
It could be a problem with wrong the path for libnvidia-nvvm.so.4. After
installation of the package it is located here:

/usr/lib/x86_64-linux-gnu/nvidia/current/libnvidia-nvvm.so.4
/usr/lib/x86_64-linux-gnu/nvidia/current/libnvidia-nvvm.so.525.85.12

But the nvidia compiller is looking of the library here:

/lib/x86_64-linux-gnu/libnvidia-nvvm.so.525.85.12
/usr/lib/x86_64-linux-gnu/libnvidia-nvvm.so.525.85.12
/lib/libnvidia-nvvm.so.525.85.12
/usr/lib/libnvidia-nvvm.so.525.85.12

When in created symlinks to the files libnvidia-nvvm.so.4 and libnvidia-
nvvm.so.525.85.12 in /usr/lib/x86_64-linux-gnu/ the OpenCL compiler started to
work properly.

So installation of libnvidia-nvvm4 and creating symlinks is actual
workaround.

This also fixes the problem with Vellum solver, and possibly other issues with OpenCL.

Please note, that Andreas Beckmann, one of Debian's nvidia-driverpackage maintainers, advices to remove those symlinks [bugs.debian.org] once the problem is fixed in the next package version.

Cheers.

PS. Yet another note. On Debian Testing the current version of nvidia-driveris 520.56.06-2, not 525.85.12. Marczak and Beckmann are working on packages from Sid. Filenames in symlinks need to be adjusted appropriately.
Edited by ajz3d - Feb. 15, 2023 14:59:05
User Avatar
Staff
823 posts
Joined: July 2006
Offline
The Houdini 19.5 Production Build of 19.5.534 includes the following change which should resolve the issue with Vellum pressure constraints and the 4090. This fix has also been backported to the daily 19.0 build.

Rewrote the Pressure constraint in the Vellum Solver to make it deterministic on GPUs at high constraint counts, as well as to avoid hanging that was occurring with the latest Ada architecture GPUs from NVIDIA (e.g. 4090).
User Avatar
Member
2 posts
Joined: May 2023
Offline
I have got rid of the problem by updating the NVIDIA driver from device manager [thegeekpage.com].
  • Quick Links