[OpenCL] Temporary NanoVDB Buffers ?

   2922   29   1
User Avatar
Member
28 posts
Joined: April 2017
Offline
Hey folks, i'm currently working on some OpenCL snippet and using NanoVDBs bind with the new opencl syntax (really cool btw).
I often come across GPU race conditions when reading / writting to a same field. For example :
#bind vdb &vel float3

@KERNEL 
{
   float3 advected_vel = advect(@vel);

   @vel.set(advecte_vel);
}

This will cause artefacts due to some race conditions. The solution is simply creating a copy of that vel field before and doing the following :
#bind vdb &vel float3
#bind vdb &temp_vel float3

@KERNEL 
{
   float3 advected_vel = advect(@temp_vel);

   @vel.set(advecte_vel);
}

Is there a way to create directly a copy in the kernel like the following ?
#bind vdb &vel float3

@KERNEL 
{
   auto temp_vel = copy(vel);
   float3 advected_vel = advect(temp_vel);

   @vel.set(advecte_vel);
}

Those are dummy implementation but i hope my problem is understandable.
Being able to create a copy would remove the overhead of Name / blast / merge step to create a temp copy of a field.
The point here is do reduce GPU <-> CPU transfer as much as possible.

Thanks !
Enzo C.
https://www.linkedin.com/in/enzocrema/ [www.linkedin.com]
User Avatar
Member
166 posts
Joined: May 2021
Offline
I believe you need to add a temporary attribute before and a write-back kernel into your current OpenCL because the memory is dedicated after sending it to OpenCL in the GPU. Jeff Lait did a great talk here on this right after the new switch to the fresh syntax:

Check right around the 32.30 mark and watch through for debugging.
New features of OpenCL in Houdini20 [www.youtube.com]
Edited by PHENOMDESIGN - July 18, 2024 20:47:53
PHENOM(enological) DESIGN;
Experimental phenomenology (study of experience) is a category of philosophy evidencing intentional variations of subjective human experiencing where both the independent and dependent variable are phenomenological. Lundh 2020
User Avatar
Member
28 posts
Joined: April 2017
Offline
Okay so no way to allocate a temp fied directly in the code ? I watched the video and it's already what I have in my code but I was wondering if that temp field thingy was avoidable.
Enzo C.
https://www.linkedin.com/in/enzocrema/ [www.linkedin.com]
User Avatar
Member
166 posts
Joined: May 2021
Offline
Once something is on the GPU you cannot do anything with it.
So you need to pre- allocate the memory for the temporary buffer as another field.
Then declare that temp in the code so the GPU knows the memory place to hold the values.

It is just an attribute node before. Not too much extra...

Other than that, there is no "space" for OpenCL to put things without the field.
Edited by PHENOMDESIGN - July 19, 2024 11:43:09
PHENOM(enological) DESIGN;
Experimental phenomenology (study of experience) is a category of philosophy evidencing intentional variations of subjective human experiencing where both the independent and dependent variable are phenomenological. Lundh 2020
User Avatar
Member
28 posts
Joined: April 2017
Offline
Okay got it.

PHENOMDESIGN
It is just an attribute node before. Not too much extra...

Well that's my current setup and this is also the bottleneck. Since copying heavy vdb for calculation then deleting them is "slow", around 20ms/frame for 2M active voxels.
I guess the next step for performance is to work using the HDK to avoid having execessive data transfer.
Enzo C.
https://www.linkedin.com/in/enzocrema/ [www.linkedin.com]
User Avatar
Member
166 posts
Joined: May 2021
Offline
I do not think there is any copying. There should be: your data, which is read, and then an empty field that the results are written to. What kind of calculations are you doing? That can have a large impact. Also the precision etc.

You are performing physics calculations at around 20ms/frame? Performing physics calculations at that rate on 2M active voxels is impressive.
PHENOM(enological) DESIGN;
Experimental phenomenology (study of experience) is a category of philosophy evidencing intentional variations of subjective human experiencing where both the independent and dependent variable are phenomenological. Lundh 2020
User Avatar
Member
28 posts
Joined: April 2017
Offline
PHENOMDESIGN
There should be: your data, which is read, and then an empty field that the results are written to.

That make sense, I think i was taking the problem the wrong way by copying the existing field, perfoming calculations on the temp field then overwriting on the pre-exiting.



I'm working on a fluid solver, on the picture this is one frame with 3M active voxels, as you can see VDB nodes takes 57ms on the 400 ms which is the bottleneck ( let's not talk about the vex code taking 90 ms )
Edited by ZephirFX - July 19, 2024 15:52:04

Attachments:
Capture.PNG (30.4 KB)

Enzo C.
https://www.linkedin.com/in/enzocrema/ [www.linkedin.com]
User Avatar
Member
28 posts
Joined: April 2017
Offline
After running some test it's way slower to init a field to 0 and write to it than overwriting an existing field :/
Edited by ZephirFX - July 19, 2024 16:18:44
Enzo C.
https://www.linkedin.com/in/enzocrema/ [www.linkedin.com]
User Avatar
Member
166 posts
Joined: May 2021
Offline
Are ther multiple OpenCL SOPs and this is your node graph in a tree? Or is this the performance trace of draw calls. I do not use this view so don't have a good understanding of the nodes from this.

Essentially, you want to keep the OpenCL together sequentially and start using compiler blocks with the end block turned to Multi-thread on compile.

Are you using any GAS Microsolvers yet?
https://www.youtube.com/watch?v=d3DZt1prjzI [www.youtube.com]
Edited by PHENOMDESIGN - July 19, 2024 18:19:28
PHENOM(enological) DESIGN;
Experimental phenomenology (study of experience) is a category of philosophy evidencing intentional variations of subjective human experiencing where both the independent and dependent variable are phenomenological. Lundh 2020
User Avatar
Member
28 posts
Joined: April 2017
Offline
Yes there's 5 opencl sops, I ran test using compiles blocks too but it didn't really improve performances.
For a better overview :


🟥 velocity advection
🟨 project non divergent
🟩 density advection
🟦 output

the only node besides the opencl ones are Name / blast / merge in order to create the temp field as efficiently as possible.

Are you using any GAS Microsolvers yet?


Nope my project is about running the simulation in SOP level using opencl & nanovdb which is not possible in microsolvers.
Edited by ZephirFX - July 19, 2024 18:22:37

Attachments:
image (1).png (51.1 KB)

Enzo C.
https://www.linkedin.com/in/enzocrema/ [www.linkedin.com]
User Avatar
Member
166 posts
Joined: May 2021
Offline
ZephirFX
Nope my project is about running the simulation in SOP level using opencl & nanovdb which is not possible in microsolvers.

Going to OpenCL for a NanoVdb that you write may not be as optimal than using the SIMD optimized OpenVDB nodes. It doesn't "have" to be NanoVDB on the GPU to be fast. OpenVDB is a highly optimized structure.

That is possible in DOPs. It is essentiall what DOPs is. GasOpenCL etc.
How scientifically accurate do you need to be?

The image is too small I cannot see.
Edited by PHENOMDESIGN - July 19, 2024 18:59:23
PHENOM(enological) DESIGN;
Experimental phenomenology (study of experience) is a category of philosophy evidencing intentional variations of subjective human experiencing where both the independent and dependent variable are phenomenological. Lundh 2020
User Avatar
Member
28 posts
Joined: April 2017
Offline
PHENOMDESIGN
ZephirFX
Nope my project is about running the simulation in SOP level using opencl & nanovdb which is not possible in microsolvers.
That is possible in DOPs. It is essentiall what DOPs is. GasOpenCL etc.

Micro solver don't use VDBs but Houdini native volumes I believe.
Also "Sparsity" in DOPs is just computed using a Gas Occupancy Mask which is just a 16^3 blocks grid around your fields ( density most of the time ).

Enabling OpenCL in any node will result in disabling Sparsity. Even if I want to go this route and using the active field as a compute mask in my opencl nodes i'd need to check for every voxels to see if i'm within the active field, which creates lots of overhead.

I've optimized many times the solver, created my own in micro solvers etc and never had that much performance than running the sim in SOP with straight VDBs and OpenCL but it's still not enough lol. I can run sim with 150M voxels only for the density field in 1sec which is wayy above the performance of DOPs already.

Edit :
Compared to the native OpenVDB node i'm 40x faster for the Project Non Divergent and 20x faster for the advection.
I'm not trying to be scientificaly accurate but it needs to look accurate enough not to be odd ahah
Edited by ZephirFX - July 19, 2024 19:11:21
Enzo C.
https://www.linkedin.com/in/enzocrema/ [www.linkedin.com]
User Avatar
Member
166 posts
Joined: May 2021
Offline
I would look at the documentation again. If that does not work well then investigate a frequency domain spectral solver fft or wavelet approach for the algorithm and transform back to the fluid domain.

OHHH I forgot, there is also the Linear Solver SOP that you can use to factorize the simulation and find your sparsity etc.

ZephirFX
Micro solver don't use VDBs but Houdini native volumes I believe.

https://www.sidefx.com/docs/houdini/nodes/dop/gasopenclmergevdb.html [www.sidefx.com]

Gas OpenCL Merge VDB dynamics node
Uses OpenCL to import VDB data from source geometry into simulation fields.

Gas OpenCL Merge VDB node imports VDB data into DOP fields. Its functionality is very similar to the volume sourcing features of the Volume Source DOP (when Source Type is set to Individual Volumes) and Volume Instance Source DOP (when Source Type is set to Packed Sets). The main difference is that the Gas OpenCL Merge VDB node leverages NanoVDB to perform the merging operations in OpenCL.

https://www.sidefx.com/docs/houdini/nodes/dop/gasopencl.html [www.sidefx.com]
This DOP provides a general interface to creating and running OpenCL kernels using a variable number of parameters. It also provides users with a way to automatically generate kernel headers from their list of parameters.


Volume

Bind a volume.

VDB

Bind a VDB.

Precision
Controls the precision the data of this parameter is bound with. The Node option will use the node’s precision, so will vary depending on its setting and the corresponding kernel code should use the fpreal or exint defines.

This is the precision the data is stored on the video card so using lower precision can save GPU memory. But note that 16-bit, which corresponds to half, often cannot be used for computation. The vload_half can be used to promote it to float for computation.

If the same attribute ends up bound with different precisions it will fail the binding.

Currently volumes only bind with 32bit data precision.

Readable
Determines if the OpenCL kernel will read from this attribute. If not set, the attributes values will not be copied onto the GPU. This is useful for write-only attributes as it avoids an unnecessary copy, but requires care as uninitialized data will be present.

Present for Attributes.

Writeable
Determines if the OpenCL kernel will write back to this attribute or field. Causes the CPU version of the attribute or field to be marked out of date so the next time it is needed it will be copied back from the GPU.

Present for Fields and Attributes.

Optional
Marks the attribute as not necessary. If the attribute isn’t present in the geometry, rather than erroring, a #define is set in the kernel options to disable the attribute. Note that this also changes the parameter signature, so the Generate Code button should be used to verify the syntax.

Note

The parameter name is used in the #define, so changing the parameter name requires changing the code.

Present for Attributes, Volumes, VDBs, and Options.

Default Value
Marks that if an optional attribute or volume is missing that a parameter value should still be bound to the kernel. A #define is set in the kernel options to disable the attribute and switch to the single value. Note that this also changes the parameter signature, so the Generate Code button should be used to verify the syntax.

The value of the bound paramater will be taken from the integer or float value of this parameter.

Ramp Size
The number of floating point values to evaluate the ramp in.
Edited by PHENOMDESIGN - July 19, 2024 19:40:26
PHENOM(enological) DESIGN;
Experimental phenomenology (study of experience) is a category of philosophy evidencing intentional variations of subjective human experiencing where both the independent and dependent variable are phenomenological. Lundh 2020
User Avatar
Member
28 posts
Joined: April 2017
Offline
Well it works to put VDB data in dops yes but that node just use the VDB to init the corresponding dense field. You can't perform calculation on only VDB in dops. You can bind it to a OpenCL node but then it will be transformed back and forth.


Edited by ZephirFX - July 19, 2024 19:39:53

Attachments:
Capture.PNG (99.3 KB)
Captuaaaaaaaaae.PNG (770.8 KB)

Enzo C.
https://www.linkedin.com/in/enzocrema/ [www.linkedin.com]
User Avatar
Member
28 posts
Joined: April 2017
Offline
PHENOMDESIGN
OHHH I forgot, there is also the Linear Solver SOP that you can use to factorize the simulation and find your sparsity etc.
yep good point, I absolutely don't understand this node tho

It seems highly specific and I think making this performant would requiere learning some really advanced linear algebra :/
Edited by ZephirFX - July 19, 2024 19:48:55
Enzo C.
https://www.linkedin.com/in/enzocrema/ [www.linkedin.com]
User Avatar
Member
166 posts
Joined: May 2021
Offline
I started learning physics in Houdini but do not write my solvers in there now.

For now, I use Julia for Physics stuff because they have the most powerful math libaries for this when I need accuracy, and literacy. Even though I have been trained as a designer and artist, I want "literacy" in physics because it is like a paint-brush in higher-dimensional space. So current tools in Houdini are obtuse or black-box to really gain "literacy" from Houdini alone.

This channel provides a great explaination of Julia and the Spectral approach.
3D Pseudo-Spectral Navier-Stokes Solver in Julia [www.youtube.com]

I plan to compile and use my libraries in Houdini as I have a ton of theoretical physics to test!

What Houdini does have and is probably THE MOST important thing for "understanding" physics is the visualization and parametric space exploration that provides a mental material for mental simulations. Like I can see the physics in my mind after using Houdini

ZephirFX
learning some really advanced linear algebra :/

Yes...that is what I have been researching at Grad School for Physics-based Neural Surrogate in Real-time Science-backed Co-design. It empowers "Futures" practices with scientifically accurate feedback to speed up the "front-end" of the design process with design constraints that can address pressing socio-ecological issues the world faces.
Edited by PHENOMDESIGN - July 19, 2024 21:53:26
PHENOM(enological) DESIGN;
Experimental phenomenology (study of experience) is a category of philosophy evidencing intentional variations of subjective human experiencing where both the independent and dependent variable are phenomenological. Lundh 2020
User Avatar
Member
166 posts
Joined: May 2021
Offline
Vellum Fluids is looking pretty good in Christopher Rutledge's demo:

how to use the SUPER FAST fluid and grains in Houdini 20.5 [www.youtube.com]
PHENOM(enological) DESIGN;
Experimental phenomenology (study of experience) is a category of philosophy evidencing intentional variations of subjective human experiencing where both the independent and dependent variable are phenomenological. Lundh 2020
User Avatar
Member
28 posts
Joined: April 2017
Offline
PHENOMDESIGN
Vellum Fluids is looking pretty good in Christopher Rutledge's demo:

how to use the SUPER FAST fluid and grains in Houdini 20.5 [www.youtube.com]
I mean for the amount of particles used this is extremely slow :/

PHENOMDESIGN
This channel provides a great explaination of Julia and the Spectral approach.
3D Pseudo-Spectral Navier-Stokes Solver in Julia

This looks interesting.

The thing is yes Houdini gives you that visual feedback which needed when working on Physics / Graphics stuff. But when building performance critical apps then Houdini is not powerfull enough, you'll then need to work on the HDK.
For example, every realtime-ish fluid solver are based on "Stable Fluids" algorithm by Jos Stam [www.researchgate.net] which is sadly not possible in Houdini due to poor performance and transfer when working on VDBs or native volumes.
Enzo C.
https://www.linkedin.com/in/enzocrema/ [www.linkedin.com]
User Avatar
Member
166 posts
Joined: May 2021
Offline
Are you willing to use Python and outside libraries?

Inlinecpp too that you can reference in the nanovdb headers or a compiled Julia library . There will have to be some clever decomposition, abstraction, or/and multi-res.

Ok I guess my question is how large is the simulation domain and do you have reference for the type of effect you are going for? What are the colliders? Are we talking about the ocean, a river, air over an airplane etc?

How do you define performance? Are you staying in Houdini or deploying this elsewhere?

Here is the 3D Stable Fluids with FFT from the same channel:
https://www.youtube.com/watch?v=bvPi6XwdM0U&t=446s [www.youtube.com]

This may need to be a shader?
Edited by PHENOMDESIGN - July 20, 2024 10:34:24
PHENOM(enological) DESIGN;
Experimental phenomenology (study of experience) is a category of philosophy evidencing intentional variations of subjective human experiencing where both the independent and dependent variable are phenomenological. Lundh 2020
User Avatar
Member
166 posts
Joined: May 2021
Offline
VDB is moving toward neural representations as per the last update
OpenVDB - Ken Museth, NVIDIA & Jeff Lait, SideFX
[www.youtube.com]

Still not opensourced yet but here is some insights:

The documentation for this is here and you could do this in Houdini.
Omni.VDB.NeuralVDB Extension [docs.omniverse.nvidia.com]

NeuralVDB: High-resolution Sparse Volume Representation using
Hierarchical Neural Networks
[arxiv.org]
It would then be a matter of using the "Mixture of Experts" they have for the physics as well.
Edited by PHENOMDESIGN - July 20, 2024 20:13:24
PHENOM(enological) DESIGN;
Experimental phenomenology (study of experience) is a category of philosophy evidencing intentional variations of subjective human experiencing where both the independent and dependent variable are phenomenological. Lundh 2020
  • Quick Links