On this page | |
Since | 12.5 |
This DOP provides a general interface to creating and running OpenCL kernels using a variable number of parameters. It also provides users with a way to automatically generate kernel headers from their list of parameters.
Warning
This node requires that you understand OpenCL. It is very easy to write incorrect code using this node.
Syntax ¶
See OpenCL for VEX users for basic information on the syntax available.
Parameters ¶
Kernel ¶
Kernel Name
The name of the OpenCL kernel to execute with the loaded program.
Use Code Snippet
Use the code provided in the Kernel Code parameter rather than an external disk file. This makes for quicker editing and creation of OpenCL microsolvers.
Kernel File
The path to OpenCL program file to compile. This can include a path to an on disk file or asset.
Enable @-Binding
In the Code Snippet mode enable the use of @-prefixed macros that will provide a simpler way to generate kernels and manipulate geometry in kernels.
Kernel Options
Specify any desired compile flags for the kernel. The most common is to use -D to provide #define directives for the pre-processor.
Note
The Apple OSX OpenCL compiler requires only a single space between kernel options!
Houdini defines additional flags while compiling kernels depending on the
OpenCL device. The flags __H_GPU__
or __H_CPU__
distinguish between GPU and
CPU devices, and __H_NVIDIA__
, __H_AMD__
, __H_INTEL__
, or __H_APPLE_
signify the hardware vendor. You can set the environment variable
HOUDINI_OCL_REPORT_BUILD_LOGS
to 1 before running Houdini to get a
dump of all kernels compiled along with their preprocessor flags.
Option Attribute
Specify a detail string attribute to be added as a compile flag to the kernel. This will take the detail attribute in the input geometry of this name. If it is a string, it will be injected as a kernel options. The string should have the -D options if specifying a define, for example.
Note
The value of the string should not change frequently or the kernel may keep recompiling, which can be very expensive.
Recompile Kernel
When loading kernels from disk the kernel is cached to avoid regenerating it every solve. Turning this on forces the re-loading and recompiling of the kernel. This is useful if #include files refer to code that has changed, or the kernel file is changed in an external text editor.
It should always be disabled when prototyping is complete.
Options ¶
Run Over
The provided OpenCL kernel is invoked once. The number of global ids, however, is controlled by this setting. First Writeable attribute sets it to the size of the first bound attribute that is marked writeable. All fields sets it to the total voxels of the fields.
The Worksets method will use a specified detail attribute to specify a list of begin values and length values. The kernel will be invoked once per non-zero length. The global id will vary from 0 to the length-1 on each invocation, and the begin value can be used to find an offset inside your bound attributes.
Note
The global ids will be rounded up to ensure efficient processing on the GPU, so you should always compare the get_global_id(0)
with the actual length of the bound attribute.
Force Align
Force the specified fields alignment to match the output grid. By selecting this option, each grid has its values interpolated to match the alignment of the output grid and allows the kernel to execute independent of field alignment.
Include Origin
Include the origin of the input/output grids.
Include Size
Include the size of the input/output grids.
Include Voxel Size
Include the size of the voxels.
Flush Attributes
After writing to attributes, the new values are left on the GPU until another solver requests the geometry attributes. This lets the attributes stay there and provides the most efficiency. Turning on flush attributes forces them to be copied back from the GPU into geometry memory explicitly. This should not be required.
Finish Kernels
When Finish Kernels is disabled, no attempt is to wait for the OpenCL kernels to complete before continuing the next solver. This lets them run in the background until their results are actually needed. To simplify debugging, it is useful to ensure kernels are finished to make sure errors are detected in the right spot.
Warn on Missing
If a bound attribute isn’t present, and is not marked optional, no computation will be done. Normally this will generate a warning as it is unexpected. However, sometimes the correct action is to silently do nothing, so turning off this flag will avoid spurious warnings.
Error on Mismatched Points
If any bound geometry has different point counts, raise an error and do not run the kernel. This allows the kernel code to assume that all the bound geometry has matching point counts, avoiding out of bound checks.
Error on Mismatched Primitives
If any bound geometry has different primitive counts, raise an error and do not run the kernel. This allows the kernel code to assume that all the bound geometry has matching primitive counts, avoiding out of bound checks.
Include Time
Include the current simulation time as a parameter.
Include Simulation Frame
Include the current simulation frame as a parameter.
Include Timestep
Include the current timestep as a parameter. This is useful as if the OpenCL node is triggered from a Gas Substep it may be less than the full timestep.
Time Scale
For some operations you may wish to know the power of the timestep.
Rather than recomputing in the kernel, you can set this to e^Timestep
and have the exponentiation pre-computed.
Include Simplex Noise Data
Include an opaque pointer that can be passed to the simplex noise functions in <xnoise.h> to generate simplex noise and curlnoise from OpenCL kernels.
Precision
Controls the precision of this node. The fpreal
and exint
types will be defined in the generated code to correspond with
this specified precision. The vector variants will also be defined,
ie, fpreal3
, fpreal4
, etc. Additionally the FPREAL_PREC
symbol
is defined as 16 for half, 32 for float, or 64 for double.
Auto will use the preferred precision of the first writeable geometry as set by the Attribute Cast SOP. In Run Over Field modes, preferred precision is currently always 32.
Note
16-bit cannot be used for computation in most drivers.
Worksets Geometry
Which DOP Geometry data to look for the workset detail attributes on.
Worksets Begin Attr.
An integer array detail attribute storing the start values for each workset.
Worksets Length Attr.
An integer array detail attribute storing the length of each workset. Worksets of zero length will not be invoked.
Note it is your responsibility to validate that the workset length and begin values provided by the detail attributes are legitimate offsets into the bound attributes. If these do not come from your control, you should validate them before dereferencing.
Use Single Workgroup If Possible
When running over Worksets on a GPU, it can be faster to execute many
small worksets on the GPU within one kernel call, performing synchronization
within the kernel after each workest, rather than executing a kernel for each
separate workset. When this option is enabled, if the largest workset will fit
within one workgroup on the OpenCL GPU device, the SINGLE_WORKGROUP preprocessor
flag will be defined, and the entire Worksets Begin and Worksets Length
arrays will be passed to the kernel. It is up to the kernel to synchronize
at the end of each workset, usually using barrier(CLK_MEM_GLOBAL_FENCE)
. The
code generated by the Generate Kernel button shows one way of handling
this synchronization.
There are three methods for this. The first will only invoke the single pass if all the worksets will fit in the workgroup. The second will interleave calls to batch up worksets that fall within the valid range. It sets the SINGLE_WORKGROUP_SPANS define in the kernel and also provides the start workoffset to process to the function. The third will always use the single work group approach and set the SINGLE_WORKGROUP_ALWAYS define in the kernel. This means the get_global_id(0) will be smaller than some of the work group sizes, so the kernel is responsible for handling those cases.
The span method usually provides the best efficiency.
Bindings ¶
OpenCL Parameters
The number of extra parameters within the OpenCL kernel.
Each parameter can either be a fixed constant value, evaluated during DOP network traversal, or read/write from a field or geometry attribute.
Parameter Name
The name of the parameter. This is used in the Generate Kernel
button, but is otherwise only present as a comment. The actual
binding to an OpenCL kernel is done by parameter order, not
by the name.
Parameter Type
The type of parameter to create and bind.
Integer
A constant integer value, allowing you to bind channel references and expressions that are pre-computed.
Float
A constant float value. Optionally you can scale it by the timestep.
Float Vec4
A constant tuple of four floats, binding to a float4
OpenCL parameter.
Scalar Field
A floating point valued scalar field. The field name is a Scalar Field Data that will be bound. The writeable flag controls whether the pointer is marked as const in OpenCL.
Note
If the field is writeable, the next time it is needed by Houdini it will be copied back.
Vector Field
A vector valued field. The field name is a Vector Field Data that will be bound.
Matrix Field
A matrix valued field. The field name is a Matrix Field Data that will be bound.
Ramp
A scalar ramp. Because evaluating a spline-based ramp inside of an OpenCL kernel is complex, the ramp is instead sampled into a uniform array of floats. The Ramp Size
parameter controls the number of samples used.
Attribute
Bind a geometry attribute.
Volume
Bind a volume.
VDB
Bind a VDB.
Data Option
Bind option values from the given simulation data.
Field
The name of the DOP data to bind as a field.
Present for Fields.
Geometry
The name of the DOP data to bind as geometry.
Present for Attributes.
Attribute
Which attribute to bind. It is an error if it is missing, unless the optional flag is set.
Present for Attributes.
Class
The type of the attribute. Since the first writeable attribute can determine the iteration order, this can determine the number of global ids processed by the OpenCL solver.
Not all bound attributes need to be the same type, or even come from the same geometry data.
Present for Attributes.
Type
What sort of attribute to bind. Float and integer attributes are bound as single arrays containing all element values in order. Tuples are interleaved, ie, P will be bound as xyzxyzxyz.
Array attributes are bound as two arrays. One array contains the offsets of each element’s array data. Thus, the difference of a pair of offsets provides the elements array length. The second array is the data of all elements' arrays concatenated into a single array.
Present for Attributes.
Size
Tuple size of the attribute to bind. If greater than zero, the attribute must be able to provide this tuple size. If zero, it will bind automatically and an extra parameter will be generated storing the tuplesize.
Present for Attributes.
Volume
The name or number of the volume or VDB primitive to bind.
Voxel Resolution
Add the resolution of the volume as a parameter.
Voxel Size
Add the size of the volume as a parameter, in SOP space.
Volume Transform to World
Add a matrix transform that converts from the volume’s voxel coordinates to the SOP coordinates.
Volume Transform to Voxel
Add a matrix transform that converts from SOP coordinates to the volume’s voxel coordinates.
Data Name
Name of the simulation data whose Option values will be sent as kernel arguments.
Option Value Name
Name of the option value on the simulation data to read.
Option Value Type
Binding type for the option value. This controls type of the argument in the OpenCL kernel.
Note
Float options can be cast and bound as integers and vice versa.
Option Value Size
Tuple size of the bound argument. This must be at least as large as the tuple size of the option value.
Precision
Controls the precision the data of this parameter is bound with.
The Node option will use the node’s precision, so will vary depending
on its setting and the corresponding kernel code should use the
fpreal
or exint
defines.
This is the precision the data is stored on the video card so using
lower precision can save GPU memory. But note that 16-bit, which
corresponds to half
, often cannot be used for computation. The
vload_half
can be used to promote it to float
for computation.
If the same attribute ends up bound with different precisions it will fail the binding.
Currently volumes only bind with 32bit data precision.
Readable
Determines if the OpenCL kernel will read from this attribute. If not set, the attributes values will not be copied onto the GPU. This is useful for write-only attributes as it avoids an unnecessary copy, but requires care as uninitialized data will be present.
Present for Attributes.
Writeable
Determines if the OpenCL kernel will write back to this attribute or field. Causes the CPU version of the attribute or field to be marked out of date so the next time it is needed it will be copied back from the GPU.
Present for Fields and Attributes.
Optional
Marks the attribute as not necessary. If the attribute isn’t present in the geometry, rather than erroring, a #define is set in the kernel options to disable the attribute. Note that this also changes the parameter signature, so the Generate Code button should be used to verify the syntax.
Note
The parameter name is used in the #define
, so changing the parameter name requires changing the code.
Present for Attributes, Volumes, VDBs, and Options.
Default Value
Marks that if an optional attribute or volume is missing that a parameter value should still be bound to the kernel. A #define is set in the kernel options to disable the attribute and switch to the single value. Note that this also changes the parameter signature, so the Generate Code button should be used to verify the syntax.
The value of the bound paramater will be taken from the integer or float value of this parameter.
Ramp Size
The number of floating point values to evaluate the ramp in.
Generated Code ¶
Generate Kernel
When @-Binding is enabled, will produce the fully expanded code that is sent to the actual compiler. This can resolve line numbers for errors when compilers do not respect the #line directive, and also help understand how the @-macros work. Note that the exact expansion of @-macros should not be relied upon.
Otherwise, creates a prototype for the required kernel function taking all of your current selected parameters into account. This can be used as a starting point or to update your interface when new parameters are added or removed.
Generated Code
The code snippet with all @-bindings expanded.
Note this parameter is not used but is purely informational.
Outputs ¶
First Output
The operation of this output depends on what inputs are connected to this node. If an object stream is input to this node, the output is also an object stream containing the same objects as the input (but with the data from this node attached).
If no object stream is connected to this node, the output is a data output. This data output can be connected to an Apply Data DOP, or connected directly to a data input of another data node, to attach the data from this node to an object or another piece of data.
Locals ¶
channelname
This DOP node defines a local variable for each channel and parameter on the Data Options page, with the same name as the channel. So for example, the node may have channels for Position (positionx, positiony, positionz) and a parameter for an object name (objectname).
Then there will also be local variables with the names positionx, positiony, positionz, and objectname. These variables will evaluate to the previous value for that parameter.
This previous value is always stored as part of the data attached to the object being processed. This is essentially a shortcut for a dopfield expression like:
dopfield($DOPNET, $OBJID, dataName, "Options", 0, channelname)
If the data does not already exist, then a value of zero or an empty string will be returned.
DATACT
This value is the simulation time (see variable ST) at which the current data was created. This value may not be the same as the current simulation time if this node is modifying existing data, rather than creating new data.
DATACF
This value is the simulation frame (see variable SF) at which the current data was created. This value may not be the same as the current simulation frame if this node is modifying existing data, rather than creating new data.
RELNAME
This value will be set only when data is being attached to a relationship (such as when Constraint Anchor DOP is connected to the second, third, of fourth inputs of a Constraint DOP).
In this case, this value is set to the name of the relationship to which the data is being attached.
RELOBJIDS
This value will be set only when data is being attached to a relationship (such as when Constraint Anchor DOP is connected to the second, third, of fourth inputs of a Constraint DOP).
In this case, this value is set to a string that is a space separated list of the object identifiers for all the Affected Objects of the relationship to which the data is being attached.
RELOBJNAMES
This value will be set only when data is being attached to a relationship (such as when Constraint Anchor DOP is connected to the second, third, of fourth inputs of a Constraint DOP).
In this case, this value is set to a string that is a space separated list of the names of all the Affected Objects of the relationship to which the data is being attached.
RELAFFOBJIDS
This value will be set only when data is being attached to a relationship (such as when Constraint Anchor DOP is connected to the second, third, of fourth inputs of a Constraint DOP).
In this case, this value is set to a string that is a space separated list of the object identifiers for all the Affector Objects of the relationship to which the data is being attached.
RELAFFOBJNAMES
This value will be set only when data is being attached to a relationship (such as when Constraint Anchor DOP is connected to the second, third, of fourth inputs of a Constraint DOP).
In this case, this value is set to a string that is a space separated list of the names of all the Affector Objects of the relationship to which the data is being attached.
ST
The simulation time for which the node is being evaluated.
Depending on the settings of the DOP Network Offset Time and Scale Time parameters, this value may not be equal to the current Houdini time represented by the variable T.
ST is guaranteed to have a value of zero at the
start of a simulation, so when testing for the first timestep of a
simulation, it is best to use a test like $ST == 0
, rather than
$T == 0
or $FF == 1
.
SF
The simulation frame (or more accurately, the simulation time step number) for which the node is being evaluated.
Depending on the settings of the DOP Network parameters, this value may not be equal to the current Houdini frame number represented by the variable F. Instead, it is equal to the simulation time (ST) divided by the simulation timestep size (TIMESTEP).
TIMESTEP
The size of a simulation timestep. This value is useful for scaling values that are expressed in units per second, but are applied on each timestep.
SFPS
The inverse of the TIMESTEP value. It is the number of timesteps per second of simulation time.
SNOBJ
The number of objects in the simulation. For nodes that create objects such as the Empty Object DOP, SNOBJ increases for each object that is evaluated.
A good way to guarantee unique object names is to use an expression
like object_$SNOBJ
.
NOBJ
The number of objects that are evaluated by the current node during this timestep. This value is often different from SNOBJ, as many nodes do not process all the objects in a simulation.
NOBJ may return 0 if the node does not process each object sequentially (such as the Group DOP).
OBJ
The index of the specific object being processed by the node. This value always runs from zero to NOBJ-1 in a given timestep. It does not identify the current object within the simulation like OBJID or OBJNAME; it only identifies the object’s position in the current order of processing.
This value is useful for generating a random number for each object, or simply splitting the objects into two or more groups to be processed in different ways. This value is -1 if the node does not process objects sequentially (such as the Group DOP).
OBJID
The unique identifier for the object being processed. Every object is assigned an integer value that is unique among all objects in the simulation for all time. Even if an object is deleted, its identifier is never reused. This is very useful in situations where each object needs to be treated differently, for example, to produce a unique random number for each object.
This value is also the best way to look up information on an object using the dopfield expression function.
OBJID is -1 if the node does not process objects sequentially (such as the Group DOP).
ALLOBJIDS
This string contains a space-separated list of the unique object identifiers for every object being processed by the current node.
ALLOBJNAMES
This string contains a space-separated list of the names of every object being processed by the current node.
OBJCT
The simulation time (see variable ST) at which the current object was created.
To check if an object was created
on the current timestep, the expression $ST == $OBJCT
should
always be used.
This value is zero if the node does not process objects sequentially (such as the Group DOP).
OBJCF
The simulation frame (see variable SF) at which the current object was created. It is equivalent to using the dopsttoframe expression on the OBJCT variable.
This value is zero if the node does not process objects sequentially (such as the Group DOP).
OBJNAME
A string value containing the name of the object being processed.
Object names are not guaranteed to be unique within a simulation. However, if you name your objects carefully so that they are unique, the object name can be a much easier way to identify an object than the unique object identifier, OBJID.
The object name can
also be used to treat a number of similar objects (with the same
name) as a virtual group. If there are 20 objects named “myobject”,
specifying strcmp($OBJNAME, "myobject") == 0
in the activation field
of a DOP will cause that DOP to operate on only those 20 objects.
This value is the empty string if the node does not process objects sequentially (such as the Group DOP).
DOPNET
A string value containing the full path of the current DOP network. This value is most useful in DOP subnet digital assets where you want to know the path to the DOP network that contains the node.
Note
Most dynamics nodes have local variables with the same names as the node’s parameters. For example, in a Position DOP, you could write the expression:
$tx + 0.1
…to make the object move 0.1 units along the X axis at each timestep.