On this page |
The OpenCL COP provides a general interface to create and run OpenCL kernels on layers. It allows binding of constants, attributes, and volume data to OpenCL parameters in the kernel.
Warning
This node requires that you understand OpenCL. It is very easy to write incorrect code using this node.
While this node is harder to use than the Wrangle COP, it is very important to keep all computations on the GPU for COPs, so it is encouraged over using VEX.
Syntax ¶
See OpenCL for VEX users for basic information on the syntax available.
Note
You likely do not need to manually edit the Signature or Bindings tabs if you are not doing something esoteric. They are often set up with the “Create inputs and spare parameters” button.
Parameters ¶
Kernel ¶
Kernel Code
OpenCL kernel to execute. This is usually a series of #bind commands to describe what the bindings to the kernel will be, along with an @KERNEL which contains the actual kernel function. Within the kernel function @ references can be used to refer to the bindings.
Options ¶
Compiler Options
Specify any desired compile flags for the kernel. The most common is to use -D to provide #define directives for the pre-processor.
Note
The Apple OSX OpenCL compiler requires only a single space between kernel options.
Houdini defines additional flags while compiling kernels depending on the
OpenCL device. The flags __H_GPU__
or __H_CPU__
distinguish between GPU and
CPU devices, and __H_NVIDIA__
, __H_AMD__
, __H_INTEL__
, or __H_APPLE_
signify the hardware vendor. You can set the environment variable
HOUDINI_OCL_REPORT_BUILD_LOGS
to 1 before running Houdini to get a
dump of all kernels compiled along with their preprocessor flags.
Tile Size
The default behavior is to run the kernel for every buffer element in the source layer. This can be used to run it once for every NxM block of buffer elements, where this gives the size of blocks. The edge blocks might thus go past the edges of the buffer.
@ixy
will give the start of the current tile, and @tilesize
the
configured tile size. min(@ixy+@tilesize, @res)
can be used as
the end point of a loop that stays within the buffer.
Iterations
The kernel can be re-executed a variable number of times. This avoids having to use more nodes to create a loop, and ensures all data remains on the video card during the successive evaluations.
Include Iteration
Binds an @Iteration
integer containing the current iteration.
Use Write Back Kernel
After the kernel is executed, a second kernel may be immediately executed with the same set of parameters bound to it. You can avoid race conditions where multiple threads want to write to the same data by breaking it into a two-pass operation.
The second kernel is configured with @WRITEBACK.
Include Time
Binds @Time to the current evaluation time in seconds.
Use #import for Prequel Code
Several common headers are always included. These can be brought in with either a #include or #import directive, this option will pick the latter. #import results in messier results in the Generate Code output, but can speed up caching of kernels on Windows.
Timing Messages
Sychronize the kernel when complete to ensure timing is accurate. Otherwise each kernel will run asynchronously, making it hard to tell which operation slowed things down.
Area Sampling Filter
The filtering method used for sampling the source layer. Consider using one of the area filters (Box and later) if the distortion shrinks large regions and you notice noise.
See Filters for more information.
Precision
Controls the precision of this node. The fpreal
and exint
types will be defined in the generated code to correspond with
this specified precision. The vector variants will also be defined,
ie, fpreal3
, fpreal4
, etc. Additionally the FPREAL_PREC
symbol
is defined as 16 for half, 32 for float, or 64 for double.
Auto will use the preferred precision of the incoming geometry, as set by the Attribute Cast SOP.
Note
16-bit cannot be used for computation in most drivers.
Signature ¶
The OpenCL kernel can read and write to multiple layers at once, so this configures how the inputs and outputs are defined.
It often is the case you want something that will work on Mono, UV,
RGB, and RGBA layers equally. This can be done by using the Varying
Type.
Note
This tab is often not configured explicitly, but set up with the “Create inputs and spare parameters” button.
Inputs ¶
Input #
The name of the input. This is what is bound to in the #bind command.
Type
Varying
Accepts Mono, UV, RGB, or RGBA. All inputs of varying type will be set to the same type.
ID
Accepts ID layers.
Mono
Accepts Mono layers.
UV
Accepts UV layers.
RGB
Accepts RGB layers.
RGBA
Accepts RGBA layers.
Geometry
A geometry input. These must be referred to by name in an attribute or volume binding.
Metadata
This input is only used for size and meta data information and will not be read.
Optional
Marks if the input is mandatory. The corresponding binding should be marked optional.
Outputs ¶
Output #
The name of the output. This is the target of the #bind.
Type
Varying
A Mono, UV, RGB, or RGBA layer matching the corresponding varying input layers.
ID
An ID layer.
Mono
A Mono layer.
UV
A UV layer.
RGB
An RGB layer.
RGBA
An RGBA layer.
Geometry
Geometry output. As geometry can’t be created by OpenCL, this must match a geometry input.
Metadata
Controls where the metadata of the layer will come from. This is where the size, display window, camera transforms, etc, will come from.
First Input
The first input of a COP is often the size reference, so is a good choice to use for the metadata.
Matching Name
The input whose name matches this output’s name will be used.
Input Name
A specific input of the provided name will be used.
Precision
Layers can be stored in 16-bit and 32-bit formats for floating point layers, and in 8, 16, and 32-bit formats for integer.
Input Precision
Maintain the precision of the input layer.
16-bit
Force 16-bit precision.
32-bit
Force 32-bit precision.
Type Info
The type info for the resulting layer. Choosing Input will use whatever the metadata’s type info was.
See the Copernicus glossary for a description of type infos.
Input for Metaada
The name of the input to use metadata from.
Bindings ¶
Bindings ¶
Each parameter can either be a fixed constant value, evaluated before kernel invocation, or read/write from a layer, volume or geometry attribute.
Bind# ¶
Name
The name of the parameter. This is used in the Generate Kernel
button and by the @ bindings. The actual
OpenCL kernel function is defined by parameter order, not
by the name, but if using @KERNEL this is hidden from you.
Type
The type of parameter to create and bind.
Integer
A constant integer value, allowing you to bind channel references and expressions that are pre-computed.
Float
A constant float value. Optionally you can scale it by the timestep.
Vector2
A constant tuple of two floats, binding to a float2
OpenCL parameter.
Vector
A constant tuple of three floats, binding to a float3
OpenCL
parameter. These actually take 4-floats in memory.
Vector4
A constant tuple of four floats, binding to a float4
OpenCL
parameter.
Ramp
A scalar or color ramp. Because evaluating a spline-based
ramp inside of an OpenCL kernel is complex, the ramp is
instead sampled into a uniform
array of floats. The Ramp Size
parameter controls the
number of samples used.
Layer
A 2D image layer corresponding to a wire in Copernicus.
Geometry
Bind a geometry attribute.
Volume
Bind a volume.
VDB
Bind a VDB.
Ramp Size
The number of floating point values to evaluate the ramp in.
Data Type
The type of ramp to bind
Float
A scalar ramp.
Vector
A color ramp.
Ramp
The scalar ramp values.
Ramp
The color ramp values.
Data Type
What type of layer to bind. This refers to the OpenCL type system, not the COP system.
Integer
An ID layer.
Float
A Mono layer.
Float2
A UV layer.
Float3
An RGB layer.
Float4
An RGBA layer.
Float?
A Mono, UV, RGB, or RGBA layer.
Border
You can override the incoming layer’s border rules without having to edit its metadata directly.
Input
Use the input’s border values unchanged.
Constant
Force a constant, zero, value for out of bounds.
Clamp
Clamp or streak out of bounds.
Mirror
Reflect out of bounds on the boundary.
Wrap
Wrap out of bounds to the other side.
Port
When binding geometry there may be more than one geoemtry input. This is the name of the input, as specified in the signature, to bind to.
Volume
The name or number of the volume or VDB primitive to bind.
Force Alignment
To simplify kernels one may often assume all volumes are aligned in resolution and transform. If Force Alignment is set, this is enforced and volumes that are misaligned generate errors.
Voxel Resolution
Add the resolution of the volume as a parameter.
Voxel Size
Add the size of the volume as a parameter, in SOP space.
Transform to World
Add a matrix transform that converts from the volume’s voxel coordinates to the SOP coordinates.
Transform to Voxel
Add a matrix transform that converts from SOP coordinates to the volume’s voxel coordinates.
Data Type
The types of VDBs to accept for the binding.
Any
Any type of VDB can be bound, the kernel will have to do introspection to validate the type.
Float
Floating point VDBs.
Vector
Vector VDBs.
Attribute
Which attribute to bind. It is an error if it is missing, unless the optional flag is set.
Present for Attributes.
Class
The type of the attribute.
Not all bound attributes need to be the same type, or even come from the same geometry data.
Present for Attributes.
Data Type
What sort of attribute to bind. Float and integer attributes are bound as single arrays containing all element values in order. Tuples are
interleaved, ie, P
will be bound as xyzxyzxyz
.
Array attributes are bound as two arrays. One array contains the offsets of each element’s array data. Thus, the difference of a pair of offsets provides the elements array length. The second array is the data of all elements' arrays concatenated into a single array.
Present for Attributes.
Tuple Size
Tuple size of the attribute to bind. If greater than zero, the attribute must be able to provide this tuple size. If zero, it will bind automatically and an extra parameter will be generated storing the tuplesize.
Present for Attributes.
Precision
Controls the precision the data of this parameter is bound with.
The Node option uses the node’s precision, so will vary depending
on its setting and the corresponding kernel code should use the
fpreal
or exint
defines.
This is the precision the data is stored on the video card so using
lower precision can save GPU memory. 16-bit, which
corresponds to half
, often cannot be used for computation. The
vload_half
can be used to promote it to float
for computation.
If the same attribute ends up bound with different precisions it will fail the binding.
Currently volumes only bind with 32bit data precision.
Read
Determines if the OpenCL kernel will read from this attribute or volume. If not set, the attributes values will not be copied onto the GPU. This is useful for write-only attributes as it avoids an unnecessary copy, but requires care as uninitialized data will be present.
Write
Determines if the OpenCL kernel will write back to this attribute or volume. Causes the CPU version of the attribute or volume to be marked out of date so the next time it is needed it will be copied back from the GPU.
Optional
Marks the attribute or volume as not necessary. If the attribute or volume isn’t present in the geometry, rather than erroring, a #define is set in the kernel options to disable the attribute. This also changes the parameter signature, so the Generate Code button should be used to verify the syntax.
Note
The parameter name is used in the #define
, so changing the parameter name requires changing the code.
Default Value
Marks that if an optional attribute or volume is missing that a parameter value should still be bound to the kernel. A #define is set in the kernel options to disable the attribute and switch to the single value. This also changes the parameter signature, so the Generate Code button should be used to verify the syntax.
The value of the bound paramater will be taken from the integer or float value of this parameter.
Value
The integer value used for integer parameters or default values.
Value
The float value used for float parameters or default values.
Value
The float2 value used for float2 parameters or default values.
Value
The float3 value used for float3 parameters or default values.
Value
The float4 value used for float4 parameters or default values.
Time Scale
How to scale the provided float value by the timestep. Because timeinc may not be known at time of parameter evaluation, it can be computed as a constant prior to evaluating the kernel and applied to the float value.
Generated Code ¶
Display Code
Produces the fully expanded code that is sent to the actual compiler. This can resolve line numbers for errors when compilers do not respect the #line directive, and also help understand how the @-macros work. The exact expansion of @-macros should not be relied upon.
Generated Code
The code snippet with all @-bindings expanded.
Note
This parameter is not used but is purely informational.
Examples ¶
GameOfLife Example for OpenCL Copernicus node
Contains an implementation of Conway’s Game of Life in Copernicus using the OpenCL COP.
See also |