HDK
|
Houdini's RE library is a wrapper around OpenGL. It attempts to simplify GL usage by implementing objects around GL concepts, and abstracting implementation-specific behaviour. The main objects are RE_Geometry, RE_VertexArray, RE_Texture, RE_Shader, and RE_Render. Other object types exist for minor GL objects, such as sync objects.
RE_Geometry
- defines vertex attributes and connectivity (element arrays).RE_Shader
- defines a GLSL shader, which uses an RE_Geometry to define vertex shader inputs.RE_Texture
- a single GL texture sampled from shadersRE_Render
- the GL context containing the current state, including the current RE_Shader
, active RE_Textures, and various GL drawing states (blending, depth, stencil, viewport).The OpenGL state is stored in RE_Render. RE_Render caches the GL state to avoid redundant state changes, and also implements a variety of stacks for commonly changed states (texture, blending, depth, shader). It includes state for both the fixed function GL pipeline (GL 1.x) and the shader pipeline.
When GL3 shaders are used, the fixed function state is ignored. This includes:
gl_ModelviewMatrix
, gl_NormalMatrix
, gl_ProjectionMatrix
and inverses)gl_LightSource
[] )gl_Material
)gl_ClipDistance
)The relevant GL state is:
You can dump the current GL state using RE_Render::dumpNewState()
.
An RE_Render object always has a GL context behind it.
The viewport in Houdini 14+ requires OpenGL 3.3 or higher, so it is safe to assume that all features of OpenGL 3.3 are available. OpenGL 4 features should be queried by their extension names before use (using ::hasGLExtension(), with an extension from the list in RE_Extension.h
).
An RE_Texture represents a single GL texture object and provides a simple interface for defining and using GL textures. Currently, Houdini supports 1D, 2D, 3D, cubemap, 1D array, 2D array, rectangle, buffer, and 2D multisample textures. The interface to these textures is very similar, so that they can all be manipulated through a generic RE_Texture handle.
Textures can be bound to samplers using RE_Render::bindTexture()
. pushTextureState()
can be used to preserve previous texture bindings. When binding textures for shaders, RE_Shader::getUniformTextureUnit()
can be used to query the sampler index.
Textures can also be created as render targets for framebuffer objects. RE_OGLFramebuffer::createTexture()
will create and bind a texture to a framebuffer object.
Shaders of all GLSL versions are represented by RE_Shader
. Houdini currently supports Vertex, Tesselation, Geometry and Fragment shaders. A shader can be created from c-style strings or plain text files found in the HOUDINI_GLSL_PATH
.
An RE_ShaderHandle
is a simple interface for loading full shaders with a minimal amount of code. It uses a new file .prog format, which is a simple list of shader files to load with a few markups. Using an RE_ShaderHandle
, you can easily load, setup and use a shader.
The .prog format is a text-based list of directives and files. Here is an example:
And a more complex example, with a shader pipeline to use if the GL extension GL_ARB_image_load_store is present, and a fallback to a different shader pipeline if it is not (version can also be used in this way; #version 430, #version 150):
The name directive gives the shader a descriptive name. This is optional; otherwise the .prog filename will be the shader's name.
The version directive defines the GLSL version this shader uses. This is the same as the GLSL directive, except that 'core', 'compatibility' and 'es' modifiers are not allowed. It may be defined multiple times, each time with a different value defining a separate set of files to load. The highest version supported by the implementation will be loaded. At least one #version
directive must exist, before any shader files are specified.
The extension directive defines a OpenGL extension that must be present in order for the shader set to be accepted. The parameter must be the exact name of the GL extension to use, such as GL_ARB_compute_shader or GL_EXT_framebuffer_object. Multiple #extensions
may be specified for the same shader set, and all of them must be present in order for the set to be loaded. A new #version
directive will clear the list of required extensions.
The input directive binds generic vertex attributes to specific attribute indices. Only generic vertex attributes can be assigned (no GL-builtins with the gl_ prefix may be bound), and only to valid indices (from 0 to the maximum GL-supported index, usually 15). This provides a similar function to the GL_ARB_explicit_attribute_location extension. These are optional, must be specified after a #version
directive, and are cleared when a new #version
directive is encountered.
There is also a hou_attrib_map directive, which assigns attributes known to Houdini to standard binding locations:
All other generic attributes are assigned starting at attribute location 9.
The output directive binds named fragment shader outputs to specific draw buffer indices. This is optional, must appear after a #version
directive and is cleared when another #version
directive is encountered. This is useful for shaders that output more than one value to different draw buffers/textures.
The define directive passes a defined symbol to the shader.
The shader files themselves are loaded from the HOUDINI_GLSL_PATH
(default path value is HOUDINI_PATH/glsl
). The files must have specific extensions - .vert for vertex shaders, .geom for geometry shaders, and .frag for fragment shaders. For GL2 shaders, either shader stage is optional though at least one stage must exist. For GL3 shaders, only the geometry shader is optional as the fixed function state will not be set up correctly for rendering (unless the GL state is initialized by the the render hook itself; or transform feedback is being used without rasterization, in which case the fragment shader can be omitted).
Finally, a comment can be specified on a line by starting it with //
. Comments cannot be specified on the same line as a directive or shader filename.
The .prog file format may be expanded with more directives in the future.
You can also create shaders directly from a c-style string, or load them individually:
Once a shader is loaded, uniforms can be bound to it using the bind*
() family of functions. Uniform blocks can be bound as well, using RE_UniformBlock
. Finally, transform feedback can be set up using the add/getFeebackBuffer
() family of methods (and enabled via RE_Render::begin/endTransformFeedback
() ).
To make a shader active, either RE_Render::push/popShader
() or RE_Render::bindShader()
can be used. A shader that does not compile or link will not have any effect when bound.
A uniform block is a GLSL structure which is bound to a buffer object. The structure may contain any GLSL type except for samplers. To declare a uniform block in a GLSL shader, define a structure at the global scope with the uniform qualifier. An example of a uniform block:
RE_Shader
will enumerate all declared uniform blocks and these can be accessed through getNumUniformBlocks()
and getUniformBlock(index)
. An RE_UniformBlock
object will be returned for each uniform block declared. You can either modify this uniform block or provide an "override block" which will be bound instead of the default shader block. The override block should be generated via RE_UniformBlock::copy()
, perferrably from the uniform block it is replacing. This allows you to keep uniform blocks locally over several renders and pay a small binding cost per render, versus continuously modifying the shader's uniform block and uploading the buffer. Once you are done with rendering with a local override block, it should be removed from the shader.
RE_UniformBlock
has many of the same bind
...() calls as RE_Shader
to set values of its various elements. These don't really bind anything, but are named similarly to the RE_Shader
methods for simplicity. Each bind()
call will check if the new value matches the previous value, and only trigger an upload of the underlying buffer if the block has actually changed.
Transform feedback is named for reading back vertex values after running through vertex processing, but really any value output by the vertex or geometry shader can be read. This allows for pre-processing of vertex values on the GPU.
Transform feedback is set up on a shader by specifying varying outputs to capture with RE_Shader::addFeebackBuffer()
. If both a vertex and geometry shader are present in a shader, only the geometry shader outputs may be collected. There is a maximum number of buffers that can be collected at once (usually 4). Once these outputs have been set, the shader must be relinked. Since linking can destroy uniform values, this should be done early in the shader setup.
Transform feedback can be enabled during regular rendering via the RE_Render::beginTransformFeedback()
method. Here you must specify the class of primitive being rendered - points, lines, or triangles, and whether the primitives carry on to the rasterizer stage. When the primitive type is points, one vertex is captured per primitive; lines, two; and triangles, three.
The shader must be active before beginTransformFeedback()
is called, and the shader cannot be changed while transform feedback is active (until after endTransformFeedback()
is called). The primitive types being rendered must also match the primitive class that transform feedback was started with. If a geometry shader is present, this refers to the primitive type output by the geometry shader, and not necessarily the type passed when drawing. These are OpenGL restrictions.
Transform feedback is very handy for debugging shaders when nothing is appearing in the viewport (after setting the fragment shader to output vec4(1)
to ensure the fragment color isn't the issue). This code will print out the vertex positions, and X,Y, and Z should lie within [-1,1] once divided by W:
Similarly, you can inspect other values being produced by the vertex or geometry shader.
In order for a shader to run, it needs its vertex shader fed by vertex values. RE_Geometry
contains vertex arrays which are fed to the shader. Drawing an RE_Geometry
object with a current shader will automatically bind vertex attributes, making rendering of complex objects very simple:
Of course, it is up to you to define the proper vertex attributes and their data that a specific shader requires.
An RE_Geometry
object contains multiple RE_VertexArray
objects, each representing a specific vertex attribute. It also contains one or more connection groups, which defines how these vertices are connected.
Each geometry object has a specific number of points, primitives and vertices. Generally only the number of points needs to be specified. This defines the length of each vertex attribute with the RE_ARRAY_POINT
array type (the default). In GL3, primitive and vertex attribute types can also be defined with the RE_ARRAY_PRIMITIVE
and RE_ARRAY_VERTEX
types, though these are accessed through a texture buffer object rather than a vertex shader input. Attributes of the same array type must be the same length (or, have at least that many elements).
Vertex attributes have names which associate them with vertex shader inputs (for RE_ARRAY_POINT
, RE_ARRAY_DETAIL
(constant), and (instanced vertex array). For TBO-based vertex attributes, the name of the samplerBuffer
in the shader should be attr<name> (RE_ARRAY_PRIMITIVE
, RE_ARRAY_VERTEX
, RE_ARRAY_RANDOM
).
Vertex attributes can be of nearly any supported RE_GPUType
, though the specific depth and stencil formats are not supported (RE_GPU_FLOAT24
, RE_GPU_UINT1
,2,4). Matrices are also supported, though these require multiple GL attribute locations so care should be taken when using them - there are a limited number of vertex attributes (mat2 uses 2 slots, mat3 uses 3, and mat4 uses 4). Some attribute formats are non-optimal on some hardware. For example, a vec3 FP16 attribute is about 4x slower than vec4 FP16 or vec3 FP32 on current AMD hardware, due to its 4B alignment preference.
RE_Geometry
uses buffer objects for storage of the attribute data. Buffer objects allow the data to be sent to the GPU once and reused multiple times.
Caching is also available for buffers attached to an RE_Geometry
object, as long as it is set up with a cache name:
This will store buffers in the cache and allow retreival at a later time, or by other objects, using:
To create a new attribute, the above method can be used with the create_missing parameter set to true, or using one of the following methods:
The create methods may fail if the data format is incorrect (eg, RE_GPU_FLOAT24
), the length is too large for GL to create, or the vector size parameter is outside the range 1-4. The data pointer may be NULL, if the buffer is used to readback data from a transform feedback or other GPU buffer write operation. It can also be NULL if data will be assigned to it later via the setArray()
or map()
calls.
The index is the ID of the connectivity group, used to draw, delete or add to it. For example,
would add another line strip to connectivity group line_index, in addition to the one added above.
To draw the connectivity group, call one of RE_Geometry's
draw()
methods. You can also use the indexed connectivity group calls, which use integers instead of strings for the group names and have slightly better performance (they are suffixed with 'I'). Drawing a connectivity group will set up the appropriate vertex arrays, and possible the element array, and render using one of the glDraw
...() functions.
More complex patterns of connectivity can be set up using RE_ElementArray
. Similar to RE_VertexArray
this object creates a buffer object, but one which which is used by GL to index vertices. It can be cached in the GL cache in the same way as RE_VertexArray
.
GL3 can render a single object multiple times using instancing. A GL3 shader must be used in order to draw instanced geometry. There are two ways to do instancing:
RE_Geometry
using createInstancedAttribute()
. This can only be accessed in the vertex shader as a vertex shader input.RE_Geometry
(of type RE_ARRAY_RANDOM
), and use the vertex shader's gl_InstanceID
to index the Texture Buffer Object that holds the buffer data. This can be accessed in any shader stage.The former is a bit easier to set up, but the latter allows you to index the array however you want, rather than just advancing the index once every 'step' instances. More than one attribute can be instanced.
To set up instancing with a simple translation of the base object using the first method:
And the second:
It's also possible to mix the two methods, some attributes being instanced and others being randomly indexed in the shader. But generally, using instanced attributes is the preferred method to use.
Besides the main RE objects required for rendering, there are some support classes which can be used for various rendering effects.
This represents a Framebuffer Object (FBO) which is used to do offscreen rendering. It is invaluable in multipass drawing algorithms. Renderbuffers and textures can be rendered to from an FBO. Textures used as a render target can then be read from a shader in a subsequent rendering pass. This is the basis of many multi-pass rendering tricks. FBOs can also be multisampled, to do such effects as antialiasing and order-independent transparency.
An RE_OGLFramebuffer
can create textures and renderbuffers for specific attachments, or attach existing textures and renderbuffers. There are multiple color attachments, but one depth/stencil attachment (the separate stencil and depth attachements are generally not supported by OpenGL implementations). This allows you to render to up to 8 color attachments at a time, sharing a single depth/stencil buffer. In order to properly take advantage of multiple color attachments, a fragment shader with multiple outputs should be used.
Framebuffer attachments need to be compatible with one another, otherwise the FBO will be "incomplete". This criteria varies by OpenGL implementation, but some things that can make an FBO incomplete are missing attachments for draw buffers, incorrect texture types for render targets, or inconsistent texture sizes.
A framebuffer can be either the current draw or read framebuffer. The draw framebuffer is affected by rendering commands, while the read framebuffer is affected by read commands (eg, glReadPixels
). This is stored in the RE_Render
object, and set by push/set/popDrawFramebuffer
() and push/set/popReadFramebuffer
() . Once a framebuffer is active, it can be operated on as if it were the default (window) framebuffer.
A sync object allows two OpenGL contexts in different threads to synchonize with one another. The most common case is a update/render thread setup, where one thread populates OpenGL objects with data (textures, buffers) and the other renders them. A RE_CommandSync
object can be used to ensure that the setup thread has completed its operation before the render thread uses the object that was being written.
This will cause the GPU to wait on the sync point.
There is also a way to make the CPU wait on a sync point, using RE_RenderWait:
An occlusion query is used to determine if a rendered object produced any visible pixels, and how many pixels were visible. Note that this query only returns if the pixels were rendered at the time of the query - if they are subsequently overwritten, this will not be reflected in the query. This sort of query is good for several cases:
RE_Render::beginConditionalRender()
).An occlusion query is implemented by RE_OcclusionQuery
. Queries cannot be shared between GL contexts. It has a simple begin/end operation, and once this has been done, the number of pixels can be queried (getNumDrawn()
). The query is asynchronous, which means that the draw operation will not stall the CPU on end()
, only on getNumDrawn()
.
It is important to note that you cannot create a new query while another query of any type is active (an OpenGL restriction).
A timer query can be used to determine how long an operation took to execute on the GPU. Because OpenGL is asynchonous, timing the GL commands will not produce accurate results because these commands are added to a queue and processed at a later time. A timer query places commands in the command queue which will accurately record the elapsed time that the operation took on the GPU. This is reported in nanoseconds (1e-9 seconds).
Timer queries are not shared between GL contexts, and they cannot be nested. If you need to nest queries, instead of using the begin/end syntax, request timestamps at the start and end of the operation using two timer queries, and compute the elapsed time manually.
To recode timestamps:
This is an easy way to avoid the nesting restriction of timer objects.