Overview

Houdini's RE library is a wrapper around OpenGL. It attempts to simplify GL usage by implementing objects around GL concepts, and abstracting implementation-specific behaviour. The main objects are RE_Geometry, RE_VertexArray, RE_Texture, RE_Shader, and RE_Render. Other object types exist for minor GL objects, such as sync objects.

RE_Geometry - defines vertex attributes and connectivity (element arrays).
RE_Shader - defines a GLSL shader, which uses an RE_Geometry to define vertex shader inputs.
RE_Texture - a single GL texture sampled from shaders
RE_Render - the GL context containing the current state, including the current RE_Shader, active RE_Textures, and various GL drawing states (blending, depth, stencil, viewport).

RE_Render

The OpenGL state is stored in RE_Render. RE_Render caches the GL state to avoid redundant state changes, and also implements a variety of stacks for commonly changed states (texture, blending, depth, shader). It includes state for both the fixed function GL pipeline (GL 1.x) and the shader pipeline.

When GL3 shaders are used, the fixed function state is ignored. This includes:

transform state (gl_ModelviewMatrix, gl_NormalMatrix, gl_ProjectionMatrix and inverses)
lighting state (gl_LightSource[] )
material state (gl_Material )
fog
clip planes (GL3 uses gl_ClipDistance )

The relevant GL state is:

blending
current shader
current Vertex Array Object or Vertex Buffer bindings
scissor/viewport settings
depth state
framebuffer
textures

You can dump the current GL state using RE_Render::dumpNewState().

An RE_Render object always has a GL context behind it.

The viewport in Houdini 14+ requires OpenGL 3.3 or higher, so it is safe to assume that all features of OpenGL 3.3 are available. OpenGL 4 features should be queried by their extension names before use (using ::hasGLExtension(), with an extension from the list in RE_Extension.h).

RE_Texture

An RE_Texture represents a single GL texture object and provides a simple interface for defining and using GL textures. Currently, Houdini supports 1D, 2D, 3D, cubemap, 1D array, 2D array, rectangle, buffer, and 2D multisample textures. The interface to these textures is very similar, so that they can all be manipulated through a generic RE_Texture handle.

// Create and assign data to a 2D texture, given 'RE_Render *r'
RE_Texture *tex = RE_Texture::newTexture(RE_TEXTURE_2D);
tex->setResolution(200, 200);
tex->setFormat(RE_GPU_UINT8, 4);
tex->setTexture(r, texture_data); // data is uchar[4] array
// Create a 3D Texture
RE_Texture *tex = RE_Texture::newTexture(RE_TEXTURE_3D);
tex->setResolution(40,40,40);
tex->setFormat(RE_GPU_FLOAT16, 1);
tex->setTexture(r, texture_data); // data is fpreal16 array
// Create a buffer texture with 'RE_OGLBuffer *buf' (TBO support required)
RE_Texture *tex = RE_Texture::newTexture(RE_TEXTURE_BUFFER);
tex->setResolution(buf->getLength());
tex->setFormat(RE_GPU_FLOAT32, 1); // some restrictions on vec3 usage
tex->setTexture(r, buf, RE_TEXTURE_BUFFER); // data is RE_OGLBuffer with fpreal32 data

Textures can be bound to samplers using RE_Render::bindTexture(). pushTextureState() can be used to preserve previous texture bindings. When binding textures for shaders, RE_Shader::getUniformTextureUnit() can be used to query the sampler index.

Textures can also be created as render targets for framebuffer objects. RE_OGLFramebuffer::createTexture() will create and bind a texture to a framebuffer object.

RE_Shader

Shaders of all GLSL versions are represented by RE_Shader. Houdini currently supports Vertex, Tesselation, Geometry and Fragment shaders. A shader can be created from c-style strings or plain text files found in the HOUDINI_GLSL_PATH.

An RE_ShaderHandle is a simple interface for loading full shaders with a minimal amount of code. It uses a new file .prog format, which is a simple list of shader files to load with a few markups. Using an RE_ShaderHandle, you can easily load, setup and use a shader.

// Given RE_Render *r, RE_Texture *tex
RE_ShaderHandle sh("sample.prog");
if(!sh.isValid(r))
{
   sh.printErrors();
   return; 
}
sh->bindInt(r, "someuniform", 1);  // -> operator accesses underlying RE_Shader
r->bindTexture(tex, r->getUniformTextureUnit("somesampler"));
r->bindShader(sh); // make it the current shader
// draw something

The .prog format is a text-based list of directives and files. Here is an example:

#name Simple Surface Shader
#version 150
#input P 0
#input N 1
#input Cd 2
#input Alpha 3
#output color 0
surface/simple.vert
surface/simple.frag

And a more complex example, with a shader pipeline to use if the GL extension GL_ARB_image_load_store is present, and a fallback to a different shader pipeline if it is not (version can also be used in this way; #version 430, #version 150):

#name Multisample Depth blit shader
#version 150
#extension GL_ARB_image_load_store
#input pos 0
#output depth 0
#output mindepth 1
#output maxdepth 2
util/depth_blit.vert
util/depth_blit.geom
util/depth_blit_load_store.frag
// for GL3.2 without image load/store
#version 150
#input pos 0
#output depth 0
#output mindepth 1
#output maxdepth 2
util/depth_blit.vert
util/depth_blit.geom
util/depth_blit.frag

The name directive gives the shader a descriptive name. This is optional; otherwise the .prog filename will be the shader's name.

The version directive defines the GLSL version this shader uses. This is the same as the GLSL directive, except that 'core', 'compatibility' and 'es' modifiers are not allowed. It may be defined multiple times, each time with a different value defining a separate set of files to load. The highest version supported by the implementation will be loaded. At least one #version directive must exist, before any shader files are specified.

The extension directive defines a OpenGL extension that must be present in order for the shader set to be accepted. The parameter must be the exact name of the GL extension to use, such as GL_ARB_compute_shader or GL_EXT_framebuffer_object. Multiple #extensions may be specified for the same shader set, and all of them must be present in order for the set to be loaded. A new #version directive will clear the list of required extensions.

The input directive binds generic vertex attributes to specific attribute indices. Only generic vertex attributes can be assigned (no GL-builtins with the gl_ prefix may be bound), and only to valid indices (from 0 to the maximum GL-supported index, usually 15). This provides a similar function to the GL_ARB_explicit_attribute_location extension. These are optional, must be specified after a #version directive, and are cleared when a new #version directive is encountered.

There is also a hou_attrib_map directive, which assigns attributes known to Houdini to standard binding locations:

P = 0 (vec3)
Cd = 1 (vec3)
Alpha = 2 (float)
N = 3 (vec3)
uv = 4 (vec2)
pointScale = 5 (float)
pointSelection = 6 (int)
pointID = 7 (int)
instIndex = 8 (int)

All other generic attributes are assigned starting at attribute location 9.

The output directive binds named fragment shader outputs to specific draw buffer indices. This is optional, must appear after a #version directive and is cleared when another #version directive is encountered. This is useful for shaders that output more than one value to different draw buffers/textures.

The define directive passes a defined symbol to the shader.

The shader files themselves are loaded from the HOUDINI_GLSL_PATH (default path value is HOUDINI_PATH/glsl ). The files must have specific extensions - .vert for vertex shaders, .geom for geometry shaders, and .frag for fragment shaders. For GL2 shaders, either shader stage is optional though at least one stage must exist. For GL3 shaders, only the geometry shader is optional as the fixed function state will not be set up correctly for rendering (unless the GL state is initialized by the the render hook itself; or transform feedback is being used without rasterization, in which case the fragment shader can be omitted).

Finally, a comment can be specified on a line by starting it with // . Comments cannot be specified on the same line as a directive or shader filename.

The .prog file format may be expanded with more directives in the future.

You can also create shaders directly from a c-style string, or load them individually:

// given RE_Render *r, const char *shader_source_string
RE_Shader *sh = RE_Shader::create("My Shader");
UT_String errors;
sh->addShader(r, RE_SHADER_VERTEX, shader_source_string, 
              "optional readable name", 150, // glsl version
              &errors);
sh->loadShader(r, RE_SHADER_FRAGMENT, "shaderfile.txt", &errors, 
               "optional readable name");
sh->linkShaders(r, &errors);

Once a shader is loaded, uniforms can be bound to it using the bind*() family of functions. Uniform blocks can be bound as well, using RE_UniformBlock . Finally, transform feedback can be set up using the add/getFeebackBuffer() family of methods (and enabled via RE_Render::begin/endTransformFeedback() ).

To make a shader active, either RE_Render::push/popShader() or RE_Render::bindShader() can be used. A shader that does not compile or link will not have any effect when bound.

Uniform Blocks

A uniform block is a GLSL structure which is bound to a buffer object. The structure may contain any GLSL type except for samplers. To declare a uniform block in a GLSL shader, define a structure at the global scope with the uniform qualifier. An example of a uniform block:

uniform LightingBlock
{
   vec3 diff;
   vec3 spec;
   vec3 pos;
   vec3 dir;
   vec3 atten;
   float cutoff;
};

RE_Shader will enumerate all declared uniform blocks and these can be accessed through getNumUniformBlocks() and getUniformBlock(index). An RE_UniformBlock object will be returned for each uniform block declared. You can either modify this uniform block or provide an "override block" which will be bound instead of the default shader block. The override block should be generated via RE_UniformBlock::copy(), perferrably from the uniform block it is replacing. This allows you to keep uniform blocks locally over several renders and pay a small binding cost per render, versus continuously modifying the shader's uniform block and uploading the buffer. Once you are done with rendering with a local override block, it should be removed from the shader.

RE_UniformBlock has many of the same bind...() calls as RE_Shader to set values of its various elements. These don't really bind anything, but are named similarly to the RE_Shader methods for simplicity. Each bind() call will check if the new value matches the previous value, and only trigger an upload of the underlying buffer if the block has actually changed.

Transform Feedback

Transform feedback is named for reading back vertex values after running through vertex processing, but really any value output by the vertex or geometry shader can be read. This allows for pre-processing of vertex values on the GPU.

Transform feedback is set up on a shader by specifying varying outputs to capture with RE_Shader::addFeebackBuffer() . If both a vertex and geometry shader are present in a shader, only the geometry shader outputs may be collected. There is a maximum number of buffers that can be collected at once (usually 4). Once these outputs have been set, the shader must be relinked. Since linking can destroy uniform values, this should be done early in the shader setup.

Transform feedback can be enabled during regular rendering via the RE_Render::beginTransformFeedback() method. Here you must specify the class of primitive being rendered - points, lines, or triangles, and whether the primitives carry on to the rasterizer stage. When the primitive type is points, one vertex is captured per primitive; lines, two; and triangles, three.

The shader must be active before beginTransformFeedback() is called, and the shader cannot be changed while transform feedback is active (until after endTransformFeedback() is called). The primitive types being rendered must also match the primitive class that transform feedback was started with. If a geometry shader is present, this refers to the primitive type output by the geometry shader, and not necessarily the type passed when drawing. These are OpenGL restrictions.

Transform feedback is very handy for debugging shaders when nothing is appearing in the viewport (after setting the fragment shader to output vec4(1) to ensure the fragment color isn't the issue). This code will print out the vertex positions, and X,Y, and Z should lie within [-1,1] once divided by W:

// Given RE_Render *r, RE_Shader *sh
sh->addFeedbackBuffer(RE_BUFFER_POSITION);
sh->linkShaders(r);
// set up uniform state
r->pushShader(sh);
r->beginTransformFeedback(RE_PRIM_TRIANGLES, true);
// draw
r->endTransformFeedback();
r->popShader(sh);
RE_VertexArray *pos = sh->getFeedbackBuffer(RE_BUFFER_POSITION);
if(pos)
{
    int num = r->getNumFeedbackPrimitives(); // for last feedback only 
    UT_Vector4F *p = (UT_Vector4F *) pos->map(r, RE_BUFFER_READ_ONLY);
    if(p)
    {
        for(int i=0; i<num; i++)
        {
           p[i].dehomogenize();
           cerr << p[i] << endl;
        }
        pos->unmap(r);
    }
}        

Similarly, you can inspect other values being produced by the vertex or geometry shader.

RE_Geometry, RE_VertexArray and RE_ElementArray

In order for a shader to run, it needs its vertex shader fed by vertex values. RE_Geometry contains vertex arrays which are fed to the shader. Drawing an RE_Geometry object with a current shader will automatically bind vertex attributes, making rendering of complex objects very simple:

// given RE_Geometry *geo, RE_Shader *sh
r->pushShader(sh);
geo.draw(r, "shaded");
r->popShader();

Of course, it is up to you to define the proper vertex attributes and their data that a specific shader requires.

An RE_Geometry object contains multiple RE_VertexArray objects, each representing a specific vertex attribute. It also contains one or more connection groups, which defines how these vertices are connected.

Each geometry object has a specific number of points, primitives and vertices. Generally only the number of points needs to be specified. This defines the length of each vertex attribute with the RE_ARRAY_POINT array type (the default). In GL3, primitive and vertex attribute types can also be defined with the RE_ARRAY_PRIMITIVE and RE_ARRAY_VERTEX types, though these are accessed through a texture buffer object rather than a vertex shader input. Attributes of the same array type must be the same length (or, have at least that many elements).

Vertex attributes have names which associate them with vertex shader inputs (for RE_ARRAY_POINT, RE_ARRAY_DETAIL (constant), and (instanced vertex array). For TBO-based vertex attributes, the name of the samplerBuffer in the shader should be attr<name> (RE_ARRAY_PRIMITIVE, RE_ARRAY_VERTEX, RE_ARRAY_RANDOM).

Vertex attributes can be of nearly any supported RE_GPUType, though the specific depth and stencil formats are not supported (RE_GPU_FLOAT24, RE_GPU_UINT1,2,4). Matrices are also supported, though these require multiple GL attribute locations so care should be taken when using them - there are a limited number of vertex attributes (mat2 uses 2 slots, mat3 uses 3, and mat4 uses 4). Some attribute formats are non-optimal on some hardware. For example, a vec3 FP16 attribute is about 4x slower than vec4 FP16 or vec3 FP32 on current AMD hardware, due to its 4B alignment preference.

RE_Geometry uses buffer objects for storage of the attribute data. Buffer objects allow the data to be sent to the GPU once and reused multiple times.

Caching is also available for buffers attached to an RE_Geometry object, as long as it is set up with a cache name:

RE_Geometry geo;

geo.cacheBuffers("some_unique_id_string");

This will store buffers in the cache and allow retreival at a later time, or by other objects, using:

RE_VertexArray *pos = geo.findCachedAttribOrArray(r, "P", RE_GPU_FLOAT32, 3,
                                                  RE_ARRAY_POINT, 
                                                  true); // create if not found
if(pos && pos->getCacheVersion() != some_version_number)
{
   // update data in buffer
   pos->setCacheVersion(some_version_number);
}

To create a new attribute, the above method can be used with the create_missing parameter set to true, or using one of the following methods:

// Given int npoints, UT_Vector3F pdata[npoints]
const fpreal32 col[3] = { 1.0, 0.5, 0.25 }
RE_VertexArray *attr;
geo.setNumPoints(npoints);
// normal point attribute arrays (GL1/2, GL3)
attr = geo.createArray(r, RE_BUFFER_POSITION, 0, RE_GPU_FLOAT32, 3,
                       pdata, RE_ARRAY_POINT);
attr = geo.createAttribute(r, "P", RE_GPU_FLOAT32, 3, pdata, RE_ARRAY_POINT);
// constant-valued (GL1/2, GL3)
attr = geo.createConstArray(r, RE_BUFFER_COLOR, 0, RE_GPU_FLOAT32, 3,
                            col);
attr = geo.createConstAttribute(r, "Cd", RE_GPU_FLOAT32, 3, col);

The create methods may fail if the data format is incorrect (eg, RE_GPU_FLOAT24), the length is too large for GL to create, or the vector size parameter is outside the range 1-4. The data pointer may be NULL, if the buffer is used to readback data from a transform feedback or other GPU buffer write operation. It can also be NULL if data will be assigned to it later via the setArray() or map() calls.

UT_Vector4F *pdata = pos->map(r);
if(pdata)
{
    // assign values to pdata
    pos->unmap(r); // cannot be drawn or GPU-written until unmapped
}
pdata = pos->mapRange(r, 0, 100); // map the first 100 UT_Vector4F elements
if(pdata)
{
    // assign values to pdata
    pos->unmap(r);
}
pos->setArray(r, position_data);
pos->setSubArray(r, position_data, 100, 100) // 100-199 UT_Vector4F elements
@c RE_VertexArray contains an underlying private object which contains the GL vertex array. It acts as a container object, allowing the cache to free buffers as it needs without deleting any @c RE_VertexArray objects that may be stored locally. It also abstracts the difference between variable and constant valued attributes (GL buffer or arrays vs. default GL attribute value).
Once vertex data is set up for the object, GL must be told how to connect these vertices. In some instances, this is very simple - such as points. In others, an element array must be created to define connectivity. The simple cases are set up with:
@code
const int pnt_index = 0;
const int line_index = 1;
const int tri_index = 2;
geo.connectAllPrims(r, pnt_index, RE_PRIM_POINTS); // draw each vertex as a point
geo.connectAllPrims(r, line_index, RE_PRIM_LINES); // draw each pair of vertices as a line
geo.connectAllPrims(r, tri_index, RE_PRIM_TRIANGLES); // draw each trio of vertices as a triangle
geo.connectSomePrims(r, line_index, RE_PRIM_LINE_STRIP, 0, 12); // first 12 vertices form a continuous line
geo.connectSomePrims(r, tri_index, RE_PRIM_TRIANGLE_STRIP, 12, 34); // next 22 vertices form a triangle strip of 20 triangles

The index is the ID of the connectivity group, used to draw, delete or add to it. For example,

geo.connectSomePrims(r, line_index, RE_PRIM_LINE_STRIP, 34, 6);

would add another line strip to connectivity group line_index, in addition to the one added above.

To draw the connectivity group, call one of RE_Geometry's draw() methods. You can also use the indexed connectivity group calls, which use integers instead of strings for the group names and have slightly better performance (they are suffixed with 'I'). Drawing a connectivity group will set up the appropriate vertex arrays, and possible the element array, and render using one of the glDraw...() functions.

More complex patterns of connectivity can be set up using RE_ElementArray. Similar to RE_VertexArray this object creates a buffer object, but one which which is used by GL to index vertices. It can be cached in the GL cache in the same way as RE_VertexArray.

const int num_elements_hint = 15; // hint does not need to be exact
const int tri_index = 2;
RE_ElementArray array(num_elements_hint); 
const uint single_tri_indices[] = {0,2,3};
const uint many_tri_indices[] = {4,5,6, 4,5,7, 5,6,7, 6,7,8 };
array.setCacheName("some_unique_name_element_group");
array.setPrimitiveType(RE_PRIM_TRIANGLES);
array.beginPrims(r);
array.addTriangle(r, single_tri_indices);
array.addPrimitives(r, many_tri_indices, 12);
array.endPrims(r);
array.setCacheVersion( some_cache_version );
geo.connectSomePrims(r, tri_index, RE_PRIM_TRIANGLES, &array);
// To draw...
geo.draw(r, tri_index);

Instanced Drawing

GL3 can render a single object multiple times using instancing. A GL3 shader must be used in order to draw instanced geometry. There are two ways to do instancing:

create an instanced array on RE_Geometry using createInstancedAttribute(). This can only be accessed in the vertex shader as a vertex shader input.
create a random-access array with RE_Geometry (of type RE_ARRAY_RANDOM), and use the vertex shader's gl_InstanceID to index the Texture Buffer Object that holds the buffer data. This can be accessed in any shader stage.

The former is a bit easier to set up, but the latter allows you to index the array however you want, rather than just advancing the index once every 'step' instances. More than one attribute can be instanced.

To set up instancing with a simple translation of the base object using the first method:

UT_Vector3FArray xlate_data;
xlate_data.setSize(num_instances);
// -- Fill xlate_data with translations here --
geo.createInstancedAttribute(r, "xlate", RE_GPU_FLOAT32, 3, num_instances, xlate_data.array()->data(), 1);
geo.drawInstanced(r, connect_group_index, num_instances);
// vertex shader
#version 330
in vec3 P;
in vec3 xlate;
uniform mat4 glH_ProjectMatrix;
uniform mat4 glH_ObjectMatrix;
uniform mat4 glH_ViewMatrix;
void main()
{
   vec4 pos = vec4( P + xlate, 1.0);
   gl_Position = glH_ProjectMatrix * (glH_ViewMatrix * (glH_ObjectMatrix * pos));
}

And the second:

UT_Vector3FArray xlate_data;
xlate_data.setSize(num_instances);
// -- Fill xlate_data with translations here --
geo.createAttribute(r, "xlate", RE_GPU_FLOAT32, 3, xlate_data.array()->data(), RE_ARRAY_RANDOM, num_instances);
geo.drawInstanced(r, connect_group_index, num_instances);
//vertex shader
#version 330
in vec3 P;
uniform samplerBuffer attrxlate;
uniform mat4 glH_ProjectMatrix;
uniform mat4 glH_ObjectMatrix;
uniform mat4 glH_ViewMatrix;
void main()
{
   vec3 xlate = texelFetch(attrxlate, gl_InstanceID ).xyz;
   vec4 pos = vec4( P + xlate, 1.0);
   gl_Position = glH_ProjectMatrix * (glH_ViewMatrix * (glH_ObjectMatrix * pos));
}

It's also possible to mix the two methods, some attributes being instanced and others being randomly indexed in the shader. But generally, using instanced attributes is the preferred method to use.

Other RE Objects

Besides the main RE objects required for rendering, there are some support classes which can be used for various rendering effects.

RE_OGLFramebuffer

This represents a Framebuffer Object (FBO) which is used to do offscreen rendering. It is invaluable in multipass drawing algorithms. Renderbuffers and textures can be rendered to from an FBO. Textures used as a render target can then be read from a shader in a subsequent rendering pass. This is the basis of many multi-pass rendering tricks. FBOs can also be multisampled, to do such effects as antialiasing and order-independent transparency.

An RE_OGLFramebuffer can create textures and renderbuffers for specific attachments, or attach existing textures and renderbuffers. There are multiple color attachments, but one depth/stencil attachment (the separate stencil and depth attachements are generally not supported by OpenGL implementations). This allows you to render to up to 8 color attachments at a time, sharing a single depth/stencil buffer. In order to properly take advantage of multiple color attachments, a fragment shader with multiple outputs should be used.

Framebuffer attachments need to be compatible with one another, otherwise the FBO will be "incomplete". This criteria varies by OpenGL implementation, but some things that can make an FBO incomplete are missing attachments for draw buffers, incorrect texture types for render targets, or inconsistent texture sizes.

A framebuffer can be either the current draw or read framebuffer. The draw framebuffer is affected by rendering commands, while the read framebuffer is affected by read commands (eg, glReadPixels). This is stored in the RE_Render object, and set by push/set/popDrawFramebuffer() and push/set/popReadFramebuffer() . Once a framebuffer is active, it can be operated on as if it were the default (window) framebuffer.

Synchronization

A sync object allows two OpenGL contexts in different threads to synchonize with one another. The most common case is a update/render thread setup, where one thread populates OpenGL objects with data (textures, buffers) and the other renders them. A RE_CommandSync object can be used to ensure that the setup thread has completed its operation before the render thread uses the object that was being written.

// Given RE_Render *r
RE_CommandSync fence;
// do some GL stuff for object A
fence.insertSyncPoint(r);
// In another thread that depends on object A:
fence.insertWaitPoint(r);
// render object A - this command will be queued without stalling the CPU

This will cause the GPU to wait on the sync point.

There is also a way to make the CPU wait on a sync point, using RE_RenderWait:

// Given RE_Render *r
RE_RenderWait fence;
// Do some GL stuff
fence.insertSyncPoint(r);
// Other useful work not depending on the data before the sync
// This command will stall the CPU until the GPU has completed its prior work
fence.wait(r);
// now we can proceed with work on the data before the sync

RE_OcclusionQuery

An occlusion query is used to determine if a rendered object produced any visible pixels, and how many pixels were visible. Note that this query only returns if the pixels were rendered at the time of the query - if they are subsequently overwritten, this will not be reflected in the query. This sort of query is good for several cases:

For a complex object, a bounding box can be pre-rendered to determine if the object should be drawn at all, possibly in conjunction with conditional rendering (RE_Render::beginConditionalRender() ).
When rendering multiple passes, an occlusion query can be used to eliminate objects to be drawn on the second and subsequent passes if they did not render any pixels in the first pass
For effects such as lens flare, a light object can be rendered as a disc and the query can determine the percentage of the light that is visible to attenuate the intensity of the flare.
For debugging, it can tell you if the object draw actually rendered anything.

An occlusion query is implemented by RE_OcclusionQuery. Queries cannot be shared between GL contexts. It has a simple begin/end operation, and once this has been done, the number of pixels can be queried (getNumDrawn() ). The query is asynchronous, which means that the draw operation will not stall the CPU on end(), only on getNumDrawn() .

// Given RE_Render *r
RE_OcclusionQuery drawn(NUM_SAMPLES); // can also be 'BOOLEAN', true if any pixels were drawn
drawn.begin(r);
// draw some objects
drawn.end(r);
// this may stall the CPU waiting on the draw to complete. If you can do other
// work before this call, the latency can be hidden.
int num_drawn = drawn.getNumDrawn(r);

It is important to note that you cannot create a new query while another query of any type is active (an OpenGL restriction).

RE_TimerQuery

A timer query can be used to determine how long an operation took to execute on the GPU. Because OpenGL is asynchonous, timing the GL commands will not produce accurate results because these commands are added to a queue and processed at a later time. A timer query places commands in the command queue which will accurately record the elapsed time that the operation took on the GPU. This is reported in nanoseconds (1e-9 seconds).

Timer queries are not shared between GL contexts, and they cannot be nested. If you need to nest queries, instead of using the begin/end syntax, request timestamps at the start and end of the operation using two timer queries, and compute the elapsed time manually.

// given RE_Render *r
RE_TimerQuery timer;
timer.begin(r);
// update buffers, draw stuff, etc.
timer.end(r);
// sometime later...
printf("Elapsed time for operation: %dms", timer.getElapsedTimeNS(r) * 1e6);

To recode timestamps:

// given RE_Render *r
RE_TimerQuery start, end;
// insert a timestamp request into the command stream
start.recordTimestamp(r);
// do some stuff
end.recordTimeStamp(r);
// request the timestamps and compute the elapsed time. This may stall the CPU.
int64 time = end.getTimeStampNS(r) - start.getTimeStampNS(r);

This is an easy way to avoid the nesting restriction of timer objects.