Houdini 20.5 HQueue

Jobs specification details

More detail on the internals of a job specification for users who want to submit custom jobs.

On this page

The Job Specification

Every HQueue job is defined by a specification – a JSON structure containing the job’s properties. Here is a simple example:

{ 
    "name": "Print Hello World",
}

The specification defines a job with exactly one property – the job name. This is a valid specification because only the name property is required. The job does not do anything useful. It will immediately finish and succeed without performing any work assigned to a client machine.

To execute tasks on the client, a set of command can be added to the specification using the command property. For example:

{ 
    "name": "Print Hello World",
    "command" : "echo 'Hello World!'",
}

When the job is assigned to a client machine, it outputs “Hello World!” using the default shell installed with the machine’s operating system.

To execute the command with a particular shell, add the shell property to the specification like so:

{ 
    "name": "Print Hello World",
    "command" : "echo 'Hello World!'",
    "shell" : "bash",
}

Several commands can be stored in a job and executed in sequence by combining them as a single command. For example:

{ 
    "name": "Multiple Print Commands",
    "command" : "echo 'The First Command' && echo 'The Second Command'",
}

How commands are combined depend on the shell but separating commands with && or ending commands with ; will work in most cases.

For complex commands, it is easier to store the commands in a script file and then point the command property to the script file. For example:

{ 
    "name": "Running a Script",
    "command" : "/path/to/myScript.sh",
}

For a complete list of the job properties that can be defined in the specification, see Job Properties.

Status Changes

An HQueue job passes through a sequence of status changes from when it is submitted to HQueue to when it completes. A basic job like the example previously mentioned undergoes this sequence:

waiting for machine → running → succeeded

When a job is submitted to HQueue, it is placed into the scheduling queue where it waits for a client machine. Once a machine becomes available and is assigned, then the machine runs the job’s commands. When execution of the job’s commands finishes without errors, the job is marked as succeeded.

For a complete list of job statuses, see Job Statuses.

Parent-Child Relationships

A dependency can be set between two jobs so that the first job cannot run until the second job completes. This dependency is called a parent-child relationship in HQueue. For example, if job A depends on job B, then job A is a parent of job B and job B is a child of job A.

A Simple Parent-Child Example

If you want to create an AVI video from X frames rendered from a Houdini scene. Assuming that IFD files have already been generated for the frames, then you can perform it by doing the following:

  1. Render the frames from the IFD files using Mantra.

  2. Encode the rendered images into a video.

For the first step, you can define X HQueue jobs, one job for every frame to be rendered. The job specification for rendering frame 1 looks like:

{ 
    "name": "Render Frame 1",
    "command": 
    """cd $HQROOT/houdini_distros/hfs;
       source houdini_setup;
       mantra < $HQROOT/path/to/ifds/frame0001.ifd"""

    "shell": "bash",
}

The job specifications for the rest of the frames is similar.

Now suppose that Mantra renders the images to $HQROOT/path/to/output/frame*.png where $HQROOT is the mount point to the main network folder registered with HQueue, then the job specification for the encoding step would look like:

{ 
    "name": "Encode Video",
    "shell": "bash",
    "command": "myEncoderApp --input=$HQROOT/path/to/output/frame*.png --output=$HQROOT/path/to/output/myVideo.avi"
}

To ensure that the encoding job is executed after all the render jobs have completed, you create a dependency by making the encoding job a parent of the render jobs.

To create the dependency, you use the children job property in the encoding job’s specification. The children property accepts a list of child job specifications.

So the final specification for the encoding job would look like:

{ 
    "name": "Encode Video",
    "shell": "bash",
    "command": "myEncoderApp --input=$HQROOT/path/to/output/frame*.png --output=$HQROOT/path/to/output/myVideo.avi",
    "children": [
        {
            "name": "Render Frame 1",
            "shell": "bash",
            "command": 
            """cd $HQROOT/houdini_distros/hfs;
               source houdini_setup;
               mantra < $HQROOT/path/to/ifds/frame0001.ifd"""
        },
            "name": "Render Frame 2",
            "shell": "bash",
            "command": 
            """cd $HQROOT/houdini_distros/hfs;
               source houdini_setup;
               mantra < $HQROOT/path/to/ifds/frame0002.ifd"""
        },
            "name": "Render Frame 3",
            "shell": "bash",
            "command": 
            """cd $HQROOT/houdini_distros/hfs;
               source houdini_setup;
               mantra < $HQROOT/path/to/ifds/frame0003.ifd"""
        },
        .............................

    ]

}

Parent Status

The parent job’s status depends on the statuses of its child jobs. When the child jobs complete, if at least one child job has failed, then the parent job’s status is set to failed and the parent job’s commands are not executed. Otherwise, if all the child jobs complete successfully, then the parent job’s status is changed to waiting for machine and the parent job is placed into the scheduling queue where it waits for a client machine.

Submitting Child Jobs from Within The Parent

It is possible to submit child jobs from within a running job. You may want this if the number of child jobs needed is not known until runtime. In such cases, commands can be added to the parent job that calculate how many child jobs are required and then submits child job specifications to the HQueue server.

To submit child jobs, use the newjob() Python API function (see Python API).

Here is an example of a Python script that creates a new job and assigns it as a child to the currently running job:

import os
import xmlrpclib

# Connect to the HQueue server.
hq_server = xmlrpclib.ServerProxy("http://hq_server_hostname:5000")

# Define the child job.
child_job_spec = { 
    "name": "The Child Job",
    "shell": "bash",
    "command": "echo 'Hello World!'" 
}

# Get the id of the current job.
# It is defined automatically by HQueue in the JOBID environment variable.
current_job_id = os.environ["JOBID"]

# Submit the job to the server.
# newjob() returns a list of job ids (in case multiple jobs are passed in at once).
job_ids = hq_server.newjob(child_job_spec, current_job_id)

If the script is saved into a file, say createChild.py, then it can be added to the command property of the parent job. So the parent job’s specification would look like:

{ 
    "name": "The Parent Job",
    "shell": "bash",
    "command": "python $HQROOT/path/to/scripts/createChild.py"
}

Commandless Jobs

It is possible to have a job without the command property defined since the only required property is the name property.

Commandless jobs do not perform any work, but they can be useful at times. For example, if you write a Python script that submits jobs to HQueue, you can use commandless jobs to test calls to newjob() without burdening the farm with any real work. Also, you can use commandless jobs as “containers” for several independent but related jobs. This has an organizational benefit in the HQueue web interface since the work produced by the related child jobs can be viewed under a single job.

Job Properties

Here is a list of properties that can be added to a job specification.

Property Name

Property Type

Property Value

children

list/tuple of job specifications

Job specifications that will be submitted and assigned as child jobs.

childrenIds

list/tuple of integers

Ids for existing jobs that will be assigned as child jobs.

command

string

The set of shell commands to execute on the assigned client machine.

conditions

list/tuple of strings

Conditions that the HQueue scheduler must follow when choosing a machine to assign the job to. For more information, please read Job Conditions.

cpus

integer

The minimum number of CPUs that the job will use. The default is 1.

Note that a job marked with 0 cpus can run on any machine, even a machine running another job. The exception is a machine running a job with the single tag. See Job Tags for more information.

description

string

The job description.

emailReasons

string

A comma separated list of reasons to send emails to the addresses specified by the emailTo property. If this is empty or not specified, no emails will be sent. Valid reasons are 'abandoned', 'cancelled', 'ejected', 'ejecting', 'failed', 'paused', 'pausing', 'priority changed', 'queued', 'rescheduled', 'resumed', 'resuming', 'runnable', 'running', 'succeeded' and 'waiting'.

Job email properties only apply if the HQueue Server is configured to send emails. See the error_email_from and smtp_server configuration variables in the Server Configuration help page.

emailTo

string

A comma separated list of addresses to send emails to based on reasons specified by the emailReasons property.

Job email properties only apply if the HQueue Server is configured to send emails. See the error_email_from and smtp_server configuration variables in the Server Configuration help page.

environment

dictionary

A dictionary of variables to define in the client’s environment when the job’s command set is executed. The keys and values of the dictionary are the variable names and values respectively.

host

string

The hostname of the machine that the job should execute on. If this property is not set, then the job can execute on any machine.

inherit_conditions

boolean

Whether the job should follow its root job’s conditions. This property is ignored when the job has conditions of its own. The default is True.

onCancel

string

The set of shell commands to execute if the job is canceled while running on a client machine.

onError

string

The set of shell commands to execute when the job fails.

onChildError

string

The set of shell commands to execute if the job has child jobs that failed. The commands are run after all the child jobs finish and after the job executes its command set. Note that if the job already has client machines assigned to it, then the job will hold onto those clients and use them to run the onChildError command.

onReschedule

string

The set of shell commands to execute if the job has been rescheduled. The commands are run before the job executes its command set.

onSuccess

string

The set of shell commands to execute when the job completes successfully.

maxHosts

integer

The maximum number of client machines allowed to be assigned to the job. The default is 1.

maxTime

integer

The maximum amount of time (in seconds) that the job is permitted to run for. If the job’s running time exceeds the maximum time then it is automatically canceled by HQueue. Note that if this property is not specified or set to less than zero then the job has no maximum time and can run indefinitely.

minHosts

integer

The minimum number of client machines that must be assigned to the job. The default is 1.

name

string

The job’s name.

priority

integer

The job’s priority. Jobs with higher priorities are scheduled and processed before jobs with lower priorities. 0 is the lowest priority. The default is 0.

shell

string

The terminal shell to use when executing the job’s command set.

submittedBy

string

The name of the person that submitted the job. For child jobs, if this value is not specified, then it is inherited from a parent job.

tags

list/tuple of strings

A list of tags to apply to the job. Tags can be used to control whether the job requires a dedicated machine or whether it can share a machine with other running jobs. For more information, see Job Tags.

triesLeft

integer

The number of times a job should be automatically rescheduled in an attempt to make it succeed after a failure. If the job fails after the amount of triesLeft, then it is marked as failed. The default value is 0.

triesDifferentClient

boolean

Whether or not should the task be automatically rescheduled on different clients. This property is useful when job.triesLeft > 0. The default is False.

resources

dictionary

A name value pairing of the HQueue resources used by the job and the amount of each resources used. For e.g., {"sidefx.license.houdini": 1, "custom_one": 2}. See the Resources help page for more details.

Job Properties Example

Here is an example of a job specification that demonstrates the use of some properties:

{ 
    "name": "The Main Job",
    "shell": "bash",
    "environment": { 
        "SHOW_MSG": "1", 
        "MSG": "Hello World!" 
    }, 
    "command":  
    """if [ $SHOW_MSG = 1 ]; then 
           echo $MSG; 
    fi""",
    "tags": [ "single" ], 
    "maxHosts": 1, 
    "minHosts": 1, 
    "priority": 0, 
    "children": [ 
        { 
            "name": "The Child Job",
            "shell": "bash",
            "command": "echo 'Hello World!'" 
        }
    ] 
}

The example above defines a job named 'The Main Job' which has a priority level of 0. It uses the bash shell to execute its command set and defines two environment variables, SHOW_MSG and MSG. Its command set directly references these two variables. The job requires one dedicated machine as defined by the single tag, and the maxHosts and minHosts properties. Finally, it has a single child job which prints out “Hello World”.

Job Conditions

Job conditions inform the HQueue scheduler to assign a job to a restricted set of client machines. A job condition is defined by a type, name, operator and value. Together they specify a comparison test that the scheduler uses to determine whether a machine can be assigned to run the job. If a client machine passes ALL of the assigned conditions, then it can run the job.

Note

A job automatically inherits its root job’s conditions in addition to the job’s own conditions. Set the job’s inherit_conditions property to False to prevent the job from inheriting its root job’s conditions.

Below is a description of each of the condition components:

Component

Description

type

The type specifies what the condition applies to. Since HQueue only supports client conditions at the moment, the type should always be set to “client”. Client conditions determine whether a client machine can be assigned to the job or not.

name

The name of the client property to be tested. The supported names are:

  • hostname - Test against the client’s hostname.

  • hostname:name - Test against the client’s hostname and name. In a farm where multiple clients may be running on the same machine, this should be used to uniquely identify a client amongst clients with the same hostname (i.e. running on the same machine).

  • group - Test against the client’s group memberships.

op

The comparison operator to use when testing the client’s attribute against the condition’s value. The supported operators are:

  • == - returns true if the client’s attribute matches the condition value

  • != - returns true if the client’s attribute does not match the condition value.

  • in - returns true if the client’s attribute matches any element in the condition value. Use commas to separate multiple elements in the value.

  • not_in - returns true if the client’s attribute does not match any element in the condition value. Use commas to separate multiple elements in the value.

  • any - Deprecated. Use “in” instead.

value

The value to test against the requested client attribute. If the condition operator is “any”, then the value can be a list of multiple items where commas are used to separate items.

Job Condition Examples

Here is an example that demonstrates how to attach a condition to a job specification:

{
    "name": "A Job with Conditions",
    "shell": "bash",
    "command": "echo 'I should be running on either machine1 or machine2!'",
    "conditions": [
        { 
            "type" : "client", 
            "name": "hostname", 
            "op": "in", 
            "value": "machine1,machine2"
        },
    ]
}

The example above defines a job which can only be assigned to a client machine named either “machine1” or “machine2”. Note that the conditions property is a list of dictionaries where each dictionary defines a single condition and its 4 components.

The next example demonstrates how to set a condition where the job can only be assigned to client machines that are members of the “Simulation” group:

{
    "name": "A Job for the Simulation Group",
    "shell": "bash",
    "command": "echo 'I should be running on a machine that is a member of the Simulation group!'",
    "conditions": [
        { 
            "type" : "client", 
            "name": "group", 
            "op": "==", 
            "value": "Simulation"
        },
    ]
}

Job Tags

Tags can be used to describe whether a job requires a dedicated machine or whether it can share a machine with other running jobs. If no tags are specified, then the job is configured to share the machine it is running on as long as the machine has enough CPUs.

To declare that a job needs a dedicated machine, add the single tag to the tags property. No other jobs will run on a machine running a job with the single tag.

You can also create custom single tags to control the sets of running jobs that can share machines and the sets that cannot. No two jobs with the same custom single tag will run on the same machine simultaneously.

To create a custom single tag, simply prefix the tag name with single.

For example, suppose you create two jobs, A and B, and assign them a custom single tag named single:1. And suppose you create two other jobs, C and D, and assign them another custom single tag named single:2. Then jobs A and B cannot concurrently run on the same machine and jobs C and D cannot concurrently run on the same machine. However, job A or B can concurrently run on a machine that is running job C or D, and vice versa.

Job Statuses

Here is a list of job statuses and their descriptions.

Status

Description

abandoned

The job is assigned to a client machine but the machine is not reporting the job’s progress or status. This can happen if the machine becomes unresponsive (i.e. reboots, or hangs).

cancelled

The job is no longer on the scheduling queue because it was interrupted by a user.

failed

The job is finished but an error was reported during execution of its command set or during execution of one of its child jobs.

paused

The job has been paused by a user. The scheduler does not assign a client machine to the job while it is paused. If the job is already running on a machine, then its execution is halted.

pausing

The job is running on a client machine but has been requested by a user to halt execution. The HQueue server is waiting for a response from the client to confirm that the job has been paused.

resuming

The job is assigned to a client machine and is currently paused but has been requested by a user to resume execution. The HQueue server is waiting for a response from the client to confirm that the job has been resumed.

running

The job is executing on a client machine.

running (X clients assigned)

One or more of the job’s child jobs is running and a total of X clients are assigned to the child jobs.

succeeded

The job is finished and no errors were reported during command execution.

waiting for resources

The job is ready for execution but is waiting for a resource. These can be HQueue resources, client machines, or job conditions. If the job does not have a child job then a tooltip will display the exact reason the job is waiting.

Job Variables

Here is a list of the built-in, runtime variables that are defined in the job’s environment.

Environment Variable

Description

HQCLIENT

The folder path to the client code on the machine running the job.

HQCLIENTARCH

The platform of the client machine running the job. It consists of the operating system and machine architecture. Here is a quick list of the possible values:

  • linux-x86_64 → Linux 64-bit

  • macosx-arm64 → macOS M1 64-bit

  • macosx-x86_64 → macOS Intel 64-bit

  • windows-x86_64 → Windows 64-bit

HQHOSTS

The name of the machine(s) running the job.

HQROOT

The folder path to the main network drive registered with HQueue.

See Network Folders Configuration.

HQSERVER

The address of the HQueue server. It consists of the HQueue server’s hostname and the port number that the server is listening on.

JOBID

The id of the current job.

HQueue

Getting started

  • About HQueue

    HQueue is a general-purpose job scheduling system. You can use it to distribute renders, simulations, and other work to remote clients.

  • Installation

    How to set up a basic HQueue farm.

  • Configuration

    How to set configuration options for the HQueue server and clients.

  • How to submit jobs

    How to put work on the farm.

Managing the farm

  • Managing jobs

    How to view and manage jobs on the farm.

  • Managing clients

    How to use the web interface or local logins to add, remove, restart, and manage client machines.

  • Managing client groups

    How to use the web interface or local logins to create and manage groups of client machines.

  • Network Folders

    How to use the Network Folder management page.

  • Resources

    You can specify what resources (such as licenses) are available on each client, so jobs can be scheduled on clients where they can run.

  • Notes

    Each client and job can have informational notes attached.

Next steps

  • Logging

    HQueue stores separate logs for server errors and scheduling events, and each client also generates a log.

  • Uninstalling

    How to uninstall the HQueue server or client software.

  • FAQs

    Answers to frequently asked questions.

Guru level