On this page |
The Job Specification ¶
Every HQueue job is defined by a specification – a JSON structure containing the job’s properties. Here is a simple example:
{ "name": "Print Hello World", }
The specification defines a job with exactly one property – the job name. This is a valid specification because only the name
property is required. The job does not do anything useful. It will immediately finish and succeed without performing any work assigned to a client machine.
To execute tasks on the client, a set of command can be added to the specification using the command
property. For example:
{ "name": "Print Hello World", "command" : "echo 'Hello World!'", }
When the job is assigned to a client machine, it outputs “Hello World!” using the default shell installed with the machine’s operating system.
To execute the command with a particular shell, add the shell
property to the specification like so:
{ "name": "Print Hello World", "command" : "echo 'Hello World!'", "shell" : "bash", }
Several commands can be stored in a job and executed in sequence by combining them as a single command. For example:
{ "name": "Multiple Print Commands", "command" : "echo 'The First Command' && echo 'The Second Command'", }
How commands are combined depend on the shell but separating commands with &&
or ending commands with ;
will work in most cases.
For complex commands, it is easier to store the commands in a script file and then point the command
property to the script file. For example:
{ "name": "Running a Script", "command" : "/path/to/myScript.sh", }
For a complete list of the job properties that can be defined in the specification, see Job Properties.
Status Changes ¶
An HQueue job passes through a sequence of status changes from when it is submitted to HQueue to when it completes. A basic job like the example previously mentioned undergoes this sequence:
When a job is submitted to HQueue, it is placed into the scheduling queue where it waits for a client machine. Once a machine becomes available and is assigned, then the machine runs the job’s commands. When execution of the job’s commands finishes without errors, the job is marked as succeeded.
For a complete list of job statuses, see Job Statuses.
Parent-Child Relationships ¶
A dependency can be set between two jobs so that the first job cannot run until the second job completes. This dependency is called a parent-child relationship in HQueue. For example, if job A depends on job B, then job A is a parent of job B and job B is a child of job A.
A Simple Parent-Child Example ¶
If you want to create an AVI video from X frames rendered from a Houdini scene. Assuming that IFD files have already been generated for the frames, then you can perform it by doing the following:
-
Render the frames from the IFD files using Mantra.
-
Encode the rendered images into a video.
For the first step, you can define X HQueue jobs, one job for every frame to be rendered. The job specification for rendering frame 1 looks like:
{ "name": "Render Frame 1", "command": """cd $HQROOT/houdini_distros/hfs; source houdini_setup; mantra < $HQROOT/path/to/ifds/frame0001.ifd""" "shell": "bash", }
The job specifications for the rest of the frames is similar.
Now suppose that Mantra renders the images to $HQROOT/path/to/output/frame*.png
where $HQROOT
is the mount point to the main network folder registered with HQueue, then the job specification for the encoding step would look like:
{ "name": "Encode Video", "shell": "bash", "command": "myEncoderApp --input=$HQROOT/path/to/output/frame*.png --output=$HQROOT/path/to/output/myVideo.avi" }
To ensure that the encoding job is executed after all the render jobs have completed, you create a dependency by making the encoding job a parent of the render jobs.
To create the dependency, you use the children
job property in the encoding job’s specification. The children
property accepts a list of child job specifications.
So the final specification for the encoding job would look like:
{ "name": "Encode Video", "shell": "bash", "command": "myEncoderApp --input=$HQROOT/path/to/output/frame*.png --output=$HQROOT/path/to/output/myVideo.avi", "children": [ { "name": "Render Frame 1", "shell": "bash", "command": """cd $HQROOT/houdini_distros/hfs; source houdini_setup; mantra < $HQROOT/path/to/ifds/frame0001.ifd""" }, "name": "Render Frame 2", "shell": "bash", "command": """cd $HQROOT/houdini_distros/hfs; source houdini_setup; mantra < $HQROOT/path/to/ifds/frame0002.ifd""" }, "name": "Render Frame 3", "shell": "bash", "command": """cd $HQROOT/houdini_distros/hfs; source houdini_setup; mantra < $HQROOT/path/to/ifds/frame0003.ifd""" }, ............................. ] }
Parent Status ¶
The parent job’s status depends on the statuses of its child jobs. When the child jobs complete, if at least one child job has failed, then the parent job’s status is set to failed and the parent job’s commands are not executed. Otherwise, if all the child jobs complete successfully, then the parent job’s status is changed to waiting for machine and the parent job is placed into the scheduling queue where it waits for a client machine.
Submitting Child Jobs from Within The Parent ¶
It is possible to submit child jobs from within a running job. You may want this if the number of child jobs needed is not known until runtime. In such cases, commands can be added to the parent job that calculate how many child jobs are required and then submits child job specifications to the HQueue server.
To submit child jobs, use the newjob()
Python API function (see Python API).
Here is an example of a Python script that creates a new job and assigns it as a child to the currently running job:
import os import xmlrpclib # Connect to the HQueue server. hq_server = xmlrpclib.ServerProxy("http://hq_server_hostname:5000") # Define the child job. child_job_spec = { "name": "The Child Job", "shell": "bash", "command": "echo 'Hello World!'" } # Get the id of the current job. # It is defined automatically by HQueue in the JOBID environment variable. current_job_id = os.environ["JOBID"] # Submit the job to the server. # newjob() returns a list of job ids (in case multiple jobs are passed in at once). job_ids = hq_server.newjob(child_job_spec, current_job_id)
If the script is saved into a file, say createChild.py, then it can be added to the command
property of the parent job. So the parent job’s specification would look like:
{ "name": "The Parent Job", "shell": "bash", "command": "python $HQROOT/path/to/scripts/createChild.py" }
Commandless Jobs ¶
It is possible to have a job without the command
property defined since the only required property is the name
property.
Commandless jobs do not perform any work, but they can be useful at times. For example, if you write a Python script that submits jobs to HQueue, you can use commandless jobs to test calls to newjob()
without burdening the farm with any real work. Also, you can use commandless jobs as “containers” for several independent but related jobs. This has an organizational benefit in the HQueue web interface since the work produced by the related child jobs can be viewed under a single job.
Job Properties ¶
Here is a list of properties that can be added to a job specification.
Property Name |
Property Type |
Property Value |
---|---|---|
children |
list/tuple of job specifications |
Job specifications that will be submitted and assigned as child jobs. |
childrenIds |
list/tuple of integers |
Ids for existing jobs that will be assigned as child jobs. |
command |
string |
The set of shell commands to execute on the assigned client machine. |
conditions |
list/tuple of strings |
Conditions that the HQueue scheduler must follow when choosing a machine to assign the job to. For more information, please read Job Conditions. |
cpus |
integer |
The minimum number of CPUs that the job will use. The default is 1. Note that a job marked with 0 cpus can run on any machine, even a machine running another job. The exception is a machine running a job with the |
description |
string |
The job description. |
emailReasons |
string |
A comma separated list of reasons to send emails to the addresses specified by the Job email properties only apply if the HQueue Server is configured to send emails. See the |
emailTo |
string |
A comma separated list of addresses to send emails to based on reasons specified by the Job email properties only apply if the HQueue Server is configured to send emails. See the |
environment |
dictionary |
A dictionary of variables to define in the client’s environment when the job’s command set is executed. The keys and values of the dictionary are the variable names and values respectively. |
host |
string |
The hostname of the machine that the job should execute on. If this property is not set, then the job can execute on any machine. |
inherit_conditions |
boolean |
Whether the job should follow its root job’s conditions. This property is ignored when the job has conditions of its own. The default is True. |
onCancel |
string |
The set of shell commands to execute if the job is canceled while running on a client machine. |
onError |
string |
The set of shell commands to execute when the job fails. |
onChildError |
string |
The set of shell commands to execute if the job has child jobs that failed. The commands are run after all the child jobs finish and after the job executes its command set. Note that if the job already has client machines assigned to it, then the job will hold onto those clients and use them to run the onChildError command. |
onReschedule |
string |
The set of shell commands to execute if the job has been rescheduled. The commands are run before the job executes its command set. |
onSuccess |
string |
The set of shell commands to execute when the job completes successfully. |
maxHosts |
integer |
The maximum number of client machines allowed to be assigned to the job. The default is 1. |
maxTime |
integer |
The maximum amount of time (in seconds) that the job is permitted to run for. If the job’s running time exceeds the maximum time then it is automatically canceled by HQueue. Note that if this property is not specified or set to less than zero then the job has no maximum time and can run indefinitely. |
minHosts |
integer |
The minimum number of client machines that must be assigned to the job. The default is 1. |
name |
string |
The job’s name. |
priority |
integer |
The job’s priority. Jobs with higher priorities are scheduled and processed before jobs with lower priorities. 0 is the lowest priority. The default is 0. |
shell |
string |
The terminal shell to use when executing the job’s command set. |
submittedBy |
string |
The name of the person that submitted the job. For child jobs, if this value is not specified, then it is inherited from a parent job. |
tags |
list/tuple of strings |
A list of tags to apply to the job. Tags can be used to control whether the job requires a dedicated machine or whether it can share a machine with other running jobs. For more information, see Job Tags. |
triesLeft |
integer |
The number of times a job should be automatically rescheduled in an attempt to make it succeed after a failure. If the job fails after the amount of |
triesDifferentClient |
boolean |
Whether or not should the task be automatically rescheduled on different clients. This property is useful when |
resources |
dictionary |
A name value pairing of the HQueue resources used by the job and the
amount of each resources used. For e.g., |
Job Properties Example ¶
Here is an example of a job specification that demonstrates the use of some properties:
{ "name": "The Main Job", "shell": "bash", "environment": { "SHOW_MSG": "1", "MSG": "Hello World!" }, "command": """if [ $SHOW_MSG = 1 ]; then echo $MSG; fi""", "tags": [ "single" ], "maxHosts": 1, "minHosts": 1, "priority": 0, "children": [ { "name": "The Child Job", "shell": "bash", "command": "echo 'Hello World!'" } ] }
The example above defines a job named 'The Main Job' which has a priority level of 0. It uses the bash shell to execute its command set and defines two environment variables, SHOW_MSG
and MSG
. Its command set directly references these two variables. The job requires one dedicated machine as defined by the single
tag, and the maxHosts
and minHosts
properties. Finally, it has a single child job which prints out “Hello World”.
Job Conditions ¶
Job conditions inform the HQueue scheduler to assign a job to a restricted set of client machines. A job condition is defined by a type, name, operator and value. Together they specify a comparison test that the scheduler uses to determine whether a machine can be assigned to run the job. If a client machine passes ALL of the assigned conditions, then it can run the job.
Note
A job automatically inherits its root job’s conditions in addition to the job’s own conditions. Set the job’s inherit_conditions
property to False
to prevent the job from inheriting its root job’s conditions.
Below is a description of each of the condition components:
Component |
Description |
---|---|
type |
The type specifies what the condition applies to. Since HQueue only supports client conditions at the moment, the type should always be set to “client”. Client conditions determine whether a client machine can be assigned to the job or not. |
name |
The name of the client property to be tested. The supported names are:
|
op |
The comparison operator to use when testing the client’s attribute against the condition’s value. The supported operators are:
|
value |
The value to test against the requested client attribute. If the condition operator is “any”, then the value can be a list of multiple items where commas are used to separate items. |
Job Condition Examples ¶
Here is an example that demonstrates how to attach a condition to a job specification:
{ "name": "A Job with Conditions", "shell": "bash", "command": "echo 'I should be running on either machine1 or machine2!'", "conditions": [ { "type" : "client", "name": "hostname", "op": "in", "value": "machine1,machine2" }, ] }
The example above defines a job which can only be assigned to a client machine named either “machine1” or “machine2”. Note that the conditions
property is a list of dictionaries where each dictionary defines a single condition and its 4 components.
The next example demonstrates how to set a condition where the job can only be assigned to client machines that are members of the “Simulation” group:
{ "name": "A Job for the Simulation Group", "shell": "bash", "command": "echo 'I should be running on a machine that is a member of the Simulation group!'", "conditions": [ { "type" : "client", "name": "group", "op": "==", "value": "Simulation" }, ] }
Job Tags ¶
Job Statuses ¶
Here is a list of job statuses and their descriptions.
Status |
Description |
---|---|
abandoned |
The job is assigned to a client machine but the machine is not reporting the job’s progress or status. This can happen if the machine becomes unresponsive (i.e. reboots, or hangs). |
cancelled |
The job is no longer on the scheduling queue because it was interrupted by a user. |
failed |
The job is finished but an error was reported during execution of its command set or during execution of one of its child jobs. |
paused |
The job has been paused by a user. The scheduler does not assign a client machine to the job while it is paused. If the job is already running on a machine, then its execution is halted. |
pausing |
The job is running on a client machine but has been requested by a user to halt execution. The HQueue server is waiting for a response from the client to confirm that the job has been paused. |
resuming |
The job is assigned to a client machine and is currently paused but has been requested by a user to resume execution. The HQueue server is waiting for a response from the client to confirm that the job has been resumed. |
running |
The job is executing on a client machine. |
running (X clients assigned) |
One or more of the job’s child jobs is running and a total of X clients are assigned to the child jobs. |
succeeded |
The job is finished and no errors were reported during command execution. |
waiting for resources |
The job is ready for execution but is waiting for a resource. These can be HQueue resources, client machines, or job conditions. If the job does not have a child job then a tooltip will display the exact reason the job is waiting. |
Job Variables ¶
Here is a list of the built-in, runtime variables that are defined in the job’s environment.
Environment Variable |
Description |
---|---|
HQCLIENT |
The folder path to the client code on the machine running the job. |
HQCLIENTARCH |
The platform of the client machine running the job. It consists of the operating system and machine architecture. Here is a quick list of the possible values:
|
HQHOSTS |
The name of the machine(s) running the job. |
HQROOT |
The folder path to the main network drive registered with HQueue. |
HQSERVER |
The address of the HQueue server. It consists of the HQueue server’s hostname and the port number that the server is listening on. |
JOBID |
The id of the current job. |