HQueue Scheduler TOP node

Schedules work items using HQueue.

On this page	Cook Modes Network Requirements Top_attributes Parameters Scheduler Submit As Job Message Queue RPC Server Job Parms
Since	17.5

This node schedules work items using HQueue in order to execute them on remote machines.

For more information on configuring HQueue, see Getting Started with HQueue or PDG For Design Work Pt. 3 - Setting Up Distributed PDG.

Cook Modes ¶

This scheduler can operate in two different cook modes:

The normal cook mode connects to your HQueue scheduler and creates jobs for work items as they become ready to execute, and the jobs then communicate back to the submitting machine with status changes. This means that the submitting Houdini session must remain open for the duration of the cook.

This mode is used whenever you select Cook from any of the menus or buttons in the TOP UI.
The standalone job mode cooks the entire TOP network as a standalone job. In this mode, the submitting Houdini session is detached from the cooking of the TOP network, the .hip file is copied if necessary, and a hython process executes the TOP network using the default scheduler for that topnet. You will also not see any updates to your current Houdini session. To check the progress of your job when using this mode, you will need to use the HQueue web portal.

This mode is used whenever you click the Submit Graph As Job > Submit button in the HQueue Scheduler’s parameters.

Network Requirements ¶

As part of the cook, a message queue (MQ) job is submitted. This job is used to communicate information from executing jobs back to the submitting machine. For this reason, your farm machines must be able to resolve the hostnames of other farm machines.

Tip

This is as simple as editing the /etc/hosts (Linux / macOS) or C:\Windows\System32\Drivers\etc\hosts (Windows).

In addition, farm machines must not have firewalls between them, or you need to use the Task Callback Port parameter to specify the open port to use.

When the cook starts, the submitting machine connects to the farm machine that is running the MQ job. So farm machines also must not have firewalls between them and the submitting machine, or you need to use the Relay Port parameter to specify the open port to use.

Enable Server

When on, turns on the data layer server for the TOP job that will cook on the farm. This allows PilotPDG or other WebSocket clients to connect to the cooking job remotely to view the state of PDG.

Server Port

Determines which server port to use for the data layer server.

This parameter is only available when Enable Server is on.

Automatic

A free TCP port to use for the data layer server chosen by the node.

Custom

A custom TCP port to use for the data layer server specified by the user.

This is useful when there is a firewall between the farm machine and the monitoring machine.

Auto Connect

When on, the scheduler will try to send a command to create a remote visualizer when the job starts. If successful, then a remote graph is created and is automatically connected to the server executing the job. The client submitting the job must be visible to the server running the job or the connection will fail.

This parameter is only available when Enable Server is on.

When Finished

Determines what to do when the TOP Cook finishes. This allows the TOP Cook job to continue running after the graph cook completes so that it can be inspected by a wrangler using a Data Layer viewer. For example, with When Finished you can retry a failed work item without restarting its whole job.

Terminate

Exit the job as normal.

Keep Open If Error

Keep the job running only if there is an error detected. You will need to kill the job manually.

Keep Open

Keep the job running. You will need to kill the job manually.

Block on Failed Work Items

When on, if there are any failed work items on the scheduler, then the cook is blocked from completing and the PDG graph cook is prevented from ending. This allows you to manually retry your failed work items. You can cancel the scheduler’s cook when it is blocked by failed work items by pressing the ESC key, clicking the Cancels the current cook button in the TOP tasks bar, or by using the cancel API method.

Auto retry downstream tasks

When on, if a parent tasks is retried manually, then its child tasks will also be retried. This parameter is only available when Block on Failed Work Items is turned on.

Hython

Determines which Houdini Python interpreter (hython) is used for your Houdini jobs. You can also specify this hython in a command using the PDG_HYTHON token.

Default

Use the default hython interpreter that is installed with Houdini.

Custom

Use the executable path specified by the Hython Executable parameter.

Hython Executable

This parameter is only available when Hython is set to Custom.

The full path to the hython executable to use for your Houdini jobs.

Load Item Data From

Determines how jobs processed by this scheduler should load work item attributes and data.

Temporary JSON File

The scheduler writes out a .json file for each work item to the PDG temporary file directory. This option is selected by default.

RPC Message

The scheduler’s running work items request attributes and data over RPC. If the scheduler is a farm scheduler, then the job scripts running on the farm will also request item data from the submitter when creating their out-of-process work item objects.

This parameter option removes the need to write data files to disk and is useful when your local and remote machines do not share a file system.

Delete Temp Dir

Determines when PDG should automatically delete the temporary file directory associated with the scheduler.

Never

PDG never automatically deletes the temp file directory.

When Scheduler is Deleted

PDG automatically deletes the temp file directory when the scheduler is deleted or when Houdini is closed.

When Cook Completes

PDG automatically deletes the temp file directory each time a cook completes.

Compress Work Item Data

When on, PDG compresses the work item .json files when writing them to disk.

This parameter is only available when Load Item Data From is set to Temporary JSON File.

Ignore RPC Errors

Determines whether RPC errors should cause out of process jobs to fail.

Never

RPC connection errors will cause work items to fail.

When Cooking Batches

RPC connection errors are ignored for batch work items, which typically make a per-frame RPC back to PDG to report output files and communicate sub item status. This option prevents long-running simulations from being killed on the farm, if the submitter Houdini session crashes or becomes unresponsive.

Always

RPC connection errors will never cause a work item to fail. Note that if a work item can’t communicate with the scheduler, it will be unable to report output files, attributes or its cook status back to the PDG graph.

Max RPC Errors

The maximum number of RPC failures that can occur before RPC is disabled in an out of process job.

Connection Timeout

The number of seconds to wait when an out of process jobs makes an RPC connection to the main PDG graph, before assuming the connection failed.

Connection Retries

The number of times to retry a failed RPC call made by an out of process job.

Retry Backoff

When Connection Retries is greater than 0, this parameter determines how much time should be spent between consecutive retries.

Batch Poll Rate

Determines how quickly an out of process batch work item should poll the main Houdini session for dependency status updates, if the batch is configured to cook when it’s first frame of work is ready. This has no impact on other types of batch work items.

Release Job Slot When Polling

Determines whether or not the scheduler should decrement the number of active workers when a batch is polling for dependency updates.

Windows

Windows Services cannot use network-mounted drives. Since HQueue jobs on Windows are executed by a Windows Service, you should only use UNC paths. For example, use //myserver/hq/project/myhip.hip instead of H:/project/myhip.hip. Also be careful with backslashes in paths, as they are interpreted as escape sequences when evaluated by Houdini or the command shell.

Tip

On the HQueue Scheduler Node, press the Load Path Map button in the Path Mapping section to automatically load the necessary path maps.

TOP Attributes ¶

hqueue_jobid

integer

When the scheduler submits a work item to HQueue, it adds this attribute to the work item in order to track the HQueue job ID.

Parameters ¶

Scheduler ¶

These are the global parameters that configure the behavior of the connection and file paths for HQueue.

HQueue

HQueue Server

URL of the HQueue server. For example, http://localhost:5000.

Job Name

The name of the top-level HQueue Job for submitted cooks.

Job Description

The description of the top-level HQueue job. This can be seen in the Job Properties for the job.

Limit Jobs

When enabled, sets the maximum number of jobs that can be submitted by the scheduler at the same time.

For farm schedulers like Tractor or HQueue, this parameter can be used to limit the total number of jobs submitted to the render farm itself. Setting this parameter can help limit the load on the render farm, especially when the PDG graph has a large number of small tasks.

Block on Failed Work Items

Advanced

Verbose Logging

Turn on printing output to console. Can be useful for debugging problems.

Tick Period

Sets the minimum time (in seconds) between calls to the onTick callback.

Max Items Per Tick

Sets the maximum number of ready item onSchedule callbacks between ticks.

Paths

Working Directory

Specifies the directory where the cook generates intermediate files and output. The intermediate files are placed in a subdirectory named pdgtemp.

If you are opening your .hip file in Houdini from the shared network path (for example, from H:/myproj/myhip.hip), you can use $HIP here (the default). However, if you are opening your .hip file from a local directory (for example, from C:/temp/myhip.hip), you have to copy it to a shared network before it can be accessed by farm machines. In this case, the Working Directory should be an absolute or relative path to that shared network location (for example, //MYPC/Shared/myproj). The .hip file will be copied automatically in that case, but note that for cross-platform compatibility you will need to add a Path Map from your local $HIP path to the farm Working Directory (for example c:/temp → /mnt/hq/pyproj)

Load Item Data From

Determines how jobs processed by this scheduler should load work item attributes and data.

Temporary JSON File

The scheduler writes out a .json file for each work item to the PDG temporary file directory. This option is selected by default.

RPC Message

This parameter option removes the need to write data files to disk and is useful when your local and remote machines do not share a file system.

Delete Temp Dir

Determines when PDG should automatically delete the temporary file directory associated with the scheduler.

Never

PDG never automatically deletes the temp file directory.

When Scheduler is Deleted

PDG automatically deletes the temp file directory when the scheduler is deleted or when Houdini is closed.

When Cook Completes

PDG automatically deletes the temp file directory each time a cook completes.

Compress Work Item Data

When on, PDG compresses the work item .json files when writing them to disk.

This parameter is only available when Load Item Data From is set to Temporary JSON File.

Validate Outputs When Recooking

When on, PDG validates the output files of the scheduler’s cooked work items when the graph is recooked to see if the files still exist on disk. Work items that are missing output files are then automatically dirtied and cooked again. If any work items are dirtied by parameter changes, then their cache files are also automatically invalidated. Validate Outputs When Recooking is on by default.

Check Expected Outputs on Disk

When on, PDG looks for any unexpected outputs (for example, like outputs that can result from custom output handling internal logic) that were not explicitly reported when the scheduler’s work items finished cooking. This check occurs immediately after the scheduler marks work items as cooked, and expected outputs that were reported normally are not checked. If PDG finds any files that are different from the expected outputs, then they are automatically added as real output files.

Path Mapping

Global

If the PDG Path Map exists, then it is applied to file paths.

None

Delocalizes paths using the PDG_DIR token.

Path Map Zone

When on, specifies a custom mapping zone to apply to all jobs executed by this scheduler. Otherwise, the local platforms are LINUX, MAC or WIN.

Load Path Map

Opens the PDG Path Map Panel and populates it with path mappings based on the configuration of your HQueue Server for the default LINUX, MAC, and WIN zones.

Override Local Shared Root

When on, the location of the local shared root directory is overridden by the Local Shared Root Paths parameters.

Local Shared Root

The HQueue farm should be configured with a shared network filesystem and the mount point of this shared file system is specified for each platform.

These parameters are only available when Override Local Shared Root is on.

Load from HQueue

Queries the HQueue server to retrieve the local shared root paths for each platform and fills the parameters below.

Windows

The local shared root path on Windows machines. For example, I:/.

macOS

The local shared root path on macOS machines. For example, /Volumes/hq.

Linux

The local shared root path on Linux machines. For example, /mnt/hq.

HFS

Universal HFS

When on, a single path to the $HFS directory (the Houdini install directory) is used by all platforms. You can use $HQROOT and $HQCLIENTARCH to help specify the directory path.

Linux HFS Path

$HFS path for Linux.

This parameter is only available when Universal HFS is off.

macOS HFS Path

$HFS path for macOS.

Windows HFS Path

$HFS path for Windows.

This parameter is only available when Universal HFS is off.

Python

Determines which Python interpreter is used for your Python jobs. You can also specify this Python in a command using the PDG_PYTHON token.

From HFS

Use the Python interpreter that is installed with Houdini.

From HQClient

Use the same Python interpreter that HQClient is using on the farm machine.

Custom

Use the executable path specified by the Python Executable parameter.

Python Executable

This parameter is only available when Python is set to Custom.

The full path to the Python executable to use for your Python jobs.

Hython

Determines which Houdini Python interpreter (hython) is used for your Houdini jobs. You can also specify this hython in a command using the PDG_HYTHON token.

Default

Use the default hython interpreter that is installed with Houdini.

Custom

Use the executable path specified by the Hython Executable parameter.

Hython Executable

This parameter is only available when Hython is set to Custom.

The full path to the hython executable to use for your Houdini jobs.

Submit As Job ¶

Submit

Cooks the entire TOP network as a standalone job. Displays the status URI for the submitted job. The submitting Houdini session is detached from the cooking of the TOP network. The .hip file is copied if necessary and a hython process executes the TOP network normally using the default scheduler for that topnet.

Tip

You can restart a finished standalone jobs using the HQueue Web UI. However, you should restart the child job named TOP Cook instead of the parent job.

Job Name

Specifies the name of the submitted job.

Job Verbosity

Specifies the verbosity level of the standalone job.

Output Node

When on, specifies the path to the node to cook. If a node is not specified, the display node of the network that contains the Scheduler is cooked instead.

Save Task Graph File

When on, the submitted job will save a task graph .py file once the cook completes.

Job Parms

Assign To

Which clients to assign priority to.

Any Client

Assign to any client.

Listed Clients

Assign to specified clients.

Clients from Listed Groups

Assign to specified client groups.

Clients

Names of clients to assign jobs to separated by spaces.

This parameter is only available when Assign To is set to Listed Clients.

Client Groups

Names of client groups to assign jobs to separated by spaces.

This parameter is only available when Assign To is set to Clients from Listed Groups.

CPUs per Job

The maximum number of CPUs that will be consumed by the job. If the number exceeds a client machine’s number of free CPUs, then the client machine will not be assigned the job.

Data Layer Server

Enable Server

When on, turns on the data layer server for the TOP job that will cook on the farm. This allows PilotPDG or other WebSocket clients to connect to the cooking job remotely to view the state of PDG.

Server Port

Determines which server port to use for the data layer server.

This parameter is only available when Enable Server is on.

Automatic

A free TCP port to use for the data layer server chosen by the node.

Custom

A custom TCP port to use for the data layer server specified by the user.

This is useful when there is a firewall between the farm machine and the monitoring machine.

Auto Connect

This parameter is only available when Enable Server is on.

When Finished

Terminate

Exit the job as normal.

Keep Open If Error

Keep the job running only if there is an error detected. You will need to kill the job manually.

Keep Open

Keep the job running. You will need to kill the job manually.

Message Queue ¶

The Message Queue (MQ) server is required to get work item results from the jobs running on the farm. Several types of MQ are provided to work around networking issues such as firewalls.

Type

The type of Message Queue (MQ) server to use.

Local

Starts or shares the MQ server on your local machine.

If another HQueue scheduler node (in the current Houdini session) already started a MQ server locally, then this scheduler node uses that MQ server automatically.

If there are not any firewalls between your local machine and the farm machines, then we recommend you use this setting.

Farm

Starts or shares the MQ server on the farm as a separate job.

If there are firewalls between your local machine and the farm machines, then we recommend you use this parameter.

Connect

Connects to an already running MQ server.

The MQ server needs to have been started manually. This is the manual option for managing the MQ and useful for running MQ as a centralized service on a single machine to serve all PDG jobs which use this setting.

Task Callback Port

Sets the TCP Port used by the Message Queue Server for the XMLRPC callback API. This port must be accessible between farm clients.

Relay Port

Sets the TCP Port used by the Message Queue Server connection between PDG and the client that is running the Message Queue Command. This port must be reachable on farm clients by the PDG/user machine.

Address

IP address of the machine running the persistent MQ server.

This parameter is only available when Type is set to Connect.

RPC Server ¶

Parameters for configuring the behavior of RPC connections from out of process jobs back to a scheduler instance.

Ignore RPC Errors

Determines whether RPC errors should cause out of process jobs to fail.

Never

RPC connection errors will cause work items to fail.

When Cooking Batches

Always

Max RPC Errors

The maximum number of RPC failures that can occur before RPC is disabled in an out of process job.

Connection Timeout

The number of seconds to wait when an out of process jobs makes an RPC connection to the main PDG graph, before assuming the connection failed.

Connection Retries

The number of times to retry a failed RPC call made by an out of process job.

Retry Backoff

When Connection Retries is greater than 0, this parameter determines how much time should be spent between consecutive retries.

Batch Poll Rate

Release Job Slot When Polling

Determines whether or not the scheduler should decrement the number of active workers when a batch is polling for dependency updates.

Job Parms ¶

These job-specific parameters affect all submitted jobs, but can be overridden on a node-by-node basis. For more information, see Scheduler Job Parms / Properties.

Note

Many of these parameters correspond directly to the HQueue Job Properties.

Scheduling

Job Priority

The job’s HQueue priority.

Jobs with higher priorities are scheduled and processed before jobs with lower priorities. 0 is the lowest priority.

Assign To

Which clients to assign priority to.

Any Client

Assign to any client.

Listed Clients

Assign to specified clients.

Clients from Listed Groups

Assign to specified client groups.

Clients

Names of clients to assign jobs to separated by spaces.

This parameter is only available when Assign To is set to Listed Clients.

Select Clients

Selects clients from HQueue to populate the Clients list.

This parameter is only available when Assign To is set to Listed Clients.

Client Groups

Names of client groups to assign jobs to separated by spaces.

This parameter is only available when Assign To is set to Clients from Listed Groups.

Select Groups

Selects client groups from HQueue to populate the Client Groups list.

This parameter is only available when Assign To is set to Clients from Listed Groups.

CPUs per Job

The maximum number of CPUs that will be consumed by the job. If the number exceeds a client machine’s number of free CPUs, then the client machine will not be assigned the job.

Note that you can control the multi-threading of some jobs with Houdini Max Threads. You can also use the Tags parm to control if this job needs a dedicated machine.

Houdini Max Threads

When on, sets the HOUDINI_MAXTHREADS environment to the specified value. If CPUS per Job is enabled, HOUDINI_MAXTHREADS is set to the same value unless this parameter is also enabled.

A value of 0 indicates that the job should use all available CPUs cores.

Positive values limit the number of threads that can be used. For example, a value of 1 disables multi-threading entirely by limiting the job to one thread. Positive values are also clamped to the number of available CPU cores.

If the value is negative, the value is added to the maximum number of processors to determine the threading limit for the job. For example, a value of -1 uses all CPU cores except 1.

See limiting resource usage.

Max Run Time

The maximum amount of time (in seconds) that the work item is permitted to run for. If it’s running time exceeds the maximum time then it is automatically canceled by HQueue.

Create Container Job

Determines whether a node-level container job should be created in the job tree, and how it should be named.

Custom Container Name

When Create Container Job is set to Custom Name, this parameter can be set to an expression to define the container job name.

Job Description

Description property for the job.

Cook Modes ¶

Network Requirements ¶

TOP Attributes ¶

Parameters ¶

Scheduler ¶

Submit As Job ¶

Message Queue ¶

RPC Server ¶

Job Parms ¶

TOP nodes