Tractor Scheduler TOP node

Schedules work items using Pixar’s Tractor.

On this page	Overview Cook Modes Network Requirements Authentication Top_attributes Parameters Schedulers Job Spec Submit As Job Message Queue RPC Server Job Parms
Since	17.5

Overview ¶

This scheduler executes work items on a farm managed by Tractor.

Note

The Tractor Scheduler TOP supports Tractor 2.4 and greater.

Tractor 2.4 introduced a Python 3-compliant API, which is required by PDG and TOPs.

The Tractor Scheduler TOP requires the Tractor client Python API. You can install the client API with the Pixar RenderMan installer using the installer’s default options. To make the API available to TOPs and PDG, add the API’s location to the PYTHONPATH environment variable before launching Houdini.

The Tractor client Python API (i.e. tractor package) is installed in the following location:

Windows

C:\Program Files\Pixar\RenderManProServer-X.X\bin

Mac

/Applications/Pixar/RenderManProServer-X.X/bin

Linux

/opt/pixar/RenderManProServer-X.X/bin

Cook Modes ¶

This scheduler can operate in two different cook modes. The normal cook mode is used when selecting cook from any of the menus or buttons in the TOP UI. It connects to your Tractor engine and creates jobs for work items as they become ready to execute. The jobs then communicate back to the submitting machine with status changes. This means the submitting Houdini session must remain open for the duration of the cook.

Alternatively, you can use the button in Submit Graph As Job to cook the entire TOP Network as a standalone job. In this mode, the submitting Houdini session is detached from the cooking of the TOP Network. The HIP file is copied if necessary, and a hython task executes the TOP network as normal using whatever the default scheduler is for that topnet. In this mode, you will not see any updates in your current Houdini session. You should instead check the progress of your job using the Tractor web portal.

Network Requirements ¶

As part of the cook, a message queue (MQ) job is submitted. This job is used to communicate information from executing jobs back to the submitting machine. For this reason, your farm machines must be able to resolve the hostnames of other farm machines.

Tip

This is as simple as editing the /etc/hosts (Linux / macOS) or C:\Windows\System32\Drivers\etc\hosts (Windows).

In addition, farm machines must not have firewalls between them, or you need to use the Task Callback Port parameter to specify the open port to use.

When the cook starts, the submitting machine connects to the farm machine that is running the MQ job. So farm machines also must not have firewalls between them and the submitting machine, or you need to use the Relay Port parameter to specify the open port to use.

Enable Server

When on, turns on the data layer server for the TOP job that will cook on the farm. This allows PilotPDG or other WebSocket clients to connect to the cooking job remotely to view the state of PDG.

Server Port

Determines which server port to use for the data layer server.

This parameter is only available when Enable Server is on.

Automatic

A free TCP port to use for the data layer server chosen by the node.

Custom

A custom TCP port to use for the data layer server specified by the user.

This is useful when there is a firewall between the farm machine and the monitoring machine.

Auto Connect

When on, the scheduler will try to send a command to create a remote visualizer when the job starts. If successful, then a remote graph is created and is automatically connected to the server executing the job. The client submitting the job must be visible to the server running the job or the connection will fail.

This parameter is only available when Enable Server is on.

When Finished

Determines what to do when the TOP Cook finishes. This allows the TOP Cook job to continue running after the graph cook completes so that it can be inspected by a wrangler using a Data Layer viewer. For example, with When Finished you can retry a failed work item without restarting its whole job.

Terminate

Exit the job as normal.

Keep Open If Error

Keep the job running only if there is an error detected. You will need to kill the job manually.

Keep Open

Keep the job running. You will need to kill the job manually.

Block on Failed Work Items

When on, if there are any failed work items on the scheduler, then the cook is blocked from completing and the PDG graph cook is prevented from ending. This allows you to manually retry your failed work items. You can cancel the scheduler’s cook when it is blocked by failed work items by pressing the ESC key, clicking the Cancels the current cook button in the TOP tasks bar, or by using the cancel API method.

Auto retry downstream tasks

When on, if a parent tasks is retried manually, then its child tasks will also be retried. This parameter is only available when Block on Failed Work Items is turned on.

Hython

Determines which Houdini Python interpreter (hython) is used for your Houdini jobs. You can also specify this hython in a command using the PDG_HYTHON token.

Default

Use the default hython interpreter that is installed with Houdini.

Custom

Use the executable path specified by the Hython Executable parameter.

Hython Executable

This parameter is only available when Hython is set to Custom.

The full path to the hython executable to use for your Houdini jobs.

Load Item Data From

Determines how jobs processed by this scheduler should load work item attributes and data.

Temporary JSON File

The scheduler writes out a .json file for each work item to the PDG temporary file directory. This option is selected by default.

RPC Message

The scheduler’s running work items request attributes and data over RPC. If the scheduler is a farm scheduler, then the job scripts running on the farm will also request item data from the submitter when creating their out-of-process work item objects.

This parameter option removes the need to write data files to disk and is useful when your local and remote machines do not share a file system.

Delete Temp Dir

Determines when PDG should automatically delete the temporary file directory associated with the scheduler.

Never

PDG never automatically deletes the temp file directory.

When Scheduler is Deleted

PDG automatically deletes the temp file directory when the scheduler is deleted or when Houdini is closed.

When Cook Completes

PDG automatically deletes the temp file directory each time a cook completes.

Compress Work Item Data

When on, PDG compresses the work item .json files when writing them to disk.

This parameter is only available when Load Item Data From is set to Temporary JSON File.

Ignore RPC Errors

Determines whether RPC errors should cause out of process jobs to fail.

Never

RPC connection errors will cause work items to fail.

When Cooking Batches

RPC connection errors are ignored for batch work items, which typically make a per-frame RPC back to PDG to report output files and communicate sub item status. This option prevents long-running simulations from being killed on the farm, if the submitter Houdini session crashes or becomes unresponsive.

Always

RPC connection errors will never cause a work item to fail. Note that if a work item can’t communicate with the scheduler, it will be unable to report output files, attributes or its cook status back to the PDG graph.

Max RPC Errors

The maximum number of RPC failures that can occur before RPC is disabled in an out of process job.

Connection Timeout

The number of seconds to wait when an out of process jobs makes an RPC connection to the main PDG graph, before assuming the connection failed.

Connection Retries

The number of times to retry a failed RPC call made by an out of process job.

Retry Backoff

When Connection Retries is greater than 0, this parameter determines how much time should be spent between consecutive retries.

Batch Poll Rate

Determines how quickly an out of process batch work item should poll the main Houdini session for dependency status updates, if the batch is configured to cook when it’s first frame of work is ready. This has no impact on other types of batch work items.

Release Job Slot When Polling

Determines whether or not the scheduler should decrement the number of active workers when a batch is polling for dependency updates.

Authentication ¶

The artist submitting work to Tractor may need to supply PDG with login information. The environment variables $TRACTOR_USER and $TRACTOR_PASSWORD will be used to authenticate with the Tractor API if they are present. The Job Owner parm sets the owner of the job. However this will be overridden by the environment variable $PDG_TRACTOR_USER if present. This can be useful when using the Submit Graph As Job workflow, because in that case PDG needs to login to the Tractor API from the blade that is actually executing the TOP Cook job. $PDG_TRACTOR_USER should be set in the Tractor client environment in that case. The Tractor Password parm should only be used for debugging and never saved in the HIP file because there is no encryption of the parm value.

TOP Attributes ¶

tractor_id

integer

When the schedule submits a work item to Tractor, it will add this attribute to the work item in order to track the Tractor Job and Task IDs. The first element is the Job jid and the second element is the Task tid.

Parameters ¶

Schedulers ¶

These are global parameters for all work items using this scheduler.

Tractor

Tractor Server

Specifies the Tractor server address.

Port

Specifies the Tractor server port.

Tractor User

Specifies the username for the Tractor server login. This user must have permission to submit and query job status. You can override this with $PDG_TRACTOR_USER.

Tractor Password

Specifies the password for the Tractor server login.

This is for convenience only. When saved, the password is saved in the HIP file with no encryption.

Alternatively, you should set $TRACTOR_PASSWORD in the Houdini environment. For more information, see how to set environment variables.

Limit Jobs

When enabled, sets the maximum number of jobs that can be submitted by the scheduler at the same time.

For farm schedulers like Tractor or HQueue, this parameter can be used to limit the total number of jobs submitted to the render farm itself. Setting this parameter can help limit the load on the render farm, especially when the PDG graph has a large number of small tasks.

Block on Failed Work Items

Paths

Working Directory

Specifies the relative directory where the work generates intermediate files and output. The intermediate files are placed in a subdirectory. For the Local Scheduler or HQueue, typically $HIP is used. For other schedulers, this should be a relative directory to Local Shared Root Path and Remote Shared Root Path; this path is then appended to these root paths.

Load Item Data From

Determines how jobs processed by this scheduler should load work item attributes and data.

Temporary JSON File

The scheduler writes out a .json file for each work item to the PDG temporary file directory. This option is selected by default.

RPC Message

This parameter option removes the need to write data files to disk and is useful when your local and remote machines do not share a file system.

Compress Work Item Data

When on, PDG compresses the work item .json files when writing them to disk.

This parameter is only available when Load Item Data From is set to Temporary JSON File.

Python Executable

Specifies the full path to the Python executable on the farm machines. This is used to execute the job wrapper script for PDG work items.

Hython

Determines which Houdini Python interpreter (hython) is used for your Houdini jobs. You can also specify this hython in a command using the PDG_HYTHON token.

Default

Use the default hython interpreter that is installed with Houdini.

Custom

Use the executable path specified by the Hython Executable parameter.

Hython Executable

This parameter is only available when Hython is set to Custom.

The full path to the hython executable to use for your Houdini jobs.

Path Mapping

Global

If the PDG Path Map exists, then it is applied to file paths.

None

Delocalizes paths using the PDG_DIR token.

Path Map Zone

When on, specifies a custom mapping zone to apply to all jobs executed by this scheduler. Otherwise, the local platforms are LINUX, MAC or WIN.

Validate Outputs When Recooking

When on, PDG validates the output files of the scheduler’s cooked work items when the graph is recooked to see if the files still exist on disk. Work items that are missing output files are then automatically dirtied and cooked again. If any work items are dirtied by parameter changes, then their cache files are also automatically invalidated. Validate Outputs When Recooking is on by default.

Check Expected Outputs on Disk

When on, PDG looks for any unexpected outputs (for example, like outputs that can result from custom output handling internal logic) that were not explicitly reported when the scheduler’s work items finished cooking. This check occurs immediately after the scheduler marks work items as cooked, and expected outputs that were reported normally are not checked. If PDG finds any files that are different from the expected outputs, then they are automatically added as real output files.

Shared File Root

NFS

Specifies the path to the shared file root for farm machines in the NFS zone.

UNC (Windows)

Specifies the path to the shared file root for farm machines in the UNC zone.

$HFS

NFS

Specifies the path to the Houdini installation for farm machines in the NFS zone.

UNC (Windows)

Specifies the path to the Houdini installation for farm machines in the UNC zone.

Temp Directory

Location

Determines where temporary files are written to. Files that are written to this location are needed for the PDG cook, but are not typically the end product and can be removed when the cook completes.

For example, log files and python scripts are some of the files that are usually written during the cook.

Working Directory

Use pdgtemp subdirectory specified in the Working Directory field.

Custom

Use the custom directory specified by in the Custom field.

Append PID

When on, a subdirectory is added to the location specified by the Location parameter and is named after the value of your Houdini session’s PID (Process Identifier). The PID is typically a 3-5 digit number.

This is necessary when multiple sessions of Houdini are cooking TOP graphs at the same time.

Custom

The full path to the custom temporary directory, which needs to be accessible by all blades involved in executing the Job.

Delete Temp Dir

Determines when PDG should automatically delete the temporary file directory associated with the scheduler.

Never

PDG never automatically deletes the temp file directory.

When Scheduler is Deleted

PDG automatically deletes the temp file directory when the scheduler is deleted or when Houdini is closed.

When Cook Completes

PDG automatically deletes the temp file directory each time a cook completes.

Job Spec ¶

Job Description

Job Owner

Specifies the username of the owner of the job.

Job Title

Specifies the title of the top-level job for the submitted cooks.

Job Priority

Specifies the priority of the cook jobs.

Job Options

Tier

Specifies a list of valid site-wide tiers, where each tier represents a particular global job priority and scheduling discipline.

Projects

Specifies the names of project affiliations for this job.

Max Active Tasks

When on, sets the maximum number of tasks that the PDG cook job is allowed to run concurrently.

After Jobs

Specifies which jobs need to be complete before job processing can begin. You can specify a single job ID or a space-separated list of multiple job IDs. Once spooled, this parameter’s settings will delay the start of job processing until the specified jobs have completed.

Verbose Logging

When on, detailed messages from scheduler binding are printed to console.

Use Session File

When on, the Tractor API will create a temporary file to avoid the need to authenticate the local user multiple times in one session of Houdini. The file will be created as $TEMP/.pdgtractor.{user}.{host}.session.

Submit As Job ¶

Submit

Cooks the entire TOP Network as a standalone job and displays the status URI for the submitted job.

By default, the submitted job uses the Tractor login and sets it to the job environment with $PDG_TRACTOR_USER and $TRACTOR_PASSWORD. If these are not present, then the Tractor User and Tractor Password parameter values are used.

Job Title

Specifies the title of the job that is submitted.

Job Verbosity

Specifies the verbosity level of the standalone job.

Output Node

When on, specifies the path to the node to cook. If a node is not specified, the display node of the network that contains the scheduler is cooked instead.

Save Task Graph File

When on, the submitted job will save a task graph .py file once the cook completes.

Job Parms ~~~~

Job Service Keys

Specifies the Tractor Service Key expression for the job that will execute the TOP graph on the farm.

You may want to use a cheaper slot for this because although executing the TOP graph requires a separate task, it does not consume much memory or CPU.

Data Layer Server

Enable Server

When on, turns on the data layer server for the TOP job that will cook on the farm. This allows PilotPDG or other WebSocket clients to connect to the cooking job remotely to view the state of PDG.

Server Port

Determines which server port to use for the data layer server.

This parameter is only available when Enable Server is on.

Automatic

A free TCP port to use for the data layer server chosen by the node.

Custom

A custom TCP port to use for the data layer server specified by the user.

This is useful when there is a firewall between the farm machine and the monitoring machine.

Auto Connect

This parameter is only available when Enable Server is on.

When Finished

Terminate

Exit the job as normal.

Keep Open If Error

Keep the job running only if there is an error detected. You will need to kill the job manually.

Keep Open

Keep the job running. You will need to kill the job manually.

Message Queue ¶

Service Keys

Specifies the Tractor Service Key expression for the task that will execute the Message Queue Server.

You may want to use a cheaper slot for this because although the Message Queue Process requires a separate task, it does not consume much memory or CPU.

Please note that a Message Queue Task is not created when a graph is cooked via Submit Graph As Job.

Task Callback Port

When on, sets the TCP Port used by the Message Queue Server for the job callback API. The port must be accessible between farm blades.

Relay Port

When on, sets the TCP Port used by the Message Queue Server connection between PDG and the blade that is running the Message Queue Command. The port must be reachable on farm blades by the PDG/user machine.

RPC Server ¶

Parameters for configuring the behavior of RPC connections from out of process jobs back to a scheduler instance.

Ignore RPC Errors

Determines whether RPC errors should cause out of process jobs to fail.

Never

RPC connection errors will cause work items to fail.

When Cooking Batches

Always

Max RPC Errors

The maximum number of RPC failures that can occur before RPC is disabled in an out of process job.

Connection Timeout

The number of seconds to wait when an out of process jobs makes an RPC connection to the main PDG graph, before assuming the connection failed.

Connection Retries

The number of times to retry a failed RPC call made by an out of process job.

Retry Backoff

When Connection Retries is greater than 0, this parameter determines how much time should be spent between consecutive retries.

Batch Poll Rate

Release Job Slot When Polling

Determines whether or not the scheduler should decrement the number of active workers when a batch is polling for dependency updates.

Job Parms ¶

These job-specific parameters affect all submitted jobs, but can be overridden on a node-by-node basis. See Scheduler Job Parms / Properties.

Service Key Expression

Specifies the job service key expression. This determines the type of blade that can run this job.

Limit Tags

Specifies the job limit tags. This is a space-separated list of strings representing the tags to be associated with every command of the job.

At Least Slots

Sets the minimum number of free slots that must be available on a Tractor blade in order to execute this command.

At Most Slots

When on, the maximum number of free slots that this command can use when launched. This is used as the default value Houdini Max Threads value unless explicitly set.

Houdini Max Threads

When on, sets the HOUDINI_MAXTHREADS environment to the given value. By default, HOUDINI_MAXTHREADS is set to the value of At Most Slots when enabled.

The default value of 0 means to use all available processors.

If the value is positive, the value limits the number of threads that can be used. A value of 1 disables multithreading entirely by limiting it to only one thread. Positive values are clamped to the number of available CPU cores.

If the value is negative, the value is added to the maximum number of processors to determine the threading limit. For example, a value of -1 uses all CPU cores except 1.

Env Keys

Specifies a space separated list of environment keys which are defined in the blade profiles.

Task Title

Specifies a custom task name prefix. By default, the corresponding work item name will be used. The name suffix is a value used internally by PDG for book keeping.

Maximum Run Time

Specifies a maximum runtime limit for tasks (in seconds). When tasks run past the time limit, they are then killed. By default, there is no maximum runtime limit as the default value is 0.

Post Success Wait

Specifies how many seconds to wait before exiting a successful job. This stops Tractor from immediately re-assigning a blade before a dependent, higher priority job can be spooled by PDG.

Metadata

Specifies an arbitrary string that is attached to the task definition.

Preview Launch

Specifies a launch expression to run an external application from the Tractor UI. This allows you to view in-progress cook results using an external application.

Tip

TOPs has its own internal viewer registry.

Non-Zero Exit Code Handling

Handle By

Customize what to do when the command fails (Returns a non-zero exit code).

Reporting Error

The work item fails.

Reporting Warning

The work item succeeds and a warning is added to the node.

Retrying Task

The work item is retried by Tractor for the number of Retries remaining.

Ignoring Exit Code

The work item succeeds.

Handle All Non Zero

When off, you can specify a particular exit code.

Exit Code

Specifies the exit code that you want to handle using Handle By. All other non-zero exit codes will be treated as a failure as normal.

This parameter is only available when Handle All Non Zero is off.

Retries

Number of times to retry the job when the command fails.

This parameter is only available when Handle By is set to Retrying Task.

Task Environment

Inherit Local Environment

When on, environment variables in the current session of PDG are copied into the job’s environment.

Unset Variables

Space-separated list of environment variables that should be unset in the task environment.

Environment File

Environment Variables

Additional work item environment variables can be specified here. These will be added to the job’s environment. If the value of the variable is empty, it will be removed from the job’s environment.

Name

Name of the work item environment variable.

Value

Value of the work item environment variable.

Specifies an environment file for environment variables to be added to the job’s environment. An environment variable from the file will overwrite an existing environment variable if they share identical names.

Environment Variables

Multiparm that lets you add custom key-value environment variables for each task.

Job Scripts

Pre Shell

Specifies a shell script to be executed/sourced before the command is executed.

Post Shell

Specifies a shell script to be executed/sourced after the command is executed.

Pre Python

Specifies the Python script to be executed in the wrapper script before the command process is spawned.

Post Python

Specifies the Python script to be executed in the wrapper script after the command process exits.

Overview ¶

Cook Modes ¶

Network Requirements ¶

Authentication ¶

TOP Attributes ¶

Parameters ¶

Schedulers ¶

Job Spec ¶

Submit As Job ¶

Message Queue ¶

RPC Server ¶

Job Parms ¶

TOP nodes