Houdini 20.5 Executing tasks with PDG/TOPs

File paths

Best practices for input/output file paths in TOP networks.

On this page

Overview

TOPs is designed to work with compute farms that may have a variety different filesystems. For example, a TOPs user could be on a Windows machine, but also be using a Linux-based farm. The problem is how to map file paths from one filesystem to another. TOPs provides the PDG Path Map to address this.

Most TOP nodes that do work let you specify input and/or output file paths. In TOPs, each scheduler node can specify a working directory. This is because different render farm software may use different shared network filesystems. In the case you are using a farm scheduler, you should make sure that all files you output are reachable by the farm machines relative to this PDG_DIR directory.

How to

  1. Set the base working directory on the scheduler node. This directory is available to jobs as the PDG_DIR environment variable.

    • Use separate working directories for each HIP file. This is to avoid having two HIP files writing to the same PDG_DIR. Many of the default generated filenames used in parameter defaults are only unique within the HIP file.

    • For render farm schedulers, make sure that the directory is inside the network filesystem (like NFS mount or SMB share) and is shared with the render farm client machines.

  2. When you use PDG_DIR or PDG_TEMP in parameter filenames, use the form __PDG_DIR__ instead of ${PDG_DIR}. If you use ${PDG_DIR}, Houdini will try and fail to expand the variable itself before the dependency graph gets it. Houdini will ignore __PDG_DIR__ syntax, but the PDG scheduler knows to expand that token to the absolute path on the executing machine.

  3. Put intermediate files under __PDG_TEMP__ and final output files under __PDG_DIR__.

    • Categorize output files using subdirectories. For example, __PDG_TEMP__/geo for intermediate geometry files and __PDG_DIR__/geo for final geometry output.

Environment Variables

Executing tasks with PDG/TOPs

Basics

Beginner Tutorials

Next steps

  • Running external programs

    How to wrap external functionality in a TOP node.

  • File tags

    Work items track the results created by their work. Each result is tagged with a type.

  • PDG Path Map

    The PDG Path Map manages the mapping of paths between file systems.

  • Feedback loops

    You can use for-each blocks to process looping, sequential chains of operations on work items.

  • Service Blocks

    Services blocks let you define a section of work items that should run using a shared Service process

  • PDG Services

    PDG services manages pools of persistent Houdini sessions that can be used to reduce work item cooking time.

  • Integrating PDG with render farm schedulers

    How to use different schedulers to schedule and execute work.

  • Visualizing work item performance

    How to visualize the relative cook times (or file output sizes) of work items in the network.

  • Event handling

    You can register a Python function to handle events from a PDG node or graph

  • Tips and tricks

    Useful general information and best practices for working with TOPs.

  • Troubleshooting PDG scheduler issues on the farm

    Useful information to help you troubleshoot scheduling PDG work items on the farm.

  • PilotPDG

    Standalone application or limited license for working with PDG-specific workflows.

Reference

  • All TOPs nodes

    TOP nodes define a workflow where data is fed into the network, turned into work items and manipulated by different nodes. Many nodes represent external processes that can be run on the local machine or a server farm.

  • Processor Node Callbacks

    Processor nodes generate work items that can be executed by a scheduler

  • Partitioner Node Callbacks

    Partitioner nodes group multiple upstream work items into single partitions.

  • Scheduler Node Callbacks

    Scheduler nodes execute work items

  • Custom File Tags and Handlers

    PDG uses file tags to determine the type of an output file.

  • Python API

    The classes and functions in the Python pdg package for working with dependency graphs.

  • Job API

    Python API used by job scripts.

  • Utility API

    The classes and functions in the Python pdgutils package are intended for use both in PDG nodes and scripts as well as out-of-process job scripts.