Houdini 20.5 Executing tasks with PDG/TOPs

External configuration and data

How to read external configuration and source data in TOPs and use it to drive work.

On this page

Overview

It is often useful to drive a TOP network using a configuration file, or input data, or both. For example, you might have a settings file that defines a bunch of variables to use when running the network, and you might run the network once for each part listed in a manifest, or for each asset listed in an asset management system.

TOPs lets you read external data and generate work items/attributes based on the data.

Loading data

  • SQL Database:

    SQL Input runs a query on a database server and generates a work item for each returned row, with attributes taken from the columns.

    SQL Output takes incoming work items and writes an INSERT query to insert a row for each work item, with the columns taken from attributes you specify.

  • CSV (comma separated values, typically exported by a spreadsheet):

    CSV Input reads a CSV file and generates a work item for each row, with attributes taken from the columns.

    CSV Output takes incoming work items and writes out a new CSV file with a row for each work item, with the columns taken from the attributes.

  • JSON files:

    JSON Input generates work items with attributes based on JSON files. Because JSON is so free-form, the node has parameters to try to extract items and attributes from various data “shapes”.

    JSON Output takes incoming work items and writes out a JSON file with a list of objects representing work items, containing key/value pairs taken from the attributes.

  • The Environment Edit node lets you add extra environment variables to the environment in which work is run.

Manipulating data

  • Using Python:

    If you don’t mind programming, the simplest and most flexible way to pre-process data, or manipulate existing attributes, is with a Python snippet.

    If you want to edit one or more attributes on every incoming work item:

    1. Add a Python Script node. The Python Script node lets you edit incoming work items one at a time.

    2. In the parameters, set the Evaluate Script During to Cook (In-Process). The Python Script will run in the current Houdini process when the work item cooks.

    3. Write a script to manipulate attributes. For example:

      # Assume we've ingested data that sets the "detail" attribute to a string
      # such as "low", "medium", or "high", and we want to translate that into
      # a numeric value, eg. -1, 0, or 1.
      
      # Define a dictionary mapping string values to numeric values
      lookup = {"low": -1, "medium": 0, "high": 1}
      # Get the value of the "detail" string attribute
      detail = work_item.intAttribValue("detail")
      # Translate the string into the numeric equivalent
      level = lookup.get(detail, 0)
      # Create a new "level" attribute with the numeric equivalent
      work_item.setIntAttrib("level", level)
      

    If you want to compute aggregate statistics (for example, the average of an attribute):

    1. Add a Wait for All. This will pause processing until all work items are available, so the average will include all values.

    2. After the Wait for All, add a Python Script node.

    3. In the parameters, set Generate When to Each Upstream Item is Cooked and set Evaluate Script During to Cook (In-Process).

    4. In the script, you can use parent_item to refer to the single item in the Wait for All nodes. That item has a partitionItems list attribute you can use to access the items in the partition:

      # Get the work items in the input Wait for All's partition
      items = parent_item.partitionItems
      # Calculate the average of the "scale" attribute
      total = sum(it.floatAttribValue("scale") for it in items)
      average = total / float(len(items))
      # Set the average as an attribute on the outgoing item
      work_item.setFloatAttrib("average", average)
      
  • Alternatively, you can use TOP nodes that manipulate attributes:

    Attribute Create. You can add or redefine attributes using this node, and you can use expressions to compute the value of the new attribute based on existing attributes.

    Attribute Copy duplicates attributes from work items in one branch onto the work items in another branch, matching the work items up by index or by an attribute value.

    Attribute Delete removes attributes from work items. This can be useful, for example, before outputting to CSV to prevent “scratch” attributes from being written to disk.

    Attribute from String creates attributes by parsing components of an input string. This node is useful for extracting frame or shot information from a file path or parsing string data loaded by the CSV or JSON input nodes.

    Attribute Reduce applies operations to array attributes that reduce them to a single value. For example, this node can be used to find the min, max, average or total from the values in an integer or float array.

Executing tasks with PDG/TOPs

Basics

Beginner Tutorials

Next steps

  • Running external programs

    How to wrap external functionality in a TOP node.

  • File tags

    Work items track the results created by their work. Each result is tagged with a type.

  • PDG Path Map

    The PDG Path Map manages the mapping of paths between file systems.

  • Feedback loops

    You can use for-each blocks to process looping, sequential chains of operations on work items.

  • Service Blocks

    Services blocks let you define a section of work items that should run using a shared Service process

  • PDG Services

    PDG services manages pools of persistent Houdini sessions that can be used to reduce work item cooking time.

  • Integrating PDG with render farm schedulers

    How to use different schedulers to schedule and execute work.

  • Visualizing work item performance

    How to visualize the relative cook times (or file output sizes) of work items in the network.

  • Event handling

    You can register a Python function to handle events from a PDG node or graph

  • Tips and tricks

    Useful general information and best practices for working with TOPs.

  • Troubleshooting PDG scheduler issues on the farm

    Useful information to help you troubleshoot scheduling PDG work items on the farm.

  • PilotPDG

    Standalone application or limited license for working with PDG-specific workflows.

Reference

  • All TOPs nodes

    TOP nodes define a workflow where data is fed into the network, turned into work items and manipulated by different nodes. Many nodes represent external processes that can be run on the local machine or a server farm.

  • Processor Node Callbacks

    Processor nodes generate work items that can be executed by a scheduler

  • Partitioner Node Callbacks

    Partitioner nodes group multiple upstream work items into single partitions.

  • Scheduler Node Callbacks

    Scheduler nodes execute work items

  • Custom File Tags and Handlers

    PDG uses file tags to determine the type of an output file.

  • Python API

    The classes and functions in the Python pdg package for working with dependency graphs.

  • Job API

    Python API used by job scripts.

  • Utility API

    The classes and functions in the Python pdgutils package are intended for use both in PDG nodes and scripts as well as out-of-process job scripts.