File Decompress TOP node

Decompresses archive files specified by incoming work items into individual files.

On this page	Top_attributes Parameters Node Caching and Output Files Schedulers Examples
Since	17.5

This node looks in the output attribute on each incoming work items for file paths tagged file/archive. If the file name’s extension is supported (.zip, .tar.gz), the node extracts the files inside the archive into a given directory.

The node can also decompress gzipped files with the .gz extension, and writes the decompressed data to a file with the same name minus the .gz extension.

(Nodes that produce file results (for example, File Pattern or File Compress) should automatically tag archive files as file/archive based on the extension.)

If you just want to extract files from one or more pre-existing archive files, do the following:

Create a File Pattern node.
- Set the Pattern to match the archive(s) you want to extract. For example, $PDG_DIR/sources/*.zip.
- Make sure Split results into separate items is on.
The node will generate work items for each archive, with its output attribute set to the archive path. The path should automatically be tagged file/archive based on the extension.
Connect a File Decompress node. Turn on Output file and set it to the directory you want to extract files into.

Warning

The File Decompress node sets each work item’s output to the path of the directory it extracted into, not to the list of extracted files, as you might expect.

TOP Attributes ¶

`input_archives`	string	A list of input archives that will be decompressed.
`output_dirs`	string	A list of directories where archives will be extracted into.

Parameters ¶

Node ¶

Generate When

Determines when this node will generate work items. You should generally leave this set to “Automatic” unless you know the node requires a specific generation mode, or that the work items need to be generated dynamically.

All Upstream Items are Generated

This node will generate work items once all of the input nodes have generated their work items.

All Upstream Items are Cooked

This node will generate work items once all of the input nodes have cooked their work items.

Each Upstream Item is Cooked

This node will generate work items each time a work item in an input node is cooked.

Automatic

The generation mode is selected based on the generation mode of the input nodes. If any of the input nodes are generating work items when their inputs cook, this node will be set to Each Upstream Item is Cooked. Otherwise, it will be set to All Upstream Items are Generated.

Cook Type

Determines how work items in the node should cook, e.g. if they should run in process, out of process, or using services.

Files to Decompress

Determines whether the node should create work items that extract archives from input tasks, or from a custom file path.

Source Tag

When Files to Decompress is set to Upstream Output Files, this parameter determines the file tag used to match the upstream outputs.

Path

When Files to Decompress is set to Custom Path, this parameter determines the path to the file that should be decompressed. This path can contain expressions that resolve to a different value for each work item.

Output Folder

Turn this on and set it to the path to the directory you want to extract files into. The node will create the directory if it does not exist.

Add Extracted Files to Outputs

When this toggle is enabled, the extracted output file paths are added as output files on the work item.

Caching and Output Files ¶

Cache Mode

Determines how the processor node handles work items that report expected file results.

Automatic

If the expected result file exists on disk, the work item is marked as cooked without being scheduled. If the file does not exist on disk, the work item is scheduled as normal. If upstream work item dependencies write out new files during a cook, the cache files on work items in this node will also be marked as out-of-date.

Automatic (Ignore Upstream)

The same as Automatic, except upstream file writes do not invalidate cache files on work items in this node and this node will only check output files for its own work items.

Read Files

If the expected result file exists on disk, the work item is marked as cooked without being scheduled. Otherwise the work item is marked as failed.

Write Files

Work items are always scheduled and the expected result file is ignored even if it exists on disk.

Expected Outputs From

Determines how expected output files should be specified.

Attribute Name

Specifies the name of the attribute that contains the file paths(s).

This parameter is only available when Expected Outputs From is set to Attribute Name,

Custom File Tag

When on, the custom tag value will be assigned to all output files. Otherwise, PDG will use the existing tag for the file if one has been set, or pick one automatically based on the file extension if a tag does not exist.

Expected Outputs

Determines the number of file list entries.

This parameter is only available when Expected Outputs From is set to File List.

Output File

Specifies the path to the file.

Schedulers ¶

TOP Scheduler Override

This parameter overrides the TOP scheduler for this node.

Schedule When

When enabled, this parameter can be used to specify an expression that determines which work items from the node should be scheduled. If the expression returns zero for a given work item, that work item will immediately be marked as cooked instead of being queued with a scheduler. If the expression returns a non-zero value, the work item is scheduled normally.

Work Item Label

Determines how the node should label its work items. This parameter allows you to assign non-unique label strings to your work items which are then used to identify the work items in the attribute panel, task bar, and scheduler job names.

Use Default Label

The work items in this node will use the default label from the TOP network, or have no label if the default is unset.

Inherit From Upstream Item

The work items inherit their labels from their parent work items.

Custom Expression

The work item label is set to the Label Expression custom expression which is evaluated for each item.

Node Defines Label

The work item label is defined in the node’s internal logic.

Label Expression

When on, this parameter specifies a custom label for work items created by this node. The parameter can be an expression that includes references to work item attributes or built-in properties. For example, $OS: @pdg_frame will set the label of each work item based on its frame value.

Work Item Priority

This parameter determines how the current scheduler prioritizes the work items in this node.

Inherit From Upstream Item

The work items inherit their priority from their parent items. If a work item has no parent, its priority is set to 0.

Custom Expression

The work item priority is set to the value of Priority Expression.

Node Defines Priority

The work item priority is set based on the node’s own internal priority calculations.

This option is only available on the Python Processor TOP, ROP Fetch TOP, and ROP Output TOP nodes. These nodes define their own prioritization schemes that are implemented in their node logic.

Priority Expression

This parameter specifies an expression for work item priority. The expression is evaluated for each work item in the node.

This parameter is only available when Work Item Priority is set to Custom Expression.

Work Item Command

By default, work items use a command line string determined by the node. It’s possible to override the command line by setting this parameter, either by providing a custom script that’s invoked instead of the default one, or by specifying a completely custom command line string.

Use Default

Use the default command line string for the work item, as determined by the node

Custom Script

Override the job script used to cook tasks in this node. The script is invoked with the same argumentas as the default script. The script will be added a file dependency, and is copied in the working directory of the graph automatically.

Custom Command

specifies a completely custom command line string that fully replaces the default string.

Command

The command line (executable and arguments) to run when the work item runs. If this field is empty the work item will not be scheduled, and will be instantly marked done once all of its dependencies finish.

Examples ¶

example_top_filedecompress Example for File Decompress TOP node

This example demonstrates how to decompress files using TOPs / PDG.