Catalog Insert

On this Page

Overview

Known Issue

In Snaplex version 4.38, pipelines using the Catalog Insert Snap no longer populate the Data Catalog tables in Manager with metadata.

Workaround: Do not use Snaplex version 4.38 to run pipelines using the Catalog Insert Snap.


The Catalog Insert Snap enables you to enrich your metadata catalog by inserting metadata into the catalog tables.

Expected Input and Output

  • Expected Input: A document containing metadata to be written into the metadata catalog. This document should have a field named schema with the document schema.
  • Expected Output: A document containing status messages on the result of the insert operation.
  • Expected Upstream Snaps: Required. Any Snap that offers document data. Examples: Parquet Writer and Mapper.
  • Expected Downstream Snaps: Any Snap that accepts document data in its input view. Examples: Mapper and File Writer.

Prerequisites

Write access to the SnapLogic data catalog.

Configuring Accounts

Accounts are not used with this Snap.

Configuring Views

Input

This Snap has exactly one document input view.
OutputThis Snap has at most one document output view.
ErrorThis Snap has at most one document error view.

Support for Ultra Pipelines

  • Does not work in Ultra Pipelines.


Snap Settings


LabelRequired. The name for the Snap. Modify this to be more specific, especially if there are more than one of the same Snap in the pipeline.
Table Name

Required. The location and name of the table that you want to update. You can either enter this information manually or you can select the table from the suggestible drop-down.

Example: /<Org>/<Project_Space>/<Project>/<Table_Name>

Default value: None

Data location

Required. The location of the file whose metadata you want to insert. This is typically a location in AWS S3, and can either be specified as a URL string, a pipeline parameter, or an upstream parameter.

Example: parquetesting1.parquet

Default value: None

Create table if not present

Enables you to specify whether the table should be automatically created if not already present. Select this check box to create the table.

Selecting this option creates a table with all columns of type STRING.

Default value:  Not selected

Insert Mode

Required. Available insert modes when loading data into table.

  • OVERWRITE - Loads rows into the target table, replacing the existing rows.
  • APPEND - If data already exists in the table, the new rows are appended to the table. If data does not already exist, the new rows are simply loaded.
  • ERROR_IF_EXISTS - Throws an error if a table with the same name already exists.
  • IGNORE - Does not insert data into the table.
Partition Keys

The partition keys for which you want to insert the metadata. You list these out by specifying Key Column and Key Value combinations to identify the precise row and column from which to create a partition

Default value: None

    Key Column

The name of the column that contains the value that you want to use to specify the partition.

Example: airline_code

Default value: None

    Key Value

The value in the column listed in the Key Column field that you want to use to specify the partition.

Example: 10

Default value: None

Custom Metadata

Enables you to specify the custom metadata values in Key and Value pairs.

Default value: None.

Key

Enables you to add a key that you want to associate with the new metadata you want to add.

Example: airline_region

Default value: None

Value

Enables you to add the value that you want to associate with the key as part of the new metadata that you want to upload.

Example: APAC

Default value: None.

Snap execution

Select one of the three modes in which the Snap executes. Available options are:
  • Validate & Execute: Performs limited execution of the Snap, and generates a data preview during Pipeline validation. Subsequently, performs full execution of the Snap (unlimited records) during Pipeline runtime.
  • Execute only: Performs full execution of the Snap during Pipeline execution without generating preview data.
  • Disabled: Disables the Snap and all Snaps that are downstream from it.

Example


Inserting and Querying Custom Metadata from the Flight Metadata Table

The Pipeline in this zipped example, MetadataCatalog_Insert_Read_Example.zip, demonstrates how you can:

  • Use the Catalog Insert Snap to update metadata tables.
  • Use the Catalog Query Snap to read the updated metadata information.

In this example:

  1. We import a file containing the metadata.
  2. We create a parquet file using the data in the imported file
  3. We insert metadata that meets specific requirements into a partition in the target table.
  4. We read the newly-inserted metadata using the Catalog Query Snap.


 Understanding the Pipeline

The Pipeline is designed as follows: