Catalog Insert

On this Page

Overview

The Catalog Insert Snap enables you to enrich your metadata catalog by inserting metadata into the catalog tables.

Expected Input and Output

  • Expected Input: A document containing metadata to be written into the metadata catalog. This document should have a field named schema with the document schema.
  • Expected Output: A document containing status messages on the result of the insert operation.
  • Expected Upstream Snaps: Required. Any Snap that offers document data. Examples: Parquet Writer and Mapper.
  • Expected Downstream Snaps: Any Snap that accepts document data in its input view. Examples: Mapper and File Writer.

Prerequisites

Write access to the SnapLogic data catalog.

Configuring Accounts

Accounts are not used with this Snap.

Configuring Views

Input

This Snap has exactly one document input view.
OutputThis Snap has at most one document output view.
ErrorThis Snap has at most one document error view.

Troubleshooting

  • None.

Known Issues

  • Does not work in Ultra Pipelines.

Snap Settings


LabelRequired. The name for the Snap. Modify this to be more specific, especially if there are more than one of the same Snap in the pipeline.
Table Name

Required. The location and name of the table that you want to update. You can either enter this information manually or you can select the table from the suggestible drop-down.

Example: /<Org>/<Project_Space>/<Project>/<Table_Name>

Default value: None

Data location

Required. The location of the file whose metadata you want to insert. This is typically a location in AWS S3, and can either be specified as a URL string, a pipeline parameter, or an upstream parameter.

Example: parquetesting1.parquet

Default value: None

Create table if not present

Enables you to specify whether the table should be automatically created if not already present. Select this check box to create the table.

Selecting this option creates a table with all columns of type STRING.

Default value:  Not selected

Insert Mode

Required. Available insert modes when loading data into table.

  • OVERWRITE - Loads rows into the target table, replacing the existing rows.
  • APPEND - If data already exists in the table, the new rows are appended to the table. If data does not already exist, the new rows are simply loaded.
  • ERROR_IF_EXISTS - Throws an error if a table with the same name already exists.
  • IGNORE - Does not insert data into the table.
Partition Keys

The partition keys for which you want to insert the metadata. You list these out by specifying Key Column and Key Value combinations to identify the precise row and column from which to create a partition

Default value: None

    Key Column

The name of the column that contains the value that you want to use to specify the partition.

Example: airline_code

Default value: None

    Key Value

The value in the column listed in the Key Column field that you want to use to specify the partition.

Example: 10

Default value: None

Custom Metadata

Enables you to specify the custom metadata values in Key and Value pairs.

Default value: None.

Key

Enables you to add a key that you want to associate with the new metadata you want to add.

Example: airline_region

Default value: None

Value

Enables you to add the value that you want to associate with the key as part of the new metadata that you want to upload.

Example: APAC

Default value: None.

Snap Execution

Select one of the three modes in which the Snap executes. Available options are:

  • Validate & Execute: Performs limited execution of the Snap, and generates a data preview during Pipeline validation. Subsequently, performs full execution of the Snap (unlimited records) during Pipeline runtime.
  • Execute only: Performs full execution of the Snap during Pipeline execution without generating preview data.
  • Disabled: Disables the Snap and all Snaps that are downstream from it.

Example


Inserting and Querying Custom Metadata from the Flight Metadata Table

The Pipeline in this zipped example, MetadataCatalog_Insert_Read_Example.zip, demonstrates how you can:

  • Use the Catalog Insert Snap to update metadata tables.
  • Use the Catalog Query Snap to read the updated metadata information.

In this example:

  1. We import a file containing the metadata.
  2. We create a parquet file using the data in the imported file
  3. We insert metadata that meets specific requirements into a partition in the target table.
  4. We read the newly-inserted metadata using the Catalog Query Snap.


 Understanding the Pipeline

The Pipeline is designed as follows:

The File Reader and JSON Formatter Snaps read flight statistics and convert the output into a JSON file.

The Parquet Writer Snap creates a Parquet file using the contents of the JSON file, in an S3 database.

The output of the Parquet Writer Snap includes the schema of the file. This is the metadata that must be included into the catalog.

The Catalog Insert Snap picks up the schema from the Parquet file and associates it with a specific partition in the target table. It also adds a custom property to the partition.

Once the Snap completes execution, the table is inserted into the metadata catalog and can be viewed in the SnapLogic Manager. To view the table, navigate to the project concerned, click the Table tab, and then click the new table created after you executed the Pipeline. This displays the table. Click Show schema to view the metadata.

The Schema view does not display the custom metadata that you inserted into the partition. Use the Catalog Query Snap to view all the updates made by the Catalog Insert Snap.

Download this ZIP file.

 How to use the Sample ZIP File

Working with the Sample ZIP File

This ZIP file contains two files:

  • Metadata_Catalog_Insert_Read.slp
  • AllDataTypes.json

To import this Pipeline:

  1. Download the ZIP file and extract its contents into a local directory.
  2. Import the Metadata_Catalog_Insert_Read.SLP Pipeline into a SnapLogic project.
  3. Open the Pipeline and click the File Reader Snap.
  4. In the File Reader Settings popup, use the  button to import and read the AllDataTypes.json file.
  5. Your Pipeline and test data are now ready. Review the other steps listed out in this example before validating or executing this Pipeline.

Downloads

Important steps to successfully reuse Pipelines

  1. Download and import the Pipeline into SnapLogic.
  2. Configure Snap accounts as applicable.
  3. Provide Pipeline parameters as applicable.

  File Modified

File MetadataCatalog_Insert_Read_Example.slp

Feb 16, 2022 by Subhajit Sengupta

Snap Pack History

 Click to view/expand
Release Snap Pack VersionDateTypeUpdates
4.29

main15993

 StableUpgraded with the latest SnapLogic Platform release.
4.28main14627 StableUpgraded with the latest SnapLogic Platform release.

4.27

main12833

 

Stable

Upgraded with the latest SnapLogic Platform release.
4.26main11181 StableUpgraded with the latest SnapLogic Platform release.
4.25main9554
 
StableUpgraded with the latest SnapLogic Platform release.
4.24main8556
StableUpgraded with the latest SnapLogic Platform release.
4.23main7430
 
StableUpgraded with the latest SnapLogic Platform release.
4.22main6403
 
StableUpgraded with the latest SnapLogic Platform release.
4.21snapsmrc542-StableUpgraded with the latest SnapLogic Platform release.
4.20snaprsmrc528-StableUpgraded with the latest SnapLogic Platform release.
4.19snaprsmrc528-StableUpgraded with the latest SnapLogic Platform release.
4.18snapsmrc523-StableUpgraded with the latest SnapLogic Platform release.
4.17 PatchALL7402-Latest

Pushed automatic rebuild of the latest version of each Snap Pack to SnapLogic UAT and Elastic servers.

4.16

snapsmrc508-Stable

New Snap. Added the Catalog Delete Snap to remove tables and table partitions from the Data Catalog. 

4.15snapsmrc500-Stable

New Snap Pack: Introducing the Data Catalog Snap Pack with the following Snaps: Catalog Insert and Catalog Query.