Redshift - Table List

On this Page

Snap Type:

Read

Description:

This Snap outputs a list of tables in a database. The Snap will connect to the database, read its metadata, and output a document for each table found in the database. The table names are output in a topological order so that tables with the fewest dependencies are output first. In other words, if table A has a foreign key reference to table B, then table B will be output before A. The ordering is intended to ease the process of replicating a group of tables from one database to another.


  • Expected input: [None]
  • Expected output: Documents with the following fields:
    • name -  The fully-qualified name of the table. To use the table name in another Snap, like a Select or Insert, you can pass it through a ForEach Snap to another pipeline with the Select or Insert.
    • type - The type of table. This value is currently fixed to the string "TABLE".
    • dependents - (If Compute table graph is selected) A list of table names that have references to this table, including this table.

Replicating a Subset of Tables

The output of the Table List Snap can be directly used to replicate an entire database.  However, if you are only interested in a subset of tables, you can use a Filter Snap to select the table names you are interested in as well as the tables that they reference. For example, given the following diamond-shaped table graph where A depends on B and C and they both depend on D:

       A
      / \
     B   C
      \ /
       D
    

The Table List will output the following documents:

name=D; dependents=[A, B, C, D]
name=C; dependents=[A, C]
name=B; dependents=[A, B]
name=A; dependents=[A]


So, if you wanted to copy just table 'A' and its dependencies, you can add a Filter Snap with the following expression:

 $.dependents.indexOf('A') != -1


The filter will then remove any extra tables that happen to be in the schema.

ETL Transformations & Data Flow

This snap is a data source. It works by performing a standard JDBC DatabaseMetaData#getTables() query for the tables within the database. The Schema name value is used to populate the schemaPattern parameter in that query.

This snap does not require any temporary files or other external resources.


Input & Output

  • Input:  [None]

    Any input documents are ignored.

    • Input Schema Provided: No

    • Output: This snap produces one document per table upon successful execution. The fields are:

      • name -  The fully-qualified name of the table. To use the table name in another Snap, like a Select or Insert, you can pass it through a ForEach Snap to another pipeline with the Select or Insert.
      • type - The type of table. This value is currently fixed to the string "TABLE".
      • dependents - (If Compute table graph is selected) A list of table names that have references to this table, including this table.

    • Output Schema Provided: Yes

    • Preview Supported: Yes

    • Passthrough Supported: No

    • Output Examples:

      • Example output upon successful execution without Compute table graph selected:

        [
          {
            "name": "\"demo\".\"'redshiftbulkload'\"",
            "type": "TABLE"
          },
          {
            "name": "\"demo\".\"'shankaradp'\"",
            "type": "TABLE"
          },
          {
            "name": "\"demo\".\"'shankaradp1'\"",
            "type": "TABLE"
          },
          {
            "name": "\"demo\".\"'shankardemo'\"",
            "type": "TABLE"
          },
          {
            "name": "\"demo\".\"'shankardemo1'\"",
            "type": "TABLE"
          }
        ]
      • Example output upon successful execution with Compute table graph selected:

          {
            "name": "\"demo\".\"account\"",
            "type": "TABLE",
            "dependents": [
              "\"demo\".\"account\""
            ]
          },
          {
            "name": "\"demo\".\"account_transaction\"",
            "type": "TABLE",
            "dependents": [
              "\"demo\".\"account_transaction\""
            ]
          },
          {
            "name": "\"demo\".\"accounts_oy\"",
            "type": "TABLE",
            "dependents": [
              "\"demo\".\"accounts_oy\""
            ]
          }
        ]

Expected upstream Snaps: Any Snap with a document output view. Note: the contents of  the input view are ignored so pipelines should only be used to sequence operations.

Expected downstream Snaps: Any Snap with a document input view, such as JSON Formatter, Mapper, and so on. The CSV Formatter Snap cannot be connected directly to this Snap since the output document map data is not flat.

Modes

Prerequisites:
  • The Redshift account does need to specify the Endpoint, Database name, Username, and Password.
  • The Redshift account does not need to specify the S3 Access-key ID, S3 Secret key, S3 Bucket, and S3 Folder.
  • The Redshift account security settings does need to allow access from the IP Address of the cloudplex or groundplex.
Limitations and Known IssuesNone at the moment.
Configurations: 

This Snap uses account references created on the Accounts page of SnapLogic Manager to handle access to this endpoint. See Redshift Account for information on setting up this type of account.


Views:
Input

This Snap has at most one document input view.

All input documents are ignored.

Output

This Snap has exactly one document output view.

The output document contains map data:

  • a "name" field containing the fully-qualified name of the table.
  • a "type" field containing the type of table. This is currently limited to "TABLE".
  • an optional "dependents" field containing a list of tables with foreign references to this table.
ErrorThis Snap has at most one document error view and produces zero or more documents in the view.
Troubleshooting:None at the moment.

Settings

Label

Required The name for the Snap. You can modify this to be more specific, especially if you have more than one of the same Snap in your pipeline.

Schema name



The database schema name. In case it is not defined, then the suggestion for the table name will retrieve all tables names of all schemas. The property is suggestible and will retrieve available database schemas during suggest values.


This field appears to support expressions but does not.

Example: test

Default value: [None]

Expression property: No

Compute table graph

Computes the dependents among tables and returns each table with a list of tables it has foreign key references to. The ordering of outputted tables is from least dependent to most-dependent. 

Turning on this option will significantly slow down the Snap; it should be left as off unless you need it.

Default value: Not selected

Snap Execution

Select one of the three modes in which the Snap executes. Available options are:

  • Validate & Execute: Performs limited execution of the Snap, and generates a data preview during Pipeline validation. Subsequently, performs full execution of the Snap (unlimited records) during Pipeline runtime.
  • Execute only: Performs full execution of the Snap during Pipeline execution without generating preview data.
  • Disabled: Disables the Snap and all Snaps that are downstream from it.

Examples


Basic Use Case

The following pipeline describes how the Snap functions in a standalone Snap in a pipeline.

  • Below is a preview of the output from the Redshift Table List Snap depicting that all the tables records from ALL schema:

Refer to the "Redshift Table List_ALL.slp" in the Downloads section for the pipeline reference

  • Below is a preview of the output from the Redshift Table List Snap depicting that all the tables records from the mentioned schema:

Refer to the "Redshift Table List_1.slp" in the Downloads section for the pipeline reference

  • Below is a preview of the output from the Redshift Table List Snap depicting that all the tables records from the mentioned schema with the dependents:


Refer to the "Redshift Table List_2.slp" in the Downloads section for the pipeline reference

Typical Snap Configurations

  • Listing the tables in ALL schema without the graph of dependents as depicted in the first example of Basic Use case by providing the "Schema name"
  • Listing the tables in the schema without the graph of dependents as depicted in the second example of Basic Use case by providing the "Schema name"
  • Listing the tables in the schema with the graph of dependents as depicted in the third example of Basic Use case by providing the "Schema name" and selecting the "Compute table graph"

Downloads

  File Modified
No files shared here yet.


Redshift IAM Account Setup

  • If the EC2 plex (where your Pipeline is running with IAM role), Redshift cluster, and S3 bucket are in the same AWS account, then you must use Redshift Account (normal IAM account).
  • If the EC2 plex (where your Pipeline is running with IAM role) is in one account and the Redshift cluster and S3 bucket are in a different AWS account, you must use Redshift Cross-account IAM role Account to run your Pipelines successfully.

This is applicable only for Redshift - Bulk Load, Redshift - Unload, and Redshift - S3 Upsert Snaps.