Redshift - Lookup

On this Page

Snap Type:

Read

Description:

This Snap provides functionality to lookup records in the target Redshift table and return a selected set of fields for every matched record. The Snap executes one request per multiple input documents to avoid making a request for every input record.

JSON Path can be used in the Snap properties and will have values from an incoming document substituted into the properties. However, documents missing values for a given JSON path will be written to the Snap's error view. After a query is executed, the query's results are merged into the incoming document.

Queries produced by the Snap have the format:

SELECT [Output fields] FROM [Table name] WHERE
[C1 = V11 AND C2 = V21 AND...[Cn = Vn1] OR
[C1 = V12 AND C2 = V22 AND...[Cn = Vn2] OR 
......................................  OR
[Cn = V1n AND Cm = V2m AND...[Cn = Vnm]

The Snap ignores any duplicated lookup condition in the input document stream since it maintains a cache for lookup conditions internally.

ETL Transformations & Data Flow

The Snap extracts records from a Redshift table based on the condition configured using input document stream/parameters. 

Input & Output

  • Input: Each document in the input view should contain a Map data of key-value entries. Input data may contain values needed to evaluate expressions in the Object type, Output fields and Conditions properties. If the Pass-through on no lookup match property is unchecked, please make sure input data types match column data types in the database table. Otherwise, you may encounter an error message "Cannot find an input data which is related to the output record .....". If the error view is open, all input data in the batch are routed to the error view with the same error information.
  • Output: Each document in the output view contains a Map data of key-value entries, where keys are the Output fields' property values. The input data that has produced the corresponding output data is also included in the output data under the "original" key.


Expected Upstream Snaps: Any Snap which produces documents in the output view, for example CSV Parser, JSON Parser, Structure, Data, and so on.

Expected Downstream Snaps: Any Snap which receives documents in the input view, for example JSON Formatter, Structure, Data, etc. CSV Formatter will cause an error since the output data is not a flattened Map data.


Prerequisites:

[None]

Limitations and Known Issues:

Works in Ultra Task Pipelines.

Account:

This Snap uses account references created on the Accounts page of SnapLogic Manager to handle access to this endpoint. See Redshift Account for information on setting up this type of account.

Views:
InputThis Snap allows exactly one document input view and expects documents in the view. Each document should have values for one AND clause in the WHERE statement.
OutputThis Snap has exactly one output view and produces documents in the view. The output document includes the corresponding input data under the "original" key. If there are no results from the query, each output field will have a null value.
ErrorThis Snap has at most one error view and produces zero or more documents in the view.

Settings

Label

Required. The name for the Snap. You can modify this to be more specific, especially if you have more than one of the same Snap in your pipeline.

Schema name

The database schema name. Selecting a schema filters the Table name list to show only those tables within the selected schema.

Table name

Required. Enter or select the name of the table to execute the lookup query on

Example: people

Default value[None]

Output fields

Required. Enter or select output filed names for SQL SELECT statement. If this property is empty, the Snap selects all fields by executing the statement "SELECT * FROM ...".

Exampleemail, address, first, last

Default value[None]

Lookup conditions


Required. The lookup conditions are created by using the lookup column name and the lookup column value. Each row will build a condition, such aslookupColumn1 = $inputField. Each additional row will be concatenated using a logical AND. All rows together build the lookup condition being used to lookup records in the lookup table.

Default value: Not selected

Value


Required. Enter or select the JSON path of the lookup column value. The value will be provided by the input data field.JSON path of the lookup column value. The value will be provided by the input data field.

Example: $email, $first, $last

Default value: [None]

Lookup column name

Required. Enter or select lookup column name.

Example: email, first, last, etc.

Default value[None]

Pass-through on no lookup match


When there is no lookup matching an input document, the input document will pass through to the output view if this property is checked. Otherwise, it will be written to the error view as an error condition.

Default value: False

Snap Execution

Select one of the three modes in which the Snap executes. Available options are:

  • Validate & Execute: Performs limited execution of the Snap, and generates a data preview during Pipeline validation. Subsequently, performs full execution of the Snap (unlimited records) during Pipeline runtime.
  • Execute only: Performs full execution of the Snap during Pipeline execution without generating preview data.
  • Disabled: Disables the Snap and all Snaps that are downstream from it.


Examples

Basic Use Cases 


The following example shows looking up names of people based upon the billing city. The city name is passed using the JSON Generator Snap and the Lookup Snap searches and retrieves the records based on the Lookup conditions where the billing city matches to the value $city as passed via the upstream.


In the below pipeline, the Redshift Execute Snap reads the records from a table and the Lookup Snap retrieves the specified records as configured on the Lookup conditions.

The Execute Snap reads the records from from a table public.lookup_tbl and the respective output preview as below:


 The Lookup Snap retrieves the records from the table lookup_tbl and the respective output preview is as displayed below:

 


Typical Snap Configurations


Key configuration of the Snap lies in how SQL statements are passed to perform lookup of the records. The statements can be passed:

Without Expressions

The values are passed directly into the Snap.

With Expressions

Using Pipeline parameters: The Table name is passed as a pipeline parameter. 

 

Advanced Use Case


The following describes a pipeline that shows how typically in an enterprise environment, a lookup functionality is used. Pipeline download link is in the Downloads section below.  


In this pipeline, a table "COLORS_NEW" belonging to the "TECTONIC" schema is selected from the Oracle DB, the values and data structure are passed to the Redshift Insert Snap where it is inserted into another table "colours_rslktest" in the "snappod" schema of the Redshift DB. The Redshift Insert Snap is configured to create a new table if the specified table does not exist and insert the data from the Oracle Select Snap into it. The Redshift Lookup Snap is then used to lookup all the records in this table that match the specified Lookup conditions. Output previews and configurations of each of the Snaps used in this pipeline are shown below.

Configuration and the output preview of the Oracle Select Snap respectively:

 


Note that there are two output views in the Oracle Select Snap, the first output view passes the table's data whereas the second output view (output1) passes the data structure. 

First output view is shown below. Table data is passed in this view.:

 


Second output view - output1. Table's data structure is passed to input1 in the Redshift Insert Snap:


 


Configuration and output preview of the Redshift Insert Snap:

 


 


Configuration and output preview of the Redshift Lookup Snap:

 


 


Downloads

  

  File Modified

File Advanced Use Case_Redshift Lookup.slp

Aug 31, 2017 by Aparna Tayi

File Basic Use Case_Redshift Lookup.slp

Aug 31, 2017 by Aparna Tayi

File Redshift Lookup_Advanced Use Case.slp

Aug 31, 2017 by Aparna Tayi

File Redshift Lookup Test_Basic Use Case.slp

Aug 31, 2017 by Aparna Tayi

File SQL_to_Redshift_2017_07_25.slp

Aug 31, 2017 by Aparna Tayi
     



Redshift IAM Account Setup

  • If the EC2 plex (where your Pipeline is running with IAM role), Redshift cluster, and S3 bucket are in the same AWS account, then you must use Redshift Account (normal IAM account).
  • If the EC2 plex (where your Pipeline is running with IAM role) is in one account and the Redshift cluster and S3 bucket are in a different AWS account, you must use Redshift Cross-account IAM role Account to run your Pipelines successfully.

This is applicable only for Redshift - Bulk Load, Redshift - Unload, and Redshift - S3 Upsert Snaps.