On this Page
This Snap provides the functionality of SCD (Slowly Changing Dimension) Type 2 on a target Snowflake table. The Snap executes one SQL lookup request per set of input documents to avoid making a request for every input record. Its output is typically a stream of documents for the Snowflake - Bulk Upsert Snap, which updates or inserts rows into the target table. Therefore, this Snap must be connected to the Snowflake - Bulk Upsert Snap to accomplish the complete SCD2 functionality.
Expected input: Each document in the input view should contain a data map of key-value entries. The input data must contain data in the Natural Key (primary key) and Cause-historization fields.
Expected output: Each document in the output view contains a data map of key-value entries for all fields of a row in the target Snowflake table.
Expected upstream Snaps: Any Snap, such as a Mapper or JSON Parser Snap, whose output contains a map of key-value entries.
Expected downstream Snaps: Snowflake Bulk Upsert snap must be used as downstream snap since the Snowflake SCD2 snap only generates set of rows to be inserted or updated and it doesn't do any write operation on the table.
Security Prerequisites: You should have the following permissions in your Snowflake account to execute this Snap:
For more information on Snowflake privileges, refer to Access Control Privileges.
This Snap uses the SELECT command internally. It enables querying the database to retrieve a set of rows.
This Snap uses account references created on the Accounts page of SnapLogic Manager to handle access to this endpoint. See Snowflake Account for information on setting up this type of account.
Input | This Snap has exactly one document input view. |
---|---|
Output | This Snap has exactly one document output view. |
Error | This Snap has at most one document error view. |
None.
None.
Label | Required. The name for the Snap. Modify this to be more specific, especially if there are more than one of the same Snap in the pipeline. | |
---|---|---|
Schema name | The name of the schema containing the target table. Providing the schema name along with the table name in the Table name field is sufficient. The suggestible field that lists all available schema in the configured account. Default value: None Example: "TestSchema" | |
Table name | Required. The name of the target table. Syntax is "<schema_name>"."<table_name>". This is a suggestible field that lists all available tables if the schema is provided in the Schema name field. Alternatively, if the Schema name field is blank, it lists all tables within the account if the schema is not provided. Default value: None Example: "TestSchema"."TestTable"
| |
Natural key | Names of fields that identify a unique row in the target table. The identity key cannot be used as the Natural key, since a current row and its historical rows cannot have the same natural key value Default value: None Example: id (Each record has to have a unique value) | |
Cause-historization fields | Names of fields where any change in value causes the historization of an existing row and the insertion of a new current row. Default value: None Example: gold bullion rate | |
SCD fields | Required. The historical and updated information for the Cause-historization field. Click + to add SCD fields. By default, there are four rows in this field-set:
| |
Meaning | Specifies the table columns that are to be updated for implementing the SCD2 type transformation. Default value:
| |
Field | The fields in the table will contain the historical information. Default value: None Below are the values that must be configured for each row:
| |
Value | The value to be assigned to the current or historical row. For date-related rows, the default is Date.now(). Default value:
The Value field should be configured as follows:
| |
Ignore unchanged rows | Specifies whether the Snap must ignore writing unchanged rows from the source table to the target table. If you enable this option, the Snap generates a corresponding document in the target only if the Cause-historization column in the source row is changed. Else, the Snap does not generate any corresponding document in the target. Default value: Not selected | |
Number of retries | The number of times that the Snap must try to write the fields in case of an error during processing. An error is displayed if the maximum number of tries has been reached. Default value: 0 | |
Retry interval (seconds) | The time interval, in seconds, between subsequent retry attempts. Default value: 1 | |
Auto Historization Query | This field-set is used to specify the fields that are to be used to historize table data. Historization is in the sort order specified. Care must be taken that the field is sortable. You can also add multiple fields here; historizaton occurs when even of the fields is changed. | |
Field | The name of the field. This is a suggestible field and suggests all the fields in the target table. Example: Invoice_Number Default value: N/A
| |
Sort Order | The order in which the selected field is to be historized. Available options are:
Default value: Ascending Order | |
Manage Queued Queries | Select this property to decide whether the Snap should continue or cancel the execution of the queued Snowflake Execute SQL queries when you stop the pipeline.
Default value: Continue to execute queued queries when pipeline is stopped or if it fails | |
This example demonstrates how you can use the Snowflake SCD2 Snap to auto-historize records. In this example, since the existing record in the Snowflake table is the latest, the incoming records are historized.
This Pipeline performs the following operations:
Before we start, let us look at the target table and understand some of its columns that are necessary for the Pipeline: We focus on the highlighted columns above to demonstrate auto-historization and describe their function in the table.
|
This Pipeline is configured to send records into the target table. The File Reader Snap is configured to read a CSV file that contains the records. The downstream CSV Parser Snap parses the CSV file read by the File Reader Snap. Below is a preview of this file:
Based on the values of HistoryStDate and HistoryEndDate, it is clear that the existing record in the target table is the latest (or current) record.
Since the output from the CSV Parser Snap is a string, it has to be parsed into the appropriate data type. Parsing and data mapping is done using the Mapper Snap, as shown below:
This mapped data is then sent to the Snowflake SCD2 Snap.
The Snowflake SCD2 Snap performs SCD2 operations on the target table. We configure it as shown below:
Let us take a look at the highlighted Snap fields and how they affect the Snap functionality in this example:
All incoming records pertaining to a POINT ID are historized. The value F is assigned under the FLAG column to these fields and the corresponding STARTDATE and ENDDATE are evaluated by the expression Date.now()
.
This can be seen in the SCD2 Snap's output preview:
The Snap identifies the current and historical records and this data is now ready to be updated and inserted into the target table.
We use the Snowflake Bulk Upsert Snap to update the target table with this historized data. We configure the Snowflake Bulk Upsert Snap as shown below:
Download this Pipeline and sample data. This is a compressed file, unzip it to extract its contents before importing them in SnapLogic.