In this article
Table of Contents | ||||||
---|---|---|---|---|---|---|
|
Overview
This Snap provides the functionality of SCD (Slowly Changing Dimension) Type 2 on a target Snowflake table. You can use this Snap to execute one SQL lookup request per set of input documents to avoid making a request for every input record. Its output is typically a stream of documents for the Snowflake - Bulk Upsert Snap, which updates or inserts rows into the target table. Therefore, this Snap must be connected to the Snowflake - Bulk Upsert Snap to accomplish the complete SCD2 functionality.
...
Snap Type
The Snowflake SCD2 is a Read-type Snap that enables you to execute multiple queries as a single atomic unit.
...
Works in Ultra Pipelines. However, we recommend that you not use this Snap in an Ultra Pipeline.
Known Issues
Because of performance issues, all Snowflake Snaps now ignore the Cancel queued queries when pipeline is stopped or if it fails option for Manage Queued Queries, even when selected. Snaps behave as though the default Continue to execute queued queries when the Pipeline is stopped or if it fails option were selected.
We plan to address this issue in a patch for the next monthly release in December.
Snap Views
Type | Format | Number of Views | Examples of Upstream and Downstream Snaps | Description |
---|---|---|---|---|
Input | Document |
|
| A document in the input view should contain a data map of key-value entries. The input data must contain data in the Natural Key (primary key) and Cause-historization fields. |
Output | Document |
|
| A document in the output view contains a data map of key-value entries for all fields of a row in the target Snowflake table. |
Error | Error handling is a generic way to handle errors without losing data or failing the Snap execution. You can handle the errors that the Snap might encounter while running the Pipeline by choosing one of the following options from the When errors occur list under the Views tab. The available options are:
Learn more about Error handling in Pipelines. |
Snap Settings
Info |
---|
|
Field Name | Field Type | Description | ||
---|---|---|---|---|
Label* Default Value: Snowflake - SCD2 | String | Specify the name for the Snap. You can modify this to be more specific, especially if you have more than one of the same Snap in your pipeline. | ||
Schema name Default Value: N/A | String/Expression | Specify the database schema name. In case it is not defined, then the suggestion for the Table Name will retrieve all tables names of all schemas. The property is suggestible and will retrieve available database schemas during suggest values. | ||
Table Name Default Value: N/A | String/Expression | Specify the name of the table in the instance. The table name is suggestible and requires an account setting. The target table should have the following three columns for field historization to work:
Use the ALTER table command to add these columns to your target table if they are not present. | ||
Natural key* Default Value: N/A | String/Expression | Specify the names of fields that identify a unique row in the target table. The identity key cannot be used as the Natural key, since a current row and its historical rows cannot have the same natural key value | ||
Cause-historization fields Default Value: N/A | String/Expression | Specify the names of fields where any change in value causes the historization of an existing row and the insertion of a new current row. | ||
SCD fields | The historical and updated information for the Cause-historization field. Click + to add SCD fields. By default, there are four rows in this field-set:
| |||
Meaning* Default Value:
| Dropdown list | Specifies the table columns that are to be updated for implementing the SCD2 type transformation. | ||
Field* Default Value: N/A | String/Expression | Specify the fields in the table will contain the historical information. Below are the values that must be configured for each row:
By default, the start and end date for both Current row and Historical row are null. After the Snap is executed, the start date for the updated row data automatically becomes the end date for the earlier version of the data (Historical row). | ||
Value* Default Value:
| String/Expression | Specify the value to be assigned to the current or historical row. For date-related rows, the default is The Value field should be configured as follows:
| ||
Ignore unchanged rows Default value: Deselected | Checkbox | Specifies whether the Snap must ignore writing unchanged rows from the source table to the target table. If you enable this option, the Snap generates a corresponding document in the target only if the Cause-historization column in the source row is changed. Else, the Snap does not generate any corresponding document in the target. | ||
Number of Retries Default Value: 0 | Integer/Expression | The number of times that the Snap must try to write the fields in case of an error during processing. An error is displayed if the maximum number of tries has been reached. | ||
Retry Interval (Seconds) Default Value: 1 | Integer/Expression | The time interval, in seconds, between subsequent retry attempts. | ||
Auto Historization Query | This field-set is used to specify the fields that are to be used to historize table data. Historization is in the sort order specified. Care must be taken that the field is sortable. You can also add multiple fields here; historizaton occurs when even of the fields is changed. | |||
Field* Default Value: N/A | String/Expression | Specify the name of the field. This is a suggestible field and suggests all the fields in the target table. If this field has null values in the incoming records, then the value in the Snowflake table is treated as the current value and the incoming record is historized. | ||
Sort Order* Default Value: Ascending Order | Dropdown list | The order in which the selected field is to be historized. Available options are:
| ||
Input Date Format Default Value: Continue to execute the snap with given input Date format | Dropdown list | The property has the following two options:
| ||
Manage Queued Queries Default Value: Continue to execute queued queries when the Pipeline is stopped or if it fails | Dropdown list | Select this property to decide whether the Snap should continue or cancel the execution of the queued Snowflake Execute SQL queries when you stop the pipeline. If you select Cancel queued queries when pipeline is stopped or if it fails, then the read queries under execution are canceled, whereas the write queries under execution are not canceled. Snowflake internally determines which queries are safe to be canceled and cancels those queries. | ||
Snap Execution Default Value: Validate & Execute | Dropdown list |
Select one of the following three modes in which the Snap executes:
|
Examples
Historizing Incoming Records
...
This Pipeline performs the following operations:
Info |
---|
The Target TableBefore we start, let us look at the target table and understand some of its columns that are necessary for the Pipeline: We focus on the highlighted columns above to demonstrate auto-historization and describe their function in the table.
|
...