...

Table of Contents

maxLevel	2
absoluteUrl	true

Overview

You can use this Snap to implement the Slowly Changing Dimension Type 2 (SCD2) function in your Snowflake, Redshift, Azure Synapse, Databricks Lakehouse Platform, or BigQuery table. Configure the database type and the access to corresponding tables in the Configuring ELT Database Accounts for the Snap.

Warning

title	Internal Change in Snap behavior

Starting from November 2021, 4.27 release, the ELT SCD2 Snap performs a MERGE INTO-based SCD2 operation instead of the previous Insert-and-then-Update approach. It is important that you take notice of this change in behavior. However, if you prefer to use the Insert-and-then-Update approach for your SCD2 operation, define the Pipeline parameter SCD2_INSERTUPDATE_FALLBACK_MODE with a value greater than zero (0), before running your Pipeline.

Image Removed

Prerequisites

Valid accounts and access permissions to connect to the following:

Source: AWS S3, Azure Cloud Storage, or Google Cloud Storage.
Target: Snowflake, Redshift, Azure Synapse, Databricks Lakehouse Platform, or BigQuery.

Limitation

Multiexcerpt include macro

name	ME_ELT_GBQ_StandardSQL_UOI
page	ELT Copy

Known Issues

If the last Snap in the Pipeline takes 2 to 5 seconds to update the runtime, the ELT Pipeline statistics are not displayed even after the Pipeline is completed. The UI does not auto-refresh to display the statistics after the runtime.
Workaround: Close the Pipeline statistics window and reopen it to see the ELT Pipeline statistics.
When you return to the Snap Statistics tab from the Extra Details tab in the Pipeline Execution Statistics pane, it contains the status bar (Pipeline execution status) instead of the Download Query Details hyperlink and the individual counts of Records Added, Records Updated, and Records Deleted.
ELT Pipelines built using an ELT SCD2 Snap fail to perform the DROP AND CREATE TABLE and ALTER TABLE operations on Delta tables when working with a Databricks SQL persona on the AWS Cloud with the error Operation not allowed: ALTER TABLE RENAME TO is not allowed for managed Delta tables on S3. However, the same action runs successfully with the Data Science and Engineering persona on the AWS Cloud.
- This is due to the limitation with the Databricks SQL Admin Console that does not allow you to add the configuration parameter spark.databricks.delta.alterTable.rename.enabledOnAWS true to its SQL Warehouse Settings.
Due to an issue with BigQuery table schema management (the time travel feature), an ALTER TABLE action (Add or Update column) that you attempt after deleting a column (DROP action) in your BigQuery target table causes the table to break and the Snap to fail.
- As a workaround, you can consider either avoiding ALTER TABLE actions on your BigQuery instance or creating (CREATE) a temporary copy of your table and deleting (DROP) it after you use it.
When using the ELT SCD2 Snap to load your source data into a Redshift target instance, the Snap fails if you use constant values such as ‘2022-01-01’ or SQL expressions such as GETDATE() and current_time() that evaluate to constant values instead of the target table column names in the Target Table Sort Fields.
In the case of Azure Synapse, if you are configuring the Start_Date column as a Target Table Natural Key, the Snap fails in each of the following scenarios:
- The source table/file contains one or more null values.
- The target table is empty.
- The End date of current row or End date of historical row has a static value, for example: '2021-01-01'.
When your Databricks Lakehouse Platform instance uses Databricks Runtime Version 8.4 or lower, ELT operations involving large amounts of data might fail due to the smaller memory capacity of 536870912 bytes (512MB) allocated by default. This issue does not occur if you are using Databricks Runtime Version 9.0.

Multiexcerpt include macro

name	ME_ELT_KI_Underscore_Refs
page	ELT Snap Pack

Snap Input and Output

...

Document

...

Min: 0
Max: 1

...

ELT Join
ELT Copy
ELT Transform

...

Document

...

Min: 0
Max: 1

...

ELT Select

...

A document containing the status of the SCD2 Load operation along with the SQL query that you can use in downstream ELT Snaps if any.

Snap Settings

Info

title	SQL Functions and Expressions for ELT

You can use the SQL Expressions and Functions supported for ELT to define your Snap or Account settings with the Expression symbol = enabled, where available. This list is common to all target CDWs supported. You can also use other expressions/functions that your target CDW supports.

Info

title	Understanding the ELT SCD2 terminology

See ELT SCD2 Scenarios | Terminology for definitions of the terms used in this Snap's configuration.

...

Multiexcerpt include macro

name	getpreviewdata
page	ELT Intersect

...

Overview

You can use this Snap to implement the Slowly Changing Dimension Type 2 (SCD2) function in your Snowflake, Redshift, Azure Synapse, Databricks Lakehouse Platform, or BigQuery table. Configure the database type and the access to corresponding tables in the Configuring ELT Database Accounts for the Snap.

Note

Switching the Snap behavior

The ELT SCD2 Snap performs a MERGE INTO-based SCD2 operation as opposed to its initial INSERT-and-then-UPDATE approach. If you prefer to use the INSERT-and-then-UPDATE approach for your SCD2 operation, define the Pipeline parameter SCD2_INSERTUPDATE_FALLBACK_MODE with a value greater than zero (0), before running your Pipeline.

Image Added

Snap Type

The ELT SCD2 Snap is a write-type Snap that writes SCD2 data from a source location to a supported CDW destination.

Prerequisites

Valid accounts and access permissions to connect to the following:

Source: AWS S3, Azure Cloud Storage, or Google Cloud Storage.
Target: Snowflake, Redshift, Azure Synapse, Databricks Lakehouse Platform, or BigQuery.

Limitation

Cfm tabs page

colorPalettes	Default
primaryTab	true
tabsPageTitle	Google BigQuery
color	#0052CC
hasBorder	true
id	tckventgizk
backgroundColorPalette	Default

ELT Snap Pack does not support Legacy SQL dialect of Google BigQuery. We recommend that you use only the BigQuery's Standard SQL dialect in this Snap.

Known Issues

Cfm tabs page

colorPalettes	Default
tabsPageTitle	All CDWs
primaryTab	true
color	#0052CC
hasBorder	true
id	wnu8rx0dqp9
backgroundColorPalette	Default

If the last Snap in the Pipeline takes 2 to 5 seconds to update the runtime, the ELT Pipeline statistics are not displayed even after the Pipeline is completed. The UI does not auto-refresh to display the statistics after the runtime.
Workaround: Close the Pipeline statistics window and reopen it to see the ELT Pipeline statistics.
When you return to the Snap Statistics tab from the Extra Details tab in the Pipeline Execution Statistics pane, it contains the status bar (Pipeline execution status) instead of the Download Query Details hyperlink and the individual counts of Records Added, Records Updated, and Records Deleted.
In any of the supported target databases, this Snap does not appropriately identify nor render column references beginning with an _ (underscore) inside SQL queries/statements that use the following constructs and contexts (the Snap works as expected in all other scenarios):
- WHERE clause (ELT Filter Snap)
- WHEN clause
- ON condition (ELT Join, ELT Merge Into Snaps)
- HAVING clause
- QUALIFY clause
- Insert expressions (column names and values in ELT Insert Select, ELT Load, and ELT Merge Into Snaps)
- Update expressions list (column names and values in ELT Merge Into Snap)
- Secondary AND condition
- Inside SQL query editor (ELT Select and ELT Execute Snaps)

Info

Workaround

Precede this Snap with an ELT Transform Snap to re-map the '_' column references to suitable column names (that do not begin with an _ ) and reference the new column names in the next Snap.

Cfm tabs page

colorPalettes	Default
tabsPageTitle	Amazon Redshift
color	#0052CC
hasBorder	true
id	ynhvqzkdmz
backgroundColorPalette	Default

When using the ELT SCD2 Snap to load your source data into a Redshift target instance, the Snap fails if you use constant values such as ‘2022-01-01’ or SQL expressions such as GETDATE() and current_time() that evaluate to constant values instead of the target table column names in the Target Table Sort Fields.

Cfm tabs page

colorPalettes	Default
tabsPageTitle	Azure Synapse
color	#0052CC
hasBorder	true
id	1l2ibmdeddei
backgroundColorPalette	Default

In the case of Azure Synapse, if you are configuring the Start_Date column as a Target Table Natural Key, the Snap fails in each of the following scenarios:

The source table/file contains one or more null values.
The target table is empty.
The End date of current row or End date of historical row has a static value, for example: '2021-01-01'.

Cfm tabs page

colorPalettes	Default
tabsPageTitle	DLP
color	#0052CC
hasBorder	true
id	3oqljw4a2cy
backgroundColorPalette	Default

ELT Pipelines built using an ELT SCD2 Snap fail to perform the DROP AND CREATE TABLE and ALTER TABLE operations on Delta tables when working with a Databricks SQL persona on the AWS Cloud with the error Operation not allowed: ALTER TABLE RENAME TO is not allowed for managed Delta tables on S3. However, the same action runs successfully with the Data Science and Engineering persona on the AWS Cloud.
- This is due to the limitation with the Databricks SQL Admin Console that does not allow you to add the configuration parameter spark.databricks.delta.alterTable.rename.enabledOnAWS true to its SQL Warehouse Settings.
When your DLP instance uses Databricks Runtime Version 8.4 or lower, ELT operations involving large amounts of data might fail due to the smaller memory capacity of 536870912 bytes (512MB) allocated by default. This issue does not occur if you are using Databricks Runtime Version 9.0.
In case of DLP where CSV files do not have a header (column names), a simple query like SELECT * FROM CSV.`/mnt/csv1.csv` returns default names such as _c0, _c1, _c2 for the columns which this Snap cannot interpret. To avoid this scenario, you can:
- Write the data in the CSV file to a DLP table beforehand, as in: CREATE TABLE csvdatatable (a1 int, b1 int,…) USING CSV `/mnt/csv1.csv` where a1, b1, and so on are the new column names.
- Then, read the data from this new table (with column names a1, b1, and so on) using a simple SELECT statement.
In case of DLP, all ELT Snaps' preview data (during validation) contains a value with precision higher than that of the actual floating point value (float data type) stored in the Delta. For example, 24.123404659344 instead of 24.1234. However, the Snap reflects the exact values during Pipeline executions.

Cfm tabs page

colorPalettes	Default
tabsPageTitle	Google BigQuery
color	#0052CC
hasBorder	true
id	q1jzzpt3pmh
backgroundColorPalette	Default

Due to an issue with BigQuery table schema management (the time travel feature), an ALTER TABLE action (Add or Update column) that you attempt after deleting a column (DROP action) in your BigQuery target table causes the table to break and the Snap to fail.
- As a workaround, you can consider either avoiding ALTER TABLE actions on your BigQuery instance or creating (CREATE) a temporary copy of your table and deleting (DROP) it after you use it.

Snap Views

Type

Format

Number of Views

Examples of Upstream and Downstream Snaps

Description

Input

Document

Min: 0
Max: 1

ELT Join
ELT Copy
ELT Transform

A document containing an SQL query that yields the historization data to load.

Output

Document

Min: 0
Max: 1

ELT Select

A document containing the status of the SCD2 Load operation along with the SQL query that you can use in downstream ELT Snaps if any.

Error

Error handling is a generic way to handle errors without losing data or failing the Snap execution. You can handle the errors that the Snap might encounter when running the Pipeline by choosing one of the following options from the When errors occur list under the Views tab:

Stop Pipeline Execution: Stops the current Pipeline execution if the Snap encounters an error.
Discard Error Data and Continue: Ignores the error, discards that record, and continues with the remaining records.
Route Error Data to Error View: Routes the error data to an error view without stopping the Snap execution.

Learn more about Error handling in Pipelines.

Snap Settings

Info

Asterisk ( * ): Indicates a mandatory field.
Suggestion icon (): Indicates a list that is dynamically populated based on the configuration.
Expression icon ( ): Indicates the value is an expression (if enabled) or a static value (if disabled). Learn more about Using Expressions in SnapLogic.
Add icon ( ): Indicates that you can add fields in the fieldset.
Remove icon ( ): Indicates that you can remove fields from the fieldset.
Upload icon ( ): Indicates that you can upload files.
You can use the SQL Expressions and Functions supported for ELT to define your Snap or Account settings with the Expression icon enabled, where available. Though this list covers only expressions that are common to all supported target CDWs, you can use (type-in manually) other expressions/functions permitted by your target CDW.
See ELT SCD2 Scenarios | Terminology for definitions of the ELT SCD2 terms used in this Snap's configuration.

Field Name

Field Type

Field Dependency

Description

Label*

Default Value: N/A
Example: Orders_SCD2_Data

String

None.

Specify a name for the Snap. You can modify this to be more specific, especially if you have more than one of the same Snap in your Pipeline.

Get preview data

Default Value: Not selected
Example: Selected

Checkbox

None.

Select this checkbox to include a preview of the query's output. The Snap performs limited execution and generates a data preview during Pipeline validation.

Info
In case of ELT Pipelines, only the SQL query flows through the Snaps but not the actual source data. Hence, the preview data for a Snap is the result of executing the SQL query that the Snap has generated in the Pipeline.

The number of records displayed in the preview (upon validation) is the smaller of the following:

Number of records available upon execution of the SQL query generated by the Snap.
The value set in the Preview Data Count setting (default is 50 records).

Note

Rendering Complex Data Types in Databricks Lakehouse Platform

Based on the data types of the fields in the input schema, the Snap renders the complex data types like map and struct as object data type and array as an array data type. It renders all other incoming data types as-is except for the values in binary fields are displayed as a base64 encoded string and as string data type.

Database Name

Default Value: N/A
Example: TestDB

String/Expression

None.

Specify the name of the database in which the target table exists. Leave this field blank if you want to use the database name specified in the

...

Default Database Name field

...

of your Snap Account settings.

...

Schema Name (Not applicable to Databricks Lakehouse Platform)*

Default Value: N/A
Example: Sales

String/Expression/Suggestion

...

None.

Specify

...

the name of the database schema.

...

Alternatively, choose from the suggested list of schemas associated with the database

...

Multiexcerpt include macro

name	ME_Schema_Name
page	ELT Insert-Select

...

.

Note
Ensure that you include the exactly same schema name including the double quotes, if used, when you repeat the schema name in the Target Table Name field. Leave this field blank if your target database is Databricks Lakehouse Platform.

Target Table Name*

Default Value: N/A
Example: SALES_ORDERS

String/Expression/Suggestion

...

None.

Specify

...

the name of the SCD2

...

table or view in which you want to load the data.

...

Alternatively,

...

choose from the suggested list of table names associated with the schema

...

.

Info
Only views that can be updated (have new rows) are listed as suggestions. So, Join views are not included. This also implies that the Snap account user has the Insert privileges on the views listed as suggestions.

...

Multiexcerpt include macro

name	ME_Schema_And_Table_Names
page	ELT Insert-Select

...

Note

Ensure that you include the same schema name, if at all, including the double quotes as specified in the Schema Name field.
If the target table does not exist during run-time, the Snap creates one with the name that you specify in this field and writes the data into it. During Pipeline validation, the Snap creates the new table but does not write any records into it.
- The new table thus created will not be dropped in the event of a subsequent/downstream Snap failure during validation.

...

The target table or view should have the following columns for field

...

historization to work:
- Required. One column to denote the starting date of the current row. For example, "START_DATE".
- Required. One column to denote when the row was historized. For example, "END_DATE".

...

- For an active row, it can be null. For a historical row it has the value that indicates it was effective

...

- until that date.
- Optional. One column to demarcate whether a row is a current row or not. For example, "CURRENT_ROW". For the

...

- current row, the value would be true or 1.
- Required when current row is specified. One column to demarcate whether a row is a historical row or not. For example, "HISTORICAL_ROW". For the

...

- historical row, the value would be false or 0.
Use the ALTER TABLE command to add these columns in your target table if they are missing.

...

Input Source Alias

Default Value: N/A
Example: src

String/Expression

None.

Specify an alias name for your source data table.

...

Target Table Action

Default Value: Merge data into target table
Example: Alter table

Drop-down list

None.

Select the SCD2 action to perform on the target table.

Available options are:

Merge data into target table. Loads the data from the source files into the specified target table.
Drop and Create table. Drops the target table if it exists, creates a new table with the columns provided in the

...

Table Columns field set, and then loads the data from the source files into the target table. Activates the

...

Table Columns

...

field set.
Alter table. Modifies the schema of the target table based on the configuration of the Modifier

...

field in the

...

Table Columns field set and loads the data from the source files into the target table. Activates the

...

Table Columns

...

field set with the Column, Data Type,

...

and

...

Modifier

...

fields.
Overwrite existing

...

Info

title	During Pipeline Validation

The Snap does not modify the existing tables during Pipeline validation, but if the specified target table does not exist, it creates a new target table.

...

table. Overwrites the target table with the data from the source table/files.

Note

Selecting the Target Table Action

Use the Drop and Create table option in this field only to create the target table for the first time. After the first execution of your Pipeline, ensure that you change the Target Table Action to Merge data into target table.
We recommend your utmost discretion in the intermittent use of the Alter table option (schema changes to target table).

...

Info

During Pipeline Validation

The Snap does not modify the existing tables during Pipeline validation, but if the specified target table does not exist, it creates a new target table.

Table Columns

Field set

Target Table Action is Drop and Create table or Alter table.

This field set enables you to configure the schema of the target table. You can use this field set to

...

create the target table or to add or drop columns from it.

This field set consists of the following fields:

Column
Data Type
Modifier

Column

Default Value: N/A
Example: ID, FNAME

String/Suggestion

Same as Table Columns.

Specify the name of the column that you want to load in the target table.

You can also specify the columns to drop if you select the

...

Alter table

...

option in

...

the Target Table Action

...

field.

...

Data Type

Default Value: N/A

...

ID

FNAME

...

Example: INT, VARCHAR

String

Same as Table Columns.

Specify the data type for the values in the specified column.

...

INT

VARCHAR

Modifier

Default Value: Add
Example: Drop

Drop-down list

...

Activates when you select the Alter table option in the Target Table Action field.

Select whether you want to add/drop the specified column.

Available options are:

Add
Drop

...

Target Table Action is Alter table.

Select whether you want to add/drop the specified column.

Available options are:

Add. To add a new column to the target table with the specified column name and data type.
Drop. To delete the column from the target table. This option deletes all data existing in the specified column.

Null Value Behavior

Default Value: Honor nulls as distinct values
Example: Ignore nulls

Drop-down list

None.

Select an option to specify how you want the Snap to treat rows with null values in Natural Key or Cause-Historization fields of the source table. The available options are:

Honor nulls as distinct values: To treat each null as a value in the field. And as a result, equate all null values (as the same value) for all practical purposes.
Ignore nulls: To treat any rows with nulls in Natural Key and Cause-Historization fields of the source table as absent and proceed with the SCD2 operation based on the values available in the remaining rows.
Error on nulls: To abort the Snap execution when a null value is found in the source data. The Snap writes the error (message) to the Pipeline runtime statistics and displays a corresponding error message in the Snap.

Refer to the ELT SCD2 scenarios to learn more about the impact of the selection you make in this field.

...

Invalid Row Handling

Default Value: Ignore invalid rows

Example: Insert duplicate invalid rows

Drop-down list

None.

Select an option to specify how you want the Snap to treat invalid rows, if any, in the source data. A row is considered invalid when it does not result in a new, valid, and coherent SCD2 entry/row in the target table.

The available options are:

Ignore invalid rows: To treat invalid rows as absent and proceed with the SCD2 operation based on the valid values available in the remaining fields. The Snap also ignores inserting any potentially invalid rows into the target table.
Insert duplicate invalid rows: To treat all rows in a uniform manner—allowing an entry for each invalid row, in the target table— and optionally flagging such invalid entries in the target table as invalid (Invalid = true or 1 or a similar boolean value) based on the Invalid historical row flag specified in the Target Table Temporal Field(s) fieldset.
Error on invalid rows: To abort the Snap execution when an invalid row is found in the source data. The Snap writes the error (message) to the Pipeline runtime statistics and displays a corresponding error message in the Snap.

Refer to the ELT SCD2 scenarios to learn more about the impact of the selection you make in this field.

...

Target Table Natural Key(s)*

...

This field set enables you to configure

...

the names of fields that identify a unique row in the target table. The identity key cannot be used as

...

the Natural key, since a current row and its historical rows cannot have the same identity value.

...

Specify the value in the following field for each new row added in this field set.

Target Table Natural Key(s)

Target Table Natural Key(s)

Default Value:N/A

Example: CUSTOMER_ID

String/Expression/Suggestion

Specify the name of the Natural Key column from the target table.

...

Target Table Cause-Historization Field(s)*

...

This field set enables you

...

to configure the names of fields where any change in value causes the historization of an existing row and the insertion of a new 'current' row.

...

Specify the value in the following field for each new row added in this field set.

Target Table Cause-Historization Field(s)

Target Table Cause-Historization Field(s)*

Default Value:N/A

Example: LAST_ORDER_AMT

String/Expression/Suggestion

...

None.

Specify the name of the historization-causing column from the target table.

...

Target Table Temporal Field(s)*

...

This field set enables you

...

to configure the historical and updated information for the Cause-historization field. Click

...

to add SCD fields. Specify the values in the following fields for each SCD field row in the field set.

Meaning
Field
Value

Meaning*

Default Value:Five rows with one list item in each row
(Without repetition and End date of historical row).

Example: Start date of current row

Drop-down list

...

None.

...

Select one from the following six permitted values for this field in each row. At the minimum, ensure that you configure appropriate values for the first three meanings, only one of the end dates, and the flag in this list.

Current row (Required)
Historical row (Required)
Start date of current row (Required)
End date of current row (Selected, by default)
End date of historical row
- Remove/replace End date of current row with this item, if you want to specify this date.
Invalid historical rows flag (Optional)

Info

...

Breaking change

In 4.26 GA, we have updated the last option in this

...

field—Invalid historical row flag—and it may cause your Pipelines to fail. See Snap Pack History for more information.

...

Five rows with one list item in each row
(without repetition and End date of historical row).

.

...

Start date of current row

...

Field*

Default Value:N/A

Examples: Start_Date

End_Date

Current_Row

Historical_Row

String/Expression/Suggestion

...

Required. Configure the field name from the target table that would contain the historical information. Click Image Removed to select an item from the suggested list of items.

...

None.

Associate with each Meaning selected the field name from the target table that would contain the historical information.

For Current row: The name of the column in the target table that holds the flag for the historized field.
For Historical row: The name of the column in the target table that holds the flag for the historized field. It has to be the same as the value configured for the Current row field.
For Start date of current row: The name of the column in the target table for denoting the start date for the current row.
For End date of current row: The name of the column in the target table

...

for denoting the end date for the current row. In this case, the start date of the current row specified above is automatically filled in as the end/closing date for the matching historical record.
For End date of historical row: The name of the column in the target table

...

for

...

For Start date of current row: The name of the column in the target table for denoting the start date for the current row.

...

For End date of current row: The name of the column in the target table for denoting the end date for the current row. In this case, the start date of the current row specified above is automatically filled in as the end/closing date for the matching historical record.

For End date of historical row: The name of the column in the target table for denoting the end date for the historical rows.

Info
If you are specifying the End date of historical row, we recommend that you specify an appropriate expression in the corresponding Value or Expression field. See Expressions to use for End date of Historical Row to choose the right expression from a set of expression templates.

...

While inserting an invalid row in the target table, the Snap inserts the specified value in the Invalid Row flag column to indicate that the row is invalid.
If no invalid rows are inserted, the Invalid Row flag column value remains empty for all valid rows.

...

denoting the end date for the historical rows.

Info
If you are specifying the End date of historical row, we recommend that you specify an appropriate expression in the corresponding Value or Expression field. See Expressions to use for End date of Historical Row to choose the right expression from a set of expression templates.

For Invalid historical row flag: The name of a nullable column that holds the flag for an invalid historical row. If you specify a non-null value for this field:
- While inserting an invalid row in the target table, the Snap inserts the specified value in the Invalid Row flag column to indicate that the row is invalid.
- If no invalid rows are inserted, the Invalid Row flag column value remains empty for all valid rows.

See the Troubleshooting section in this document and ELT SCD2 Scenarios to ensure that you have configured these fields in your Snap's settings properly.

Info
The date fields must be configured with a field and a corresponding value. Pass null as the value to indicate the field's value is null.

Value or Expression*

Default Value:N/A

Examples:
For date fields:
src.start_date,
'2021-12-31',
CURRENT_DATE,
null

For row fields:
'true', 'false'

String/Expression

None.

Configure the value for each of the fields selected in the Field field.

Info
The date fields must be configured with a field and a corresponding value.

...

Required. Configure the value for each of the fields selected in the Field field.

Info
The date fields must be configured with a field and a corresponding value. Pass null as the value to indicate the field's value is null.

Info
For the End date of historical row, we recommend that you specify an appropriate expression in this field. See Expressions to use for End date of Historical Row to choose the right expression from a set of expression templates.

...

For date fields:
src.start_date,
'2021-12-31',
CURRENT_DATE,
null

For row fields:
'true' or 'false'

Pass null as the value to indicate the field's value is null.

...

Start_Date

End_Date

Current_Row

Historical_Row

For the End date of historical row, we recommend that you specify an appropriate expression in this field. Refer to Expressions to use for End date of Historical Row to specify the right expression from a set of expression templates.

Here are a few more examples that you can use to define the expressions in this field based on your target CDW:

Snowflake: CURRENT_DATE(), CURRENT_DATE, or DATE.NOW(). Refer to Snowflake Documentation for more information.
Amazon Redshift: CURRENT_DATE, GETDATE(), or DATE.NOW(). Refer to Redshift Documentation for more information.
Azure Synapse: currentDate. Refer to Data flow: Date and time functions for more information.
DLP: current_date. Refer to Alphabetical list of built-in functions in Databricks SQL for more information.
Google BigQuery: CURRENT_DATE() or CURRENT_DATE. Refer to Date functions for more information.

Target Table Sort Field(s)*

...

This field set enables you

...

to sort the records in the target table, by one or more historization fields (for example, by natural key). This mapping is for columns that are not part of the Target Table

...

Temporal Fields. Specify the values in the following fields for each sort order row.

Sort Field
Sort Order
Null

...

Value Sort Preference

Sort Field

Default Value:N/A

Example: start_date

GENDER

String/Expression/Suggestion

Specify the column by which to sort the target table data set.

...

start_date

GENDER

Sort Order

Default Value: ASC

Example: DESC

Drop-down list

Choose either of the sort orders—ASC (ascending) or DESC (descending)—for the target table data set.

...

Null Value Sort Preference

Default Value: NULLS LAST

Example: NULLS FIRST

Drop-down list

Choose the position for the null values, if any, in

...

the selected column while sorting—NULLS FIRST (at the beginning) or NULLS LAST (at the end).

Info
This field is not applicable if you are working with an Azure Synapse target database.

...

Target-Source Column Mapping

This field set enables you to define a one-to-one mapping between the columns in the source and target tables for inserting new rows into the target table.

...

This field set contains the following fields:

Target Table Column Name
Source Column Mapping Expression

Target Table Column Name

Default Value: N/A
Example: ORD_AMT

String/Expression/Suggestion

None.

Choose a column from the target table's schema to map with a column from the source data set (schema).

Note
Ensure that the target table column name you specify for mapping with a source column is not already configured as a target table SCD field

...

.

Source Column Mapping Expression

Default Value: N/A
Example: src.ORD_AMT

...

Source Column Mapping Expression

String/Expression

None.

Choose a column from the source data set (schema) that corresponds to the respective column in the target table's schema. You can alternatively specify an expression for the source column name.

Note
If you have specified the target table column as a target table natural key or a target table cause-historization field already in this Snap, ensure that you specify the corresponding source column mapping expression (not null nor left blank). All other columns not mentioned in the Snap can be mapped to null

...

.

...

Pipeline Execution Statistics

As a Pipeline executes, the Snap shows the following statistics updating periodically. You can monitor the progress of the Pipeline as each Snap performs

...

execution.

Records Added
Records Updated
Records Deleted

You can view more information

...

by clicking the Download Query Details link.

...

Image Removed

...

This downloads a JSON file that contains the count of rejected records besides the above counts.

Image Added

Info
The statistics are also available in the output view

...

of the child ELT Pipeline.

Troubleshooting

Cfm tabs page

colorPalettes	Default
primaryTab	true
tabsPageTitle	Common Errors
color	#0052CC
id	r544rg6r8n
hasBorder	true
backgroundColorPalette	Default

Error	Reason	Resolution
Ambiguous SCD column mapping	You cannot map a field that is configured in Target Table SCD Fields again in the Target-Source Column Mapping field set.	Ensure that the target table column name you specify for mapping with a source column is not already configured as a target table temporal field.
Invalid target column - source expression mapping	You cannot map a target table column to null or a blank source column value/expression.	If you have specified the target table column as a target table natural key or a target table cause-historization field already in this Snap, ensure that you specify the corresponding source column mapping expression (not null nor left blank). All other columns not mentioned in the Snap can be mapped to null.
Database cannot be blank. (When seeking the suggested list for Schema Name field)	Suggestions in the Schema Name and Target Table Name fields do not work when you have not specified a valid value for the Database Name field in this Snap.	Specify the target Database Name in this Snap to view and choose from a suggested list in the Schema Name and Target Table Name fields respectively.
Invalid row flag not configured for target table	Invalid historical row flag is not defined in Target Table Temporal Field(s) field set.	Add an entry in Target Table Temporal Field(s) field set for the Meaning Invalid historical row flag with the corresponding field name in the target table and the boolean value (true or false) or an expression that resolves to a boolean true or false. Make sure that the target table contains the specified field.
Target table sort column %s not mapped to a source column or an expression	Target table sort fields should be mapped to a source column or an expression in one of the two places: Target Table Temporal Field(s) field set or Target-Source Column Mapping field set.	Map this column either in Target Table Temporal Field(s) field set or Target-Source Column Mapping field set before you can use it for sorting.
Invalid end date configuration for SCD2	You have provided both End date of current row and End date of historical row while one of them is adequate.	Remove one of End date of current row and End date of historical row from the Target Table Temporal Field(s) field set and try again.
Ambiguous source data (Only when you configure the Snap to check for nulls and duplicates.)	Source data contains nulls in one or more of columns you specified in the Target Table Natural Key(s) field set, Target Table Cause-Historization Field(s)

...

field set, or the Start date of current row in Target Table Temporal Field(s) field set.

Peruse your source data to ensure that it is valid and contains valid values required to perform the SCD2 operation and try again.

Source data has ambiguous records

(Only when you configure the Snap to check for nulls and duplicates.)

Source data contains duplicate keys in one or more of columns you specified in the Target Table Natural Key(s) field set, Target Table Cause-Historization Field(s)

...

field set, or the Start date of current row in Target Table Temporal Field(s) field set.	Peruse your source data to ensure that it is valid and contains only unique entries required for the SCD2 operation in these columns and try again.
Incorrect start date in source data (Only when you configure the Snap to check for nulls and duplicates.)	The column you specified for the Start date of current row in Target Table Temporal Field(s) field set has one or more nulls.	Ensure that your source data or the start date SQL expression resolves to a valid start date and try again.
Incorrect start date(s) in source data (Only when you configure the Snap to check for nulls and duplicates.)	Multiple rows for one or more Target Table Natural Keys that you defined contain the same start date.	Ensure that your source data contains only unique start dates across the rows for the same natural key. Perform this check for all natural keys you specified, before trying again.
Scenario: Sort order by a date column in the target table is incorrect after the SCD2 operation is complete.	Sorting the date values in a column with date data

...

type may yield different results when compared to sorting the same date values in a column with string data type.

Ensure that the values loaded in the target table column are of the same data type as defined for the respective column/field.

Cfm tabs page

colorPalettes	Default
tabsPageTitle	Snowflake
color	#0052CC
id	baxunwbt72f
backgroundColorPalette	Default

Error	Reason	Resolution
SQL exception

...

: Syntax error in one or more positions in the SQL query.

Column names in Snowflake tables are case-sensitive. It stores all columns in uppercase unless they are surrounded by quotes during the time of creation in which case, the exact case is preserved. See, Identifier Requirements — Snowflake Documentation.

Ensure that you follow the same casing for the column table

...

names across the Pipeline.

Cfm tabs page

colorPalettes	Default
tabsPageTitle	DLP
color	#0052CC
id	t4hxlxlzk3
backgroundColorPalette	Default

Error

Reason

Resolution

[Simba][SparkJDBCDriver](500051) ERROR processing query/statement. Error Code: 0

Cannot create table ('<schema name>`.

...

`<table name>`'). The associated location (

...

`…<table name>`) is not empty but it's not a Delta table

...

.

The specified location contains one of the following:

A non-Delta table (such as CSV, ORC, JSON, PARQUET)
A corrupted table
A Delta table with a different table schema

So, the Snap/Pipeline cannot overwrite this table with the target table as needed.

Ensure that you take appropriate action (mentioned below) on the existing table before running your Pipeline again (to create another Delta table at this location).

Move or drop the existing table from the schema manually using one of the following commands:

Access the DBFS through a terminal and run:

dbfs mv dbfs:/<current_table_path> dbfs:/<new_table_path> to move the table or
dbfs rm -r dbfs:/<table_path> to drop the table.

OR

Use a Python notebook and run:

dbutils.fs.mv(from: String, to: String, recurse: boolean = false): boolean to move the table/file/directory or
dbutils.fs.rm(dir: String, recurse: boolean = false): boolean to drop the table/file/directory.

Examples

Loading Historical Data From an S3 Bucket to a Snowflake Table

The following Pipeline is designed to view and capture Type 2 SCD data from a folder in S3 bucket and load it into the target table in Snowflake database.

...

Image Added

We configure an ELT Select Snap to capture SCD2 data from the source table SCD2_NEW_DATA in S3. The image below depicts the data available in the source table.

...

Image Added

ELT Select Snap
Image Modified
Snap Output

...

Image Added

We use this Snap's output as the input to an ELT SCD2 Snap and configure this Snap to update the table SCD2_TRG_TEST5 residing in the Snowflake database.

The image below depicts the data available in the target table, before the data load.

Existing Data in Target Table

...

ELT SCD2 Snap
Image Added
Image Modified

...

Target Table after SCD2 operation
Image Added

Notice the difference in the number of rows and the values in the END_DATE and the CURRENT_ROW columns. There is only one row with CURRENT_ROW = true in the updated table for each unique CUSTOMER_ID.

Download this Pipeline.

Downloads

Note

...

Important Steps to Successfully Reuse Pipelines

Download and import the Pipeline into SnapLogic.
Configure Snap accounts as applicable.
Provide Pipeline parameters as applicable.

Attachments

patterns	.slp, .zip

...

Snap Pack History

Refer to the Snap Pack History table in ELT Snap Pack.

...

Versions Compared

Old Version 51

New Version Current

Key

Overview

Prerequisites

Limitation

Known Issues

Snap Input and Output

Snap Settings

Overview

Switching the Snap behavior

Snap Type

Prerequisites

Limitation

Known Issues

Snap Views

Snap Settings

Pipeline Execution Statistics

Troubleshooting

Examples

Loading Historical Data From an S3 Bucket to a Snowflake Table

Downloads

Important Steps to Successfully Reuse Pipelines

See Also

Snap Pack History

Related Content

Page Comparison

Versions Compared

Old Version 51

New Version Current

Key

Overview

Prerequisites

Limitation

Known Issues

Snap Input and Output

Snap Settings

Overview

Switching the Snap behavior

Snap Type

Prerequisites

Limitation

Known Issues

Snap Views

Snap Settings

Pipeline Execution Statistics

Troubleshooting

Examples

Loading Historical Data From an S3 Bucket to a Snowflake Table

Downloads

Important Steps to Successfully Reuse Pipelines

See Also

Snap Pack History

Related Content