ELT Merge Into

In this article

Overview

Use this Snap to perform the MERGE INTO operation on the specified table. This operation updates, inserts, and deletes rows in/from the target table based on values in a source table. You can set match conditions in the Snap based on which the merge into operation is carried. The Snap creates a table and inserts the data if the target table does not exist.

After successfully running the Pipeline with this Snap, you can use this Snap's output view to verify whether the data would be inserted correctly in the target table.

Prerequisites

  • A valid SnapLogic account to connect to the database in which you want to perform the MERGE INTO operation. 
  • Appropriate database permissions to read, insert, update, or delete records.
  • Appropriate database permissions to create a table.
  • Appropriate read permissions for the source table that you want to use in the MERGE INTO operation. 

Limitations

  • Though Snowflake allows any number of merge (insert, update, and delete) actions in any combination, you can configure a maximum of three Merge-into Statements in this Snap. The ELT Merge Into Snap returns an error when you specify more than 3 actions, more than 2 WHEN MATCHED actions, or more than 1 WHEN NOT MATCHED action.

  • ELT Snap Pack does not support Legacy SQL dialect of Google BigQuery. We recommend that you use only the BigQuery's Standard SQL dialect in this Snap.

Known Issues

  • When your Databricks Lakehouse Platform instance uses Databricks Runtime Version 8.4 or lower, ELT operations involving large amounts of data might fail due to the smaller memory capacity of 536870912 bytes (512MB) allocated by default. This issue does not occur if you are using Databricks Runtime Version 9.0.
  • When you configure an ELT Merge Into Snap to perform an Update or Delete operation or an ELT Execute Snap with a MERGE INTO statement that performs Update or Delete operation on a Databricks Lakehouse Platform cluster, it may return an error if multiple source rows attempt to update or delete the same target row. To prevent such errors, you need to preprocess the source table to have only unique rows.

  • ELT Pipelines created prior to 4.24 GA release using one or more of the ELT Insert Select, ELT Merge Into, ELT Load, and ELT Execute Snaps may fail to show expected preview data due to a common change made across the Snap Pack for the current release (4.26 GA). In such a scenario, replace the Snap in your Pipeline with the same Snap from the Asset Palette and configure the Snap's Settings again.
  • Suggestions displayed for the Schema Name field in this Snap are from all databases that the Snap account user can access, instead of the specific database selected in the Snap's account or Settings.

  • ELT Pipelines targeting a Databricks Lakehouse Platform (DLP) instance might fail due to a very long or complex SQL query that they build. As a workaround, you can set an advanced (URL) property useNativeQuery to 1 in your ELT Database Account configuration as shown below:
  • In any of the supported target databases, this Snap does not appropriately identify nor render column references beginning with an _ (underscore) inside SQL queries/statements that use the following constructs and contexts (the Snap works as expected in all other scenarios):
    • WHERE clause (ELT Filter Snap)
    • WHEN clause
    • ON condition (ELT Join, ELT Merge Into Snaps)
    • HAVING clause
    • QUALIFY clause
    • Insert expressions (column names and values in ELT Insert Select, ELT Load, and ELT Merge Into Snaps)
    • Update expressions list (column names and values in ELT Merge Into Snap)
    • Secondary AND condition
    • Inside SQL query editor (ELT Select and ELT Execute Snaps)

Workaround

As a workaround while using these SQL query constructs, you can:

  • Precede this Snap with an ELT Transform Snap to re-map the '_' column references to suitable column names (that do not begin with an _ ) and reference the new column names in the next Snap, as needed.
  • In case of Databricks Lakehouse Platform where CSV files do not have a header (column names), a simple query like SELECT * FROM CSV.`/mnt/csv1.csv` returns default names such as _c0, _c1, _c2 for the columns which this Snap cannot interpret. To avoid this scenario, you can:
    • Write the data in the CSV file to a DLP table beforehand, as in: CREATE TABLE csvdatatable (a1 int, b1 int,…) USING CSV `/mnt/csv1.csv` where a1, b1, and so on are the new column names.
    • Then, read the data from this new table (with column names a1, b1, and so on) using a simple SELECT statement.
  • In case of Databricks Lakehouse Platform, all ELT Snaps' preview data (during validation) contains a value with precision higher than that of the actual floating point value (float data type) stored in the Delta. For example, 24.123404659344 instead of 24.1234. However, the Snap reflects the exact values during Pipeline executions.

Snap Input and Output

Input/OutputType of ViewNumber of ViewsExamples of Upstream and Downstream SnapsDescription
Input 

Document

  • Min: 1
  • Max: 1
  • ELT Transform
  • ELT Select
The source table you want to compare with the target table before updating, inserting, or deleting the values in the target table. 
Output

Document

  • Min: 0
  • Max: 1
  • ELT Transform
  • ELT Select

A document containing the SQL query executed on the target database and the rows data inserted in the target table (after running the Pipeline successfully). 

Snap Settings

SQL Functions and Expressions for ELT

You can use the SQL Expressions and Functions supported for ELT to define your Snap or Account settings with the Expression symbol = enabled, where available. This list is common to all target CDWs supported. You can also use other expressions/functions that your target CDW supports.

Parameter NameData TypeDescriptionDefault ValueExample 
LabelString
Specify a name for the Snap. You can modify this to be more specific, especially if you have more than one of the same Snap in your Pipeline.
ELT Merge IntoMerge Sales Data
Database NameStringThe name of the database in which the target table exists. Leave this blank if you want to use the database name specified in the Default Database Name field in the account settings.N/ATESTDB
Schema NameStringRequiredThe name of the database schema. In case it is not defined, then the suggestion for the schema name retrieves all schema names in the specified database when you click the suggest button.
  • Ensure that you include the exactly same schema name including the double quotes, if used, when you repeat the schema name in the Target Table Name field.
  • Leave this field blank if your target database is Databricks Lakehouse Platform.
N/ASALES
Target Table NameString

RequiredThe name of the table in which you want to perform the MERGE INTO operation. 

  • Ensure that you include the exactly same schema name, if at all, including the double quotes as specified in the Schema Name field.

  • If the target table does not exist, the Snap creates one with the name that you specify in this field and writes the data into it.
  • You can specify the table name without using double quotes (""). However, they must be used if you want to include special characters such as hyphens (-) in the table name.
  • A table name must always start with an alphabet.
  • Integers and underscores (_) can also be a part of the table name.
  • All characters are automatically converted to upper-case at the backend. Use double-quotes to retain lower casing.
N/ASALES_ORDERS
Target Table Hash Distribution Column (Azure Synapse Only)String/Expression

Specify the Hash distribution column name for the target table in Azure Synapse, if you configure INSERT action in the Snap.

Azure Synapse needs a table to be always Hash distributed, because the MERGE INTO statement on Azure Synapse supports only Hash distributed tables and does not support other types of target tables. See Troubleshooting section for related information on an edge case in creating a table in Azure Synapse.

If the target table is created outside the Snap or if you do not use Merge Into statement, you need not specify any value in this field.

N/Avar table
Target Table AliasString/Expression

Specify the alias name for the target table.
You can use the alias names in MERGE INTO statements, instead of using the lengthy target table names.

The Snap is equipped with the ability to auto-replace the actual table names (with the alias name), if any, used in the ON clause condition, secondary AND conditions, Update Expression list, or Insert Expression list. This applies to Snowflake, Redshift, and Synapse databases.

N/ASLS_ORDRS
Input Source AliasString

Enter the alias name that you want to use for the source table.

The alias name is optional. It is useful in cases where the source table's column and the target table's column names are the same. For example:

src.order_id = sales.order_id

In such cases, the database cannot resolve the column ambiguity and the table qualifier is required to resolve it.

N/Asrc
ON ConditionString

RequiredThe condition based on which you want to check the values in the target table. You can also use Pipeline parameters in this field to bind values. However, you must be careful to avoid SQL injection. See Preventing SQL Injection for details.

N/Asrc.PROD_ID=SALES.PROD_ID
Merge-into Statements

Use this field set to specify the conditions that activate the MERGE INTO operation and the additional conditions that must be met. Specify each condition in a separate row. Click  to add a row.

This field set contains the following fields:

  • When Clause
  • Condition
  • Action
  • The Snap allows the following combinations of actions:
    • INSERT
    • UPDATE
    • DELETE
    • UPDATE AND DELETE
    • UPDATE AND INSERT
    • DELETE AND INSERT
    • UPDATE, DELETE AND INSERT
When ClauseString/Expression/Suggestion

Specify the matching condition based on the outcome of the ON ConditionAlternatively, click  to view and select a clause from the suggestion list.

Available options are:

  • WHEN MATCHED: Applies the specified condition and action when the source data matches with the target.
  • WHEN NOT MATCHED: Applies the specified condition and action when the source data does not match with the target.
  • WHEN NOT MATCHED BY SOURCE: (Available only for Azure Synapse.) Applies the specified condition and action when the target rows do not match the source rows.

Redshift and Snowflake supports the following actions in Merge-into statements:

When clauseAction
WHEN MATCHEDUPDATE or DELETE
WHEN NOT MATCHEDINSERT

Azure Synapse supports the following actions in Merge-into statements:

When clauseAction
WHEN MATCHEDUPDATE or DELETE
WHEN NOT MATCHEDINSERT
WHEN NOT MATCHED BY SOURCEUPDATE or DELETE
WHEN MATCHEDWHEN NOT MATCHED
ConditionString

Specify the additional criteria that must be specified.
The action associated for the specified condition is not performed if the condition's criteria is not fulfilled. It can be a combination of both source and target tables, source table only, target table only, or may not contain references to any table at all.

Having this additional condition allows the Snap to identify whether the UPDATE or DELETE action must be performed (since both the actions correspond to the WHEN MATCHED clause).

You can also use Pipeline parameters in this field to bind values. However, you must be careful to avoid SQL injection. See Preventing SQL Injection for details.

N/ATOTAL_ORDERS = 0
ActionDrop-down list

Choose the action to apply on the condition.

Available options are:

  • INSERT
  • UPDATE
  • DELETE
INSERTDELETE
Insert Expression

Use this field set to specify the column in which you want to insert the data and also the values that you want to insert into them. Each column must be specified in a separate row. Click + to add a new row.

This field set contains the following fields:

  • Insert Column
  • Insert Value
Insert ColumnString/Expression/Suggestion

Specify the column in which the value must be inserted. Alternatively, click  to view and select a column from the suggestion list.

N/APRODUCT_ID
Insert ValueString/Expression

Specify the value to be inserted for the column. This field can be an SQL expression that references the source table's columns only or none. It cannot reference the target table's columns.

You can also use Pipeline parameters in this field to bind values. However, you must be careful to avoid SQL injection. See Preventing SQL Injection for details.

N/Asrc.PRODUCT_ID
Update Expression - When Matched

Use this field set to add expressions whose values must be updated when the the Merge Into condition matches the source data. Specify each column to be updated in a separate row. Click  to add a new row.

This field set contains the following fields:

  • Update Column
  • Update Value
Update ColumnString/Expression/Suggestion

Specify the column in which the value must be updated. Alternatively, click  to view and select a column name from the suggestion list.

N/ATOTAL_ORDERS
Update ValueString/Expression

Specify the new value to be inserted when the Merge Into condition matches the source.
This field can be an SQL expression that may reference columns in the source table, the target table, both source and target, or neither. 
You can also use Pipeline parameters in this field to bind values. However, you must be careful to avoid SQL injection. See Preventing SQL Injection for details.

N/Asrc.TOTAL_ORDERS + 1
Update Expression – When Not Matched By Source

Applicable only for Azure Synapse and is used only when you choose the WHEN NOT MATCHED BY SOURCE as the When clause. 
Use this field set to add expressions listing out columns whose values must be updated or deleted if the source rows do not match the target rows in the Merge Into operation. Click  to add a new row.

This field set contains the following fields:

  • Update Column
  • Update Value

If the options in this field set do not populate on choosing the WHEN NOT MATCHED BY SOURCE clause, the snap displays an error.

Update ColumnString/Expression/Suggestion

Specify the column name in which the value must be updated or deleted as specified in the conditions. Alternatively, click  to view and select a column from the suggestion list.

When you choose Delete action, the Snap ignores the Update Expression - When Not Matched by Source field set.

N/ATOTAL_ORDERS
Update ValueString/Expression

Specify the new value to be inserted when the Merge Into condition matches the source.
This field can be an SQL expression that may reference columns in the source table, the target table, both source and target, or neither. 
You can also use Pipeline parameters in this field to bind values. However, you must be careful to avoid SQL injection. See Preventing SQL Injection for details.

N/Asrc.TOTAL_ORDERS + 1
Get preview dataCheck box

Select this checkbox to include a preview of the query's output. The Snap performs limited execution and generates a data preview during Pipeline validation.

In case of ELT Pipelines, only the SQL query flows through the Snaps but not the actual source data. Hence, the preview data for a Snap is the result of executing the SQL query that the Snap has generated in the Pipeline.

The number of records displayed in the preview (upon validation) is the smaller of the following:

  • Number of records available upon execution of the SQL query generated by the Snap.

  • The value set in the Preview Data Count setting (default is 50 records).

Rendering Complex Data Types in Databricks Lakehouse Platform

Based on the data types of the fields in the input schema, the Snap renders the complex data types like map and struct as object data type and array as an array data type. It renders all other incoming data types as-is except for the values in binary fields are displayed as a base64 encoded string and as string data type.

Not selectedSelected

Preventing SQL Injection

You can pass Pipeline parameters as values in an SQL expression; however, if you do not phrase the expression properly it can lead to the parameter's name being bound as a value in the database. This potentially incorrect information being inserted into the database is known as SQL injection. It is thus necessary to take precautions when including Pipeline parameters in your SQL expression to prevent SQL injection. Based upon the intended use of the Pipeline parameter, use one or both the following methods to prevent accidental SQL injection:

Method-1: Simple Substitutions

You can reference the Pipeline parameter directly with a JSON-path without enabling expressions.

For example, if you want to use the Pipeline parameter, name, which contains the value of a column in the ON Condition, ConditionInsert Value, or Update Value fields: 

ON Condition or Condition field:
colname = _name

Insert Value or Update Value field:
_name

Method-2: Dynamic Substitutions

You must enable expressions when using Pipeline parameters for dynamic substitutions. Format the SQL expression, except the Pipeline parameter's reference, as a string. 

For example, if you want to use the Pipeline parameter, name, which contains the value of a column in the ON Condition, ConditionInsert Value, or Update Value fields: 

ON Condition or Condition field:
_columnname + “= _name”

Insert Value or Update Value field:
“_name”

The Snap evaluates the expression and also carries out path substitutions.

Here is how it works

The Snap pre-processes the query to extract any JSON-Paths and converts them to bound parameters. For example, consider the following query:

_columnname + “= _name”

The Snap converts this query into the following before turning it into a prepared statement for the database:

colname = ?

The Snap evaluates the JSON-Path to get the value to bind the Pipeline parameter in the prepared statement. 

Using escape characters

When expressions are disabled, use \ as an escape character to treat underscore (_) as a string.

For example:

colname = \_name 

Troubleshooting

ErrorReasonResolution

The Snap has not been configured properly.

(This error scenario occurs when the Target table does not exist (Snap needs to create one), and the Target Table Hash Distribution Column is not specified.)

Without the Target Table Hash Distribution Column defined for creating the new table in Azure Synapse, the Snap creates a new target table with Round-Robin distribution (default for Azure) instead of a Hash-distributed table. But, as the Snap cannot insert values into a non-Hash-distributed table, it errors out with this message.

If you have not created the Hash-distributed table before hand, ensure that you specify the Target Table Hash Distribution Column for creating the new table. To do so, validate the Pipeline to get the target table column names through the Input SQL coming from the upstream Snap.

This allows the ELT Merge Into Snap to create a Hash-distributed table in the Azure Synapse Database and insert the rows, as needed.

If the target table exists already and was created with Hash distribution, you can leave this field blank.

Database cannot be blank.

(when seeking the suggested list for Schema Name field)

Suggestions in the Schema Name and Target Table Name fields do not work when you have not specified a valid value for the Database Name field in this Snap.

Specify the target Database Name in this Snap to view and choose from a suggested list in the Schema Name and Target Table Name fields respectively.

SQL exception from Snowflake: Syntax error in one or more positions in the SQL query
Column names in Snowflake tables are case-sensitive. It stores all columns in uppercase unless they are surrounded by quotes during the time of creation in which case, the exact casing is preserved. See, Identifier Requirements — Snowflake Documentation.Ensure that you follow the same casing for the column table names across the Pipeline.

[Simba][SparkJDBCDriver](500051) ERROR processing query/statement. Error Code: 0

Cannot create table ('<schema name>`.`<table name>`'). The associated location (`…<table name>`) is not empty but it's not a Delta table

(Target CDW: Databricks Lakehouse Platform)

The specified location contains one of the following:

  • A non-Delta table (such as CSV, ORC, JSON, PARQUET)

  • A corrupted table

  • A Delta table with a different table schema

So, the Snap/Pipeline cannot overwrite this table with the target table as needed.

Ensure that you take appropriate action (mentioned below) on the existing table before running your Pipeline again (to create another Delta table at this location).

Move or drop the existing table from the schema manually using one of the following commands:

Access the DBFS through a terminal and run:

  • dbfs mv dbfs:/<current_table_path> dbfs:/<new_table_path> to move the table or

  • dbfs rm -r dbfs:/<table_path> to drop the table.

OR

Use a Python notebook and run:

  • dbutils.fs.mv(from: String, to: String, recurse: boolean = false): boolean to move the table/file/directory or

  • dbutils.fs.rm(dir: String, recurse: boolean = false): boolean to drop the table/file/directory.

Example

Retrieving and Transforming Records and Merging Them Into A Table

In this example, we want to retrieve records from two tables. First, we transform the retrieved records using conditions and use a JOIN clause to join the queries from the two tables. We then perform transformations on the records and apply aggregate function. Finally, we merge the records by applying conditions and actions. This example demonstrates how we can accomplish this task.

First, we use two ELT Select Snaps to build SELECT queries to retrieve all records from the tables (ORDERS_34R_2KB and CUSTOMER_50R_3KB). The configuration and query output is as shown below:

ELT Select SnapsOutput

The tables have several columns. But, we want to retrieve only the CUST_CODE and ORD_AMOUNT columns. Therefore, we configure the ELT Transform Snap with the following settings, which builds the query as shown below:

ELT TransformOutput

The SELECT queries in the ELT Select and ELT Transform Snaps form the inputs for the ELT Join Snap. Now, we apply an inner join based on matching values of the CUST_CODE column in the tables. Hence, we configure the ELT Join Snap and validate. Upon validation, the ELT Join Snap combines both the incoming SELECT queries and adds the JOIN clause. The output preview is as shown below:

ELT JoinOutput

We use the ELT Transform Snap to retrieve only the CUST_NAME and ORD_AMOUNT columns after joining the queries. We configure the Snap with the following setting which builds the query as shown below:

ELT Transform SnapOutput

Subsequently, we use an ELT Aggregate Snap to apply an aggregate function on the values in the ORD_AMOUNT column. Here, we want to calculate the SUM for the ORD_AMOUNT column.

The Snap builds the query as shown below:

Finally, we use the ELT Merge Into Snap to merge the records into the Target table MERGE_INTO_OUT_01 based on the ON Condition, apply Merge-into statements, update actions, and update expressions.
The ON condition checks for the customer's name - if the customer's name matches, it applies the Merge-into statements and applies conditions as configured below. In this case, if the customer's name matches and if the condition (SUM_COLUMN<5000) is met, then the Snap updates the record with customer's name. If the customer's name does not match and if the condition (SrcAlias.SUM_COLUMN<5000) is met, then it inserts the customer's name into the record.

Upon successful execution, the Pipeline adds and updates customer records in the Target database table in the configured database.

Download this Pipeline.

Downloads

Important Steps to Successfully Reuse Pipelines

  1. Download and import the Pipeline into SnapLogic.
  2. Configure Snap accounts as applicable.
  3. Provide Pipeline parameters as applicable.

  File Modified

File Example_ELT_Merge_Into.slp

Feb 05, 2021 by Kalpana Malladi


Snap Pack History

 Click here to expand...

Release

Snap Pack Version 

Date

Type

Updates

4.29main15993 Stable
  • Introduced the following new ELT Snaps:
    • ELT Cast Function: Snap to convert a data type of a column in the input SQL string into other supported data types.
    • ELT String Function: Snap to support the various string functions supported by the different databases.
    • ELT Router: Snap to enable routing input SQL queries into multiple output views based on the given conditional expressions.
  • Enhanced the following Snaps to display the final SQL query in their output preview upon Pipeline validation.
  • Enhanced the ELT Database Account to support OAuth2-based authentication on the target Snowflake database.

  • Enhanced the ELT Select, ELT Insert Select, ELT SCD2, ELT Merge Into, and ELT Load Snaps to display suggestions on the Schema Name field based on the Default Database Name provided in the Snap Account configuration when the Database Name is not specified in the respective Snap.

    • Improved usability of the suggestions features for these Snaps by making them case-insensitive. For example, typing default in the Schema Name field displays both default and DEFAULT, if they co-exist. You do not need to type DEFAULT to invoke and select the schema name DEFAULT from the suggestions list.

  • Enhanced the ELT SCD2 Snap to address different feature requests and issues raised by multiple customers. These changes provide more flexibility in configuring your SCD2 operations using this Snap.

  • Removed Check for nulls and duplicates in the source field and added two dropdown lists - Null Value Behavior and Invalid Row Handling.

  • Made the following items in the Meaning field of the Target Table Temporal Fields fieldset mandatory while making the Invalid historical rows flag optional.

    • Current row 

    • Historical row

  • Enhanced the ELT Aggregate and ELT Window Functions Snaps to support the following functions across all supported CDWs:

    • KURTOSIS

    • MODE

    • SKEW

  • Enhanced the ELT Aggregate Snap to support the following GROUP BY features across all supported CDWs:

    • Group by Cube

    • Group by Grouping Sets

    • Group by Rollup

    • Automatic GROUP BY for all input columns.

  • Fixed an issue with ELT Merge Into Snap where the Snap erroneously modified the target table column name when the column name contained the target table name.

  • Fixed an issue in ELT SCD2 Snap where the Snap causes incorrect results with Snowflake targets, when:

    • The Historical Row End Date value is provided.

    • Nulls and Invalid rows are recognized, but one or more start dates in the source are null.

  • Fixed the issue in ELT Transform Snap where the Output Schema of the Snap does not populate all the column names from its Input Schema.

4.28-Patch428patches15638 Latest

Fixed the issue with ELT Merge Into Snap where the Snap erroneously modified the target table column name when it contained the target table name, due to a misinterpretation of the target table name aliases.

4.28-Patch428patches15290 Latest
  • Updated ELT SCD2 Snap to address different feature requests and issues raised by multiple customers. These changes provide more flexibility in configuring your SCD2 operations using this Snap.

    • Removed Check for nulls and duplicates in source field and added two dropdown lists - Null Value Behavior and Invalid Row Handling.

    • Refer to the ELT SCD2 scenarios to learn more.

  • Introduced a new Snap ELT Router to enable routing input SQL queries into multiple output views based on the given conditional expressions.

4.28main14627 Stable
  • Subquery Pushdown Optimization: SnapLogic now optimizes SQL queries before they are passed to the CDW to ensure the queries are performant and cost-efficient in the respective CDW. An SQL subquery means a query inside a query. Pushdown optimization refers to rewriting these incremental (nested) SQL queries produced in your ELT Pipeline to form a more optimal/performant version.

  • Introduced the following new ELT Snaps:

    • ELT Case Expression: Snap to return the action to perform on an event based on a list of events and respective expected actions.

    • ELT Coalesce: Snap to return the first non-NULL value from a list of arguments.

    • ELT Conditional Functions: Snap to perform unary and binary conditional operations on data.

    • ELT Math Functions: Snap to perform mathematical—arithmetic, logarithmic, trigonometric, exponent, root, rounding, and truncation—operations on data.

  • Enhanced all the expression-enabled fields in ELT Snaps to display suggestions from the Input Schema (emanating from the upstream Snaps) in addition to the existing standard SQL expressions and functions list.

  • Enhanced the ELT Aggregate Snap to support:

    • HAVING clause within GROUP BY clause, when the WHERE clause cannot be used.

    • GROUP BY ROLLUP.

    • New aggregate functions for DLP: ANY, SOME, KURTOSIS, and STDDEV.

  • Enhanced the ELT Load Snap to support loading data into BigQuery targets from S3 buckets and Redshift CDW. These load operations use the BigQuery Data Transfer Service (DTS) client libraries and are carried out in asynchronous mode.

  • Enhanced the ELT Load and ELT Insert Select Snaps with a new fieldset Table Options List to support defining the Table Options for creating a new table in your target CDW.

  • Enhanced the ELT Select Snap to support Common Table Expression (CTE)-based SQL queries that contain a WITH clause inside the SQL Query Editor field, when your target CDW is Azure Synapse.

4.27-Patch

427patches13923

 Latest
  • Fixed the issue with ELT SCD2 Snap where the Snap did not equate null values in the corresponding cause-historization rows of both the source and target tables (with no other changes to data in the remaining fields) as the same and produced duplicate rows in the target table, as a result. After this fix, the Snap does not cause any new duplicate rows in the target table.

  • Fixed the issue with ELT Load Snap where the Snap fails with the error Database encountered an error during Bulk Load process when you specify a CSV file to load data from, with the Load Action as Alter Table. The Snap now performs the specified ALTER TABLE actions—ADD/DROP columns—and loads the data into the target table accordingly (without the need to manually modify the source or target tables beforehand).

4.27-Patch427patches13539 Latest
  • Fixed the issue with ELT SCD2 Snap where the Snap failed when you define more than one TargetTable Natural Key in the Snap configuration to load SCD2 data into the target CDW instance.

  • Fixed an issue with the ELT SCD2 Snap where the Snap failed to update the previous current rows in the target SCD2 table to historical rows when you define an End Date of Historical Row in Target Table Temporal Fields.

  • Fixed an issue with the ELT SCD2 Snap where the Snap failed to insert new rows when you define the values that exist in the most recent historical rows of the target SCD2 table as the cause-historization values.

  • Fixed the issue with ELT Transform Snap where the Snap does not omit the source columns marked for removal from the output view—using an empty Target Path for one or more columns selected in the Expression field of the Snap’s Mapping Table. See Using Empty Target Paths to Omit Rows from the Snap Output to understand how to perform this operation.

4.27-Patch

427patches13030

 Latest
  • Fixed the following issues with the ELT SCD2 Snap:

    • Where the Snap failed to get the right data type due to column name case mismatches between what is used in the Snap and what is actually used in the Azure Synapse tables (returned by the JDBC driver). You no longer need to type the column names in the exact case that Azure Synapse expects.

    • The Snap failed with the error—start_date does not exist—while writing SCD2 data to a Redshift table column start_date that is specified as the Start Date of Current Row in the Target Table Temporal Field(s) field set.

    • The Snap failed with the error—Reference 'END_DATE' is ambiguous—while merging SCD2 updates into DLP tables.

    • Where the Snap failed due to lack of required access privileges on the target database (for example, create table rights to create temporary tables as needed). The Snap now runs the input SQL statement and the elaborate sub-queries instead of attempting to create a temporary table in such scenarios.

  • Fixed the issue with the ELT SCD2 and ELT Load Snaps that fail to perform an add/drop operation involving multiple columns on a Redshift target table.

4.27main12833 Stable
  • Enhanced the ELT Aggregate Snap to support COUNT_IF aggregate function for Redshift and Azure Synapse target databases.
    Eliminating duplicates with COUNT_IF aggregate function
    Note: Selecting the Eliminate Duplicates checkbox while using COUNT_IF aggregate function does not eliminate duplicate records in case of Snowflake and BigQuery databases, as there is no native support for this feature. However, for Redshift, Azure Synapse and Databricks Lakehouse Platform (DLP), the duplicates are eliminated from the list of records when you select this checkbox for COUNT_IF function.

  • Enhanced the ELT Database Account connecting to a Databricks Lakehouse Platform (DLP) to support two new options - Optimize Write and Auto Compact for creating/replacing a table using any of the ELT Insert SelectELT Merge IntoELT Load, and ELT SCD2 Snaps.

  • Updated the Expressions and Functions Supported for ELT in the Snap and Account configuration sections. This list is common to all target CDWs supported. You can use these expressions to define your Snap or Account settings with the Expression symbol = enabled, where available.

  • Enhanced the ELT Load Snap to ensure that the Snap uses the default S3 Folder name specified in the Snap's account to accurately resolve the defined File Name Pattern.

  • Enhanced the ELT Select and ELT Execute Snaps to allow SQL comments inside the SQL Query Editor and SQL Statement Editor fields respectively.

  • Enhanced the ELT Transform Snap to display the exact data types of fields listed in the Input Schema and Target Schema in case of Azure Synapse.

  • Enhanced the ELT Merge Into Snap to to support MERGE INTO ALL option and automatic source table aliasing.

  • Enhanced the ELT Load Snap by adding the Source File to Target Table Columns Map field set to enable mapping of columns between the source file and the target table. In case of Databricks Lakehouse Platform (DLP) the Snap is enhanced further to support delimiters other than the comma in the source CSV files.

  • Enhanced the ELT Join Snap to support Left Anti and Left Semi join types in BigQuery though it does not natively support these join types.

  • Enhanced the underlying load mechanism for ELT SCD2 Snap from Insert-and-Update mode to Merge-into mode to substantially improve the Snap's performance while working with large and very large volumes of data (upwards of 500M rows or 50GB size).

4.26-Patch426patches12534 Latest
  • Fixed an issue with ELT Transform Snap where it may display incorrect schema only in the previews (during Pipeline validation). This occurs especially when the incoming SQL statement (defined in the SQL Statement Editor of the upstream Snap) contains one or more of the WHERE, GROUP BY, HAVING , ORDER BY, LIMIT, LIMIT followed by OFFSET, and SAMPLE clauses. Here are a few Pipeline scenarios where this issue might surface:

4.26-Patch426patches12021 Latest
  • Fixed an issue where the ELT Load Snap connecting to a Databricks Lakehouse Platform (DLP) instance failed to perform the load operation. Ensure that you provide a valid DBFS Folder path in the Snap's account settings as the Snap requires this folder path.

4.26-Patch426patches11646 Latest
  • Enhanced the ELT Database Account to support token-based authentication (Source Location Session Credentials) to S3 locations for Snowflake and Redshift target databases.
  • Enhanced the ELT Aggregate Snap with the following changes:
    • Revised the field labels from:
      • GROUP BY Fields List field set > Output Field to GROUP BY Field.
      • ORDER-By Fields to ORDER-BY Fields (Aggregate Concatenation Functions Only).
    • Removed the Suggestion option for Field Name field under General Aggregate Functions List field.
    • Made the Alias Name fields in the Aggregate Concatenation Functions List and the Percentile Distribution Functions List field sets mandatory.
  • If your target database is a Databricks Lakehouse Platform (DLP) instance, then the ELT Load Snap supports loading data from source CSV files that contain only comma as the separator between values.
4.26-Patch426patches11323 

Latest

  • Enhanced the ELT Database Account to allow parameterization of field values using Pipeline Parameters. You can define and use these parameters in expression-enabled fields to pass values during runtime.
4.26-Patch426patches11262Latest
  • Fixed the following Known Issues recorded in the 4.26 GA version:
    • For a Snowflake target instance, the ELT Insert Select Snap does not suggest column names to select for the Insert Column field in the Insert Expression List.
    • The Snaps—ELT Merge Into, ELT Select, ELT Join, and ELT Filter—do not prevent the risk of SQL injection when your target database is Databricks Lakehouse Platform (DLP).
    • Intermittent null-pointer exceptions in the ELT Load Snap on Databricks Lakehouse Platform (DLP).

    • The ELT Insert Select Snap attempts to create the target table even when it exists in the Snowflake database.
    • When loading data from a JSON file into a target Databricks Lakehouse Platform (DLP) instance using an ELT Load Snap, if you choose the Drop and Create Table option as the Load Action and specify an additional column (that is not available in the JSON file) for the new table, it results in one more column null added to the new target table.
    • When you use the SQL editor in the ELT Select Snap configuration to define your SQL query, the Pipeline validation fails due to a syntax error in the following scenarios. However, the Pipeline execution works as expected. The only workaround is to drop the LIMIT clause and the optional OFFSET clause from the SQL query during Pipeline validation.
      • The query contains a LIMIT clause on a Snowflake, Redshift or Databricks Lakehouse Platform target instance: The SQL query created during Pipeline validation includes an additional LIMIT clause, for example: SELECT * FROM "STORE_DATA"."ORDERS" LIMIT 10 LIMIT 990

      • The query contains an OFFSET clause (supported in case of Snowflake and Redshift): The SQL query created during Pipeline validation looks like SELECT * FROM "STORE_DATA"."ORDERS" LIMIT 10 OFFSET 4 LIMIT 990
4.26main11181 Stable
  • Enhanced the ELT Snap preview to support the following Snowflake data types: array, object, variant, and timestamp.

    • The Snaps convert the values to hexadecimal (HEX) equivalents—the default setting for the session parameter BINARY_OUTPUT_FORMAT in Snowflake. See Session Parameters for Binary Values for more information.

    • If this setting is different from hexadecimal (such as base64) in the Snowflake table, the Snaps still convert the values to hexadecimal equivalents for rendering them in the Snap preview.

  • Enhanced all ELT Snaps to display the Get preview data checkbox below the Snap's Label field.
  • The ELT Database account is now mandatory for all Snaps in the ELT Snap Pack.

    Breaking change

    Starting with the 4.26 release, all Snaps in the ELT Snap Pack (except the ELT Copy Snap) require an account to connect to the respective target database. Your existing Pipelines that do not use an account may fail. We recommend you to associate an ELT Database Account to each of the ELT Snaps (except ELT Copy Snap) for your Pipelines.

  • Enhanced the ELT Aggregate Snap to support Linear Regression functions on Redshift and Azure Synapse. The Snap also supports these functions on Databricks Lakehouse Platform.
  • Enhanced the ELT Execute Snap to enable running multiple DML, DDL, and DCL SQL statements from the same Snap instance.
  • Enhanced the ELT Join Snap to:
    • Support LEFT ANTI JOIN and LEFT SEMI JOIN types on all supported databases.
    • Display or hide the Resultant Column Names Prefix Type field based on the target database selected in the Snap's account.
  • Enhanced the ELT Load and ELT SCD2 Snaps to provide a list of suggested data types, while adding columns to or creating a table.
4.25-Patch425patches10017 Latest
  • Updated the ELT SCD2 Snap to replace End date of historical row option in the Meaning field of Target Table SCD2 Fields field set with End Date of Current Row.

    Breaking change

    This may cause the existing Pipelines to fail as the End date of historical row option no longer exists.

    You need to make the following update in the ELT SCD2 Snap's settings across your Pipelines after upgrading your Snap Pack to this patch:

    • Select End Date of Current Row from the Meaning drop-down list in the second entry (highlighted in the image).
  • Fixed the issue with the ELT Insert Select Snap containing an open output preview that fails to retrieve output preview data in case of Redshift and Azure Synapse databases, though the Pipeline runs work as expected.
  • Fixed an issue where the ELT Execute Snap does not error out (Snap turns Green) even when running an SQL query to drop a non-existent table from a Snowflake or Azure Synapse database.
  • [Update on ]: Enhanced the ELT Snap previews to support the following data types: array, object, variant, and timestamp.
    • The Snaps convert the values to hexadecimal (HEX) equivalents—the default setting for the session parameter BINARY_OUTPUT_FORMAT in Snowflake. See Session Parameters for Binary Values for more information.
    • If this setting is different from hexadecimal (such as base64) in the Snowflake table, the Snaps still convert the values to hexadecimal equivalents for rendering them in the Snap previews.
4.25-Patch425patches9725 Latest
  • Enhanced the ELT Snap preview to display the exact binary and varbinary values from Snowflake database during Pipeline validation, by converting the values to hexadecimal equivalents—the default setting in SnowflakeIf the setting is different from hexadecimal in the Snowflake table, then the Snaps still convert the values to hexadecimal for rendering the Snap preview.
  • Enhanced the ELT Transform Snap to display the appropriate data type (binary or varbinary) for the column names populated in the output schema.
  • Enhanced the ELT Window Functions Snap to address potential issues due to an incorrect definition for MINUS function in case of Redshift and Azure Synapse databases.
4.25main9554 Stable
  • Starting with the 4.25 release, SnapLogic has now certified the ELT Snap Pack to work with Snowflake hosted on Google Cloud Platform (GCP) as the target database, in addition to the other flavors of Snowflake hosted on AWS and Microsoft Azure
  • Introduced the ELT Execute Snap to enable you to run DML, DDL, and DCL SQL queries in Snowflake in Snowflake, Redshift, and Azure Synapse.
  • Introduced the ELT SCD2 Snap to support Type 2 Slowly Changing Dimensions (SCD2) updates to the target databases—Snowflake, Redshift, and Azure Synapse.
  • Enhanced the ELT Database Account to introduce:
    • Support for Google Cloud Storage as a storage location (source) in addition to AWS S3 and Azure Data Lake Storage (ADLS) when your target database is Snowflake.
    • Automatic download of the JDBC driver required for the selected Database Type using the new Download JDBC Driver Automatically check box.
  • Enhanced the ELT Load Snap to prevent changes to existing tables during Pipeline validation. If you set the Load Action as Drop and Create table, and the target table does not exist, the Snap creates a new (empty) target table based on the schema specified in its settings.
  • Enhanced the ELT Window Functions Snap to support Covariance, Correlation, and Linear Regression Functions on Snowflake, Redshift, and Azure Synapse databases. The Snap uses function-specific query re-writes to support these functions on Redshift and Azure Synapse databases.
  • Enhanced the ELT Merge Into and ELT Insert Select Snaps to support up to one output view, and added the Get Preview Data check box to these Snaps. You can now connect downstream ELT Snaps to these Snaps.
4.24-Patch424patches8793 Latest
  • Fixes the issue of production job failures due to ELT Insert Select Snap after upgrading to 4.24 GA by updating the ELT Transform Snap to continue allowing duplication of fields in the Expression list for the Pipeline to complete successfully.

No changes are needed to your existing Pipelines.

  • Fixes the column name collision issue in the Snap's output when the two tables being joined have columns with the same/identical names. You can specify the extent of prefix (that is, to prefix all columns, only duplicate columns, or no prefix) using the Resultant Column Names Prefix Type drop-down list. Based on the prefix you choose, a table alias name is prefixed to the identical columns in the output.

Behavior Change

The behavior of ELT Load Snap for Load Action during Pipeline validation across the supported databases is as follows:

Append rows to existing table: Does not append the data from the source files into the target table.

Overwrite existing table: Does not overwrite the data.

Drop and Create table: Does not drop the target table even if it exists, but the Snap creates a new target table if a table does not exist.

Alter table: Does not modify the schema of the target table.

4.24main8556 Stable
  • Adds support for Azure Synapse database. You can now use the ELT Snap Pack to transform tables in the Snowflake, Redshift as well as Azure Synapse databases.

Updates the Snap Pack with the following features:

  • ELT Database Account: Enhances the ELT Database Account to support the Azure Synapse database.
  • ELT Aggregate: Enhances the Snap to:
  • Support Azure Synapse's T-SQL aggregate functions and the aggregate functions in Snowflake and Redshift databases.
    • General Aggregate Function COUNT_IF in Snowflake.
    • General Aggregate Functions in Snowflake.
    • Linear Regression Aggregate Functions in Snowflake.
    • Aggregate Concatenation Functions in Snowflake, Redshift, and Azure Synapse.
    • Percentile Distribution Functions in Snowflake and Redshift.
  • Suggest appropriate column names to select from, in the Snap fields. This applies to Snowflake, Redshift, and Azure Synapse databases.
  • ELT Insert Select: Enhances the Snap to:
    • Suggest appropriate column names to select from, in the Snap fields.
    • Create Hash-distributed tables using the Target Table Hash Distribution Column (Azure Synapse Only) field when the Load Action is selected as Drop and Create table and a condition like WHEN NOT MATCHED BY TARGET.
  • ELT Join
    • Enhances the Snap to support Natural JOINS (NATURAL INNER JOIN, NATURAL LEFT OUTER JOIN, NATURAL RIGHT OUTER JOIN, and NATURAL FULL OUTER JOIN) in addition to the INNER, LEFT OUTER, RIGHT OUTER, FULL OUTER, and CROSS Joins in Azure Synapse Database. This enhancement also makes account configuration mandatory when using this Snap.
    • Fixes the column name collision issue in the Snap's result set when the two tables being joined have columns with the same/identical names.  You can specify the Resultant Column Names Prefix Type drop-down list. Based on the prefix type you choose, a table alias name is prefixed to identical columns in the output.
  • ELT Load: Enhances the Snap to:
    • Support the File Name Pattern option using Key Based Mechanism for Redshift database. 
    • Suggest appropriate column names to select from, in the Snap fields. This applies to Snowflake, Redshift, and Azure Synapse databases.
    • Create Hash-distributed tables using the Target Table Hash Distribution Column (Azure Synapse Only) field when the Load Action is selected as Drop and Create table.
  • ELT Merge Into: Enhances the Snap to:
    • Suggest appropriate column names to select from, in the Snap fields. This applies to Snowflake, Redshift, and Azure Synapse databases.
    • Include the Target Table Hash Distribution Column (Azure Synapse Only) field for the Snap to create hash-distributed tables always.
    • Include the Update Expression List - When Not Matched By Source field set to allow defining one or more Update Expressions for the WHEN clause - WHEN NOT MATCHED BY SOURCE. This applies to Azure Synapse database.
    • Include the Target Table Alias field to specify the alias name required for the target table. The Snap is also equipped with the ability to auto-replace the actual table names (with the alias name), if any, used in the ON clause condition, secondary AND conditions, Update Expression list, or Insert Expression list. This applies to Snowflake, Redshift, and Azure Synapse databases.
  • ELT Transform: Enhances the Snap to:
    • Display input schema and output schema based on the upstream and downstream Snaps connected to this Snap.
    • Delete fields mentioned in the Expression field from the Snap's output when the mappings have an empty Target Path
  • ELT Window Functions: Enhances the Snap to support the following Window Functions in addition to the existing ones:
    • Value Based Analytic Functions
    • LEAD and LAG Analytic Functions
  • Fixes the issue of displaying generic error messages for Triggered Task failures with ELT Pipelines by displaying detailed error messages for ease in debugging.
4.23main7430 StableIntroduces the following Snaps:
  • ELT Load: Loads data from AWS S3 buckets and Azure clusters into the Snowflake and Redshift tables.
  • ELT Sample: Generates a data subset from the source table. 
  • ELT Pivot: Converts row data into column data.
  • ELT Unpivot: Converts column data into row data.
  • ELT Window Functions: Provides support for SQL Window Functions in ELT Pipelines.

4.22

main6403

 

Stable

Introduces the ELT Snap Pack that provides you with the Extract, Load, and Transform (ELT) capabilities. Use the following Snaps to build SQL queries that are executed in the Snowflake database:

  • ELT Aggregate : Builds SQL query to perform aggregate functions such as SUM, COUNT, MIN, and MAX. Also offers the GROUP BY functionality.
  • ELT Copy: Creates copies of the input SQL query. 
  • ELT Filter: Adds a WHERE clause in the input SQL query. Use this capability to create filters/conditions for your data set. 
  • ELT Insert Select: Performs the INSERT INTO SELECT operation on the specified table. 
  • ELT Intersect: Adds an INTERSECT SQL operator in the input queries.
  • ELT Join: Builds SQL query with a JOIN clause.
  • ELT Limit: Adds a LIMIT clause in the incoming SQL query.
  • ELT Merge Into: Performs the MERGE INTO operation on the specified table.
  • ELT Minus: Adds a MINUS SQL operator in the input queries.
  • ELT Select: Builds an SQL SELECT query and provides a built-in SQL query editor that enables you to construct complex queries.
  • ELT Sort: Adds the ORDER BY keyword in the input query. 
  • ELT Transform: Builds transformation-based SQL queries for the specified table.
  • ELT Union: Adds a UNION ALL or UNION DISTINCT operator in the input queries.
  • ELT Unique: Builds a SELECT DISTINCT SQL query. 


See Also