Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

In this article

...

Use this Snap to perform the MERGE INTO operation on the specified table. This operation updates, inserts, and deletes rows in/from the target table based on values in a source table. You can set match conditions in the Snap based on which the merge into operation is carried. The Snap creates a table and inserts the data if the target table does not exist.

After successfully running the Pipeline with this Snap, you can use this Snap's output view to verify whether the data would be inserted correctly in the target table.

...

Info
titleSQL Functions and Expressions for ELT

You can use the SQL Expressions and Functions supported for ELT to define your Snap or Account settings with the Expression symbol = enabled, where available. This list is common to all target CDWs supported. You can also use other expressions/functions that your target CDW supports.

...

Parameter NameData TypeDescriptionDefault ValueExample 
LabelString
Insert excerpt
File Writer
File Writer
nopaneltrue
ELT Merge IntoMerge Sales Data
Database NameStringThe name of the database in which the target table exists. Leave this blank if you want to use the database name specified in the Default Database Name field in the account settings.N/ATESTDB
Schema NameStringRequiredThe name of the database schema. In case it is not defined, then the suggestion for the schema name retrieves all schema names in the specified database when you click the suggest button.
Multiexcerpt include macro
nameME_Schema_Name
pageELT Insert Select
N/ASALES
Target Table NameString

RequiredThe name of the table in which you want to perform the MERGE INTO operation. 

Multiexcerpt include macro
nameME_Schema_And_Table_Names
pageELT Insert Select

Note
  • If the target table does not exist, the Snap creates one with the name that you specify in this field and writes the data into it.
  • You can specify the table name without using double quotes (""). However, they must be used if you want to include special characters such as hyphens (-) in the table name.
  • A table name must always start with an alphabet.
  • Integers and underscores (_) can also be a part of the table name.
  • All characters are automatically converted to upper-case at the backend. Use double-quotes to retain lower casing.


N/ASALES_ORDERS
Target Table Hash Distribution Column (Azure Synapse Only)String/Expression

Specify the Hash distribution column name for the target table in Azure Synapse, if you configure INSERT action in the Snap.

Info

Azure Synapse needs a table to be always Hash distributed, because the MERGE INTO statement on Azure Synapse supports only Hash distributed tables and does not support other types of target tables. See Troubleshooting section for related information on an edge case in creating a table in Azure Synapse.

If the target table is created outside the Snap or if you do not use Merge Into statement, you need not specify any value in this field.


N/Avar table
Target Table AliasString/Expression

Specify the alias name for the target table.
You can use the alias names in MERGE INTO statements, instead of using the lengthy target table names.

Info

The Snap is equipped with the ability to auto-replace the actual table names (with the alias name), if any, used in the ON clause condition, secondary AND conditions, Update Expression list, or Insert Expression list. This applies to Snowflake, Redshift, and Synapse databases.


N/ASLS_ORDRS
Input Source AliasString

Enter the alias name that you want to use for the source table.

Note

The alias name is optional. It is useful in cases where the source table's column and the target table's column names are the same. For example:

src.order_id = sales.order_id

In such cases, the database cannot resolve the column ambiguity and the table qualifier is required to resolve it.


N/Asrc
ON ConditionString

RequiredThe condition based on which you want to check the values in the target table. You can also use Pipeline parameters in this field to bind values. However, you must be careful to avoid SQL injection. See Preventing SQL Injection for details.

N/Asrc.PROD_ID=SALES.PROD_ID
Merge-into Statements

Use this field set to specify the conditions that activate the MERGE INTO operation and the additional conditions that must be met. Specify each condition in a separate row. Click  to add a row.

This field set contains the following fields:

  • When Clause
  • Condition
  • Action
Note
  • The Snap allows the following combinations of actions:
    • INSERT
    • UPDATE
    • DELETE
    • UPDATE AND DELETE
    • UPDATE AND INSERT
    • DELETE AND INSERT
    • UPDATE, DELETE AND INSERT


When ClauseString/Expression/Suggestion

Specify the matching condition based on the outcome of the ON ConditionAlternatively, click Image Modified to view and select a clause from the suggestion list.

Available options are:

  • WHEN MATCHED: Applies the specified condition and action when the source data matches with the target.
  • WHEN NOT MATCHED: Applies the specified condition and action when the source data does not match with the target.
  • WHEN NOT MATCHED BY SOURCE: (Available only for Azure Synapse.) Applies the specified condition and action when the target rows do not match the source rows.

Redshift and Snowflake supports the following actions in Merge-into statements:

When clauseAction
WHEN MATCHEDUPDATE or DELETE
WHEN NOT MATCHEDINSERT

Azure Synapse supports the following actions in Merge-into statements:

When clauseAction
WHEN MATCHEDUPDATE or DELETE
WHEN NOT MATCHEDINSERT
WHEN NOT MATCHED BY SOURCEUPDATE or DELETE


WHEN MATCHEDWHEN NOT MATCHED
ConditionString

Specify the additional criteria that must be specified.
The action associated for the specified condition is not performed if the condition's criteria is not fulfilled. It can be a combination of both source and target tables, source table only, target table only, or may not contain references to any table at all.

Having this additional condition allows the Snap to identify whether the UPDATE or DELETE action must be performed (since both the actions correspond to the WHEN MATCHED clause).

You can also use Pipeline parameters in this field to bind values. However, you must be careful to avoid SQL injection. See Preventing SQL Injection for details.

N/ATOTAL_ORDERS = 0
ActionDrop-down list

Choose the action to apply on the condition.

Available options are:

  • INSERT
  • UPDATE
  • DELETE
INSERTDELETE
Insert Expression

Use this field set to specify the column in which you want to insert the data and also the values that you want to insert into them. Each column must be specified in a separate row. Click + to add a new row.

This field set contains the following fields:

  • Insert Column
  • Insert Value
Insert ColumnString/Expression/Suggestion

Specify the column in which the value must be inserted. Alternatively, click Image Modified to view and select a column from the suggestion list.

N/APRODUCT_ID
Insert ValueString/Expression

Specify the value to be inserted for the column. This field can be an SQL expression that references the source table's columns only or none. It cannot reference the target table's columns.

You can also use Pipeline parameters in this field to bind values. However, you must be careful to avoid SQL injection. See Preventing SQL Injection for details.

N/Asrc.PRODUCT_ID
Update Expression - When Matched

Use this field set to add expressions whose values must be updated when the the Merge Into condition matches the source data. Specify each column to be updated in a separate row. Click  to add a new row.

This field set contains the following fields:

  • Update Column
  • Update Value
Update ColumnString/Expression/Suggestion

Specify the column in which the value must be updated. Alternatively, click Image Modified to view and select a column name from the suggestion list.

N/ATOTAL_ORDERS
Update ValueString/Expression

Specify the new value to be inserted when the Merge Into condition matches the source.
This field can be an SQL expression that may reference columns in the source table, the target table, both source and target, or neither. 
You can also use Pipeline parameters in this field to bind values. However, you must be careful to avoid SQL injection. See Preventing SQL Injection for details.

N/Asrc.TOTAL_ORDERS + 1
Update Expression – When Not Matched By Source

Applicable only for Azure Synapse and is used only when you choose the WHEN NOT MATCHED BY SOURCE as the When clause. 
Use this field set to add expressions listing out columns whose values must be updated or deleted if the source rows do not match the target rows in the Merge Into operation. Click  to add a new row.

This field set contains the following fields:

  • Update Column
  • Update Value
Note

If the options in this field set do not populate on choosing the WHEN NOT MATCHED BY SOURCE clause, the snap displays an error.


Update ColumnString/Expression/Suggestion

Specify the column name in which the value must be updated or deleted as specified in the conditions. Alternatively, click Image Modified to view and select a column from the suggestion list.

Note

When you choose Delete action, the Snap ignores the Update Expression - When Not Matched by Source field set.


N/ATOTAL_ORDERS
Update ValueString/Expression

Specify the new value to be inserted when the Merge Into condition matches the source.
This field can be an SQL expression that may reference columns in the source table, the target table, both source and target, or neither. 
You can also use Pipeline parameters in this field to bind values. However, you must be careful to avoid SQL injection. See Preventing SQL Injection for details.

N/Asrc.TOTAL_ORDERS + 1
Get preview dataCheck box

Multiexcerpt include macro
namegetpreviewdata
pageELT Intersect

Not selectedSelected

...

Method-2: Dynamic Substitutions

You must enable expressions when using Pipeline parameters for dynamic substitutions. Format the SQL expression, except the Pipeline parameter's reference, as a string. 

For example, if you want to use the Pipeline parameter, name, which contains the value of a column in the ON Condition, ConditionInsert Value, or Update Value fields: 

...

ErrorReasonResolution

The Snap has not been configured properly.

(This error scenario occurs when the Target table does not exist (Snap needs to create one), and the Target Table Hash Distribution Column is not specified.)

Without the Target Table Hash Distribution Column defined for creating the new table in Azure Synapse, the Snap creates a new target table with Round-Robin distribution (default for Azure) instead of a Hash-distributed table. But, as the Snap cannot insert values into a non-Hash-distributed table, it errors out with this message.

If you have not created the Hash-distributed table before hand, ensure that you specify the Target Table Hash Distribution Column for creating the new table. To do so, validate the Pipeline to get the target table column names through the Input SQL coming from the upstream Snap.

This allows the ELT Merge Into Snap to create a Hash-distributed table in the Azure Synapse Database and insert the rows, as needed.

If the target table exists already and was created with Hash distribution, you can leave this field blank.

Database cannot be blank.

(when seeking the suggested list for Schema Name field)

Suggestions in the Schema Name and Target Table Name fields do not work when you have not specified a valid value for the Database Name field in this Snap.

Specify the target Database Name in this Snap to view and choose from a suggested list in the Schema Name and Target Table Name fields respectively.

SQL exception from Snowflake: Syntax error in one or more positions in the SQL query
Column names in Snowflake tables are case-sensitive. It stores all columns in uppercase unless they are surrounded by quotes during the time of creation in which case, the exact casing is preserved. See, Identifier Requirements — Snowflake Documentation.Ensure that you follow the same casing for the column table names across the Pipeline.

[Simba][SparkJDBCDriver](500051) ERROR processing query/statement. Error Code: 0

Cannot create table ('<schema name>`.`<table name>`'). The associated location (`…<table name>`) is not empty but it's not a Delta table

(Target CDW: Databricks Lakehouse Platform)

The specified location contains one of the following:

  • A non-Delta table

that currently exists is corrupted and needs to be dropped from the schema before creating a Delta-formatted table.

However, this corrupted table can only be dropped manually—by accessing the DBFS through a terminal. The Pipeline cannot perform this operation.

Drop the corrupted table and then try creating the new table in Delta format (using the Pipeline).

To drop the corrupted table, from the terminal, access the DBFS and run the following command:

dbfs rm -r dbfs:/<table_path>
  • (such as CSV, ORC, JSON, PARQUET)

  • A corrupted table

  • A Delta table with a different table schema

So, the Snap/Pipeline cannot overwrite this table with the target table as needed.

Ensure that you take appropriate action (mentioned below) on the existing table before running your Pipeline again (to create another Delta table at this location).

Move or drop the existing table from the schema manually using one of the following commands:

Access the DBFS through a terminal and run:

  • dbfs mv dbfs:/<current_table_path> dbfs:/<new_table_path> to move the table or

  • dbfs rm -r dbfs:/<table_path> to drop the table.

OR

Use a Python notebook and run:

  • dbutils.fs.mv(from: String, to: String, recurse: boolean = false): boolean to move the table/file/directory or

  • dbutils.fs.rm(dir: String, recurse: boolean = false): boolean to drop the table/file/directory.


Example

Retrieving and Transforming Records and Merging Them Into A Table

In this example, we want to retrieve records from two tables. First, we transform the retrieved records using conditions and use a JOIN clause to join the queries from the two tables. We then perform transformations on the records and apply aggregate function. Finally, we merge the records by applying conditions and actions. This example demonstrates how we can accomplish this task.

First, we use two ELT Select Snaps to build SELECT queries to retrieve all records from the tables (ORDERS_34R_2KB and CUSTOMER_50R_3KB). The configuration and query output is as shown below:

ELT Select SnapsOutput

The tables have several columns. But, we want to retrieve only the CUST_CODE and ORD_AMOUNT columns. Therefore, we configure the ELT Transform Snap with the following settings, which builds the query as shown below:

ELT TransformOutput
Image Modified
Image Modified

The SELECT queries in the ELT Select and ELT Transform Snaps form the inputs for the ELT Join Snap. Now, we apply an inner join based on matching values of the CUST_CODE column in the tables. Hence, we configure the ELT Join Snap and validate. Upon validation, the ELT Join Snap combines both the incoming SELECT queries and adds the JOIN clause. The output preview is as shown below:

ELT JoinOutput
Image Modified
Image Modified

We use the ELT Transform Snap to retrieve only the CUST_NAME and ORD_AMOUNT columns after joining the queries. We configure the Snap with the following setting which builds the query as shown below:

ELT Transform SnapOutput

Subsequently, we use an ELT Aggregate Snap to apply an aggregate function on the values in the ORD_AMOUNT column. Here, we want to calculate the SUM for the ORD_AMOUNT column.
Image Modified

The Snap builds the query as shown below:
Image Modified

Finally, we use the ELT Merge Into Snap to merge the records into the Target table MERGE_INTO_OUT_01 based on the ON Condition, apply Merge-into statements, update actions, and update expressions.
The ON condition checks for the customer's name - if the customer's name matches, it applies the Merge-into statements and applies conditions as configured below. In this case, if the customer's name matches and if the condition (SUM_COLUMN<5000) is met, then the Snap updates the record with customer's name. If the customer's name does not match and if the condition (SrcAlias.SUM_COLUMN<5000) is met, then it inserts the customer's name into the record.
Image Modified

Upon successful execution, the Pipeline adds and updates customer records in the Target database table in the configured database.

Download this Pipeline.

Downloads

Note
titleImportant Steps to Successfully Reuse Pipelines
  1. Download and import the Pipeline into SnapLogic.
  2. Configure Snap accounts as applicable.
  3. Provide Pipeline parameters as applicable.

...