- Created by Mohammed Iqbal, last modified by Kalpana Malladi on Feb 12, 2021
You are viewing an old version of this page. View the current version.
Compare with Current View Page History
« Previous Version 6 Next »
In this article
Overview
Use this Snap to add a MINUS SQL operator to the separate queries coming from upstream Snaps. The MINUS SQL operator corresponds to ANSI EXCEPT. In Azure Synapse, this Snap uses EXCEPT operation unlike in Redshift and Snowflake, wherein the Snap uses MINUS operation. This Snap also allows you to preview the result of the MINUS SQL operation on the incoming SQL queries. You can validate the modified query using this preview functionality.
- The MINUS SQL operation does not eliminate duplicate records. You can add the ELT Unique Snap to the ELT Minus Snap to remove duplicates.
- This Snap's output is not commutative. The Snap's output changes based on which table is connected in which input view. For example, if you use queries for tables A and B as inputs, where query A is connected to the first input and query is connected to the second, the result from the Snap's execution would not be the same if it was the other way around. Basically, A MINUS B is not the same as B MINUS A.
Prerequisites
None.
Limitations and Known Issues
None.
Snap Input and Output
Input/Output | Type of View | Number of Views | Examples of Upstream and Downstream Snaps | Description |
---|---|---|---|---|
Input | Document |
|
| Multiple SQL queries. |
Output | Document |
|
| The incoming SQL queries with the MINUS operator included. |
Snap Settings
Parameter Name | Data Type | Description | Default Value | Example | ||
---|---|---|---|---|---|---|
Label | String | Specify a name for the Snap. You can modify this to be more specific, especially if you have more than one of the same Snap in your pipeline. | ELT Minus | Exclude Common Records | ||
Get preview data | Check box | Select this checkbox to include a preview of the query's output. The Snap performs limited execution and generates a data preview during Pipeline validation. In the case of ELT Pipelines, only the SQL query flows through the Snaps but not the actual source data. Hence, the preview data for a Snap is the result of executing the SQL query that the Snap has generated in the Pipeline. The number of records displayed in the preview (upon validation) is the smaller of the following: Number of records available upon execution of the SQL query generated by the Snap. The value set in the Preview Data Count setting (default is 50 records). Rendering Complex Data Types in Databricks Lakehouse Platform Based on the data types of the fields in the input schema, the Snap renders the complex data types like map and struct as object data type and array as an array data type. It renders all other incoming data types as-is except for the values in binary fields are displayed as a base64 encoded string and as string data type. In Azure Synapse, this Snap uses EXCEPT operation unlike in Redshift and Snowflake, wherein the Snap uses MINUS operation.
| Not selected | Selected |
Troubleshooting
None.
Examples
Retrieving Uncommon Rows From a Table
We need a query with the MINUS operator if we want to compare two tables and retrieve the uncommon records from the first table. This example shows how we can use the ELT Minus Snap to build this query.
First, we build SELECT queries to read the target tables. To do so, we can use two ELT Select Snaps, in this example: Read Part A and Read Part B. Each of these Snaps is configured to output a SELECT * query to read the target table in the database. Additionally, these Snaps are also configured to show a preview of the SELECT query's execution as shown:
Read Part A Configuration | Read Part B Configuration |
---|---|
A preview of the outputs from the ELT Select Snaps is shown below:
Read Part A Output | Read Part B Output |
---|---|
Then, we connect the ELT Minus Snap to the output view of the ELT Select Snaps. The SELECT * queries in both of these Snaps form the inputs for the ELT Minus Snap. Upon execution, the ELT Minus Snap combines both incoming SELECT * queries and adds the MINUS operator.
A preview of the ELT Minus Snap's output is shown below:
We can also add an ELT Insert-Select Snap downstream and write the result of this query into another table.
Downloads
Important Steps to Successfully Reuse Pipelines
- Download and import the Pipeline into SnapLogic.
- Configure Snap accounts as applicable.
- Provide Pipeline parameters as applicable.
File | Modified |
---|
Release Snap Pack Version Type Updates Enhanced the ELT Load Snap’s capabilities to allow loading flat and nested data sets (from canonical and non-canonical formats) from your Parquet files to the target tables in Snowflake. Learn more about the usage of this feature at Load data from Parquet files and in the following example Pipelines: Enhanced the ELT Merge Into Snap to support defining the update expressions for updating tables in your target Databricks Lakehouse Platform instance using the WHEN NOT MATCHED BY SOURCE clause, besides other supported clauses and expression lists. Enhanced the ELT Merge Into and ELT SCD2 Snaps to display detailed pipeline execution statistics about the data loaded to target tables on a Databricks Lakehouse Platform instance. In addition to the individual counts of rows inserted, updated, and deleted that the ELT Merge Into Snap covers, the ELT SCD2 Snap also reports the count of source rows rejected. Upgraded the JDBC driver support for Snowflake to snowflake-jdbc-3.13.33.jar. Updated the ELT Snowflake Account to append the connection parameter Fixed a null pointer exception so no 5XX errors can occur if you download non-existent query details from the Pipeline Execution Statistics of an ELT (write-type) Snap. ELT Insert-Select, ELT Merge Into, and ELT SCD2 Snaps show the following statistics on execution. Records Added Records Updated Records Deleted The ELT Create View Snap enables you to create a new view when the view does not exist in the target database and/or schema or if the view already exists in the database and/or schema, and you choose to drop the existing view and re-create it. Enhancements The ELT Insert-Select Snap is more flexible and easier to use, especially if the number of columns in your source data set is very large. You can choose to update values only in a subset of columns in the target table. The ELT Execute Snap can retrieve and execute SQL queries from the upstream Snap's output when referenced in the SQL Statement Editor using the Expression language (with the Expression button enabled). The ELT Load Snap can infer the schema from the source files in Amazon S3, ADLS Gen2, Microsoft Azure Blob Storage, or Google Cloud Storage location and use it to create, overwrite, and append the target table in your Snowflake instance with the source data. The source files can be in the AVRO, CSV, JSON, ORC, or PARQUET format. Learn more at Automatic Schema Inference with ELT Load Snap. Target Table Name in the following Snaps supports retrieving editable views with the table names from the selected target schema: The pivot values in the ELT PivotSnap turns dynamic when you select Enable dynamic pivot values. The following field settings are added as part of this dynamic pivot values feature: Filter Predicate List: A field set to filter the predicate list of the pivot values. Pivot Values Filter: Condition required to filter the pivot values. Boolean Operator: Predicate condition type through AND or OR Boolean operators Sort Order: Sorting order of the pivot values. You can specify the type of Microsoft Azure external storage location (source)—an Azure Data Lake Gen2 or a Blob Storage—to access your source data using the Storage Integration type of authentication and load it to your target Snowflake instance. Enhanced the ELT Snap Pack to support the latest JDBC drivers across CDWs—Azure Synapse, BigQuery, DLP, Redshift, and Snowflake. See Configuring ELT Database Accounts or the respective Account page for the exact versions. Enhanced the ELT Pivot Snap to make the Value List field dynamic. Enhanced the ELT Snowflake Account with support for Key Pair Authentication. Enhanced the ELT SCD2 Snap: To include a new option Overwrite existing table in the Target Table action field. To display the final SQL query in its output preview upon Pipeline validation. Enhanced the end Snap SQL query in the ELT Insert Select Snap’s preview output to display the Fixed an issue with ELT Insert Select and ELT Merge Into Snaps where they cause the Pipeline to fail when the specified target table does not exist. After this fix, the Snaps create a new target table if it does not exist during Pipeline validation. The new table that is created will not be dropped in the event of a subsequent/downstream Snap failing during validation. Fixed the issue with ELT Load Snap where the Snap caused an SQL exception— The ELT Insert Select, ELT Merge Into, ELT Load, and the ELT SCD2 Snaps now run successfully even when the specified target table does not exist. These Snaps now create a new target table if it does not exist during Pipeline validation. The new table thus created will not be dropped in the event of a subsequent/downstream Snap failure during validation. 4.29patches16287 Fixed an issue with the ELT SCD2 Snap where the Snap was rounding off decimal values to the nearest integer—the value 57.601000000000 in the source table was written to the target table as 58.000000000. Enhanced the ELT Database Account to support OAuth2-based authentication on the target Snowflake database. Enhanced the ELT Select, ELT Insert Select, ELT SCD2, ELT Merge Into, and ELT Load Snaps to display suggestions on the Schema Name field based on the Default Database Name provided in the Snap Account configuration when the Database Name is not specified in the respective Snap. Improved usability of the suggestions features for these Snaps by making them case-insensitive. For example, typing default in the Schema Name field displays both default and DEFAULT, if they co-exist. You do not need to type DEFAULT to invoke and select the schema name DEFAULT from the suggestions list. Enhanced the ELT SCD2 Snap to address different feature requests and issues raised by multiple customers. These changes provide more flexibility in configuring your SCD2 operations using this Snap. Removed Check for nulls and duplicates in the source field and added two dropdown lists - Null Value Behavior and Invalid Row Handling. Made the following items in the Meaning field of the Target Table Temporal Fields fieldset mandatory while making the Invalid historical rows flag optional. Current row Historical row Enhanced the ELT Aggregate and ELT Window Functions Snaps to support the following functions across all supported CDWs: KURTOSIS MODE SKEW Enhanced the ELT Aggregate Snap to support the following GROUP BY features across all supported CDWs: Group by Cube Group by Grouping Sets Group by Rollup Automatic GROUP BY for all input columns. Fixed an issue with ELT Merge Into Snap where the Snap erroneously modified the target table column name when the column name contained the target table name. Fixed an issue in ELT SCD2 Snap where the Snap causes incorrect results with Snowflake targets, when: The Historical Row End Date value is provided. Nulls and Invalid rows are recognized, but one or more start dates in the source are null. Fixed the issue in ELT Transform Snap where the Output Schema of the Snap does not populate all the column names from its Input Schema. Fixed the issue with ELT Merge Into Snap where the Snap erroneously modified the target table column name when it contained the target table name, due to a misinterpretation of the target table name aliases. Updated ELT SCD2 Snap to address different feature requests and issues raised by multiple customers. These changes provide more flexibility in configuring your SCD2 operations using this Snap. Removed Check for nulls and duplicates in source field and added two dropdown lists - Null Value Behavior and Invalid Row Handling. Refer to the ELT SCD2 scenarios to learn more. Introduced a new Snap ELT Router to enable routing input SQL queries into multiple output views based on the given conditional expressions. Subquery Pushdown Optimization: SnapLogic now optimizes SQL queries before they are passed to the CDW to ensure the queries are performant and cost-efficient in the respective CDW. An SQL subquery means a query inside a query. Pushdown optimization refers to rewriting these incremental (nested) SQL queries produced in your ELT Pipeline to form a more optimal/performant version. Enhanced all Snaps in the ELT Snap Pack with the ability to push down optimized SQL subqueries along the ELT Pipelines. See Subquery Pushdown Optimization (SPDO) for details. Introduced the following new ELT Snaps: ELT Case Expression: Snap to return the action to perform on an event based on a list of events and respective expected actions. ELT Coalesce: Snap to return the first non-NULL value from a list of arguments. ELT Conditional Functions: Snap to perform unary and binary conditional operations on data. ELT Math Functions: Snap to perform mathematical—arithmetic, logarithmic, trigonometric, exponent, root, rounding, and truncation—operations on data. Enhanced all the expression-enabled fields in ELT Snaps to display suggestions from the Input Schema (emanating from the upstream Snaps) in addition to the existing standard SQL expressions and functions list. Enhanced the ELT Aggregate Snap to support: HAVING clause within GROUP BY clause, when the WHERE clause cannot be used. GROUP BY ROLLUP. New aggregate functions for DLP: ANY, SOME, KURTOSIS, and STDDEV. Enhanced the ELT Load Snap to support loading data into BigQuery targets from S3 buckets and Redshift CDW. These load operations use the BigQuery Data Transfer Service (DTS) client libraries and are carried out in asynchronous mode. Enhanced the ELT Database Account to support these operations. Enhanced the ELT Load and ELT Insert Select Snaps with a new fieldset Table Options List to support defining the Table Options for creating a new table in your target CDW. Enhanced the ELT Select Snap to support Common Table Expression (CTE)-based SQL queries that contain a WITH clause inside the SQL Query Editor field, when your target CDW is Azure Synapse. 427patches13923 Fixed the issue with ELT SCD2 Snap where the Snap did not equate null values in the corresponding cause-historization rows of both the source and target tables (with no other changes to data in the remaining fields) as the same and produced duplicate rows in the target table, as a result. After this fix, the Snap does not cause any new duplicate rows in the target table. Fixed the issue with ELT Load Snap where the Snap fails with the error Database encountered an error during Bulk Load process when you specify a CSV file to load data from, with the Load Action as Alter Table. The Snap now performs the specified ALTER TABLE actions—ADD/DROP columns—and loads the data into the target table accordingly (without the need to manually modify the source or target tables beforehand). Fixed the issue with ELT SCD2 Snap where the Snap failed when you define more than one TargetTable Natural Key in the Snap configuration to load SCD2 data into the target CDW instance. Fixed an issue with the ELT SCD2 Snap where the Snap failed to update the previous current rows in the target SCD2 table to historical rows when you define an End Date of Historical Row in Target Table Temporal Fields. Fixed an issue with the ELT SCD2 Snap where the Snap failed to insert new rows when you define the values that exist in the most recent historical rows of the target SCD2 table as the cause-historization values. Fixed the issue with ELT Transform Snap where the Snap does not omit the source columns marked for removal from the output view—using an empty Target Path for one or more columns selected in the Expression field of the Snap’s Mapping Table. See Using Empty Target Paths to Omit Rows from the Snap Output to understand how to perform this operation. 427patches13030 Fixed the following issues with the ELT SCD2 Snap: Where the Snap failed to get the right data type due to column name case mismatches between what is used in the Snap and what is actually used in the Azure Synapse tables (returned by the JDBC driver). You no longer need to type the column names in the exact case that Azure Synapse expects. The Snap failed with the error— The Snap failed with the error— Where the Snap failed due to lack of required access privileges on the target database (for example, create table rights to create temporary tables as needed). The Snap now runs the input SQL statement and the elaborate sub-queries instead of attempting to create a temporary table in such scenarios. Fixed the issue with the ELT SCD2 and ELT Load Snaps that fail to perform an add/drop operation involving multiple columns on a Redshift target table. Enhanced the ELT Aggregate Snap to support COUNT_IF aggregate function for Redshift and Azure Synapse target databases. Enhanced the ELT Database Account connecting to a Databricks Lakehouse Platform (DLP) to support two new options - Optimize Write and Auto Compact for creating/replacing a table using any of the ELT Insert Select, ELT Merge Into, ELT Load, and ELT SCD2 Snaps. Updated the Expressions and Functions Supported for ELT in the Snap and Account configuration sections. This list is common to all target CDWs supported. You can use these expressions to define your Snap or Account settings with the Expression symbol = enabled, where available. Enhanced the ELT Load Snap to ensure that the Snap uses the default S3 Folder name specified in the Snap's account to accurately resolve the defined File Name Pattern. Enhanced the ELT Select and ELT Execute Snaps to allow SQL comments inside the SQL Query Editor and SQL Statement Editor fields respectively. Enhanced the ELT Transform Snap to display the exact data types of fields listed in the Input Schema and Target Schema in case of Azure Synapse. Enhanced the ELT Merge Into Snap to to support MERGE INTO ALL option and automatic source table aliasing. Enhanced the ELT Load Snap by adding the Source File to Target Table Columns Map field set to enable mapping of columns between the source file and the target table. In case of Databricks Lakehouse Platform (DLP) the Snap is enhanced further to support delimiters other than the comma in the source CSV files. Enhanced the ELT Join Snap to support Left Anti and Left Semi join types in BigQuery though it does not natively support these join types. Enhanced the underlying load mechanism for ELT SCD2 Snap from Insert-and-Update mode to Merge-into mode to substantially improve the Snap's performance while working with large and very large volumes of data (upwards of 500M rows or 50GB size). Fixed an issue with ELT Transform Snap where it may display incorrect schema only in the previews (during Pipeline validation). This occurs especially when the incoming SQL statement (defined in the SQL Statement Editor of the upstream Snap) contains one or more of the WHERE, GROUP BY, HAVING , ORDER BY, LIMIT, LIMIT followed by OFFSET, and SAMPLE clauses. Here are a few Pipeline scenarios where this issue might surface: … > ELT Select (with one or more of these clauses used in the SQL Statement Editor) > ELT Transform > … … > ELT Select (with or without using the SQL Statement Editor) > ELT Filter > ELT Transform > … Fixed an issue where the ELT Load Snap connecting to a Databricks Lakehouse Platform (DLP) instance failed to perform the load operation. Ensure that you provide a valid DBFS Folder path in the Snap's account settings as the Snap requires this folder path. Latest Intermittent null-pointer exceptions in the ELT Load Snap on Databricks Lakehouse Platform (DLP). The query contains a LIMIT clause on a Snowflake, Redshift or Databricks Lakehouse Platform target instance: The SQL query created during Pipeline validation includes an additional LIMIT clause, for example: Enhanced the ELT Snap preview to support the following Snowflake data types: array, object, variant, and timestamp. The Snaps convert the values to hexadecimal (HEX) equivalents—the default setting for the session parameter BINARY_OUTPUT_FORMAT in Snowflake. See Session Parameters for Binary Values for more information. If this setting is different from hexadecimal (such as base64) in the Snowflake table, the Snaps still convert the values to hexadecimal equivalents for rendering them in the Snap preview. The ELT Database account is now mandatory for all Snaps in the ELT Snap Pack. Breaking change Starting with the 4.26 release, all Snaps in the ELT Snap Pack (except the ELT Copy Snap) require an account to connect to the respective target database. Your existing Pipelines that do not use an account may fail. We recommend you to associate an ELT Database Account to each of the ELT Snaps (except ELT Copy Snap) for your Pipelines. Updated the ELT SCD2 Snap to replace End date of historical row option in the Meaning field of Target Table SCD2 Fields field set with End Date of Current Row. Breaking change This may cause the existing Pipelines to fail as the End date of historical row option no longer exists. You need to make the following update in the ELT SCD2 Snap's settings across your Pipelines after upgrading your Snap Pack to this patch: No changes are needed to your existing Pipelines. Behavior Change The behavior of ELT Load Snap for Load Action during Pipeline validation across the supported databases is as follows: Append rows to existing table: Does not append the data from the source files into the target table. Overwrite existing table: Does not overwrite the data. Drop and Create table: Does not drop the target table even if it exists, but the Snap creates a new target table if a table does not exist. Alter table: Does not modify the schema of the target table. Updates the Snap Pack with the following features: 4.22 main6403 Stable Introduces the ELT Snap Pack that provides you with the Extract, Load, and Transform (ELT) capabilities. Use the following Snaps to build SQL queries that are executed in the Snowflake database:Snap Pack History
Date May 2024 main26341 Stable Updated and certified against the current SnapLogic Platform release. February 2024 436patches25953 Latest February 2024 main25112 Stable Updated and certified against the current SnapLogic Platform release. November 2023 435patches24461 Latest Fixed an issue with the ELT Merge Into Snap that caused inaccurate aliases and table identifiers in its generated SQL statement. November 2023 435patches23671 Latest and Stable November 2023 main23721 Stable August 2023 main22460 Stable application
in its JDBC URL with the value SnapLogic_iPaaS
.May 2023 N/A Stable May 2023 main21015 Stable Upgraded with the latest SnapLogic Platform release. February 2023 432patches20978 Latest Fixed an issue with the ELT SCD2 Snap where the COLLATE
column constraint (used in the new target table definition for Snowflake) resulted in an incorrect syntax internally, causing the pipeline to fail. The load operation succeeds with this fix.February 2023 main19844 Stable Upgraded with the latest SnapLogic Platform release. November 2022 431patches19240 Latest November 2022 main18944 Stable September 2022 430patches18196 Latest New Snap
August 2022 main17386 Stable Upgraded with the latest SnapLogic Platform release. 4.29-Patch 429patches16665 Latest CREATE TABLE...
or the DELETE
/DROP TABLE
statements to be run before the query that inserts/loads data into a new table in the Snowflake target CDW.[Simba][SparkJDBCDriver](500051) ERROR processing query/statement
when reading from a CSV file in S3 mount point on DBFS in the case of a DLP target instance.4.29-Patch Latest 4.29 main15993 Stable 4.28-Patch 428patches15638 Latest 4.28-Patch 428patches15290 Latest 4.28 main14627 Stable 4.27-Patch Latest 4.27-Patch 427patches13539 Latest 4.27-Patch Latest start_date does not exist—
while writing SCD2 data to a Redshift table column start_date that is specified as the Start Date of Current Row in the Target Table Temporal Field(s) field set.Reference 'END_DATE' is ambiguous—
while merging SCD2 updates into DLP tables.4.27 main12833 Stable
Eliminating duplicates with COUNT_IF aggregate function
Note: Selecting the Eliminate Duplicates checkbox while using COUNT_IF aggregate function does not eliminate duplicate records in case of Snowflake and BigQuery databases, as there is no native support for this feature. However, for Redshift, Azure Synapse and Databricks Lakehouse Platform (DLP), the duplicates are eliminated from the list of records when you select this checkbox for COUNT_IF function.4.26-Patch 426patches12534 Latest 4.26-Patch 426patches12021 Latest 4.26-Patch 426patches11646 Latest 4.26-Patch 426patches11323 4.26-Patch 426patches11262 Latest SELECT * FROM "STORE_DATA"."ORDERS" LIMIT 10 LIMIT 990
. SELECT * FROM "STORE_DATA"."ORDERS" LIMIT 10 OFFSET 4 LIMIT 990
. 4.26 main11181 Stable 4.25-Patch 425patches10017 Latest
4.25-Patch 425patches9725 Latest 4.25 main9554 Stable 4.24-Patch 424patches8793 Latest 4.24 main8556 Stable 4.23 main7430 Stable Introduces the following Snaps:
See Also
- No labels