ELT Transform
In this article
An account for the Snap
You must define an account for this Snap to communicate with your target CDW. Click the account specific to your target CDW below for more information:
Overview
Use this Snap to perform build transformation-based SQL queries on the input tables. Transformations such as renaming columns, selecting a set of columns instead of all the columns, and any row-based SQL expressions. Use the Input Schema and Target Schema lists displayed on both sides of the Mapping Table to drag and drop entities from the schemas into the respective columns in the Mapping Table. This Snap also allows you to preview the result of the output query. You can validate the modified query using this preview functionality.
Prerequisites
None.
Limitations
- This Snap does not display the list of schema entities in the following scenarios:
- Until the first validation of the Pipeline.
- When the downstream Snaps are not configured to point to valid database table/columns.
- ELT Snap Pack does not support Legacy SQL dialect of Google BigQuery. We recommend that you use only the BigQuery's Standard SQL dialect in this Snap.
Known Issue
None.
Snap Input and Output
Input/Output | Type of View | Number of Views | Examples of Upstream and Downstream Snaps | Description |
---|---|---|---|---|
Input | Document |
|
| The SQL query in which you want to add the transformations. |
Output | Document |
|
| The incoming SQL query with the specified transformations. |
Snap Settings
SQL Functions and Expressions for ELT
You can use the SQL Expressions and Functions supported for ELT to define your Snap or Account settings with the Expression symbol = enabled, where available. This list is common to all target CDWs supported. You can also use other expressions/functions that your target CDW supports.
Parameter Name | Data Type | Description | Default Value | Example |
---|---|---|---|---|
Label | String | Specify a name for the Snap. You can modify this to be more specific, especially if you have more than one of the same Snap in your pipeline. | ELT Transform | Map Data |
Get preview data | Checkbox | Select this checkbox to include a preview of the query's output. The Snap performs limited execution and generates a data preview during Pipeline validation. In case of ELT Pipelines, only the SQL query flows through the Snaps but not the actual source data. Hence, the preview data for a Snap is the result of executing the SQL query that the Snap has generated in the Pipeline. The number of records displayed in the preview (upon validation) is the smaller of the following:
Rendering Complex Data Types in Databricks Lakehouse Platform Based on the data types of the fields in the input schema, the Snap renders the complex data types like map and struct as object data type and array as an array data type. It renders all other incoming data types as-is except for the values in binary fields are displayed as a base64 encoded string and as string data type. | Not selected | Selected |
Pass through | Checkbox | Select this to include the columns in the upstream Snap's query projection along with the existing columns that are specified in the mapping table. | Not selected | Selected |
Input Schema | Use this field set to select or drag and drop fields available in the input schema coming from the upstream Snap into the Mapping Table and define the expression needed for the transformation. This field set contains the following fields.
Rendering Complex Data Types in Databricks Lakehouse Platform Based on the data types of the fields in the input schema, this section of the Snap renders the preview of complex data types like map and struct as object data type and array as an array data type. It renders all other incoming data types as-is except for the values in binary fields are displayed as a base64 encoded string and as string data type. | |||
Select All | Hyperlink | Click this hyperlink to select (the checkboxes of) for using all the fields from the input schema to define your transformation criteria in the Mapping table. Drag and drop the selected fields to populate all their field names in the Expression column. | N/A | N/A |
Deselect All | Hyperlink | Click this hyperlink to deselect all the selected checkboxes to restart selecting the fields needed to define your transformation criteria. | N/A | N/A |
(Search/Find) | String | Start entering a field name to filter the list by the entered keyword and select the fields needed for defining the transformation. | N/A | N/A |
All | Drop-down list | Change the selection in this drop-down list to filter the list of fields displayed in the Input Schema. The available options are:
| All | Selected |
Mapping Table | This field set enables you to specify the transformations you want to perform on the columns/records in the source table. Each different transformation must be specified in a separate row. Click + to add a new row. This field set consists of the following fields:
The output query contains only those columns that are specified here unless the Pass through check box is also selected. | |||
Expression | String/Expression | The column and the transformation operation that you want on that column. | N/A | EMPLOYEE ORDER_NUMBER + 1 |
Target Path | String | The name to be assigned to the column. If the Target Path field is empty for a specified expression, the respective column is deleted from the output. See the Example below. | N/A | EMPLOYEE ORDER_NUMBER_NEW |
Output Schema | Use this field set to select or drag and drop fields available in the output schema coming from the downstream Snap into the Mapping Table and define the expression, as needed for the transformation. Output schema fields are populated upon validation of the Pipeline and this depends on the downstream Snap's configuration. This field set contains the following fields.
Rendering Complex Data Types in Databricks Lakehouse Platform Based on the data types of the fields in the input schema, this section of the Snap renders the preview of complex data types like map and struct as object data type and array as an array data type. It renders all other incoming data types as-is except for the values in binary fields are displayed as a base64 encoded string and as string data type. | |||
Select All | Hyperlink | Click this hyperlink to select (the check boxes of) for mapping all the fields from the output schema with corresponding input field expressions (your transformation criteria) in the Mapping table. drag and drop the selected fields to populate all their field names in the Target Path column. | N/A | N/A |
Deselect All | Hyperlink | Click this hyperlink to deselect all the selected check boxes to restart selecting the output fields needed to define your transformation criteria. | N/A | N/A |
(Search/Find) | String | Start entering a field name to filter the list by the entered keyword and select the fields to include in the transformation criteria. | N/A | N/A |
All | Drop-down list | Change the selection in this drop-down list to filter the list of entries displayed in the Output Schema. The available options are:
| All | Selected |
Input Preview | Display-only | This section of the Snap's Settings displays the preview (partial result) of the incoming data from the previous Snap in the Pipeline. | N/A | N/A |
Output Preview | Display-only | This section of the Snap's Settings displays the preview (partial result) of applying the transformation criteria defined in the Mapping table. | N/A | N/A |
Get preview data | Check box | Select this checkbox to include a preview of the query's output. The Snap performs limited execution and generates a data preview during Pipeline validation. In the case of ELT Pipelines, only the SQL query flows through the Snaps but not the actual source data. Hence, the preview data for a Snap is the result of executing the SQL query that the Snap has generated in the Pipeline. The number of records displayed in the preview (upon validation) is the smaller of the following:
Rendering Complex Data Types in Databricks Lakehouse Platform Based on the data types of the fields in the input schema, the Snap renders the complex data types like map and struct as object data type and array as an array data type. It renders all other incoming data types as-is except for the values in binary fields are displayed as a base64 encoded string and as string data type. | Not selected | Selected |
Troubleshooting
None.
Examples
Retrieving Specified Columns from a Table
We need a SELECT query with only the columns that we want to retrieve. This example shows how we can use the ELT Transform Snap to achieve this result.
First, we use the ELT Select Snap to build a query to retrieve all records from the target table.
Upon execution, this Snap builds the query as shown below:
The table has several columns. But, we want to retrieve only the CUST_CODE and ORD_AMOUNT columns. Therefore, we add the ELT Transform Snap and configure it as shown below:
Based on this configuration, the ELT Transform Snap builds a query as shown below:
We can add also an ELT Insert-Select Snap downstream and write the result of this query into another table.
Using Empty Target Paths to Omit Rows from the Snap Output
We can use the ELT Transform Snap to retrieve information from specific columns in a database table or the input schema instead of returning data from all the available columns. This example shows how we can use the ELT Transform Snap to achieve this result.
First, we configure the ELT Select Snap to build a query to retrieve all records from the "TEST_DATA"."ORG1" table.
Next we use the ELT Transform Snap and configure the Expressions and Target Path in the Mapping table as shown below. Note that we intentionally leave the Target Path blank for DEPT and LOCAL fields, so as to not retrieve these columns in the query. We ensure to select Pass through to avoid duplicate fields in the query if same fields are present in the Mapping table.
Upon validation, the ELT Transform Snap builds a query as shown below. Note that DEPT and LOCAL fields do not appear in the query.
Downloads
Important Steps to Successfully Reuse Pipelines
- Download and import the Pipeline into SnapLogic.
- Configure Snap accounts as applicable.
- Provide Pipeline parameters as applicable.