ELT Coalesce

Overview

You can use this Snap to retrieve the first non-NULL expression among a set of arguments. The Snap returns NULL if all its arguments are NULL. If you expect a query to return null values for certain functions or columns, you can use this Snap to replace the NULLs with some other value—typically in the case of certain aggregate functions that return null values instead of zero when they have no rows to evaluate.

COALESCE Arguments

Result

COALESCE Arguments

Result

coalesce(NULL, 2, NULL)

2

coalesce(1, NULL, NULL)

1

coalesce(NULL, NULL, 3)

3

coalesce(NULL, NULL, NULL)

NULL

Snap Type

ELT Coalesce Snap is a TRANSFORM-type Snap that extracts non-NULL values from datasets containing NULL values.

Prerequisites

  • Valid accounts and access permissions to connect to the following:

    • Source: AWS S3, Redshift, Azure Cloud Storage, or Google Cloud Storage

    • Target: Snowflake, Redshift, Azure Synapse, Databricks Lakehouse Platform, or BigQuery

Limitations

None.

Known Issues

None.

Snap Input and Output

Input/Output

Type of View

Number of Views

Examples of Upstream and Downstream Snaps

Description

Input/Output

Type of View

Number of Views

Examples of Upstream and Downstream Snaps

Description

Input 

Document

  • Min: 1

  • Max: 1

  • ELT Insert Select

  • ELT Merge Into

  • ELT Copy

The SQL query that resolves to a document containing the list of arguments for the COALESCE operation.

Output

Document

  • Min: 1

  • Max: 1

  • ELT Select

  • ELT Filter

  • ELT Join

  • ELT Insert Select

The modified SQL query with the COALESCE operation included.

Snap Settings

  • Click the = (Expression) button in the Snap's configuration, if available, to define the corresponding field value using expression language and Pipeline parameters. 

  • Field names marked with an asterisk ( * )  in the table below are mandatory. 

Field Name

Type

Description

Field Name

Type

Description

Label*

String

Specify a unique label for the Snap.

Default ValueELT Coalesce
ExampleCoalesce for AVG values

Get preview data

Checkbox

Select this checkbox to include a preview of the query's output. The Snap performs limited execution and generates a data preview during Pipeline validation.

In case of ELT Pipelines, only the SQL query flows through the Snaps but not the actual source data. Hence, the preview data for a Snap is the result of executing the SQL query that the Snap has generated in the Pipeline.

The number of records displayed in the preview (upon validation) is the smaller of the following:

  • Number of records available upon execution of the SQL query generated by the Snap.

  • The value set in the Preview Data Count setting (default is 50 records).

Rendering Complex Data Types in Databricks Lakehouse Platform

Based on the data types of the fields in the input schema, the Snap renders the complex data types like map and struct as object data type and array as an array data type. It renders all other incoming data types as-is except for the values in binary fields are displayed as a base64 encoded string and as string data type.

Default ValueNot selected
ExampleSelected

Pass through

Checkbox

Select this checkbox to include the original input data in the output document along with the results of the COALESCE operation.

Default ValueNot selected
ExampleSelected

Coalesce Alias

String

Enter the column alias name for storing the results of the COALESCE operation. Leave this field blank to allow the Snap to automatically define an alias name for the column.

Default Value: None
ExampleNewCoaleas

Coalesce Arguments

Specify the list of arguments based on the input data schema for performing the COALESCE operation. Each function must be specified as a new row. Click  to add a row.

This field set consists of the following fields:

  • Coalesce Argument

Coalesce Argument

String/Expression

Enter an argument for performing the COALESCE operation. Define only one argument in this field.

Default Value: None
ExampleORDER_CNT, ORDERS_AVG_AMT

Troubleshooting

Error

Reason

Resolution

Error

Reason

Resolution

None.

 

 

Examples

Applying COALESCE Operation on a Dataset

In this Pipeline example, we apply the COALESCE operation on a dataset using the ELT Coalesce Snap.

To begin with, we use an ELT Select Snap to read the dataset from COALESCE_SRC table in a source database (Redshift in this example). The Snap retrieves the data from the table. You can see this dataset in JSON format in this Snap’s preview shown below.

ELT Select Snap

Output

ELT Select Snap

Output

We connect the ELT Coalesce Snap to the ELT Select Snap to apply the COALESCE operation on this dataset.

ELT Coalesce Snap

Output

ELT Coalesce Snap

Output

In this Snap, we provide an alias name for the COALESCE operation - test_coalesce and define the arguments—columns in the dataset to be considered for the operation—from the Coalesce Arguments fieldset. In the adjacent Snap Output (preview) image, we can see the results of the COALESCE operation on the selected arguments (I1, I2, and I3, in this case). The values returned are the first non-null values from each row in the data set.

We write these values to a different table - COALESCE_TC2 in the Redshift database using an ELT Insert Select Snap. Here we select the Overwrite checkbox to make sure that the data is written to a fresh(empty) table.

Download this Pipeline

Downloads

  File Modified

File ELT_Coalesce_FEP1.slp

Feb 04, 2022 by Anand Vedam

Snap Pack History


See Also