Skip to end of banner
Go to start of banner

Databricks - Delete

Skip to end of metadata
Go to start of metadata

You are viewing an old version of this page. View the current version.

Compare with Current View Page History

« Previous Version 4 Next »

In this article

Overview

You can use this Snap to execute a Databricks SQL DELETE statement based on specific conditions. Ensure adequate discretion while using this Snap as it can truncate the table if run without specifying a WHERE condition for the DELETE statement.

Snap Type

Databricks - Delete Snap is a write-type Snap that deletes rows from a target DLP table.

Prerequisites

  • Valid access credentials to a DLP instance.

  • Valid access to the external source data in one of the following: Azure Blob Storage, ADLS Gen2, DBFS, GCP, AWS S3, or another database (JDBC-compatible).

Support for Ultra Pipelines

Works in Ultra Pipelines

Limitations and Known Issues

None.

Snap Views

Type

Format

Number of Views

Examples of Upstream and Downstream Snaps

Description

Input 

Document

  • Min: 0

  • Max: 1

  • JSON Generator

  • Copy

  • Databricks - Select

A JSON document containing the reference to the table, rows to be deleted.

Output

Document

  • Min: 0

  • Max: 1

  • Databricks - Select

  • JSON Parser

A JSON document containing the result of the delete operation on the target table.

Error

Error handling is a generic way to handle errors without losing data or failing the Snap execution. You can handle the errors that the Snap might encounter while running the Pipeline by choosing one of the following options from the When errors occur list under the Views tab. The available options are:

  • Stop Pipeline Execution: Stops the current pipeline execution when the Snap encounters an error.

  • Discard Error Data and Continue: Ignores the error, discards that record, and continues with the rest of the records.

  • Route Error Data to Error View: Routes the error data to an error view without stopping the Snap execution.

Learn more about Error handling in Pipelines.

Snap Settings

  • Asterisk ( * ): Indicates a mandatory field.

  • Suggestion icon ((blue star)): Indicates a list that is dynamically populated based on the configuration.

  • Expression icon ((blue star) ): Indicates whether the value is an expression (if enabled) or a static value (if disabled). Learn more about Using Expressions in SnapLogic.

  • Add icon ( (blue star) ): Indicates that you can add fields in the fieldset.

  • Remove icon ( (blue star)): Indicates that you can remove fields from the fieldset.

Field Name

Field Type

Description

Label*

Default ValueDatabricks - Delete
ExampleDb_Del_Duplicates

String

The name for the Snap. You can modify this to be more specific, especially if you have more than one of the same Snap in your Pipeline.

Database name

 

Default Value: None.
Example: Cust_DB

String/Expression/Suggestion

Enter your corresponding DLP database name for the DELETE statement to delete existing rows from the table.

Table name*

 

Default Value: None.
Example: Cust_List

String/Expression/Suggestion

Enter your table name for the DELETE statement to delete existing rows from.

Delete Condition (Truncates Table if empty)

Default Value
Example: last_login_date < ‘2010-01-01’

String/Expression/Suggestion

Specify the condition for the DELETE statement to filter the rows to delete from the target table.

Snap Execution

Default Value
Example: Validate & Execute

Dropdown list

Select one of the three modes in which the Snap executes. Available options are:

  • Validate & Execute: Performs limited execution of the Snap, and generates a data preview during Pipeline validation. Subsequently, performs full execution of the Snap (unlimited records) during Pipeline runtime.

  • Execute only: Performs full execution of the Snap during Pipeline execution without generating preview data.

  • Disabled: Disables the Snap and all Snaps that are downstream from it.

Troubleshooting

Error

Reason

Resolution

Missing property value

You have not specified a value for the required field where this message appears.

Ensure that you specify valid values for all required fields.

Examples

Deleting Employee Information from DLP table

Consider the scenario where we want to delete information of certain employees from an intermediate data location that runs on DLP. We can achieve this through a Pipeline containing the Databricks - Delete Snap.

We configure this Snap (Pipeline) to delete the employee rows from the company_employees table in our DLP instance if their joining date is before Jan 01, 2010. It uses an appropriate account

Upon validation, the Pipeline deletes the rows satisfying the condition specified and returns the status of the operation in the Snap’s output.

Download this Pipeline

Downloads

  1. Download and import the Pipeline into SnapLogic.

  2. Configure Snap accounts as applicable.

  3. Provide Pipeline parameters as applicable.

  File Modified
No files shared here yet.

Snap Pack History

Release

Snap Pack Version

Date

Type

Updates

August 2024

main27765

Stable

Upgraded the org.json.json library from v20090211 to v20240303, which is fully backward compatible.

May 2024

437patches27246

Latest

Added Databricks - Run Job. This Snap executes a job, checks its status in Databricks, and, based on the job's status, completes or fails the pipeline.

May 2024

437patches26400

Latest

Fixed an invalid session handle issue with the Databricks Snap Pack that intermittently triggered an error message when the Snaps failed to connect with Databricks to execute the SQL statement.

May 2024

main26341

Stable

Updated the Delete Condition (Truncates a Table if empty) field in the Databricks - Delete Snap to Delete condition (deletes all records from a table if left blank) to indicate that all entries will be deleted from the table when this field is blank, but no truncate operation is performed.

February 2024

main25112

Stable

Updated and certified against the current SnapLogic Platform release.

November 2023

main23721

Stable

Updated and certified against the current SnapLogic Platform release.

August 2023

main22460

Stable

Updated and certified against the current SnapLogic Platform release.

May 2023

433patches21630

Latest

Enhanced the performance of the Databricks - Insert Snap to improve the amount of time it takes for validation.

May 2023

main21015

Stable

Upgraded with the latest SnapLogic Platform release.

February 2023

main19844

Stable

Upgraded with the latest SnapLogic Platform release.

November 2022

main18944

Stable

The Databricks - Insert Snap now creates the target table only from the table metadata of the second input view when the following conditions are met:

  • The Create table if not present checkbox is selected.

  • The target table does not exist.

  • The table metadata is provided in the second input view.

September 2022

430patches18305

Latest

The following fields are added to each Databricks Snap as part of this enhancement:

  • Number of Retries: The number of attempts the Snap should make to perform the selected operation when the Snap account connection fails or times out.

  • Retry Interval (seconds): The time interval in seconds between two consecutive retry attempts.

September 2022

430patches17796

Latest

The Manage Queued Queries property in the Databricks Snap Pack enables you to decide whether a given Snap should continue or cancel executing the queued Databricks SQL queries.

August 2022

main17386

Stable

Upgraded with the latest SnapLogic Platform release.

4.29.2.0

42920rc17045

Latest

A new Snap Pack for Databricks Lakehouse Platform (Databricks or DLP) introduces the following Snaps:


  • No labels