In this article

1 Overview
- 1.1 Snap Type
- 1.2 Prerequisites
- 1.3 Support for Ultra Pipelines
- 1.4 Limitations
- 1.5 Known Issues
2 Snap Views
3 Snap Settings
4 Troubleshooting
5 Examples
- 5.1 Delete employee information from DLP table
- 5.2 Downloads
6 Snap Pack History

Overview

You can use this Snap to execute a Databricks SQL DELETE statement based on specific conditions. Ensure adequate discretion while using this Snap as it can truncate the table if run without specifying a WHERE condition for the DELETE statement.

Snap Type

Databricks - Delete Snap is a write-type Snap that deletes rows from a target DLP table.

Prerequisites

Valid access credentials to a DLP instance with adequate access permissions to perform the action in context.
Valid access to the external source data in one of the following: Azure Blob Storage, ADLS Gen2, DBFS, GCP, AWS S3, or another database (JDBC-compatible).

Support for Ultra Pipelines

Does not support Ultra Pipelines.

Limitations

Snaps in the Databricks Snap Pack do not support array, map, and struct data types in their input and output documents.

Known Issues

When you add an input view to this Snap, ensure that you configure the Batch size as 1 in the Snap’s account configuration. For any other batch size, the Snap fails with the exception: Multi-batch parameter values are not supported for this query type.

Snap Views

Type	Format	Number of Views	Examples of Upstream and Downstream Snaps	Description

Type

Format

Number of Views

Examples of Upstream and Downstream Snaps

Description

Input

Document

Min: 0
Max: 1

JSON Generator
Copy
Databricks - Select

A JSON document containing the reference to the table and rows to be deleted.

Output

Document

Min: 0
Max: 1

Databricks - Select
JSON Parser

A JSON document containing the result of the delete operation on the target table.

Error

Error handling is a generic way to handle errors without losing data or failing the Snap execution. You can handle the errors that the Snap might encounter while running the pipeline by choosing one of the following options from the When errors occur list under the Views tab. The available options are:

Stop Pipeline Execution: Stops the current pipeline execution when the Snap encounters an error.
Discard Error Data and Continue: Ignores the error, discards that record, and continues with the rest of the records.
Route Error Data to Error View: Routes the error data to an error view without stopping the Snap execution.

Learn more about Error handling in Pipelines.

Snap Settings

Asterisk ( * ): Indicates a mandatory field.
Suggestion icon (): Indicates a list that is dynamically populated based on the configuration.
Expression icon ( ): Indicates whether the value is an expression (if enabled) or a static value (if disabled). Learn more about Using Expressions in SnapLogic.
Add icon ( ): Indicates that you can add fields in the fieldset.
Remove icon ( ): Indicates that you can remove fields from the fieldset.

Field Name	Field Type	Field Dependency	Description

Field Name	Field Type	Field Dependency	Description
Label* Default Value: Databricks - Delete Example: Db_Del_Duplicates	String	None	The name for the Snap. You can modify this to be more specific, especially if you have more than one of the same Snap in your Pipeline.
Use unity catalog Default state: Deselected	Checkbox	None	Select this checkbox to use the Unity catalog to access data from the catalog.
Catalog name Default value: hive_metastore Example: xyzcatalog	String/Expression/Suggestion	Appears when you select Use unity catalog	Specify the name of the catalog for using the unity catalog.
Database name Default Value: None. Example: Cust_DB	String/Expression/Suggestion	None	Enter your corresponding DLP database name for the DELETE statement to delete existing rows from the table.
Table name* Default Value: None. Example: Cust_List	String/Expression/Suggestion	None	Enter your table name for the DELETE statement to delete existing rows from.
Delete condition (deletes all records from table if left blank) Default Value: N/A Example: last_login_date < ‘2010-01-01’>	String/Expression	None	Specify the condition for the DELETE statement to filter the rows to delete from the target table. Specify a valid WHERE clause for the delete condition. If you leave this field blank, the Snap deletes all the records from the table.
Number of Retries Minimum value: 0 Default value: 0 Example: 3	Integer/Expression	None	Specifies the maximum number of retry attempts when the Snap fails to write.
Retry Interval (seconds) Minimum value: 1 Default value: 1 Example: 3	Integer/Expression	None	Specifies the minimum number of seconds the Snap must wait before each retry attempt.
Manage Queued Queries Default value: Continue to execute queued queries when pipeline is stopped or if it fails. Example: Cancel queued queries when pipeline is stopped or if it fails	Dropdown list	None	Select this property to determine whether the Snap should continue or cancel the execution of the queued Databricks SQL queries when you stop the Pipeline. If you select Cancel queued queries when pipeline is stopped or if it fails, then the read queries under execution are cancelled, whereas the write type of queries under execution are not cancelled. Databricks internally determines which queries are safe to be cancelled and cancels those queries. Due to an issue with DLP, aborting an ELT Pipeline validation (with preview data enabled) causes only those SQL statements that retrieve data using bind parameters to get aborted while all other static statements (that use values instead of bind parameters) persist. For example, `select * from a_table where id = 10` will not be aborted while `select * from test where id = ?` gets aborted. To avoid this issue, ensure that you always configure your Snap settings to use bind parameters inside its SQL queries.
Snap Execution Default Value: Execute only Example: Validate & Execute	Dropdown list	None	Select one of the three modes in which the Snap executes. Available options are: Validate & Execute: Performs limited execution of the Snap, and generates a data preview during Pipeline validation. Subsequently, performs full execution of the Snap (unlimited records) during Pipeline runtime. Execute only: Performs full execution of the Snap during Pipeline execution without generating preview data. Disabled: Disables the Snap and all Snaps that are downstream from it.

Troubleshooting

Error	Reason	Resolution

Error	Reason	Resolution
Missing property value	You have not specified a value for the required field where this message appears.	Ensure that you specify valid values for all required fields.

Examples

Delete employee information from DLP table

Consider the scenario where we want to delete information of certain employees from an intermediate data location that runs on DLP. We can achieve this through a Pipeline containing the Databricks - Delete Snap.

We configure this Snap (Pipeline) to delete the employee rows from the company_employees table in our DLP instance if their joining date is before Jan 01, 2010. We also configure an appropriate account for the Snap to connect to the target DLP instance.

Upon validation, the Pipeline deletes the rows satisfying the condition specified and returns the status of the operation in the Snap’s output.

Download this Pipeline.

Downloads

Download and import the Pipeline into SnapLogic.
Configure Snap accounts as applicable.
Provide Pipeline parameters as applicable.

	File	Modified
Labels No labels Preview View	File Databricks_Delete_FEP1.slp	Jul 14, 2022 by Anand Vedam

SnapLogic Documentation

Databricks - Delete