In this article
You can use this Snap to run one or more Databricks SQL statements on your target Databricks Lakehouse Platform (DLP) instance. You can run the following types of queries using this Snap:
Data Definition Language (DDL) queries
Data Manipulation Language (DML) queries
Data Control Language (DCL) queries
This Snap works only with single queries.
This Snap works only with single queries.
The Snap runs each statement as a single atomic unit so as to allow rolling back changes when a statement fails during its execution.
Databricks - Execute Snap is a write-type Snap that can read, fetch, and write data and tables into a target DLP instance.
Valid access credentials to a DLP instance with adequate access permissions to perform the action in context.
Valid access to the external source data in one of the following: Azure Blob Storage, ADLS Gen2, DBFS, GCP, AWS S3, or another database (JDBC-compatible).
Does not support Ultra Pipelines.
This Snap does not support multi-statement transaction rollback.
Each statement is auto-committed upon successful execution. In the event of a failure, the Snap can rollback only updates corresponding to the failed statement execution. All previous statements (during that Pipeline execution runtime) that ran successfully are not rolled back.
You cannot run Data Query Language (DQL) queries using this Snap. For example, SELECT
and WITH
query constructs.
None.
Type | Format | Number of Views | Examples of Upstream and Downstream Snaps | Description |
---|---|---|---|---|
Input | Document |
|
| Input document is not mandatory. The Snap can fetch and apply values for parameterized queries from an upstream Snap output. |
Output | Document |
|
| A JSON document containing each SQL statement along with its execution status (or result). |
Error | Error handling is a generic way to handle errors without losing data or failing the Snap execution. You can handle the errors that the Snap might encounter while running the Pipeline by choosing one of the following options from the When errors occur list under the Views tab. The available options are:
Learn more about Error handling in Pipelines. |
|
Field Name | Field Type | Description | ||
---|---|---|---|---|
Label* Default Value: Databricks - Execute | String | The name for the Snap. You can modify this to be more specific, especially if you have more than one of the same Snap in your Pipeline. | ||
SQL Statements* | Use this fieldset to define your SQL statements, one in each row. You can add as many SQL statements as you need. | |||
SQL statement* Default Value: None. | String/Expression | Specify the Databricks SQL statement you want the Snap to execute. We recommend you to add a single query in the SQL Statement field. The SQL statement must follow the SQL syntax as stipulated in DLP. | ||
Number of Retries Default Value: 0 Minimum value: 0
| Integer | Specify the maximum number of retry attempts when the Snap fails to write. | ||
Retry Interval (Seconds) Default value: 1 Minimum value: 1
| Integer | Specify the minimum number of seconds the Snap must wait before each retry attempt. | ||
Use Result Query | Checkbox | Select this checkbox to write the SQL statement execution result to the Snap's output view for each successful execution. The output of the Snap is enclosed in the key This option allows you to effectively track the SQL statement's execution by clearly indicating the successful execution and the number of records affected, if any, after the execution. For DDL statements Because the Databricks JDBC driver does not return a result set for Data Definition Language (DDL) statements such as DROP, CREATE, and ALTER, the Snap displays a standard message: For DDL statements Because the Databricks JDBC driver does not return a result set for Data Definition Language (DDL) statements such as DROP, CREATE, and ALTER, the Snap displays a standard message: | ||
Manage Queued Queries Default Value: Continue to execute queued queries when pipeline is stopped or if it fails. | Dropdown list | Select either of the following options from the dropdown list to handle queued SQL queries:
If you select Cancel queued queries when pipeline is stopped or if it fails, then the read queries under execution are cancelled, whereas the write type of queries under execution are not cancelled. Databricks internally determines which queries are safe to be cancelled and cancels those queries. If you select Cancel queued queries when pipeline is stopped or if it fails, then the read queries under execution are cancelled, whereas the write type of queries under execution are not cancelled. Databricks internally determines which queries are safe to be cancelled and cancels those queries.
| ||
Snap Execution Default Value: Execute only | Dropdown list | Select one of the three modes in which the Snap executes. Available options are:
|
Error | Reason | Resolution |
---|---|---|
Missing property value | You have not specified a value for the required field where this message appears. | Ensure that you specify valid values for all required fields. |
Example title must be gerunds in title case.
Consider the scenario where the data in a DLP table becomes obsolete every few hours. We need to refresh the data in the table on a frequent basis. To do so, we can create the following Pipeline with only Databricks - Execute Snap.
Configure the Snap (Pipeline) to run two Databricks SQL statements in a specific order - Delete the existing table and create a new table with the same schema as the source file and populate the latest values into this new table. Ensure that the DLP account used with the Snap has the required permissions to perform operations you specify in your SQL statements.
The Snap upon successful validation displays the output in the preview pane as follows. This output contains the SQL statement we passed and the respective result of execution.
|