In this article

Overview

You can use this Snap to execute a job, check its status in Databricks, and, based on the job's status, complete or fail the pipeline. The Snap triggers the task to execute and then periodically checks its status. The Snap stops after the job is executed. However, if the pipeline is canceled before the task is finished, the Snap requests to stop the task.

Example

Run Job on a Cluster

The following example pipeline demonstrates how to run a job specified in the notebook on a cluster.Snap Type

The Databricks - Run Job Snap is a Write-type Snap.

Prerequisites

Valid client ID.
A valid account with the required permissions.

Support for Ultra Pipelines

Works in Ultra Pipelines.

Limitations and Known Issues

None.

Snap Views

Type

Format

Number of Views

Examples of Upstream and Downstream Snaps

Description

Input

Document

Min: 0
Max: 1

Mapper
Copy

Requires a valid task name, notebook path, and cluster-info.

Output

Document

Min: 1
Max: 1

Mapper
Filter

Executes the selected notebook.

Error

Error handling is a generic way to handle errors without losing data or failing the Snap execution. You can handle the errors that the Snap might encounter when running the pipeline by choosing one of the following options from the When errors occur list under the Views tab:

Stop Pipeline Execution: Stops the current pipeline execution if the Snap encounters an error.
Discard Error Data and Continue: Ignores the error, discards that record, and continues with the remaining records.
Route Error Data to Error View: Routes the error data to an error view without stopping the Snap execution.

Learn more about Error handling in Pipelines.

Snap Settings

Asterisk ( * ): Indicates a mandatory field.
Suggestion icon (): Indicates a list that is dynamically populated based on the configuration.
Expression icon ( ): Indicates the value is an expression (if enabled) or a static value (if disabled). Learn more about Using Expressions in SnapLogic.
Add icon ( ): Indicates that you can add fields in the field set.
Remove icon ( ): Indicates that you can remove fields from the field set.
Upload icon ( ): Indicates that you can upload files.

Field Name		Field Type	Description
Label* Default Value: Databricks - Run Job Example: Run Job		String	Specify the name for the Snap. You can modify this to be more specific, especially if you have more than one of the same Snap in your pipeline.
Task name* Default Value: N/A Example: Test username and password		String/Expression	Specify the name of the task to perform the job.
Notebook path* Default Value: N/A Example: /Users/johndoe@snaplogic.com/notebook		String/Expression/Suggestion	Specify the path of the saved notebook that will run in this job. Notebook is a web-based interface that allows you to create, edit, and execute data science and data engineering workflows. Learn more about Databricks notebooks.
Cluster* Default Value: N/A Example: Code Ammonite - Shared Compute Cluster - V2		String/Expression/Suggest	Specify the cluster to run the job within its environment.
Parameter(s)	Use this field set to specify the parameters to run the job.
	Key* Default Value: N/A Example: Age	String/Expression	Specify the parameter key.
	Value* Default Value: N/A Example: 35	String/Expression	Specify the parameter value.
Interval check (seconds)* Default Value: 10 Example: 15		Integer/Expression	Specify the number of seconds to wait before checking the status of the task.
Snap Execution Default Value: Execute only Example: Validate & Execute		Dropdown list	Select one of the following three modes in which the Snap executes: Validate & Execute: Performs limited execution of the Snap and generates a data preview during pipeline validation. Subsequently, performs full execution of the Snap (unlimited records) during pipeline runtime. Execute only: Performs full execution of the Snap during Pipeline execution without generating preview data. Disabled: Disables the Snap and all Snaps that are downstream from it.

Example

Run Job on a Cluster

The following example pipeline demonstrates how to run a job specified in the notebook on a cluster.

Download this pipeline.

Step 1: Configure the Databricks - Run Job Snap with the following settings:

a. Task name: Specify the task the Databricks - Run Job Snap must perform in this field.

b. Notebook path: Specify the path to the Databricks notebook that contains the code to be executed. This path indicates the location within the Databricks environment where the notebook is stored.

c. Cluster: Specify the cluster on which the job must be executed. The cluster configuration (including computational resources) is predefined and identified by this name and ID.

d. Interval check (seconds): Specify the frequency (in seconds) at which the Snap will check the status of the running job. In this case, it will check every 10 seconds.

Databricks - Run Job Configuration	Databricks - Run Job Output

Step 2: Configure the Mapper Snap to store the result status of the Databricks - Run Job Snap. On validation, the Mapper Snap displays the job success message.

Downloads

	File	Modified

No files shared here yet.

Snap Pack History

Release	Snap Pack Version	Date	Type	Updates
August 2024	main27765	21 Aug 2024	Stable	Upgraded the `org.json.json` library from v20090211 to v20240303, which is fully backward compatible.
May 2024	437patches27246	08 Aug 2024	Latest	Added Databricks - Run Job. This Snap executes a job, checks its status in Databricks, and, based on the job's status, completes or fails the pipeline.
May 2024	437patches26400	15 May 2024	Latest	Fixed an invalid session handle issue with the Databricks Snap Pack that intermittently triggered an error message when the Snaps failed to connect with Databricks to execute the SQL statement.
May 2024	main26341	08 May 2024	Stable	Updated the Delete Condition (Truncates a Table if empty) field in the Databricks - Delete Snap to Delete condition (deletes all records from a table if left blank) to indicate that all entries will be deleted from the table when this field is blank, but no truncate operation is performed.
February 2024	main25112	14 Feb 2024	Stable	Updated and certified against the current SnapLogic Platform release.
November 2023	main23721	08 Nov 2023	Stable	Updated and certified against the current SnapLogic Platform release.
August 2023	main22460	16 Aug 2023	Stable	Updated and certified against the current SnapLogic Platform release.
May 2023	433patches21630	28 Jun 2023	Latest	Enhanced the performance of the Databricks - Insert Snap to improve the amount of time it takes for validation.
May 2023	main21015	10 May 2023	Stable	Upgraded with the latest SnapLogic Platform release.
February 2023	main19844	09 Feb 2023	Stable	Upgraded with the latest SnapLogic Platform release.
November 2022	main18944	10 Nov 2022	Stable	The Databricks - Insert Snap now creates the target table only from the table metadata of the second input view when the following conditions are met: The Create table if not present checkbox is selected. The target table does not exist. The table metadata is provided in the second input view.
September 2022	430patches18305	29 Sep 2022	Latest	The name of the Databricks - Multi Execute Snap is simplified to Databricks - Execute Snap. The Use Result Query checkbox in the Databricks - Execute Snap enables you to include in the Snap's output the result of running (during validation) each SQL statement specified in the Snap. The Retry mechanism for the Databricks Snap Pack enables the following Databricks Snaps to repeatedly perform the selected operations for the specified number of times when the Snap account connection fails or times out. Databricks - Delete Databricks - Insert Databricks - Select Databricks - Execute Databricks - Bulk Load (when the Source Type is Input View) Databricks - Merge Into (when the Source Type is Input View) The following fields are added to each Databricks Snap as part of this enhancement: Number of Retries: The number of attempts the Snap should make to perform the selected operation when the Snap account connection fails or times out. Retry Interval (seconds): The time interval in seconds between two consecutive retry attempts.
September 2022	430patches17796	28 Sep 2022	Latest	The Manage Queued Queries property in the Databricks Snap Pack enables you to decide whether a given Snap should continue or cancel executing the queued Databricks SQL queries.
August 2022	main17386	11 Aug 2022	Stable	Upgraded with the latest SnapLogic Platform release.
4.29.2.0	42920rc17045	15 Jul 2022	Latest	A new Snap Pack for Databricks Lakehouse Platform (Databricks or DLP) introduces the following Snaps: Databricks - Select: Retrieves information from the target Databricks table. Databricks - Insert: Inserts new rows of data in the target Databricks table. Databricks - Delete: Deletes data from a target Databricks table. Databricks - Bulk Load: Loads millions of rows of data in the target table through a single load operation. Databricks - Unload: Unloads data from a target Databricks table through a single unload operation. Databricks - Merge Into: Updates millions of existing rows and inserts new rows in a target Databricks table through a single operation. Databricks - Multi Execute: Runs multiple SQL statements on the target Databricks instance.