In this article
Overview
You can use this account type to connect Databricks Snaps with data sources that use Databricks Account with Azure Data Lake Storage (ADLS) Gen2 as source.
Prerequisites
A valid Databricks account.
Certified JDBC JAR File: databricks-jdbc-2.6.25-1.jar
Limitations and Known Issues
None.
Account Settings
Asterisk ( * ): Indicates a mandatory field.
Suggestion icon ( ): Indicates a list that is dynamically populated based on the configuration.
Expression icon ( ): Indicates whether the value is an expression (if enabled) or a static value (if disabled). Learn more about Using Expressions in SnapLogic.
Add icon ( ): Indicates that you can add fields in the fieldset.
Remove icon ( ): Indicates that you can remove fields from the fieldset.
Field Name | Field Type | Field Dependency | Description |
---|---|---|---|
Label* Default Value: N/A | String | None. | Specify a unique label for the account. |
Download JDBC Driver Automatically Default Value: Not Selected Example: Selected | Checkbox | None. | Select this checkbox to allow the Snap account to download the certified JDBC Driver for DLP. The following fields are disabled when this checkbox is selected.
To use a JDBC Driver of your choice, clear this checkbox, upload (to SLDB), and choose the required JAR files in the JDBC JAR(s) and/or ZIP(s): JDBC Driver field. Use of Custom JDBC JAR version You can use a different JAR file version outside of the recommended listed JAR file versions. Spark JDBC and Databricks JDBC If you do not select this checkbox and use an older JDBC JAR file (older than version 2.6.25), ensure that you use:
|
JDBC URL* Default Value: N/A Example: jdbc:spark://adb-2409532680880038.18.azuredatabricks.net:443/default;transportMode=http;ssl=1;httpPath=sql/protocolv1/o/2409532680880038/0326-212833-drier754;AuthMech=3; | String | None. | Enter the JDBC driver connection string that you want to use in the syntax provided below, for connecting to your DLP instance. See Microsoft's JDBC and ODBC drivers and configuration parameters for more information. jdbc:spark://dbc-ede87531-a2ce.cloud.databricks.com:443/default;transportMode=http;ssl=1;httpPath= Avoid passing Password inside the JDBC URL If you specify the password inside the JDBC URL, it is saved as it is and is not encrypted. We recommend passing your password using the Password field provided, instead, to ensure that your password is encrypted. |
Use Token Based Authentication Default value: Selected Example: Not selected | Checkbox | None. | Select this checkbox to use token-based authentication for connecting to the target database (DLP) instance. Activates the Token field. |
Token* Default value: N/A Example: <Encrypted> | String | Use Token Based Authentication checkbox is selected. | Enter the token value for accessing the target database/folder path. |
Database name* Default value: N/A Example: Default | String | None. | Enter the name of the database to use by default. This database is used if you do not specify one in the Databricks Select or Databricks Insert Snaps. |
Source/Target Location* Default value: N/A Example: Default | Dropdown list | None. | Select the source or target data warehouse into which the queries must be loaded, that is ADLS Gen2. This activates the following fields:
|
Azure storage account name* Default value: N/A Example: tonyblob | String | Source is ADLS Gen2. | Enter the name with which Azure Storage was created. The Bulk Load Snap automatically appends the '.blob.core.windows.net' domain to the value of this property. |
Azure Container* Default value: N/A Example: sl-bigdata-qa | String | Source is ADLS Gen2. | Enter the name of an existing Azure container. |
Azure folder* Default value: N/A Example: test-data | String | Source is ADLS Gen2. | Enter the name of an existing Azure folder to be used within the container for hosting files. |
Azure Auth Type Default value: Shared Access Signature Example: Shared Access Signature | Dropdown list | Source is ADLS Gen2. | Select the authorization type that you want to consider while setting up the account. Options available are:
|
SAS Token* Default value: N/A Example: ?sv=2020-08-05&st=2020-08-29T22%3A18%3A26Z&se=2020-08-30T02%3A23%3A26Z&sr=b&sp=rw&sip=198.1.2.60-198.1.2.70&spr=https&sig=A%1DEFGH1Ijk2Lm3noI3OlWTjEg2tYkboXr1P9ZUXDtkk%3D | String | Azure Auth Type is Shared Access Signature. | Enter the SAS token which is the part of the SAS URI associated with your Azure storage account. See Getting Started with SAS for details. |
Azure storage account key* Default value: N/A Example: ABCDEFGHIJKL1MNOPQRS | String | Azure Auth Type is Storage account key. | Enter the access key ID associated with your Azure storage account. |
Advanced Properties | Other parameters that you want to specify to configure the account. This field set consists of the following fields:
| ||
URL properties | Use this field set to define the account parameter's name and its corresponding value. Click + to add the parameters and the corresponding values. Add each URL property-value pair in a separate row. It consists of the following fields:
| ||
URL property name Default Value: N/A Example: queryTimeout | N/A | None | Specify the name of the parameter for the URL property. |
URL property value Default Value: N/A Example: 0 | N/A | None | Specify the value for the URL property parameter. |
Batch size* Default Value: N/A Example: 3 | Integer | None | Specify the number of queries that you want to execute at a time.
|
Fetch size* Default Value: 100 Example: 12 | Integer | None | Specify the number of rows a query must fetch for each execution. Large values could cause the server to run out of memory. |
Min pool size* Default Value: 3 Example: 0 | Integer | None | Specify the minimum number of idle connections that you want the pool to maintain at a time. |
Max pool size* Default Value: 15 Example: 0 | Integer | None | Specify the maximum number of connections that you want the pool to maintain at a time. |
Max life time* Default Value: 60 Example: 50 | Integer | None | Specify the maximum lifetime of a connection in the pool, in seconds.
Minimum value: 0 |
Idle Timeout* Default Value: 5 Example: 4 | Integer | None | Specify the maximum amount of time in seconds that a connection is allowed to sit idle in the pool. 0 (zero) indicates that idle connections are never removed from the pool. Minimum value: 0 |
Checkout timeout* Default Value: 10000 Example: 9000 | Integer | None | Specify the maximum time in milliseconds you want the system to wait for a connection to become available when the pool is exhausted. Minimum value: 0 |
Troubleshooting
Error | Reason | Resolution |
---|---|---|
Account validation failed. | The Pipeline ended before the batch could complete execution due to a connection error. | Verify that the Refresh token field is configured to handle the inputs properly. If you are not sure when the input data is available, configure this field as zero to keep the connection always open. |