Skip to end of banner
Go to start of banner

Databricks Account (Source: AWS S3)

Skip to end of metadata
Go to start of metadata

You are viewing an old version of this page. View the current version.

Compare with Current View Page History

« Previous Version 3 Next »

In this article

Overview

You can use this account type to connect Databricks Snaps with data sources that use Databricks Account with AWS S3 as a source.

Prerequisites

  • A valid Databricks account.

  • Certified JDBC JAR File: databricks-jdbc-2.6.25-1.jar

Limitations and Known Issues

None.

Account Settings

  • Asterisk ( * ): Indicates a mandatory field.

  • Suggestion icon ( (blue star) ): Indicates a list that is dynamically populated based on the configuration.

  • Expression icon ( (blue star) ): Indicates whether the value is an expression (if enabled) or a static value (if disabled). Learn more about Using Expressions in SnapLogic.

  • Add icon ( (blue star) ): Indicates that you can add fields in the fieldset.

  • Remove icon ( (blue star) ): Indicates that you can remove fields from the fieldset.

Field Name

Field Type

Field Dependency

Description

Label*

 

Default Value: N/A
Example: STD DB Acc DeltaLake AWS S3

String

None.

Specify a unique label for the account.

 

Download JDBC Driver Automatically

 

 

 

 

 

 

 

 

 

 

 

Default Value: Not Selected

Example: Selected

Checkbox

None

Select this checkbox to allow the Snap account to download the certified JDBC Driver for DLP. The following fields are disabled when this checkbox is selected.

  • JDBC JAR(s) and/or ZIP(s) : JDBC Driver

  • JDBC driver class

To use a JDBC Driver of your choice, clear this checkbox, upload (to SLDB), and choose the required JAR files in the JDBC JAR(s) and/or ZIP(s): JDBC Driver field. 

Use of Custom JDBC JAR version

You can use a different JAR file version outside of the recommended listed JAR file versions.

Spark JDBC and Databricks JDBC

If you do not select this checkbox and use an older JDBC JAR file (older than version 2.6.25), ensure that you use: 

  • The old format JDBC URL ( jdbc:spark:// ) instead of the new one ( jdbc:databricks:// )

    • For JDBC driver prior to version 2.6.25, the JDBC URL starts with jdbc:spark://

    • For JDBC driver version 2.6.25 or later, the JDBC URL starts with jdbc:databricks://

  • The older JDBC Driver Class com.simba.spark.jdbc.Driver instead of the new com.databricks.client.jdbc.Driver.

 

JDBC URL*

 

 

 

 

Default Value: N/A

Example: jdbc:spark://adb-2409532680880038.18.azuredatabricks.net:443/default;transportMode=http;ssl=1;httpPath=sql/protocolv1/o/2409532680880038/0326-212833-drier754;AuthMech=3;

String

None

Enter the JDBC driver connection string that you want to use in the syntax provided below, for connecting to your DLP instance. See Microsoft's JDBC and ODBC drivers and configuration parameters for more information.

jdbc:spark://dbc-ede87531-a2ce.cloud.databricks.com:443/default;transportMode=http;ssl=1;httpPath=
sql/protocolv1/o/6968995337014351/0521-394181-guess934;AuthMech=3;UID=token;PWD=<personal-access-token> 

Avoid passing Password inside the JDBC URL

If you specify the password inside the JDBC URL, it is saved as it is and is not encrypted. We recommend passing your password using the Password field provided, instead, to ensure that your password is encrypted.

 

Use Token Based Authentication

Default value: Selected

Example: Not selected

Checkbox

None

Select this checkbox to use token-based authentication for connecting to the target database (DLP) instance. Activates the Token field.

 

Token*

Default value: N/A

Example: <Encrypted>

String

When Use Token Based Authentication checkbox is selected.

Enter the token value for accessing the target database/folder path.

 

Database name*

Default value: N/A

Example: Default

String

None

Enter the name of the database to use by default. This database is used if you do not specify one in the Databricks Select or Databricks Insert Snaps.

 

Source/Target Location*

Dropdown

None

Select the source or target data warehouse into which the queries must be loaded, that is AWS S3. This activates the following fields:

  • S3 Bucket

  • S3 Folder

  • AWS Authorization type

  • S3 Access Key ID

  • S3 Secret Key

S3 Bucket*

String

None

Specify the name of the S3 bucket that you want to use for staging data to Databricks. 

Default Value: N/A

Examplesl-bucket-ca

S3 Folder*

String

None

Specify the relative path to a folder in the S3 bucket listed in the S3 Bucket field. This is used as a root folder for staging data to Databricks.

Default Value: N/A

Example:  https://sl-bucket-ca.s3.<ca>.amazonaws/<sf>

Aws Authorization type

Dropdown

None

Select the authentication method to use for accessing the source data.

Available options are:

  • Source/Target Location Credentials. Select this option when you do not have a storage integration setup in your S3. Activates the Access Key and Secret Key fields for S3.

  • Source/Target Location Session Credentials. Select this option if you have session credentials to access the source location in S3. Activates the Session Access KeySession Secret Key, and Session Token fields.

  • Storage Integration. Select this option when you want to use the storage integration to access the selected source location. Activates the Storage Integration Name field.

Default value: Source Location Credentials for S3 and Azure, Storage Integration for Google Cloud Storage.

Example: Storage Integration

S3 Access-key ID*

String

None

Specify the S3 access key ID that you want to use for AWS authentication.

Default Value: N/A

ExampleNAVRGGRV7EDCFVLKJH

S3 Secret key*

String

None

Specify the S3 secret key associated with the S3 Access-ID key listed in the S3 Access-key ID field.

Default Value: N/A

Example2RGiLmL/6bCujkKLaRuUJHY9uSDEjNYr+ozHRtg

S3 AWS Token*

String

Appears when Source/Target Location Session Credentials is selected in Aws Authorization type

Specify the S3 AWS Token to connect to private and protected Amazon S3 buckets.

The temporary AWS Token is used when:

  • Data is staged in S3 location.

  • Data is coming from the input view and the files are staged in an external staging location.

Default Value: None
ExampleAQoDYXdzEJr

Troubleshooting

Error

Reason

Resolution

Account validation failed.

The Pipeline ended before the batch could complete execution due to a connection error.

Verify that the Refresh token field is configured to handle the inputs properly. If you are not sure when the input data is available, configure this field as zero to keep the connection always open.

  • No labels