Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

 Not SelectedSelected Value N/A

Example: jdbc:spark://adb-2409532680880038.18.azuredatabricks.net:443/default;transportMode=http;ssl=1;httpPath=sql/protocolv1/o/2409532680880038/0326-212833-drier754;AuthMech=3;

<Encrypted> tonyblob sl-bigdata-qa N/A test-dataAppears when you select Shared Access Signature in the Azure Auth Type.

Enter the SAS token which is the part of the SAS URI associated with your Azure storage account. See Getting Started with SAS for details.

 queryTimeout 0 N/A 3 3 150

Field Name

Field Type

Field Dependency

Description

Label*

Default ValueELT Database Account N/A
ExampleELT RS AccountSTD DB Acc DeltaLake AWS ALD

String

None.

Specify a unique label for the account.

Download JDBC Driver Automatically

Default Value: Not Selected

Example: Selected

Checkbox

None

Select this checkbox to allow the Snap account to download the certified JDBC Driver for DLP. The following fields are disabled when this checkbox is selected.

  • JDBC JAR(s) and/or ZIP(s) : JDBC Driver

  • JDBC driver class

To use a JDBC Driver of your choice, clear this checkbox, upload (to SLDB), and choose the required JAR files in the JDBC JAR(s) and/or ZIP(s): JDBC Driver field. 

Use of Custom JDBC JAR version

You can use a different JAR file version outside of the recommended listed JAR file versions.

Spark JDBC and Databricks JDBC

If you do not select this checkbox and use an older JDBC JAR file (older than version 2.6.25), ensure that you use: 

  • The old format JDBC URL ( jdbc:spark:// ) instead of the new one ( jdbc:databricks:// )

    • For JDBC driver prior to version 2.6.25, the JDBC URL starts with jdbc:spark://

    • For JDBC driver version 2.6.25 or later, the JDBC URL starts with jdbc:databricks://

  • The older JDBC Driver Class com.simba.spark.jdbc.Driver instead of the new com.databricks.client.jdbc.Driver.

JDBC URL*

Default Value:

 N/A

Example: 

JDBC URL*jdbc:spark://adb-2409532680880038.18.azuredatabricks.net:443/default;transportMode=http;ssl=1;httpPath=sql/protocolv1/o/2409532680880038/0326-212833-drier754;AuthMech=3;

String

None

Enter the JDBC driver connection string that you want to use in the syntax provided below, for connecting to your DLP instance. See Microsoft's JDBC and ODBC drivers and configuration parameters for more information.

jdbc:spark://dbc-ede87531-a2ce.cloud.databricks.com:443/default;transportMode=http;ssl=1;httpPath=
sql/protocolv1/o/6968995337014351/0521-394181-guess934;AuthMech=3;UID=token;PWD=<personal-access-token> 

Avoid passing Password inside the JDBC URL

If you specify the password inside the JDBC URL, it is saved as it is and is not encrypted. We recommend passing your password using the Password field provided, instead, to ensure that your password is encrypted.

Use Token Based Authentication

Default

Use Token Based Authentication

value:

Selected

Example: Not selected

Checkbox

None

Select this checkbox to use token-based authentication for connecting to the target database (DLP) instance. Activates the Token field.

Token*

Default value: Selected N/A

Example: Not selected <Encrypted>

Token*

String

When Use Token Based Authentication checkbox is selected.

Enter the token value for accessing the target database/folder path.

Database name*

Default value: N/A

Example:

Database name* Default

String

None

Enter the name of the database to use by default. This database is used if you do not specify one in the Databricks Select or Databricks Insert Snaps.

Default value: N/A

Example: Default

Source/Target Location*

Dropdown

None

Select the target data warehouse. If you want to load the queries from ADLS Blob Storage as source, then the selected datawarehouse would serve as a target and vice versa. Following are the options available:

  • None: Select while using the read-only Snaps and you need not write anything to the target data warehouse.

  • Amazon S3

  • Azure Blob Storage

  • Azure Data Lake Storage Gen 2

  • DBFS

  • Google Cloud Storage

  • JDBC

 This activates the following fields:

  • Azure storage account name

  • Azure Container

  • Azure Folder

  • Azure Auth Type

  • SAS Token

Azure storage account name*

Default value: N/A

Example: tonyblob

String

When ADLS Blob Storage is selected as source

Enter the name with which Azure Storage was created. The Bulk Load Snap automatically appends the '.blob.core.windows.net' domain to the value of this property.

Azure Container*

Default value: N/A

Example:

Azure Container*

sl-bigdata-qa

String

When Appears when ADLS Blob Storage is selected as source

Enter the name of an existing Azure container.

Azure folder*

Default value: N/A

Example:

Azure folder* test-data

String

Appears when ADLS Blob Storage is selected as source

Enter the name of an existing Azure folder to be used within the container for hosting files.

Azure Auth Type

Default value:

Shared Access Signature

Example:

Azure Auth Type Shared Access Signature

Dropdown

Appears when ADLS Blob Storage is selected as source

Select the authorization type that you want to consider while setting up the account. Options available are:

  • Storage account Key

  • Shared Access Signature: Select when you want to enter the SAS Token associated with the Azure storage account.

Default value: Shared Access Signature

Example: Shared Access Signature

SAS Token*

String

Default value: N/A

Example: ?sv=2020-08-05&st=2020-08-29T22%3A18%3A26Z&se=2020-08-30T02%3A23%3A26Z&sr=b&sp=rw&sip=198.1.2.60-198.1.2.70&spr=https&sig=A%1DEFGH1Ijk2Lm3noI3OlWTjEg2tYkboXr1P9ZUXDtkk%3D

String

Appears when you select Shared Access Signature in the Azure Auth Type.

Enter the SAS token which is the part of the SAS URI associated with your Azure storage account. See Getting Started with SAS for details.

Azure storage account key*

Default value: N/A

Example: ABCDEFGHIJKL1MNOPQRS

String

Appears when you select Storage account key in the Azure Auth Type.

Enter the access key ID associated with your Azure storage account.

Default value: N/A

Example: ABCDEFGHIJKL1MNOPQRS

Advanced Properties

Other parameters that you want to specify to configure the account. This field set consists of the following fields:

  • URL Properties

  • Batch Size

  • Fetch Size

  • Min Pool Size

  • Max Pool Size

  • Max Life Time

    • URL Property Name

    • URL Property Value

URL properties

Use this field set to define the account parameter's name and its corresponding value. Click + to add the parameters and the corresponding values. Add each URL property-value pair in a separate row. It consists of the following fields:

  • URL property name

  • URL property value

URL property name

Default Value: N/A

ExamplequeryTimeout

N/A

None

Specify the name of the parameter for the URL property.

URL property value

URL property value

Default Value: N/A

Example:

0

N/A

None

Specify the value for the URL property parameter.

Batch size*

Default Value: N/A

Example:

Batch size* 3

Integer

None

Specify the number of queries that you want to execute at a time.

  • If the Batch Size is one, the query is executed as-is, that is the Snap skips the batch (non-batch execution).

  • If the Batch Size is greater than one, the Snap performs the regular batch execution.

Fetch size*

Default Value:

100

Example:

Fetch size*12

Integer

None

Specify the number of rows a query must fetch for each execution.Default Value: 100

Example: 12

Large values could cause the server to run out of memory.

Min pool size*

Default Value: 3

Example: 0

Integer

None

Specify the minimum number of idle connections that you want the pool to maintain at a time. 

Max pool size*

Default Value:

15

Example: 0

Max pool size*

Integer

None

Specify the maximum number of connections that you want the pool to maintain at a time.

Max life time*

Default Value:

60

Example

50

Max life time*

Integer

None

Specify the maximum lifetime of a connection in the pool, in seconds.

  • Ensure that the value you enter is a few seconds shorter than any database or infrastructure-imposed connection time limit.

  • 0 (zero) indicates an infinite lifetime, subject to the Idle Timeout value.

  • An in-use connection is never retired. Connections are removed only after they are closed.

Default Value: 60

Example50

Minimum value: 0
Maximum value: No limit

Idle Timeout*

Default Value5

Example4

Integer

None

Specify the maximum amount of time in seconds that a connection is allowed to sit idle in the pool. 

0 (zero) indicates that idle connections are never removed from the pool.

Default Value5

Example4

Minimum value: 0
Maximum value: No limit

Checkout timeout*

Default Value10000

Example9000

Integer

None

Specify the maximum time in milliseconds you want the system to wait for a connection to become available when the pool is exhausted.

Default Value10000

Example9000

Minimum value: 0
Maximum value: No limit

...