Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

Field Name

Field Type

Field Dependency

Description

Label*

Default ValueAzure Synapse SQL - Bulk Load
ExampleBulk_Load

String

N/A

Specify a unique name for the Snap.

Schema Name

Default value: None
Example: SYS

String/Expression

N/A

Specify the database schema name. In case it is not defined, then the suggestion for the table name retrieves all tables names of all schemas. The property is suggestible and retrieves available database schemas during suggest values.

Table Name*

Default Value: None
Example: users

String/Expression

N/A

Specify the table name or select the table from the suggestion list to load the incoming data into.

Data Source

Default Value: Input View
Example: External Source

Dropdown list

N/A

Select either of the following sources from where the data must load:

  • Input view: The data is loaded from the upstream Snap.

  • External storage: The data is loaded from an external source.

When the Data Source is Input View, the incoming data is written to a temporary file in the Snaplex and then uploaded to the staged location (Azure storage) using the Azure API. After uploading the files to Azure storage, the Snap runs the COPY INTO command to load the data from files to table. The temporary files are deleted after the execution (failed or successful) is completed. Hence, if you are running huge data Pipelines, we recommend you to configure your Snaplex instance disk space accordingly.

Create table if not present

Default Value: Deselected

Checkbox

Appears only when you select Input view for Data Source.

Select this checkbox to create the target table in case if it does not exist; else . Otherwise, the system throws displays the "table not found" error.

If a target table does not exist when the Snap tries to do the bulk load, and if you select this checkbox, the Snap creates the table with the columns and data types. If you want a table to be created with the same schema as a source table, you can include a second input view for this Snap. This view can be used to pass metadata about the table, effectively allowing you to replicate a table from one database to another.

The table metadata document that is read in by the second input view contains a dump of the JDBC DatabaseMetaData class. The document can be manipulated to affect the CREATE TABLE statement that is generated by this Snap. For example, to rename the name column to full_name, you can use a Mapper Snap that sets the path $.columns. Name.COLUMN_NAME to full_name.

The Snap does not automatically fix the errors encountered during table creation, because it may require user intervention to resolve correctly. For example, if the source table contains a column with a type that does not have a direct mapping in the target database, Snap fails to execute. In such a case, add a Mapper Snap to change the metadata document to explicitly set the values required to produce a valid CREATE TABLE statement.

Purge files

Default Value: Deselected
Example: Selected

Checkbox

Appears only when you select Exteranl External storage for Data Source.

Select this checkbox if you want to purge the data files automatically from the external storage after the data is loaded successfully.

  • The Snap automatically deletes the staged files when Data Source is Input view.

  • Use file pattern and purge files combination cautiously as it might delete the data which is irreversible.

Table Column Settings

Use this field set to define columns to map the source data to the columns in the target table. The source data can be either from the input view or external storage.

Column Name

Default Vaue: N/A
Example: $first_name

String/Expression/Suggestion

N/A

Specify the column name in the target table.

Default Value

Default Vaue: N/A
Example: new

String/Expression

N/A

Specify the default value that replaces null value if any in the input file.

Source Column Position

Default Value: None
Example: Second

String/Expression

Appears only when you select Exteranl storage for Data Source.

Specify the position of the column for the source file within a row.

Add Quotes

Default Vaue: Deselected

Checkbox

N/A

Select this checkbox to enclose default value in quotes.

File List

Appears only when you select External source for Data Source.

Use this fieldset to specify the list of files that must be loaded to the target table.

File

String/Expression

Appears only when you select External source for Data Source.

Specify the file to be loaded to the target table.

File Name Pattern

String/Expression

Appears only when you select External source for Data Source.

Specify an expression or string that indicates the absolute path of the file names to match the files in the external location.

File Format Type

Dropdown list

Appears only when you select External source for Data Source.

Choose one of the following file format types to load data into the target table or unload data from the target table:

  • CSV

  • PARQUET

  • ORC

When the file format type is PARQUET or ORC and you specify the FILE_FORMAT in the Copy Arguments fieldset, the Snap overrides the FILE TYPE.

Copy Arguments

Use this fieldset to configure the list of arguments that you want the Snap to generate for the copy command.

With

Default Value: None
Example: ENCODING={'UTF8'|'UTF18'}

String/Expression

N/A

Specify the list of arguments to be used when loading the data. For example, the argument With MAXERRORS=1000 enables the Snap to ignore 1000 record errors and continues with the execution of the Snap and terminates the operation after exceeding the 1000 errors.

When the Data Source is Input View, this Snap suggests few copy arguments that allows you to configure the arguments. The Snap uses default values for the following properties (as shown below) to prepare the external stage file:

MAXERRORS=0,

FIELDQUOTE is quote character (")

FIELDTERMINATOR is a (,)

ROW TERMINATOR is \r\n.

Note

Changes to the Snap's above-mentioned default values may result in failures with the bulk operation.

Parallel Transfer Options

This property enables in faster loading of data.

Max Concurrency*

Default Value: 5
Example: 10

Integer

N/A

Specify the maximum number of parallel requests that are to be issued at a given time as part of a single parallel transfer.

Block Size (MB)*

Default Value: 2
Example: 6

Integer

N/A

Specify the size of data that must be chunked to transfter at a time.

Snap Execution

Default ValueExecute only
Example: Validate & Execute

Dropdown list

N/A

Select one of the three modes in which the Snap executes. Available options are:

  • Validate & Execute: Performs limited execution of the Snap, and generates a data preview during Pipeline validation. Subsequently, performs full execution of the Snap (unlimited records) during Pipeline runtime.

  • Execute only: Performs full execution of the Snap during Pipeline execution without generating preview data.

  • Disabled: Disables the Snap and all Snaps that are downstream from it.

...