On this Page
Snap type: | Write | |||||||
---|---|---|---|---|---|---|---|---|
Description: | This Snap executes a Snowflake bulk load, writing data into an Amazon S3 bucket or a Microsoft Azure Storage Blob. The Snap creates temporary files in JCC when the staging location is internal and data source is Input view. These temporary files are removed automatically once the Pipeline completes execution.
| |||||||
Prerequisites: | You should have minimum permissions on the database to execute Snowflake Snaps. To understand if you already have them, you must retrieve the current set of permissions. The following commands enable you to retrieve those permissions. SHOW GRANTS ON DATABASE <database_name> SHOW GRANTS ON SCHEMA <schema_name> SHOW GRANTS TO USER <user_name>
The following commands enable minimum privileges in the Snowflake Console: grant usage on database <database_name> to role <role_name>; grant usage on schema <database_name>.<schema_name>; grant "CREATE TABLE" on database <database_name> to role <role_name>; grant "CREATE TABLE" on schema <database_name>.<schema_name>; For more information on Snowflake privileges, refer to Access Control Privileges. The below are mandatory when using an external staging location: When using an Amazon S3 bucket for storage:
When using a Microsoft Azure storage blob:
| |||||||
Internal SQL Commands | This Snap uses the following Snowflake commands internally: | |||||||
Support and limitations: |
| |||||||
Account: | This Snap uses account references created on the Accounts page of SnapLogic Manager to handle access to this endpoint. See Snowflake Account for information on setting up this type of account. | |||||||
Views: |
| |||||||
Settings | ||||||||
Label | Required. The name provided for the instance. You can modify this to be more specific, especially if you have more than one of the same Snap in your pipeline. | |||||||
Schema name | Required. The database schema name. In case it is not defined, then the suggestion for the Table Name will retrieve all tables names of all schemas. The property is suggestible and will retrieve available database schemas during suggest values. The values can be passed using the pipeline parameters but not the upstream parameter. Default value: [None] | |||||||
Table name | Required. The name of the table to execute bulk load operation on. The values can be passed using the pipeline parameters but not the upstream parameter. Default value: [None] | |||||||
Create table if not present | Whether the table should be automatically created if not already present. Default value: Not selected
This should not be used in production since there are no indexes or integrity constraints on any column and the default varchar() column is over 30k bytes. | |||||||
Data source | The source from where the data is loaded from. The options available are Input view and Staged files. When the option 'Input View' is selected, leave the Table Coumns field empty, and if the 'Staged files' option is selected, provide the column names for the Table Coumns to which the records are to be added. Default value: Input view | |||||||
Preserve case sensitivity | If selected, the case sensitivity of the column names will be preserved. Default value: Not selected | |||||||
Load empty strings | If selected, empty string values in the input documents are loaded as empty strings to the string-type fields. Otherwise, empty string values in the input documents are loaded as null. Null values are loaded as null regardless. Default value: Not selected | |||||||
Staging Location | The type of staging location that is used for data loading. The options available include:
Default value: Internal | |||||||
Target | An internal or external location where data can be loaded. If you selected External in the Staging Location field above, a staging area is created in S3 or Azure as required; else a staging area is created in Snowflake's internal location. The staging field accepts the following input:
Format: @<Schema>.<StageName>[/path]
Format: @~/[path]
Format: s3://[path]
The value for the expression has to be provided as a pipeline parameter and cannot be provided from the Upstream Snap for performance reasons when the property is used as an expression parameter. Default value: [None] | |||||||
Storage Integration | The pre-defined storage integration which is used to authenticate the external stages. The value for the expression has to be provided as a pipeline parameter and cannot be provided from the Upstream Snap for performance reasons when the property is used as an expression parameter. Default value: None | |||||||
Staged file list | List of the staged file(s) loaded to the target file. | |||||||
File name pattern | A regular expression pattern string, enclosed in single quotes, specifying the file names and /or path to match. Default value: [None] | |||||||
File format object | Specify an existing file format object to use for loading data into the table. The specified file format object determines the format type such as CSV, JSON,XMLand AVRO, or other format options for data files. Default value: [None] | |||||||
File Format type | Specify a predefined file format object to use for loading data into the table. The available file formats include CSV, JSON, XML and AVRO. | |||||||
File Format option | Specify the file format option. Separate multiple options by using blank spaces and commas. You can use various file format options including binary format which passes through in the same way as other file formats. See File Format Type Options for additional information. Before loading binary data into Snowflake, you must specify the binary encoding format, so that the Snap can decode the string type to binary types before loading into Snowflake. This can be done by specifying the following binary file format:
However, the file you upload and download must be in similar formats. For instance, if you load a file in HEX binary format, you should specify the HEX format for download as well. Example: BINARY_FORMAT=UTF-8 Default value: [None] When using external staging locations
| |||||||
Table Columns | Conditional. This specifies the table columns to use in the Snowflake COPY INTO query. This only applies when the Data source is 'Staged files'. This configuration is useful when the staged files contain a subset of the columns in the Snowflake table. For example, if the Snowflake table contains columns A, B, C and D and the staged files contain columns A and D then the Table Columns field would be have two entries with values A and D. The order of the entries should match the order of the data in the staged files. Default value: [None] If the Data source is selected with 'Input view', the snap would throw an error as displayed: | |||||||
Select Query | Activates when you select Staged files in the Data source field. Specify the The SELECT statement transform option enables querying the staged data files by either reordering the columns or loading a subset of table data from a staged file. For example, (OR)
We recommend you not to use temporary stage while loading your data. Default value: N/A | |||||||
Encryption type | Specifies the type of encryption to be used on the data. The available encryption options are:
Default value: No default value. The KMS Encryption option is available only for S3 Accounts (not for Azure Accounts) with Snowflake. If Staging Location is set to Internal, and when Data source is Input view, the Server Side Encryption and Server-Side KMS Encryption options are not supported for Snowflake snaps: This happens because Snowflake encrypts loading data in its internal staging area and does not allow the user to specify the type of encryption in the PUT API (see Snowflake PUT Command Documentation.) | |||||||
KMS key | The KMS key that you want to use for S3 encryption. For more information about the KMS key, see AWS KMS Overview and Using Server Side Encryption. Default value: No default value. This property applies only when you select Server-Side KMS Encryption in the Encryption Type field above. | |||||||
Additional Options | ||||||||
Buffer size (MB) | This property is required when bulk loading to Snowflake using AWS S3 as the external staging area. It specifies the data in MB to be loaded into the S3 bucket, at a time. Minimum value: 5 MB Maximum value: 5000 MB Default value: 10 MB S3 allows a maximum of 10000 parts to be uploaded so this property must be configured accordingly to optimize the bulk load. Refer to Upload Part for more information on uploading to S3. | |||||||
Manage Queued Queries | Select this property to decide whether the Snap should continue or cancel the execution of the queued Snowflake Execute SQL queries when you stop the pipeline. If you select Cancel queued queries when pipeline is stopped or if it fails, then the read queries under execution are cancelled, whereas the write type of queries under execution are not cancelled. Snowflake internally determines which queries are safe to be cancelled and cancels those queries. Default value: Continue to execute queued queries when pipeline is stopped or if it fails | |||||||
On Error | Specifies an action to be performed when errors are encountered in a file. The available actions include:
Default value: ABORT_STATEMENT | |||||||
Error Limit | Specifies to skip file when the number of errors in the file exceeds the specified error limit on when SKIP_FILE_number is selected for On Error. Default value: 0 | |||||||
Error percentage limit | Specifies to skip file when the percentage of errors in the file exceeds the specified percentage when SKIP_FILE_number% is selected for On Error. Default value: 0 | |||||||
Size limit | Specifies the maximum size (in bytes) of data to be loaded. At least one file is loaded regardless of the value specified for SIZE_LIMIT unless there is no file to be loaded. A null value indicates no size limit. Default value: 0 | |||||||
Purge | Specifies whether to purge the data files from the location automatically after the data is successfully loaded. Default value: Not Selected | |||||||
Return Failed Only | Specifies whether to return only files that have failed to load while loading. Default value: Not Selected | |||||||
Force | Specifies to load all files, regardless of whether they have been loaded previously and have not changed since they were loaded. Default value: Not Selected | |||||||
Validation mode | This mode is useful for visually verifying the data before unloading it. The available options are: NONE RETURN_n_ROWS RETURN_ERRORS RETURN_ALL_ERRORS Default value: NONE | |||||||
Rows to return | Specifies the number of rows not loaded into the corresponding table. Instead, the data is validated to be loaded and returns results based on the validation option specified. It can be one of the following values: RETURN_n_ROWS | RETURN_ERRORS | RETURN_ALL_ERRORS Default value: 0 | |||||||
Snap execution | Select one of the three modes in which the Snap executes. Available options are:
|
Instead of building multiple Snaps with inter dependent DML queries, it is recommended to use the Stored Procedure or the Multi Execute Snap.
In a scenario where the downstream Snap does depends on the data processed on an Upstream Database Bulk Load Snap, use the Script Snap to add delay for the data to be available.
For example, when performing a create, insert and a delete function sequentially on a pipeline, using a Script Snap helps in creating a delay between the insert and delete function or otherwise it may turn out that the delete function is triggered even before inserting the records on the table.
Examples
Loading Binary Data Into Snowflake
The following example Pipeline demonstrates how you can convert the staged data into binary data using the binary file format before loading into Snowflake database.
To begin with, we configure the Snowflake Execute Snap with this query: select * from "PUBLIC"."EMP2" limit 25 –
this query reads 25 records from the Emp2 table.
Then, we configure the Mapper Snap with the output from the upstream Snap by mapping the employee details to the target table. Note that the Bio column is binary data type and the Text column is varbinary data type. Upon validation, the Mapper Snap passes the output with the given mappings (employee details) in the table.
Next, we configure the Snowflake - Bulk Load Snap to load the records into Snowflake. We set the File format option as BINARY_FORMAT=UTF-8 to enable the Snap to encode the binary data before loading.
Upon validation, the Snap loads the database with 25 employee records.
Output Preview | Data in Snowflake |
---|---|
Finally, we connect the JSON Formatter Snap to the Snowflake - Bulk Load Snap to transform the binary data to JSON format, and finally write this output in S3 using the File Writer Snap.
Transforming Data Using Select Query Before Loading Into Snowflake
The following example Pipeline demonstrates how you can reorder the columns using the SELECT statement transform option before loading data into Snowflake database. We use the Snowflake - Bulk Load Snap to accomplish this task.
Prerequisite: You must create an internal or external stage in Snowflake before you transform your data. This stage is used for loading data from source files into the tables of Snowflake database.
To begin with, we create a stage using a query in the following format. Snowflake supports both internal (Snowflake) and external (Microsoft Azure and AWS S3) stages for this transformation.
"CREATE STAGE IF NOT EXISTS "+_stageName+" url='"+_s3Location+"' CREDENTIALS = (AWS_KEY_ID='string' AWS_SECRET_KEY='string')
"
This query creates an external stage in Snowflake pointing to S3 location with AWS credentials (Key ID and Secrete Key).
We recommend you not to use a temporary stage to prevent issues while loading and transforming your data.
Now, we add the Snowflake - Bulk Load Snap to the canvas and configure it to transform the data in the staged file SNAP7517_EXT_CSV.csv by providing the following query in the Select Query field:
"select t.$1,t.$4,t.$3,t.$4,t.$5,t.$6,t.$7 from @"+_stageName+" t"
You must provide the stage name along with schema name in the Select Query, else the Snap displays an error. For instance,
SELECT t.$1,t.$4,t.$3,t.$4,t.$5,t.$6,t.$7 from @mys3stage t", displays an error.
SELECT t.$1,t.$4,t.$3,t.$4,t.$5,t.$6,t.$7 from @<Schema Name>.<stagename> t", executes correctly.
Next, we connect a Snowflake Select Snap with the Snowflake - Bulk Load Snap to select the data from the Snowflake database. Upon validation we can view the transformed data in the output view.
Downloads
Release | Snap Pack Version | Date | Type | Updates |
---|---|---|---|---|
November 2024 | main29029 |
| Stable | Updated and certified against the current SnapLogic Platform release. |
August 2024 | 438patches28811 |
| Latest | Fixed an issue with the Snowflake Snaps where the Snaps intermittently displayed an error requesting to |
August 2024 | 438patches28667 |
| Latest | Added the Snowflake - Snowpipe Streaming Snap that enables data insertion into Snowflake using the Snowpipe Streaming API. It allows for continuous data ingestion into Snowflake tables as soon as data becomes available. |
August 2024 | main27765 | Stable | Upgraded the | |
May 2024 | 437patches27531 |
| Latest | Enhanced the Snowflake - Select Snap with Time Travel query support that enables you to access historical data at any point within a defined period. This includes restoring data-related objects, duplicating and backing up data, and analyzing data usage. |
May 2024 | 437patches27156 |
| Latest |
|
May 2024 | 437patches26821 |
| Latest |
|
May 2024 | 437patches26508 |
| Latest |
|
May 2024 | main26341 |
| Stable |
|
February 2024 | 436patches25630 |
| Latest |
|
February 2024 | main25112 |
| Stable | Updated and certified against the current SnapLogic Platform release. |
November 2023 | 435patches24865 |
| Latest | Fixed an issue across the Snowflake Snaps that populated all suggestions for the Schema and Table Names existing in the configured Snowflake Account. Now, the Snaps only populate suggestions related to the database configured in the Account. |
November 2023 | 435patches24110 |
| Latest | Added a lint warning to the Snowflake-Bulk Load Snap that recommends users to select the Purge checkbox when the Data source is input view and the Staging location is External. |
November 2023 | main23721 |
| Stable | The Snowflake Snap Pack is now bundled with the default Snowflake JDBC driver v3.14. |
August 2023 | 434patches23541 |
| Latest | Fixed an issue with the Snowflake-Bulk Load Snap where the Snap wrote irrelevant errors to the error view when both of the following conditions occurred:
Now, the Snap writes the correct errors to the error view. |
August 2023 | main22460 |
| Stable | The Snowflake - Execute Snap now includes a new Query type field. When Auto is selected, the Snap tries to determine the query type automatically. |
May 2023 | 433patches21890 |
| Latest |
|
May 2023 | 433patches21370 |
| Latest |
|
May 2023 | main21015 |
| Stable |
|
February 2023 | 432patches20906 |
| Latest |
|
February 2023 | 432patches20266 |
| Latest | Fixed an issue with the Snowflake - Bulk Load Snap that resulted in lowercase (or mixed case) column names when creating a new table under specific conditions. The new Create table with uppercase column names checkbox addresses this issue. |
February 2023 | 432patches20120 |
| Latest | The Snowflake Bulk Load, Bulk Upsert, and Unload Snaps now support expressions for the Staging location field. |
February 2023 | main19844 |
| Stable |
|
November 2022 | 431patches19581 |
| Latest |
|
November 2022 | 431patches19454 |
| Latest | The Snowflake Snap Pack supports geospatial data types. As the Snowflake Snap Pack requires using our custom Snowflake JDBC driver for full support of all data types, contact support@snaplogic.com for details. |
November 2022 | 431patches19220 |
| Latest | The Snowflake S3 OAuth2 Account now support expressions for external staging fields. |
November 2022 | 431patches19220
|
| Latest |
|
November 2022 | main18944 |
| Stable |
|
November 2022 | 430patches18911 |
| Latest | Because of performance issues, all Snowflake Snaps now ignore the Cancel queued queries when pipeline is stopped or if it fails option for Manage Queued Queries, even when selected. Snaps behave as though the default Continue to execute queued queries when the Pipeline is stopped or if it fails option were selected. |
October 2022 | 430patches18781 |
| Latest | The Snowflake Insert and Snowflake Bulk Upsert Snaps now do not fail with the The Snowflake Bulk Load Snap now works as expected when you configure On Error with SKIP_FILE_*error_percent_limit*% and set the Error Percent Limit to more than the percentage of rows with invalid data in the CSV file. |
October 2022 | 430patches18432 |
| Latest | The Snowflake Bulk Load Snap now has a Validation Errors Type dynamic field, which provides options for displaying validation errors. You can now choose Aggregate errors per row to display a summary view of errors. |
September 2022 | 430patches17962 |
| Latest | The Snowflake Bulk Load Snap now triggers the metadata query only once even for invalid input, thereby improving the performance of Snap. |
September 2022 | 430patches17894 |
| Latest | The Snowflake Select Snap now works as expected when the table name is dependent on an upstream input. |
August 2022 | 430patches17748 |
| Latest | Fixes in the Snowflake Bulk Load Snap:
|
August 2022 | 430patches17377 |
| Latest |
|
August 2022 | main17386 |
| Stable | The following Snowflake Accounts support Key Pair Authentication. |
4.29 Patch | 429patches16478 |
| Latest | Fixed an issue with Snowflake - Bulk Load Snap where Snap failed with 301 status code - SnapDataException. |
4.29 Patch | 429patches16458 |
| Latest |
|
4.29 | main15993 |
| Stable | Upgraded with the latest SnapLogic Platform release. |
4.28 Patch | 428patches15236 |
| Latest |
|
4.28 | main14627 |
| Stable |
|
4.27 | 427patches12999 |
| Latest | Enhanced the Snowflake SCD2 Snap to support Pipeline parameters for Natural key and Cause-historization fields. |
4.27 | main12833 |
| Stable |
|
4.26 Patch | 426patches11469 |
| Latest | Fixed an issue with Snowflake Insert and Snowflake Bulk Load Snaps where the schema names or database names containing underscore (_) caused the time out of Pipelines. |
4.26 | main11181 |
| Stable |
|
4.25 | 425patches10190 | Latest | Enhanced the Snowflake S3 Database and Snowflake S3 Dynamic accounts with a new field S3 AWS Token that allows you to connect to private and protected Amazon S3 buckets. | |
4.25 | main9554 |
| Stable | Upgraded with the latest SnapLogic Platform release. |
4.24 Patch | 424patches8905 |
| Latest | Enhanced the Snowflake - Bulk Load Snap to allow transforming data using a new field Select Query before loading data into the Snowflake database. This option enables you to query the staged data files by either reordering the columns or loading a subset of table data from a staged file. This Snap supports CSV and JSON file formats for this data transformation. |
4.24 | main8556 | Stable | Enhanced the Snowflake - Select Snap to return only the selected output fields or columns in the output schema (second output view) using the Fetch Output Fields In Schema check box. If the Output Fields field is empty all the columns are visible. | |
4.23 Patch | 423patches7905 |
| Latest | Fixed the performance issue in the Snowflake - Bulk Load Snap while using External Staging on Amazon S3. |
4.23 | main7430 |
| Stable |
|
4.22 Patch | 422patches7246 |
| Latest | Fixed an issue with the Snowflake Snaps that failed while displaying similar error in the Snowflake URL connection: |
4.22 Patch | 422patches6849 |
| Latest |
|
4.22 | main6403 |
| Stable | Updated with the latest SnapLogic Platform release. |
4.21 Patch | 421patches6272 |
| Latest | Fixes the issue where Snowflake SCD2 Snap generates two output documents despite no changes to Cause-historization fields with DATE, TIME and TIMESTAMP Snowflake data types, and with Ignore unchanged rows field selected. |
4.21 Patch | 421patches6144 |
| Latest |
|
4.21 Patch | db/snowflake8860 |
| Latest | Added a new field, Handle Timestamp and Date Time Data, to Snowflake Lookup Snap. This field enables you to decide whether the Snap should translate UTC time to your local time and the format of the Date Time data. |
4.21 Patch | MULTIPLE8841 |
| Latest | Fixed the connection issue in Database Snaps by detecting and closing open connections after the Snap execution ends. |
4.21 | snapsmrc542 |
| Stable | Upgraded with the latest SnapLogic Platform release. |
4.20 Patch | db/snowflake8800 |
| Latest |
|
4.20 Patch | db/snowflake8758 |
| Latest | Re-release of fixes from db/snowflake8687 for 4.20: Fixes the Snowflake Bulk Load snap where the Snap fails to load documents containing single quotes when the Load empty strings checkbox is not selected. |
4.20 | snapsmrc535 |
| Stable | Upgraded with the latest SnapLogic Platform release. |
4.19 Patch | db/snowflake8687 |
| Latest | Fixed the Snowflake Bulk Load snap where the Snap fails to load documents containing single quotes when the Load empty strings checkbox is not selected. |
4.19 Patch | db/snowflake8499 |
| Latest | Added the property Handle Timestamp and Date Time Data to Snowflake - Execute and Snowflake - Select Snaps. This property enables you to decide whether the Snap should translate UTC time to your local time. |
4.19 Patch | db/snowflake8412 |
| Latest | Fixed an issue with the Snowflake - Update Snap wherein the Snap is unable to perform operations when:
|
4.19 | snaprsmrc528 |
| Stable |
|
4.18 Patch | db/snowflake8044 |
| Latest | Fixed an issue with the Snowflake - Select Snap wherein the Snap converts the Snowflake-provided timestamp value to the local timezone of the account. |
4.18 Patch | db/snowflake8044 |
| Latest | Enhanced the Snap Pack to support AWS SDK 1.11.634 to fix the NullPointerException issue in the AWS SDK. This issue occurred in AWS-related Snaps that had HTTP or HTTPS proxy configured without a username and/or password. |
4.18 Patch | MULTIPLE7884 |
| Latest | Fixed an issue with the PostgreSQL grammar to better handle the single quote characters. |
4.18 Patch | db/snowflake7821 |
| Latest | Fixed an issue with the Snowflake - Execute Snap wherein the Snap is unable to support the '$' character in query syntax. |
4.18 Patch | MULTIPLE7778 |
| Latest | Updated the AWS SDK library version to default to Signature Version 4 Signing process for API requests across all regions. |
4.18 Patch | db/snowflake7739 |
| Latest |
|
4.18 | snapsmrc523 |
| Stable | Added the Use Result Query property to the Multi Execute Snap, which enables you to write results to an output view. |
4.17 | ALL7402 |
| Latest | Pushed automatic rebuild of the latest version of each Snap Pack to SnapLogic UAT and Elastic servers. |
4.17 Patch | db/snowflake7396 |
| Latest | Fixed an issue wherein bit data types in the Snowflake - Select table convert to true or false instead of 0 or 1. |
4.17 Patch | db/snowflake7334 |
| Latest | Added AWS Server-Side Encryption support for AWS S3 and AWS KMS (Key Management Service) for Snowflake Bulk Load, Snowflake Bulk Upsert, and Snowflake Unload Snaps. |
4.17 | snapsmrc515 |
| Latest |
|
4.16 Patch | db/snowflake6945 |
| Latest | Fixed an issue with the Snowflake Lookup Snap failing when Date datatype is used in JavaScript functions. |
4.16 Patch | db/snowflake6928 |
| Latest | Added support for file format options for input data from upstream Snaps, to the Snowflake Bulk Load Snap. |
4.16 Patch | db/snowflake6819 |
| Latest |
|
4.16 | snapsmrc508 |
| Stable |
|
4.15 | snapsmrc500 |
| Stable |
|
4.14 | snapsmrc490 |
| Stable | Upgraded with the latest SnapLogic Platform release. |
4.13 | snapsmrc486 |
| Stable | Upgraded with the latest SnapLogic Platform release. |
4.12 | snapsmrc480 |
| Stable | Upgraded with the latest SnapLogic Platform release. |
4.11 Patch | MULTIPLE4377 |
| Latest | Fixed a document call issue that was slowing down the Snowflake Bulk Load Snap. |
4.11 Patch | db/snowflake4283 |
| Latest | Snowflake Bulk Load - Fixed an issue by adding PUT command to the list of DDL command list for Snowflake. |
4.11 Patch | db/snowflake4273 |
| Latest | Snowflake Bulk Load - Resolved an issue with Snowflake Bulk Load Delimiter Consistency (comma and newline). |
4.11 | snapsmrc465 |
| Stable | Upgraded with the latest SnapLogic Platform release. |
4.10 Patch | snowflake4133 |
| Latest | Updated the Snowflake Bulk Load Snap with Preserve case sensitivity property to preserve the case sensitivity of column names. |
4.10 | snapsmrc414 |
| Stable |
|
4.9.0 Patch | snowflake3234 |
| Latest | Enhanced Snowflake - Execute Snap results to include additional details |
4.9.0 Patch | snowflake3125 |
| Latest | Addressed an issue in Snowflake Bulk Load where the comma character in a value is not escaped. |
4.9 | snapsmrc405 |
| Stable | JDBC Driver Class property added to enable the user to custom configure the JDBC driver in the Database and the Dynamic accounts. |
4.8.0 Patch | snowflake2760 |
| Latest | Potential fix for JDBC deadlock issue. |
4.8.0 Patch | snowflake2739 |
| Latest | Addressed an issue with the Snowflake schema not correctly represented in the Mapper Snap. |
4.8 | snapsmrc398 |
| Stable |
|