Snowflake - Bulk Load

In this article

Overview

You can use the Snowflake - Bulkload Snap to load data from input sources or files stored on external object stores like Amazon S3, Google Storage, and Azure Storage Blob into the Snowflake data warehouse.

Snap Type

The Snowflake - Bulk Load Snap is a Write-type Snap that performs a bulk load operation.

Prerequisites

You must have minimum permissions on the database to execute Snowflake Snaps. To understand if you already have them, you must retrieve the current set of permissions. The following commands enable you to retrieve those permissions:

SHOW GRANTS ON DATABASE <database_name> SHOW GRANTS ON SCHEMA <schema_name> SHOW GRANTS TO USER <user_name>
  • You must enable the Snowflake account to use private preview features for creating the Iceberg table.

  • External volume has to be created on the Snowflake worksheet, or Snowflake Execute snap. Learn more about creating external volume.

Security Prererequisites

You must have the following permissions in your Snowflake account to execute this Snap: 

  • Usage (DB and Schema): Privilege to use the database, role, and schema.

  • Create table: Privilege to create a temporary table within this schema.

The following commands enable minimum privileges in the Snowflake console:

grant usage on database <database_name> to role <role_name>; grant usage on schema <database_name>.<schema_name>; grant "CREATE TABLE" on database <database_name> to role <role_name>; grant "CREATE TABLE" on schema <database_name>.<schema_name>;

Learn more about Snowflake privileges: Access Control Privileges.

Internal SQL Commands

This Snap uses the following Snowflake commands internally:

  • COPY INTO - Enables loading data from staged files to an existing table.

  • PUT - Enables staging the files internally in a table or user stage.

Requirements for External Storage Location

The following are mandatory when using an external staging location:

When using an Amazon S3 bucket for storage:

  • The Snowflake account should contain S3 Access-key ID, S3 Secret key, S3 Bucket and S3 Folder.

  • The Amazon S3 bucket where the Snowflake will write the output files must reside in the same region as your cluster.

When using a Microsoft Azure storage blob:

  • A working Snowflake Azure database account.

When using a Google Cloud Storage:

  • Provide permissions such as Public access and Access control to the Google Cloud Storage bucket on the Google Cloud Platform.

Support for Ultra Pipelines

Works in Ultra PipelinesHowever, we recommend that you not to use this Snap in an Ultra Pipeline.

Limitations

  • Special character'~' is not supported if it is there in the temp directory name for Windows. It is reserved for the user's home directory.

  • Snowflake provides the option to use the Cross Account IAM in the external staging. You can adopt the cross-account access through the option