eXtreme Snowflake Snap Pack

On this Page

Overview

Snowflake is a data warehousing service in the cloud. Starting with 4.16, the eXtreme Snowflake Snap Pack, enables SnapLogic eXtreme users to read and write directly to Snowflake. Previously, eXtreme users would have to build a standard-mode pipeline for reading and writing the data to an S3 bucket. This Snap Pack eliminates the need to implement a standard-mode pipeline, making the workflow easier so that you can write directly to Snowflake, instead of writing to the S3 bucket first and then to the Snowflake data warehouse. This feature saves execution time in the end-to-end pipeline by eliminating a step and saves money because users do not have to use their Groundplex resources to move the data from the S3 bucket to the Snowflake Data Warehouse. Users can now avoid moving large volumes of data around from cluster to cluster in their data centers. 

Support

The eXtreme Snowflake Snap Pack uses the Spark 2.4 connector to support bi-directional data movement within eXtreme Pipelines.

Purpose of Snaps

The Snowflake Select Snap reads data from a source database in the Snowflake data warehouse to be passed through an eXtreme Pipeline. After the data is processed, the eXtreme Pipeline can write the data to any location in one of the following data formats: JSON, Avro, Parquet, CSV, ORC. When data is passed through an eXtreme Pipeline, the Snowflake Insert Snap upstream takes data from an S3 bucket or a Snowflake data warehouse in any format (JSON, Avro, Parquet, CSV, ORC) and writes it to a database in the Snowflake data warehouse.


Example Pipeline

In this example, we are analyzing sales data by marital status in certain states. The sales data is stored in a database in the Snowflake data warehouse, where you want the processed data to remain for future analysis.

 Understanding the Pipeline Workflow

Snowflake Select: Reads the data from the Snowflake source database.

Router: Categorizes the sales by marital status (single, divorced, undeclared).

Union: Groups the data by into these categories.

Filter: Filters out the data for inputs 1 and 2 (single and divorced) by state.

Aggregate: Performs aggregate operation to obtain average for sales, quantity, and wholesales with the sum of all wholesales.

Snowflake Insert: Writes the resulting data back to the Snowflake data warehouse.

Download


  File Modified

File Snowflake_Basic_UseCase.slp

Feb 07, 2019 by John Brinckwirth


Snap Pack History

 Click to view/expand

4.25 (main12833)

  • No updates made.

4.25 (main11181)

  • No updates made.

4.25 (main9554)

  • No updates made.

4.24 (main8556)

  • Enhances the Snowflake Select - Spark 2.x SQL Snap by adding the Enable SQL query editor check box to enable you to enter the SQL query in a separate editor window.

4.23 (main7430)

  • Accounts support validation. Thus, you can click Validate in the account settings dialog to validate that your account is configured correctly. 
  • Enhances Snowflake Select/Insert Snaps to support Snap suggestions for the tables. You can click  to retrieve a list of available table names, based on your account configuration.

4.22 (main6403)

  • No updates made.

4.21 (snapsmrc542)

  • No updates made.

4.20 (snapsmrc535)

  • Added a new field, Table Insert Policy, in the Snowflake Insert Snap to handle duplicate records in the insert statement.

4.19 (snapsmrc528)

  • No updates made.

4.18 (snapsmrc523)

  • No updates made.

4.17 Patch ALL7402

  • Pushed automatic rebuild of the latest version of each Snap Pack to SnapLogic UAT and Elastic servers.

4.17 (snapsmrc515)

  • No updates made. Automatic rebuild with a platform release.

4.16 (snapsmrc508)

  • New Snap Pack. Extract and insert data from and to a database in your Snowflake Data Warehouse. Snaps in this Snap Pack are:
    • Snowflake Select - Spark 2.x SQL: Extracts data from a specified database in the Snowflake data warehouse.
    • Snowflake Insert - Spark 2.x SQL: Inserts data into a specified database in the Snowflake data warehouse.