Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

In this Article

...

To solve this big data problem, we have designed example Pipelines that run on Snowflake and Microsoft the following:

  • AWS Snowflake
  • AWS Redshift
  • Microsoft Azure Synapse
  • Microsoft Azure Databricks Lakehouse Platform

...

  • Google BigQuery

Example Pipelines - Snowflake

Pipeline 1 (SQL to

...

Snowflake)

This Pipeline does not require source tables as they are created on Snowflake on the fly from SQLduring runtime using SQL. Output target tables are also created on Snowflake. The Pipeline writes from the Snowflake database to the Snowflake database. Users need not have Snowflake accounts. However, they require SQL experience. 

You can download the Pipeline from here

Pipeline 2 (

...

S3 to

...

SF)

This Pipeline does not require source tables as they are created on Snowflake on the fly using ELT Load snap and the output target tables are created on Snowflake. The Pipeline writes from the Snowflake database to the Snowflake database. However, it requires source tables to be present in Snowflake  - from the S3 location to the Snowflake database to run the Pipeline.  An output target table is created on Snowflake. The Pipeline converts data from CSV The Pipeline converts CSV data to database tables and . This data can be used for a wide variety of complex tasks. It requires table schema setup and AWS/Snowflake account is required. Users need not require SQL experience for this Pipeline.

Image RemovedImage Added

You can download the Pipeline from here.

Pipeline

...

(Snowflake

...

to Snowflake)

This Pipeline does not require source tables as they are created on Snowflake on the fly using ELT Load snap and the output target tables are created on Snowflake. The Pipeline writes from The Pipeline writes from the Snowflake database to the Snowflake database. However, it requires source tables to be present in the Snowflake database to run the Snowflake databasePipeline. An output target table is created on Snowflake. The Pipeline  Pipeline converts data from CSV (S3 location) to database tables and can be used for a wide variety of complex tasks. It requires table schema setup and AWSand AWS/Azure Snowflake account is required. Users need not require SQL experience for this Pipeline.

Image RemovedImage Added

You can download the Pipeline from  here.

Example Pipelines -

...

 AWS Redshift

Pipeline

...

1 (SQL

...

to Redshift)

This Pipeline does not require source tables or raw data as they are created on Microsoft Azure Databricks Lakehouse Platform on on AWS Redshift on the fly from SQL. An output Output target tables are also created on Microsoft Azure Databricks Lakehouse Platform. The Pipeline writes from the Microsoft Azure Databricks Lakehouse Platform database to the Microsoft Azure Databricks Lakehouse Platform. Howeverbuilt on AWS Redshift. However, they require SQL experience. It can be used only for ELT demo demos or simple tasks.

Image RemovedImage Added

You can download the Pipeline from   here

Pipeline

...

2 (

...

S3 to

...

RS)

The Pipeline writes from the Microsoft Azure Databricks Lakehouse Platform to the Microsoft Azure Databricks Lakehouse Platform. However, the source tables should be present in the Microsoft Azure Databricks Lakehouse Platform before execution. An output target table is created on Microsoft Azure Databricks Lakehouse Platform. SQL experience is not requiredThis Pipeline does not require source tables as they are created on AWS Redshift on the fly using ELT Load snap and the output target tables are made on AWS Redshift. The Pipeline converts data from CSV to database tables and can be used for a wide variety of complex tasks. It requires table schema setup and AWS Redshift account is required. Users need not require SQL experience for this Pipeline.

Image Added

You can download the Pipeline from here

Pipeline 3 (RS to RS)

The Pipeline writes from AWS Redshift to AWS Redshift. However, source tables must be present in AWS Redshift before execution. Users do not require SQL experience. The Pipeline can be used for a wide variety of complex tasks. 

Image RemovedImage Added

You can download the Pipeline from   here

Example Pipelines -

...

 Microsoft Azure Synapse

Pipeline 1 (SQL to

...

SY)

This Pipeline does not require source tables or raw data as they are created on AWS Redshift Azure Synapse on the fly from SQL. The output Output target tables are also created on AWS RedshiftAzure Synapse. However, they require SQL experience. It can be used only for ELT demo or simple tasks.

Image Added

You can download the Pipeline from here

Pipeline 2 (

...

Microsoft Azure Data Lake Storage to Azure Synapse)

This Pipeline does not require source tables as they are created on AWS Redshift on Azure Synapse on the fly using ELT Load snap and the output target tables are created made on AWS RedshiftAzure Synapse. The Pipeline converts data from CSV (S3 ADLS Gen 2 location) to database tables and can be used for a wide variety of complex tasks. It requires table schema setup and AWS Redshift and an Azure Synapse account is required. Users need not require SQL experience is not needed for this Pipeline.

Image Added

You can download the Pipeline from here

Pipeline 3 (

...

SY to

...

SY)

The Pipeline writes from AWS Redshift to AWS Redshift. The source tables should from Azure Synapse to Azure Synapse. However, it requires source tables to be present in AWS Redshift the Azure Synapse before execution.  Users do not require SQL experience is not needed for this Pipeline. The . The Pipeline can be used for a wide variety of complex tasks. 

Image Added

You can download the Pipeline from here

Example Pipelines -

...

 Microsoft Azure Databricks Lakehouse Platform

Pipeline 1 (SQL to

...

DLP)

This Pipeline does not require source tables or raw data as they are created on Azure Synapse Microsoft Azure Databricks Lakehouse Platform (DLP) on the fly from SQL. Output target tables are also created on Azure Synapse. SQL experience is neededbuilt on Microsoft Azure Databricks Lakehouse Platform. The Pipeline writes from the Microsoft Azure Databricks Lakehouse Platform database to the Microsoft Azure Databricks Lakehouse Platform. However, they require SQL experience. It can be used only for ELT demo or simple tasks.

Image Added

You can download the Pipeline from here

Pipeline 2 (

...

DBFS to

...

DLP)

This Pipeline does not require source tables as they are created on Azure Synapse on Databricks Lakehouse Platform on the fly using ELT Load snap and the output target tables are created on Azure Synapse Databricks Lakehouse Platform. The Pipeline converts data from CSV (ADLS Gen 2 DBFS location) to database tables and can be used for a wide variety of complex tasks. It requires table schema setup and Azure Synapse DLP account is required.  Users need not require SQL experience is not needed for this Pipeline.

Image Added

You can download the Pipeline from here

Pipeline 3 (

...

DLP to

...

DLP)

The Pipeline writes from Azure Synapse to Azure Synapse. The source tables should be present in the Azure Synapse before execution. SQL experience is not needed for this Pipeline. The from the Microsoft Azure Databricks Lakehouse Platform to the Microsoft Azure Databricks Lakehouse Platform. However, source tables must be present in the Microsoft Azure Databricks Lakehouse Platform before execution. An output target table is created on Microsoft Azure Databricks Lakehouse Platform. Users do not require SQL experience. The Pipeline can be used for a wide variety of complex tasks. 

Image Added

You can download the Pipeline from here

Example Pipelines - Google BigQuery

...

This Pipeline does not require source tables or raw data as they are created on Google BigQuery on the fly from SQL. Output target tables are also created on Google BigQuery. SQL experience is needed However, they require SQL experience. It can be used only for ELT demo or simple tasks.

Image Added

You can download the Pipeline from here

Pipeline 2 (S3 to BQ)

This Pipeline does not require source tables as they are created on Google BigQuery on the fly using ELT Load snap and the output target tables are created on Google BigQuery. The Pipeline converts data from CSV (S3 location) to database tables and can be used for a wide variety of complex tasks. It requires table schema setup and Google and a Google BigQuery account is required.  Users need not require SQL experience is not needed for this Pipeline.

Image Added

You can download the Pipeline from here

Pipeline (BQ to BQ)

The Pipeline writes from Google BigQuery to Google BigQuery. However, it requires source tables to must be present in the Google BigQuery before execution.  Users do not require SQL experience is not needed for this Pipeline.The . The Pipeline can be used for a wide variety of complex tasks. 

Image Added

You can download the Pipeline from here


Source Data Sets Details:


File Name
Volume
Rows
STORE_RETURNS_DEMO_3.csv128K1000
STORE_DEMO_3.csv 128K1000
CUSTOMER_DEMO_3.csv 707K5000
STORE_DEMO_3.csv271K1000