Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

In this Article

Table of Contents
maxLevel2
absoluteUrltrue

...

Use Case: Analyze Customer Returns For Retail Stores Chain

In a real business scenario, once the customers received their placed order, they have an option to return the order in the retail chain.  This   This use case demonstrates how you can use SnapLogic ELT Snap Pack to analyze customer returns for a retail stores chain. This example demonstrates the demonstrates the process to find customers who have returned items worth 20% more often than the average customer returns for a store in a given state for a given year.    

Problem

For a given retail chain, we need data on customers who return more than 20% of the average customer merchandise returns for a given store per year. In the retail industry, the dataset is typically large, running into several terabytes. The  The challenge in dealing with such large datasets is grouping the data. 

Solution

This use case is based on the the TPC Benchmark DSTM DSTM (TPC-DS) specification, a widely used decision support industry benchmark that evaluates the performance of big data processing engines. TPC-DS models the Data Science functions that retailers use to capture the longer-running complex queries of Data Science. TPC TPC-DS models the Data Science functions of a retailer. This  This model captures the captures the longer-running complex queries of Data Science and represents and represents samples of data sets - customers, store, store_returns, and data_dim. 

There is a fact dataset:  STORE_RETURNS and 3 dimensions datasets: DATE_DIM, STORES, and CUSTOMERS.

ELT Solution Pipeline

To solve this big data problem, we have designed Pipelines that run on Snowflake and Microsoft Azure Databricks Lakehouse Platform

Example Pipelines - Snowflake

...

This Pipeline does not require source tables as they are created on Snowflake on the fly from SQL. An output target tables are also created on Snowflake. The Pipeline writes from the Snowflake database to the Snowflake database. Users need not have Snowflake accounts. However, they require SQL require SQL experience. 

You can download the Pipeline from here

Pipeline 2

The Pipeline writes from the Snowflake database to the Snowflake database. However, it requires source tables to be present in the Snowflake database to run the Pipeline. 

An output target table is created on Snowflake. Users need not have AWS/Snowflake accounts or do not require SQL experience.

You can download the Pipeline from here

Pipeline 3

This Pipeline does not require source tables as they are created on Snowflake on the fly using ELT Load snap and the output target tables are created on Snowflake. The Pipeline writes from the Snowflake database to the Snowflake database. The Pipeline converts data from CSV to database tables and can be used for a wide variety of complex tasks. It requires table schema setup and AWS/Azure account is required. Users  Users need not require SQL experience for this Pipeline.


Image Modified

You can download the Pipeline from here

Example Pipelines - Microsoft Azure Databricks Lakehouse Platform

Pipeline 4

This Pipeline does not require source tables or raw data as data as they are created on Microsoft Azure Databricks Lakehouse Platform on Platform on the fly from SQL. An output target tables are also created on Microsoft Azure Databricks Lakehouse Platform. The Pipeline writes from the Microsoft Azure Databricks Lakehouse Platform database to the Microsoft Azure Databricks Lakehouse Platform. However, they require SQL require SQL experience. It can be used be used only for ELT demo or simple tasks.

You can download the Pipeline from here

Pipeline 5

The Pipeline writes from the Microsoft Azure Databricks Lakehouse Platform to the Microsoft Azure Databricks Lakehouse Platform. However, it requires source tables to be present in the Microsoft the Microsoft Azure Databricks Lakehouse Platform before execution. An output target table is created on Microsoft Azure Databricks Lakehouse Platform. Users do not require SQL experience. The  The Pipeline can be used for a wide variety of complex tasks. 

You can download the Pipeline from here