In this article

We design the Data Preparation pipeline as shown below:

This pipeline contains the following key Snaps:

	Snap Label	Snap Name	Description
1	ZipFile Read	ZipFile Read	Reads the Twitter dataset containing 1,600,000 tweets extracted using the Twitter API.
2	Select Fields	Mapper	Identifies all sentences in the dataset that have been tagged 'negative' or 'positive'.
3	Tokenizer	Tokenizer	Breaks each sentence into an array of words, of which two copies are made.
4	Common Words	Common Words	Computes the frequency of the 200 most common words in one copy of the array of words.
5	Write Common Words	File Writer	Writes the output of the Common Words Snap into a file in SLFS.
6	Bag of Words	Bag of Words	Converts the second copy of the array of words into a vector of word frequencies, whose rows are then shuffled to ensure that the unsorted dataset can be used in a variety of model-creation algorithms.
7	Write Processed Dataset	File Writer	Writes the processed dataset received from the Bag of Words Snap to SLFS.

Key Data Preparation Snaps

Anchor
ZFR
ZFR
ZipFile Read

This Snap reads the Twitter dataset, saved as a ZIP file.

Image RemovedImage Added

Expand

title	Output preview

The Pipeline then parses the data retrieved from the file as a CSV using a CSV Parser Snap:

Note
The property labeled field001 captures the nature of the response that was captured by the people who originally tagged each sentence in the dataset. Here, the value 0 implies a negative polarity, and 1 implies a positive polarity.

...

The Build Model Pipeline is designed as shown below:

This Pipeline contains the following key Snaps:

	Snap Label	Snap Name	Description
1	Read Processed Dataset	File Reader	Reads the processed Twitter dataset created by the Data Preparation Pipeline.
2	AutoML	AutoML	Runs specified algorithms on the processed dataset and trains the model that offers the most reliable and accurate results.
3	Write Model	File Writer	Writes the model identified and trained by the AutoML Snap to a file in the SLFS.
4	Write Leaderboard	File Writer	Writes the leaderboard, a table listing out the top models built by this Snap display in the order of ranking, along with metrics indicating the performance of the model.
5	Write Report	File Writer	Writes the report generated by the AutoML Snap to the SLFS. This report describes the performance of each of the top-five algorithms evaluated by the AutoML Snap.

Key Build Model Snaps

Anchor
RPD
RPD
Read Processed Dataset

...

Output

Description

Screenshot (Click to expand)

Output0: Model

This is the model that the AutoML Snap determines offers the most accurate and reliable sentiment analysis.

Anchor

	WLB
	WLB

Output1: Leaderboard

A document that contains the leaderboard. All the models built by this Snap display in the order of ranking along with metrics indicating the performance of the model.

Anchor

	WRT
	WRT

Output2: Report

A document that contains an interactive report of up to top-10 models.

The Twitter Sentiment Analysis Pipeline

This Pipeline takes the input sentence sent through the web UI and uses the model created by the Pipelines discussed above to predict the sentiment of the input sentence.

This Pipeline contains the following key Snaps:

	Snap Label	Snap Name	Description
1	Sample Request	JSON Formatter	Provides a sample request for the purpose of this use case.
2	Extract Params	Mapper	Isolates the input text from the rest of the properties associated with the input sentence. .
3	Tokenizer	Tokenizer	Breaks the input text into a array of tokens.
4	Read Common Words	File Reader	Reads the array of common words that you had saved in the Data Preparation Pipeline.
5	Bag of Words	Bag of Words	Creates a vector made out of the words that are present in both the input sentence and the list of common words.
6	Read Model	File Reader	Reads the model that you had saved from the Build Model Pipeline.
7	Predictor	Predictor	Determines the polarity of the input sentence using the Bag of Words input vector and the model. It also outputs the confidence levels in its predictions.
8	Prepare Response	Mapper	Prepares a response that will be picked up by the Ultra Task and sent back to the web application UI.

Key Sentiment Analysis Snaps

Anchor
SR1
SR1
Sample Request

This example uses a JSON Generator Snap to provide a sample request; but when you create and run the web application, the Pipeline shall receive the input sentence through the open input view of the Filter Snap.

Image Modified

Note
The $token property indicates that the data coming into the Pipeline is from the web application. That is why you have the Filter Snap, which checks for the string "snaplogic_ml_showcase" and filters all those inputs that do not contain this string.

...

Versions Compared

Old Version 6

New Version Current

Key

Key Data Preparation Snaps

Anchor
ZFR
ZFR
ZipFile Read

Key Build Model Snaps

Anchor
RPD
RPD
Read Processed Dataset

The Twitter Sentiment Analysis Pipeline

Key Sentiment Analysis Snaps

Anchor
SR1
SR1
Sample Request

Page Comparison

Versions Compared

Old Version 6

New Version Current

Key

Key Data Preparation Snaps

AnchorZFRZFRZipFile Read

Key Build Model Snaps

AnchorRPDRPDRead Processed Dataset

The Twitter Sentiment Analysis Pipeline

Key Sentiment Analysis Snaps

AnchorSR1SR1Sample Request

Anchor
ZFR
ZFR
ZipFile Read

Anchor
RPD
RPD
Read Processed Dataset

Anchor
SR1
SR1
Sample Request