SnapLogic Pipeline Configuration as a REST API

In this article

Overview

This use case demonstrates how you can use SnapLogic Data Science to configure SnapLogic Pipelines as REST APIs to receive requests from external applications. Currently, we support the following two types of REST APIs that are called tasks in SnapLogic:

The table below shows the key differences between the two types of tasks.


Triggered TaskUltra Task
ObjectiveTrigger the API using a URLReal-time API that has a very low response time
LatencyHighVery low
Resource ConsumptionStart a new Pipeline execution for each requestThe Pipeline is active all the time
Number of DocumentsAny number of input/output documents1 input document for 1 output document

This document explains how you can use the SnapLogic platform to run the Pipelines as APIs and is structured as follows:

  • Data Science API Pipeline: Describes the Pipeline you must build to run as an API.
  • Request format: Describes the request format to send API requests.
  • Configuring and managing Ultra Tasks: Describes how to schedule and manage Ultra Tasks.
  • Testing Ultra Tasks: Describes how to test the Ultra Tasks.

Data Science API Pipeline

To build Data Science-driven applications, Data Science API is key. Data Science API provides an interface that enables external applications to use the Data Science model. The model can be trained using the Trainer (Classification), Trainer (Regression), 721453390, or other Data Science libraries in Python using the Remote Python Script Snap. Once the model is ready, you can use a Pipeline with a Predictor Snap to host the model and provide REST API.

The image below shows an example Pipeline that hosts the Data Science model and provides an API in the Telco Customer Churn Prediction demo. As you can see in the screenshot, the Pipeline that hosts a model must have one open input view and one open output view to be compatible with an Ultra Task. This way, the Pipeline receives the input data from the request and sends the output data in the API’s response

The table below briefly describes what each Snap does in this Pipeline.

Snap LabelDescription
Reads the model from the SnapLogic File System (SLFS).
JSON ParserParses the model, which is saved in JSON format.
FilterAuthenticates the request by comparing $content.token to _token, which is the token set as Pipeline parameter. The "_" refers to a Pipeline parameter, while "$" refers to an input document.
Extract ParamsThis is a Mapper Snap. It extracts $content.params.gender and other parameters from the request.
Predictor (Classification)Loads the model from the JSON Parser Snap and uses that model to offer predictions based on the input received from the Extract Params Snap.
Prepare ResponseThis is a Mapper Snap, which inserts the prediction from the Predictor Snap into $content, which forms the response body. It also adds headers to support Cross-Origin Resource Sharing (CORS).

Request Format

The recommended method for sending Data Science API requests is the POST method. You can send an input document in JSON format as part of the request body. The code block below contains an example of the request body used for the Telco Customer Churn Prediction API. The request body is accessed as $content. In this case, $content.token is "snaplogic_ml_showcase", while $content.params.gender is "Female".

Configuring and Managing Ultra Tasks

This section talks about configuring and managing Ultra Tasks.

To configure an ultra Task:

  1. In the Designer toolbar, click Create Task.

  2. Select Ultra in Run policy and remove the Bearer Token to remove authentication.

    We recommend that you use a Snap to perform authentication inside the Pipeline. To authenticate within the pipeline, send the static token snaplogic_ml_showcase using the JSON Generator Snap through the REST Post Snap to send a request to the Ultra Task.
  3. Use a Filter Snap in Data Science API Pipeline to compare your authentication details against a static token, as shown in the below screenshot. This step verifies the static token passed from the ultra task.
  4. Optional. Specify Instances. This is the number of API instances to be run in parallel across nodes in the Snaplex.

    We recommend that you use at least two API instances to provide fault tolerance. You can find more information about Ultra Tasks here.

To manage Ultra Tasks:

  1. Click Create in the Create Task dialog and click Show tasks in this project in Manager to go to the Manager page.
  2. Click the task name to edit the task settings.



  3. Click  next to the task, and then click Details. The task detail popup appears, complete with the HTTP Endpoint URL.



  4. Click Details to see the task statistics. You can see that there are 7 API instances that have started so far.


However, only 2 API instances are running as shown in the following image from the Dashboard page. The API instance stops, and a new one is created if the Pipeline fails.

Testing Ultra Tasks

There are many tools that you can use to test your Ultra Tasks (APIs). However, if you would like to have a regression/automated test, we recommend that you build a Pipeline to do this, so you can schedule it to run on a daily/weekly basis.

Pipeline

Use the JSON Generator Snap to generate a sample request and use the REST Post Snap to send a request to the Ultra Task. You can download this Pipeline here.

Ultra Task API Tester

The Ultra Task API Tester is available on our SnapLogic Machine Learning Showcase.