Problem Scenario

Papers have been replaced by digital documents for many reasons. However, we still see a lot of papers in our daily life. Machines do not have an ability to understand what has been written on those physical papers. Converting handwritten characters to digital characters has been a tough problem in the past (and probably now). We cannot efficiently process those physical documents with computers unless we can convert them to digital documents.

Description

Researchers in the Machine Learning field have been trying to solve this problem for many years. A lot of state-of-the-art Machine Learning algorithms are able to accurately recognise handwritten characters. In the past few years, Convolutional Neural Networks (CNN) algorithm has been widely used and has shown successful results in various computer vision tasks. It also shows promising results in handwritten character recognition.

In this use case, we train the CNN model on MNIST dataset that consists of 70,000 images containing handwritten digits. Each image is 28 pixels by 28 pixels and contains one handwritten digit. We train the model on 60,000 images and keep 10,000 images for testing the model.

The live demo is available at our Machine Learning Showcase.

Objectives

Use Remote Python Script Snap from ML Core Snap Pack to deploy a Python script to train Neural Networks model on Iris flower dataset.
Test the model with a sample.
Use Remote Python Script Snap from ML Core Snap Pack to deploy a Python script to host the model and schedule an Ultra Task to provide API.
Test the API.
Develop a demo with HTML and Javascript.

Pipelines

We are going to build 3 pipelines: Model Building, Model Testing, and Model Hosting.

Model Building

The Remote Python Script Snap downloads the MNIST dataset (we use Keras library to get the dataset), train CNN model, and evaluate the model. Than, we format the model into JSON with JSON Formatter Snap and save the model on SnapLogic File System (SLFS) using File Writer Snap.

Python Script

Below is a piece of the Python script from the Remote Python Script Snap used in this pipeline. The code is modified from official Keras example.

There are 3 main functions: snaplogic_init, snaplogic_process, and snaplogic_final. The first function, snaplogic_init, will be executed before consuming documents from the upstream snap. The second function, snaplogic_process, will be called on each of the incoming document. The last function, snaplogic_final, will be processed after all incoming documents have been consumed by snaplogic_process. In this case, Remote Python Script Snap does not have an input view, so snaplogic_process will not be executed.

In snaplogic_init, we create a new session. Then, we download the dataset and build the CNN model in snaplogic_final. The dataset can be obtained directly from Keras. The raw data shape is (N, 28, 28), we need to reshape it to (N, 28, 28, X) in order to use the Conv2D layer. Since the images in this dataset contain one color channel (grayscale), X is 1. We also scale the color intensity to range [0,1]. Lastly, we apply one hot encoding to targets (y_train, y_test).

Our CNN model starts with two Conv2D layers with (3,3) kernel. The first layer's size is 32 and the second one is 64. Then, MaxPooling2D, Dropout, Flatten, Dense, one more Dropout and Dense layers are followed. We train the model 12 epochs with the batch size of 128. The model performs at 99.06% accuracy on 10,000 test samples. We use SnapLogicUtil.model_to_text to serialize the model.

Model Testing

In the pipeline, the File Reader Snap reads the CNN model from SLFS. The JSON Generator Snap contains 1 handwritten image. The correct label is "1".

The screenshot below shows the handwritten image in the JSON Generator Snap.

The prediction of the Python Script Snap is shown below. It can be see that the character in the input image has been correctly identified.

Model Hosting

This pipeline is scheduled as an Ultra Task to provide a REST API that is accessible by external applications. The core snaps are File Reader, JSON Parser, and Remote Python Script that are the same from the Model Testing pipeline. The rest are for authentication, parameter extraction, and Cross-Origin Resource Sharing (CORS) handling.

Scheduling Ultra Task

To build an API from this pipeline, create a Task. You can either use Triggered Task or Ultra Task. Triggered Task is good for batch processing since it starts a new pipeline instance for each request. Ultra Task is good to provide REST API to external applications that require low latency. In this case, we will use Ultra Task. You do not need to specify the bearer token here since we use the Router Snap to perform authentication inside the pipeline. You can go to the Manager by clicking at Show tasks in this project in Manager to see task details as shown in the screenshot below (Right).

Testing

After scheduling an Ultra Task, you can test it. The below screenshot shows a sample request and response. Based on the image provided, the pipeline returns "1" as the first prediction.

Demo

Once we have the API ready, it is time to build an application that demonstrates the power of our handwritten-digit recognition model. Below is the video demo. The live demo is available at our Machine Learning Showcase, Feel free to try and let us know your feedback.

HTML Code

In this demo, we have four main components: canvas, CLEAR button, READ button and result label. You can use mouse or touch screen to write a digit on the canvas. You can clear the canvas using the CLEAR button. If you are ready, click the READ button to send the request to the API. Once it is done, the result will be displayed.

Javascript

The Javascript code is shown below. You will need to specify the URL to your API and the token in guess function. The function, requestSnapLogic(url, token, params, action), allows you to send a request to your Ultra Task. The parameter, action, is a function that will be executed once the result is ready.

Handwritten Digit Recognition using Convolutional Neural Networks