On this Page

Overview

The Type Inspector Snap analyzes the data types in a datasetThis Snap is useful when your data contains multiple data types and you want to recognize the type of every input data. Also, with the aggregate option of the Snap, you can get a count of each data type present in the input.

Input and Output

  • Expected input: A document stream.
  • Expected outputA document that contains information about data types.
  • Expected upstream Snaps: Any Snap that has a document output stream. For example, Mapper, Categorial to Numerical, etc.
  • Expected downstream Snaps: Any Snap that has a document input stream. For example, Unique, JSON Formatter, etc.

Prerequisites

None

Configuring Accounts

Accounts are not used with this Snap.

Configuring Views

Input

This Snap has exactly one document input view.

OutputThis Snap has exactly one document output view.
ErrorThis Snap has at most one document error view.

Troubleshooting

None

Limitations and Known Issues

None

Modes

  • Ultra Pipelines: Works in Ultra Pipelines only when you do not select Aggregate property.


Snap Settings


LabelThe name for the Snap. You can modify this to be specific, especially if you have more than one of the same Snap in your pipeline.
Full class name

Select to output the full class name instead of the shorter one.

Default value: Not selected.

For example, java.math.BigInteger is the full class name of BigInteger.

Aggregate

Select to display the aggregated count of each data type for each field.

Default value: Not selected




Example


Analyzing Data Types

This pipeline demonstrates how to analyze the different data types in the input using the Type Inspector Snap. We also get their full class names with the count of each data type. 

You have an input dataset with different data types which is passed to two Type Inspector Snaps. The first Type Inspector Snap has just the default configuration and analyzes the data types that are present in the dataset. The second Type Inspector Snap lists the data types along with the full class name and the aggregated count of each data type.

Download this pipeline.

The input is a JSON document generated by the JSON Generator Snap and contains fields of various data types. The output data preview is as follows:

The dataset from the JSON Generator Snap is sent to two Type Inspector Snaps. The first Type Inspector Snap is configured as follows:

The second Type Inspector Snap is configured as follows:

The first Type Inspector Snap replaces the values with their types and the output preview is as follows:

For the second Type Inspector Snap, based on the configuration, the output contains the full class names of the data types in an aggregated form which tells the number of values in each type:

Download this pipeline.