The input is a CSV document generated by the CSV Generator Snap. This document contains categorical ($class) as well as numeric ($sepal_length, $sepal_width, $petal_length, and $petal_width) fields. There are also some values missing. Since the output from the Profile Snap depends upon the data types in the input, a Type Converter Snap is added downstream so that all data types are correct. The Type Converter Snap is configured to automatically convert data types based on the field's values. Below is a preview of the output from the Type Converter Snap, this serves as the input for the Profile Snap:
The Profile Snap is configured as shown below:
Statistics of the input documents are computed by the Profile Snap based on its configuration. This is shown in the output preview below:
Statistics of the documents are computed by the Snap depending upon the type of fields.
The $class field being a categorical field, the statistics include:
The $sepal_length, $petal_length, $petal_width, and $sepal_width being numeric fields, the statistics include:
See the Input and Output section above for a description of these fields.
The second output preview of the Profile Snap displays the report. The report is passed to a Document to Binary Snap and then to a File Writer Snap where you can save and download the report in HTML format. The preview of the report in HTML from the File Writer Snap is as follows:
Since the Profile Snap does not work in Ultra Pipelines, you may write the output from the Profile Snap into a file using the JSON Formatter and File Writer Snap. In the Ultra Pipeline, use a File Reader Snap to read this profile. The File Writer Snap in this example is configured as shown below:
Download this pipeline