Skip to end of banner
Go to start of banner

Profile

Skip to end of metadata
Go to start of metadata

You are viewing an old version of this page. View the current version.

Compare with Current View Page History

« Previous Version 28 Next »

On this Page

Overview

This is a Transform type Snap that computes statistics of the incoming data. Each field can be either numeric or categorical, you can use the Type Converter Snap to change the data type appropriately. This is helpful in deriving a statistical analysis of the data in datasets.

The Snap also has an optional second output view. When enabled, this view outputs an HTML file that is a graphical visualization of the first output. If you enable the Value distribution property, the value distribution of each class is also included in the output. 

Output without value distribution:

Input and Output

Expected input: The dataset as a document stream.

Expected output: Statistical details of the dataset. Computation is different based on the type of fields.

  • First Output: Statistical details of the dataset. Computation is different based on the type of fields.
    • For categorical fields, the following is computed:
      • popular: The most popular value
      • total: The total number of documents in the dataset
      • unique values: The number of unique values
      • missing values: The number of whitespaces, null values, and missing values 
      • value distribution: The distribution of values. It is presented with value-frequency pairs. This is not shown if the Value distribution property is not selected.
    • For numerical fields, the following is computed:
      • mean: Average value
      • min: Minimum value
      • max: Maximum value
      • sd: Standard deviation
      • popular: The bin with the highest number of data (if binning is enabled) or the most popular value (if binning is disabled).
      • total: The total number of documents in the dataset
      • unique values: The number of unique values
      • missing values: The number of missing values 
      • value distribution: The distribution of bins (if number of bins is greater than 0) or values (if number of bins is 0). It is presented with bin/value-frequency pairs. This is not shown if the Value distribution property is not selected.
  • Second Output: Optional. HTML-version of the output from the first output view.

Expected upstream Snaps: Any Snap that generates documents. For example, CSV Generator, JSON Generator, or a combination of File Reader and JSON Parser.

Expected downstream Snaps: Any Snap that accepts a document input. For example, Mapper.

Prerequisites

The input document must not have a nested structure.

Configuring Accounts

Accounts are not used with this Snap.

Configuring Views

Input

This Snap has exactly one document input view.
OutputThis Snap has at most two document output views.
ErrorThis Snap has at most one document error view.

Troubleshooting

None

Limitations and Known Issues

None

Modes


Snap Settings


LabelRequired. The name for the Snap. Modify this to be more specific, especially if there are more than one of the same Snap in the pipeline.
Value distribution

If selected, the Snap includes the value distribution of the fields in the output.

Default value: Selected

Top values limit

Required. This property is applicable only to the categorical fields. However, if binning is disabled, this property is also applied to numeric fields. This property limits the number of value-frequency pairs in the value distribution. For example, if the value in this property is 2, then the Snap lists two most-popular values in the dataset along with the number of documents with those values. 

Default value: 100


Configure this as 0 to include all values.

Number of bins

Required. This property is applicable only to the numerical fields in the dataset. It specifies the number of bins. Binning is a method of splitting the data space into N-equally sized ranges where N is the number of bins.

Default value: 10


Configure this as 0 to disable binning.

Maximum memory %

Required. The maximum portion of the node's memory, as a percentage, that is utilized to buffer the incoming dataset. If this percentage is exceeded then the dataset is written to a temporary local file. This configuration is useful in handling large datasets without over-utilization of the node memory. The minimum default memory to be utilized by the Snap is set at 100 MB.

Default value: 10

Page lookup error: page "Anaplan Read" not found.

If you're experiencing issues please see our Troubleshooting Guide.

Page lookup error: page "Anaplan Read" not found.

If you're experiencing issues please see our Troubleshooting Guide.

Temporary Files

During execution, data processing on Snaplex nodes occurs principally in-memory as streaming and is unencrypted. When larger datasets are processed that exceeds the available compute memory, the Snap writes Pipeline data to local storage as unencrypted to optimize the performance. These temporary files are deleted when the Snap/Pipeline execution completes. You can configure the temporary data's location in the Global properties table of the Snaplex's node properties, which can also help avoid Pipeline errors due to the unavailability of space. For more information, see Temporary Folder in Configuration Options


Examples


Computing Statistics

This example demonstrates how the Profile Snap is used to compute data statistics.

Download this pipeline

 Understanding the pipeline

The input is a CSV document generated by the CSV Generator Snap. This document contains categorical ($class) as well as numeric ($sepal_length$sepal_width$petal_length, and $petal_width) fields. There are also some values missing. Since the output from the Profile Snap depends upon the data types in the input, a Type Converter Snap is added downstream so that all data types are correct. The Type Converter Snap is configured to automatically convert data types based on the field's values. Below is a preview of the output from the Type Converter Snap, this serves as the input for the Profile Snap:

The Profile Snap is configured as shown below:

Statistics of the input documents are computed by the Profile Snap based on its configuration. This is shown in the output preview below:

Statistics of the documents are computed by the Snap depending upon the type of fields.

The $class field being a categorical field, the statistics include:

  • $popular
  • $uniqueValues
  • $missingValues
  • $total
  • $valueDistribution

The $sepal_length$petal_length$petal_width, and $sepal_width being numeric fields, the statistics include:

  • $mean
  • $min
  • $max
  • $stddev
  • $popular
  • $uniqueValues
  • $missingValues
  • $total
  • $valueDistribution

See the Input and Output section above for a description of these fields.

The second output preview of the Profile Snap displays the report. The report is passed to a Document to Binary Snap and then to a File Writer Snap where you can save and download the report in HTML format. The preview of the report in HTML from the File Writer Snap is as follows:


Since the Profile Snap does not work in Ultra Pipelines, you may write the output from the Profile Snap into a file using the JSON Formatter and File Writer Snap. In the Ultra Pipeline, use a File Reader Snap to read this profile. The File Writer Snap in this example is configured as shown below:

Download this pipeline

Additional Example

The following use case demonstrates a real-world scenario for using this Snap:

Downloads

Important steps to successfully reuse Pipelines

  1. Download and import the Pipeline into SnapLogic.
  2. Configure Snap accounts as applicable.
  3. Provide Pipeline parameters as applicable.

  File Modified
No files shared here yet.

Release 

Snap Pack Version

Date

Type

  Updates

May 2024

main26341

Stable

Updated and certified against the current SnapLogic Platform release.

February 2024

436patches25953

Latest

Upgraded Apache CXF from version 3.4.2 to 3.6.3 to prevent vulnerability issues.

February 2024

main25112

Stable

Updated and certified against the current SnapLogic Platform release.

November 2023

435patches24944

Latest

Fixed an issue with Workday REST Snap where the input schema of the Snap did not populate in the upstream Mapper Snap.

November 2023

435patches24309

Latest

Added the WorkdayQL Snap in the Workday Snap Pack, which connects with the Workday Query Language (WQL) endpoints.

November 2023

main23721

08 Nov 2023 

Stable

Updated and certified against the current SnapLogic Platform release.

August 2023

main22460

Stable

Updated and certified against the current SnapLogic Platform release.

May 2023

main21015

Stable

Upgraded with the latest SnapLogic platform release.

February 2023

432patches20313

Latest

Fixed an issue with the Workday REST Snap where it failed with a null pointer exception when the input document was null.

February 2023

main19844

Stable

  • The Workday REST Snap supports Pass through and Pagination fields.

    • Pass through: Enables the Snap to pass the input data to the output document.

    • Enable Pagination: Enables the Snap to return the response in multiple pages based on the limit and offset query parameters. The maximum limit value is 100. Deselect this checkbox to download only one page of records.

January 2023

431patches19450

Latest

Introduced the Workday REST Snap to connect to Workday REST APIs. This Snap supports the following new accounts:

November 2022

main18944

Stable

Upgraded with the latest SnapLogic platform release.

October 2022

430patches18358

Latest

The Workday Read, Workday Write, and Workday Cancel Snaps are now showing the Services in the suggestions list using the Public Web Service API, which the Snaps failed to use previously.

August 2022

430patches17587

Latest

The performance of the Workday Read Snap is improved to reduce the execution time.

August 2022

main17386

 

Stable

Upgraded with the latest SnapLogic platform release.

4.29

main15993

Stable

Added support for the latest version of Workday APIs (certified to be compatible with version 37.0).

4.28 Patch

428patches14290

Latest

Enhanced the Workday Cancel Snap with the retry mechanism fields.

  • Number of Retries: Specifies the number of attempts the Snap should make to perform the selected operation in case of connection failure or timeout.

  • Retry Interval (seconds): The time interval in seconds between retry attempts.

4.28

main14627

Stable

  • Enhanced the Workday Read Snap with the following:

    • Added the Pass-through on no lookup match checkbox that allows the input document to pass through to the output view when there are when no records match an input document.

    • Parameterization of the Page Number and Page Size fields using Pipeline parameters. You can define and use the parameters in these fields using the
      Expression Enabler (blue star)  icon to pass values during runtime.

4.27

main12833

Stable

Enhanced the Workday Read Snap with the new field Pool Size that controls the maximum number of threads in the pool. This field is available only when you select the Multi-threaded checkbox.

4.26 Patch

426patches11525

 

Latest

Enhanced the Workday Read Snap with the new field Pool Size that controls the maximum number of threads in the pool. This field is available only when you select the Multi-threaded checkbox.

4.26

main11181

 

Stable

Upgraded with the latest SnapLogic platform release.

4.25

main9554

 

Stable

Upgraded with the latest SnapLogic platform release.

4.24

main8556

Stable

Upgraded with the latest SnapLogic platform release.

4.23

main7430

 

Stable

Upgraded with the latest SnapLogic platform release.

4.22

main6403

 

Stable

Removes support for Workday WSDL Account. Pipelines. You must switch these Snaps using either the Workday Account or Workday Dynamic Account type.

4.21

snapsmrc542

 

Stable

Upgraded with the latest SnapLogic platform release.

4.20 Patch 

workday8817

 

Latest

Fixes the Workday Write Snap that ignores the Import Synchronized checkbox selection when importing object data. The Snap now:

  • Waits for the current import request to complete, whether successfully or with error, before initiating the next import request.

  • Provides accurate real-time status of the request to downstream Snaps.   

Existing Pipelines using this Snap to might experience longer exercution times because of
the synchronous behavior. However, you no longer need to use an additional Snap to capture and pass the request status to downstream Snaps.

4.20 Patch 

workday8761

 

Latest

Fixes the Workday Read Snap that fails to validate when configured in the New Form UI.  

4.20

snapsmrc535

 

Stable

Upgraded with the latest SnapLogic platform release.

4.19

snaprsmrc528

 

Stable

The Workday Write Snap includes a Validate Only Load checkbox, which enables users to upload–and then manually validate–data before importing it into Workday.

4.18 Patch 

workday7837

 

Latest

Fixed an issue with the Workday Read Snap wherein the Snap is unable to log SOAP calls.

4.18

snapsmrc523

 

Stable

Upgraded with the latest SnapLogic platform release.

4.17

ALL7402

 

Latest

Pushed automatic rebuild of the latest version of each Snap Pack to SnapLogic UAT and Elastic servers.

4.17

snapsmrc515

 

Latest

Added the Snap Execution field to all Standard-mode Snaps. In some Snaps, this field replaces the existing Execute during preview checkbox.

4.16 Patch

 workday6973

 

Latest

Fixed an issue with paging variable toggle in Workday Read Snap to produce correct preview data.

4.16 Patch 

workday6862

 

Latest

Fixed an issue with the Workday Read Snap not logging SOAP calls.

4.16

snapsmrc508

 

Stable

Upgraded with the latest SnapLogic platform release.

4.15 Patch

workday6727

 

Latest

Fixed an issue with the Soap Execute Snap failing for Workday when the Trust All certificate is enabled while using SSL authentication.

4.15 Patch

workday6292

 

Latest

Fixed an issue with the Workday Read Snap giving inconsistent results. Reverting the multithreading update resolved the issue. 

4.15

snapsmrc500

 

Stable

Upgraded with the latest SnapLogic platform release.

4.14

snapsmrc490

 

Stable

Upgraded with the latest SnapLogic platform release.

4.13 Patch 

workday5247

 

Latest

Fixed the Workday Read Snap that took an extended time to read from the Workday application. 

4.13

snapsmrc486

 

Stable

Upgraded with the latest SnapLogic platform release.

4.12

snapsmrc480

 

Stable

Upgraded with the latest SnapLogic platform release.

4.11

snapsmrc465

 

Stable

Upgraded with the latest SnapLogic platform release.

4.10

snapsmrc414

 

Stable

Upgraded with the latest SnapLogic platform release.

4.9

snapsmrc405

 

Stable

  • Updated the Snap with Import Synchronized, Import Batch Size and Import Batch Node to enhance the performance for Import objects supporting bulk operations. 

  • Updated the Workday Read Snap with the Page Number and the Page Size properties.

4.8

snapsmrc398

 

Stable

  • Updated the Read Snap with the Simplified output property.

  • Info tab added to accounts.

4.7

snapsmrc382

 

Stable

Upgraded with the latest SnapLogic platform release.

4.6

snapsmrc362

 

Stable

  • Resolved an issue in Workday Read Snap that caused errors when the timeout field was set.

  • Resolved an issue in Snaps with Workday Dynamic account that caused an error when a pipeline parameter provided an empty value to the password field.

4.5.1

snapsmrc344

 

Stable

Resolved issues in Workday Snaps to ensure appropriate errors are routed to the error views.

4.4.1

 

Stable

Upgraded with the latest SnapLogic Platform release.

4.4

 

Stable

Resolved an exception in Workday Read.

4.3.2

 

Stable

  • Addressed the following issues:

    • Defect: Workday Read Input Using As Of Date/Time in Response_Filter

    • Defect: Workday Snap Does Not Show All Services

  • No labels