Aggregate

In this article

Overview

You can use this Snap to apply aggregate functions on input data using the Group By support. You can perform a calculation on a set of values to return a single scalar value.

The following are the commonly used SQL Aggregate functions:

  • AVG – calculates the average of a set of values.

  • COUNT – counts rows in a specified table or view.

  • MIN – gets the minimum value in a set of values.

  • MAX – gets the maximum value in a set of values.

  • SUM – calculates the sum of values.

  • CONCAT – calculates the sum of values.

  • UNIQUE_CONCAT – calculates the sum of values.

Snap Type

Aggregate Snap is a TRANSFORM-type Snap that transforms, parses, cleans, and formats data from binary data to document data.

Prerequisites

None.

Support for Ultra Pipelines

Does not support Ultra Pipelines

Limitations and Known Issues

None.

Snap Views

Type

Format

Number of Views

Examples of Upstream and Downstream Snaps

Description

Type

Format

Number of Views

Examples of Upstream and Downstream Snaps

Description

Input 

Document

 

  • Min: 1

  • Max: 1

  • JSON

  • Mapper

Each document should contain values referenced in the Aggregate fields and the GROUP-BY fields field set. If not, the input data is sent to the error view.

Output

Document

 

  • Min: 1

  • Max: 1

  • JSON

  • Mapper

Each document contains the mapped data that includes key-value entries of the GROUP-BY field name and its value, and a key-value entry of the Result field and its value, if processed successfully.

Error

Error handling is a generic way to handle errors without losing data or failing the Snap execution. You can handle the errors that the Snap might encounter while running the Pipeline by choosing one of the following options from the When errors occur list under the Views tab. The available options are:

  • Stop Pipeline Execution: Stops the current Pipeline execution when the Snap encounters an error.

  • Discard Error Data and Continue: Ignores the error, discards that record, and continues with the rest of the records.

  • Route Error Data to Error View: Routes the error data to an error view without stopping the Snap execution.

Learn more about Error handling in Pipelines.

Snap Settings

  • Asterisk (*): Indicates a mandatory field.

  • Suggestion icon (): Indicates a list that is dynamically populated based on the configuration.

  • Expression icon (): Indicates whether the value is an expression (if enabled) or a static value (if disabled). Learn more about Using Expressions in SnapLogic.

  • Add icon ( ): Indicates that you can add fields in the field set.

  • Remove icon (): Indicates that you can remove fields from the field set.

Field Name

Type

Description

Field Name

Type

Description

Label*

Default ValueAggregate
ExampleAggregate_Avg

String

Specify the name for the Snap. You can modify this to be more specific, especially if you have more than one of the same Snap in your Pipeline.

 

Aggregate fields*

 

 

 

Use this field set to define the type of Aggregate function to perform on the field and the key name to be used in the output. This field set contains the following fields:

  • Function

  • Field

  • Result field

Function*

Default ValueSUM
ExampleAVG

String

Select the functions that applies to the aggregate field value in the input data. The available functions and the supported datatypes are: 

  • COUNT – all types.

  • SUM – Number (Numeric string is converted to a number).

  • MIN – Number, String, DateTime, LocalDateTime, LocalDate, LocalTime.

  • MAX – Number, String, DateTime, LocalDateTime, LocalDate, LocalTime.

  • AVG – Number, DateTime, LocalDateTime, LocalDate, LocalTime (Numeric string is converted to a number).

  • CONCAT – Concatenates all records separated by a pipe (|).

  • UNIQUE_CONCAT – Concatenates all unique records separated by a pipe (|).

When you select the AVG function, the Snap rounds up all numeric values that have more than 16 digits. The AVG function handles the numeric values as below:

  • If the raw value is less than 16 with or without decimals, then the output value displays without any change.

    • Raw Value: 1234567890/1234567890.1234

    • Output Value: 1234567890/1234567890.1234

  • If the raw value is greater than or equal to 16 with or without decimals, then the output value is rounded off to 16 digits, including decimals.

    • Raw Value: 123456789012345.67

    • Output Value: 123456789012345.7

  • If the raw value is greater than 16 with or without decimals, then the output value is rounded up to 16 digits and expressed in exponential notation.

    • Raw Value: 1234567890123456789/1234567890123456789.123

    • Output Value: 1.234567890123457E+18

  • If the raw value is non-terminating decimal, then the output value is truncated to 16 digits.

    • Raw Value: 1.3333333333333333333333

    • Output Value: 1.333333333333333

Field*

Default value: [None]
Example:  $Total or DateTime.parse($service)

String/Expression

Specify a JSON path to the field on which the Aggregate function should be applied such as $items.item. Learn more: JSONPath - XPath for JSON.

 

Result field*

Default value:  [None]
Example:  Revenue

String

Specify the field name to be used for mapped data in the output. This value is the aggregate computed result corresponding to the GROUP-BY field values.

GROUP-BY fields*

 

 

 

 

 

 

 

Use this field set to define field paths and names. If you leave this field blank, the Snap produces only one output document. This field set contains the following fields:

  • Field

  • Output field

Field

Default value: [None]
Example:  $.Product.Name

String/Expression

Specify a JSON path for the GROUP-BY field.

 

Output field

Default value:  [None]
Example:  ProductName

String

Specify the GROUP-BY field name to be used in the output map data. If left blank, the Field path is used instead.

Integer mode

Default value: Deselected
Example: Selected

 

Checkbox

Select this checkbox if you want the Snap to produce integer results rounded half up.

The input data can be mixed in integers and floating-point numbers, and the Snap maintains intermediate results in floating-point numbers. The value of this field is ignored in the COUNT Aggregate function.

Sorted streams*

Default value: Unsorted
ExampleAscending

Dropdown list

Select an option to specify if the input documents are sorted or not.

This option enables the Snap to verify if the input is sorted as it processes each document and performs the aggregation efficiently and displays an error if the records are not sorted.

The available options are:

  • Unsorted: The input documents are not sorted. When not sorted, the Snap uses a memory-intensive method that tracks the value of the aggregation for every unique value of the GROUP-BY fields.

  • Ascending: The input documents are sorted in an ascending order.

  • Descending: The input documents are sorted in descending order.

If the input data stream contains several documents, then presort the input using the Sort Snap—this uses less memory and results in an effective performance. If the data is not presorted, the Snap consumes memory equal to the input data stream.

Snap Execution

Default value: Validate & Execute
ExampleDisabled

 

 

Dropdown list

Select one of the three modes in which the Snap executes. Available options are:

  • Validate & Execute: Performs limited execution of the Snap, and generates a data preview during Pipeline validation. Subsequently, performs full execution of the Snap (unlimited records) during Pipeline runtime.

  • Execute only: Performs full execution of the Snap during Pipeline execution without generating preview data.

  • Disabled: Disables the Snap and all Snaps that are downstream from it.

Temporary Files

During execution, data processing on Snaplex nodes occurs principally in-memory as streaming and is unencrypted. When larger datasets are processed that exceeds the available compute memory, the Snap writes Pipeline data to local storage as unencrypted to optimize the performance. These temporary files are deleted when the Snap/Pipeline execution completes. You can configure the temporary data's location in the Global properties table of the Snaplex's node properties, which can also help avoid Pipeline errors due to the unavailability of space. For more information, see Temporary Folder in Configuration Options

Troubleshooting

Error

Reason

Resolution

Error

Reason

Resolution

EXPRESSIONS_DETECTED

Expressions have been detected in Aggregate.

Remove all expressions.

ERR_UNSUPPORTED_AGGR_FUNCTION

The selected aggregate function is not supported.

Select a valid aggregate function.

INPUT_VARIABLES_NOT_SUBSTITUTED

One or more variables in the JSON file are not mapped.

Either remove non substituted variables starting with $ or map them using an upstream snap.

ERR_EDITOR_KEYWORDS_NOT_ESCAPED

Reserved characters are detected in the JSON key value pair.

Replace the reserved characters using # and [[ ]]. For example, replace ‘##’ as #[[##]]# to escape it.

ERR_MAPDB_VALUE_NULL

MapDB intermittently retrieves null for non-null value.

Use Sort Snap to sort the input data stream.

Examples

Concatenating Unique String Values

The following example Pipeline shows how to use the Aggregate Snap to concatenate unique string values:

Snap Configuration

Output

Snap Configuration

Output

 

Counting the Occurrences of a Given Product Name

The following example Pipeline shows how to use the Aggregate Snap to count the occurrences of a given product name.

Snap Configuration

Output

Snap Configuration

Output

 

Downloads

  1. Download and import the Pipeline into SnapLogic.

  2. Configure Snap accounts as applicable.

  3. Provide Pipeline parameters as applicable.

 

  File Modified

File Aggregate - Count.slp

Apr 06, 2017 by Diane Miller

File Aggregate - Concat Unique.slp

Apr 06, 2017 by Diane Miller

Snap Pack History