The Snap groups input documents by the field values into batches of output documents. Each batch is an output document with a list of input Map data as a value at the location specified by the Target field property. Input documents with the same group-by field values are grouped into the same output document.
The Snap expects the input documents with the same group-by field values to be contiguous and whenever the group-by field values change, the Snap produces a new output document. Therefore, if all input documents with the same group-by field values are expected to be grouped into one output document, the Sort Snap can be used in front of the Group By Fields Snap so that the input document stream are sorted by the group-by field values.
Expected upstream Snaps: Any Snap with a document output view
Expected downstream Snaps: Any Snap with a document input view
Expected input: A document with Map data
Expected output: A document with a list of input Map data as a value at the location specified by the Target field
Prerequisites:
All input documents should be of Map data type and contain values specified by the Fields property.
This Snap has exactly one document output view. The Snap is configured with a second output view to get statistics of the input data
Error
This Snap has at most one document error view and produces zero or more documents in the view. The error view contains error, reason, resolution and stack trace.
Settings
Label
Required. The name for the Snap. You can modify this to be more specific, especially if you have more than one of the same Snap in your pipeline.
Fields
Required. The fields to group by.
Example: $OrderNumber
Default value: [None]
Memory Sensitivity
Required. Indicates the Snap's behavior towards memory changes. Choose one of the available options:
None: If selected, it groups input documents by the field values into batches of output documents.
Dynamic: If selected, groups may be split into multiple parts, depending on memory availability. The group size to scale against each group is determined statistically from the groups already processed (mean group size + one standard deviation)
Default value: [None]
Min. Part Size
Activated when Memory Sensitivity is set to Dynamic.
Enter the minimum part size that you want Snap to split larger groups into multiple parts.
This limit does not apply to the last part of the multi-part group or a single part of the group that's smaller than the size of the part mentioned here.
Example: 100
Default value: 10
Target field
Required. Target field name to be used as a key in the output document or a JSON path where a list of input Map data would be located.
Example: batch
Default value: group
Minimum memory (MB)
If the available memory is less than this property value while processing input documents, the Snap stops to fetch the next input document until more memory is available. This feature is disabled if this property value is 0.
Example: 500
Default value: 750
Out-of-memory timeout (minutes)
If the Snap pauses longer than this property value while waiting for more memory available, it throws an exception to prevent the system from running out of memory.
Example: 30
Default value: 20
Snap Execution
Select one of the three modes in which the Snap executes. Available options are:
Validate & Execute: Performs limited execution of the Snap, and generates a data preview during Pipeline validation. Subsequently, performs full execution of the Snap (unlimited records) during Pipeline runtime.
Execute only: Performs full execution of the Snap during Pipeline execution without generating preview data.
Disabled: Disables the Snap and all Snaps that are downstream from it.
Default Value: Execute only Example: Validate & Execute
Examples
Input and output documents batched by the group name and fields
Input and Output Documents Batched by the Group Name and Fields
In this pipeline, the Group By Fields Snap groups the input documents into a batch of output documents with the same group by the field property.
The JSON Generator Snap passes the values to be batched into groups by fields.
The Sort Snap Sorts the input documents into ascending order, the respective output preview:
The Group By Fields Snap groups the documents by group name and fields.
The output preview from the Group By Fields Snap grouped by the order number and the group fields:
The output preview in the table format:
Input and output documents by code
Input and Output Documents by Code
Assume an input stream of nine documents as follows:
Fixed an issue with the CSV Parser Snap that introduced unexpected characters into the records and output data because of incorrect handling of the delimiter.
November 2024
main29029
Stable
Updated and certified against the current SnapLogic Platform release.
August 2024
438patches28073
Latest
Fixed an issue with the JSON Generator and XML Generator Snaps that caused unexpected output displaying '__at__' and '__h__' instead of '@' and '-' respectively because the Snap could not update them to their original values after the Velocity library upgrade.
August 2024
438patches27959
Latest
Fixed an issue with the Sort where the Snap could not sort files larger than 52 MB. This fix applies to Join Snap also.
August 2024
main27765
Stable
Upgraded the org.json.json library from v20090211 to v20240303, which is fully backward compatible.
May 2024
437patches26643
Latest
Fixed an issue with the Sort Snap that displayed an error when estimating the size of the input document provided by the upstream S3 Browser Snap.
Fixed an issue with the Parquet Formatter Snap that was unable to route errors to the error view.
May 2024
437patches26453
Latest
Added expression support to the Skip lines field in the CSV Parser Snap to enable passing pipeline parameters and upstream values.
Fixed an issue with the XML Parser Snap that caused an error when using the Splitter option in the Snap settings.
May 2024
main26341
Stable
Added Parquet Parser and Parquet Formatter Snaps to the Transform Snap Pack:
Parquet Parser: Reads the binary Parquet data and writes document data to the output.
Parquet Formatter: Reads the document data and writes it to the output in binary Parquet format.
Enhanced the JSON Splitter Snap to capture metadata and lineage information from the input document.
February 2024
436patches25564
Latest
Fixed an issue with theJSON FormatterSnap that generated incorrect schema.
February 2024
436patches25292
Latest
Fixed an out-of-memory error issue with the Aggregate Snap. This Snap no longer performs the presort for the input documents.
If the input documents areunsorted and GROUP-BY fields are used, you must use the Sort Snap upstream of the Aggregate Snap to presort the input document stream and set the Sorted stream field Ascending or Descending to prevent the out-of-memory error. However, if the total size of input documents is expected to be relatively small compared to the available memory, then Sort Snap is not required upstream.
Updated and certified against the current SnapLogic Platform release.
November 2023
435patches24802
Latest
Fixed an issue with the Excel Parser Snap that caused a null pointer exception when the input data was an Excel file that did not contain a StylesTable.
November 2023
435patches24481
Latest
Fixed an issue with the Aggregate Snap where the Snap was unable to produce the desired number of output documents when the input was unsorted and the GROUP-BY fields field set was used.
November 2023
435patches24094
Latest
Fixed a deserialization issue for a unique function in the Aggregate Snap.
November 2023
main23721
Stable
Updated and certified against the current SnapLogic Platform release.
August 2023
434patches23076
Latest
Fixed an issue with the Binary to Document Snap where an empty input document with Ignore Empty Stream selected caused the Snap to stop executing.
August 2023
434patches23034
Latest
Fixed an issue with the Transform Snap Pack that caused an error when the input file was a binary JSON file that contained a string value of more than 20,000,000 characters.
Fixed a memory issue with the Aggregate Snap that occurred when using GROUP-BYfields.
August 2023
434patches22705
Latest
Fixed an issue with the JSON Splitter Snap that caused the pipeline to terminate with excessive memory usage on the Snaplex node after the 4.33 GA upgrade. The Snap now consumes less memory.
August 2023
main22460
Stable
Updated and certified against the current SnapLogic Platform release.
May 2023
433patches22431
Latest
Fixed an issue with the Excel Multi Sheet Formatter Snap that caused it to produce binary output data when there was no input document and Ignore empty stream was selected.
Introduced the following new Snaps:
GeoJSON Parser: Parses geospatial data from binary data input and outputs the contents as a GeoJSON document downstream.
WKT Parser: Parses geospatial data from binary data input and outputs the contents as a WKT (Well Known Text) document downstream.
May 2023
433patches21779
Latest
The Decrypt Field and Encrypt Field Snaps now support CTR (Counter mode) for the AES (Advanced Encryption Standard) block cipher algorithm.
May 2023
433patches21586
Latest
The Decrypt Field Snap now supports the decryption of various encrypted fields on providing a valid decryption key.
Fixed an issue with the AutoPrep Snap where dates could potentially be rendered in a currency format because currency format options were displayed for the DOB column.
May 2023
433patches21196
Latest
Enhanced the In-Memory Lookup Snap with the following new fields to improve memory management and help reduce the possibility of out-of-memory failures:
Minimum memory (MB)
Out-of-memory timeout (minutes)
These new fields replace the Maximum memory % field.
May 2023
main21015
Stable
Upgraded with the latest SnapLogic Platform release.
February 2023
432patches20535
Latest
Fixed an issue with the Encrypt Field Snap, where the Snap failed to support an RSA public key to encrypt a message or field. Now the Snap supports the RSA public key to encrypt the message.
The Pipeline Execution Statistics of the Join Snap now has a status message that displays the parameters - Free disk space, Available memory, and Average document size.
The internal sort buffer size is reduced to a minimum of 10MB when the available memory in the node becomes lower than 500MB to avoid the out-of-memory crash.
The internal sort buffer size is restored to its original size when the available memory becomes larger than 2GB.
We have improved the readability of the error message for the out of disk space on node error. The updated error message now provides clearer information and guidance for users, as shown below: Reason: Insufficient free disk space available to stage sort data into temporary files. Resolution: Increase the amount of free disk space and try again.
February 2023
432patches20250
Latest
Fixed an issue with the JSON Splitter Snap that was causing errors when using multiple repeated dots in the JSON Path.
The Sort Snap includes the following improvements:
The Maximum memory % field is revised to Maximum memory.
The Maximum memory unit (new dropdown list) enables you to choose a unit, percentage (%), or MB for better memory control.
February 2023
432patches20151
Stable/Latest
Fixed an issue that occurred with the JSON Splitter Snap when used in an Ultra pipeline. The request was acknowledged before it was processed by the downstream Snaps, which caused a 400 Bad Request response.
February 2023
432patches20062
Stable/Latest
Fixed the behavior of the JSON Splitter Snap for some use cases where its behavior was not backward compatible with the 4.31 GA version. These cases involved certain uses of either the Include scalar parents feature or the Include Paths feature.
February 2023
432patches19974
Stable/Latest
Fixed the "Json Splitter expects a list" error by restoring the JSON Splitter Snap's previous behavior of handling the case where the document element referenced by the JSON Path to Split field is an object instead of a list or array.
Review your pipelines where this error occurred to check your assumptions about the input to the JSON Splitter and whether the value referenced by the JSON Path to Split field will always be a list. If the input is provided by an XML-based or SOAP-based Snap like the Workday or NetSuite Snaps, a result set or child collection that’s an array when there's more than one result or child will be an object when there's only one result or child. In these cases, we recommend using a Mapper Snap and the sl.ensureArray() function to ensure that the value being split by the JSON Splitter is always an array (even for the single element cases).
February 2023
432patches19918
Stable/Latest
Fixed an issue with the CSV Formatter Snap where the Unicode character delimiters using [0-9a-f] did not work.
Fixed an issue with the JSON Splitter Snap that was generating null values for empty input data.
February 2023
main19844
Stable
Upgraded with the latest SnapLogic Platform release.
The Transform Join Snap now doesn’t fail with the Null Pointer Exception when you configure the Sorted streams field with Ascending.
November 2022
431patches19359
Latest
The JSON Splitter Snap includes memory improvements and a new Exclude List from Output Documents checkbox. This checkbox enables you to prevent the list that is split from getting included in output documents, and this also improves memory usage.
The Mapper Snap now has a Sorted checkbox in the Input Schema and Target Schema panels, which allows you to sort the input and target schemas. When unchecked, the Snap unsorts the input and the target schema.
October 2022
430patches18800
Latest
The Sort and Join Snaps now have improved memory management, allowing used memory to be released when the Snap stops processing.