MongoDB - Group

On this Page

Overview

The MongoDB Group Snap groups input documents by specified expressions. The Snap then generates one output document for each distinct grouping. Each of the output documents contains an _id field which contains the distinct group by key.

Input and Output

Expected input

  • A document stream that contains information to construct a query condition or grouping condition.

Expected output

  • A document stream that contains documents that are a result of the specified grouping or query condition.

Expected upstream Snaps

  • A Snap that generates documents. For example, Mapper, JSON Generator, CSV Generator.

Expected downstream Snaps

  • A Snap that accepts documents. For example, Mapper, Filter, and JSON Formatter.

Prerequisites

None.

Configuring Accounts

This Snap uses account references created on the Accounts page of SnapLogic Manager to handle access to this endpoint. See MongoDB Account for information on setting up this type of account.

Configuring Views

Input

This Snap has at most one document input view. If the input view is defined, then values will be used to evaluate the expression in the conditions.
OutputThis Snap has exactly one document output view.
ErrorThis Snap has at most one document error view and produces zero or more documents in the view.

Troubleshooting

None.

Limitations and Known Issues

None.

Modes

Snap Settings


Label

Required. The name for the Snap. Modify this to be more specific, especially if there are more than one of the same Snap in the Pipeline.

Database name

The database that contains the documents. If you do not specify a database, then the Snap uses the MongoDB account database.

Example: assets

Default value: N/A

Collection name

Required. The MongoDB collection name to execute the grouping.

Exampleusers

Default value: N/A

Query Condition

An expression that represents the query parameter. If no query condition is defined, then MongoDB retrieves all the documents of the collection. When the expression evaluates to an object, only strict mode is supported. When the expression evaluates to a JSON string, both strict mode and mongo shell mode are supported. Click here for more information about MongoDB Extended JSON.

Example: {num:1}

Default value: N/A

Group Condition

Required. The condition to group documents and create a single document for each distinct group. Read here for details on group conditions.

Example: {"_id": "$Churn", "num_customer": {$sum: 1}, "total": {$sum: "$TotalCharges"}}

Default value: N/A

Sort Condition

The condition to order the documents in the result set. To use multiple sort orders, enter comma-separated sort conditions. Read here for details on sort conditions. 

Example: {age: -1}

Default value: N/A

Batch Size

Required. The number of documents to return in a batch. Click here for more details on how MongoDB batches documents. 

If n is the batch size, the Snap behaves as follows:

  • If n > 1, the Snap gets all the documents in the collection that match the query condition (returns n documents at a time, or as many that can fit in the batch buffer)
  • If n = 1, the Snap gets only one document in the collection that matches the query condition
  • If n = 0, the Snap gets all documents in the collection that match the query condition (returns as many documents that can fit in the batch buffer at a time)
  • If n < 0, the Snap gets n documents in the collection that match the query condition (for example, if n = -10, the Snap gets 10 documents)

In MongoDB 3.4, the batch buffer size is 16MB. In previous versions, the batch buffer size is 4MB. The initial batch always returns a maximum of 101 documents. Click here for details.

Example: 0

Default value: 0 

Timezone Offset

The timezone offset to be applied to the time fields. By default, the Snap follows UTC (00:00 offset). 

Hours Offset

The number of hours to be offset. For example, if you specify a value of -2 in Hours Offset and 30 in Minutes Offset, then the timezone is offset by -2:30 hours. 

Example:  -7

Default value: 0 

Minutes Offset

The number of minutes to be offset. For example, if you specify a value of -2 in Hours Offset and 30 in Minutes Offset, then the timezone is offset by -2:30 hours. 

Example: 1

Default value: 0

Ignore empty result

If selected, no document is written to the output view when the group operation does not produce any result. If this property is not selected and the Pass through property is selected, the input document is passed through to the output view.

Default value: Not selected

Group result

Select to group results in one single field named result, instead of an array.

Default value: Not selected

Pass through

If selected, the input document is passed through to the output view under the key 'original'.

Default value: Selected

Number of retries

Specify the maximum number of attempts to be made to receive a response. The request is terminated if the attempts do not result in a response.

  • If the Number of retries value is set to 0 (the default value), the retry option is disabled, and the Snap does not initiate a retry. The pipeline will not attempt to retry the operation in case of a failure—any failure encountered during the database operation will immediately result in the pipeline failing without any retry attempts to recover from the errors.

  • If the Snap fails on all retries, it routes the last occurred exception to the error view.

Default Value0
Example: 4

Retry interval (seconds)

Specify the time interval between two retry requests.


Default Value: 1
Example: 5

Snap Execution

Select one of the following three modes in which the Snap executes:

  • Validate & Execute: Performs limited execution of the Snap, and generates a data preview during Pipeline validation. Subsequently, performs full execution of the Snap (unlimited records) during Pipeline runtime.

  • Execute only: Performs full execution of the Snap during Pipeline execution without generating preview data.

  • Disabled: Disables the Snap and all Snaps that are downstream from it.

Default ValueExecute only

Example: Validate & Execute

Example


This Pipeline demonstrates how the MongoDB Group Snap helps you group the number of customers based on churn and the total customer count for a telecommunication company. For each group, the Pipeline returns the average monthly charges, and total charges for the Telco customer churn dataset. 

Download the Pipeline.

 Understanding the Pipeline

In this example, we use the dataset of a telecommunication company. The dataset contains 21 fields. Each document in the dataset represents a customer record and contains data about the customer's demographics, service subscriptions, and the field $churn which indicates whether the customer is an existing customer or has quit. The Snap reads the dataset from the database location specified in the Snap configuration. The Snap configuration is as follows:

A preview of the input dataset is as follows:

The query that we use in the Group Condition field is:

{"_id": "$Churn", "num_customer": {$sum: 1}, "avg_monthly": {$avg: "$MonthlyCharges"}, "total": {$sum: "$TotalCharges"}}

This Snap runs this query on the dataset and returns the customer churn rate, the total number of customers, the average monthly charges, and the total charges. The output preview of the Snap is as follows:

The Snap groups the dataset into two: one for customer churn rate and the other for the total number of customers, represented by _id as Yes and No, respectively. And for each group, it returns the average monthly charges and the total charges.


Downloads

Important steps to successfully reuse Pipelines

  1. Download and import the pipeline into the SnapLogic application.
  2. Configure Snap accounts as applicable.
  3. Provide pipeline parameters as applicable.

  File Modified

File MongoDBGroupExample.slp

Aug 08, 2019 by Vidya Patil

Snap Pack History

 Click to view/expand
Release Snap Pack VersionDateType  Updates
November 2024main29029 Stable

Enhanced the MongoDB Replica and Mongo ReplicaSet Dynamic Accounts to define read preference options when querying data. The default option is Primary, so you cannot allocate read load to the secondary node. Note that the Secondary preferred mode is not supported for the MongoDB Execute Snap.

August 2024

main27765

 

Stable

Updated and certified against the current Snaplogic Platform release.

May 2024437patches27343 Latest

The MongoDB - Atlas Vector Search Snap now supports the following:

  • Suggestions for the Search index field that enables the Snap to populate the associated indices in the list.

  • The input schema displays the mandatory vector field and optional filter suggestion (if the Search index contains a filter type query) in alignment with the fields expected by the Snap.

May 2024437patches26832 Latest
  • Fixed the inconsistency in ObjectId and Date representation in the output preview between MongoDB - Execute and MongoDB - Find snaps.

  • Enhanced the MongoDB Execute Snap with the Timezone Offset field set that enables you to apply the timezone offset on the date fields.

May 2024437patches26721 Latest
  • Added Number of retries and Retry interval (seconds) fields, to MongoDB Delete, Update, Find, Group, Insert, and Atlas Vector Search Snaps that enable you to handle retries during a connection failure.
  • Fixed an issue with the MongoDB - Execute Snap, where data was missing when the database server restarted and the error view was enabled.
  • Fixed an issue with the MongoDB - Execute Snap where the log file missed the retry attempts information.
May 2024main26341 Stable
  • Enhanced the MongoDB Update Snap with the Array Filters field, which enables you to use array filters in the update operation. Additionally, the Update Query field is modified into a text box for visibility and usability of input queries.
  • Upgraded Spring dependencies to the latest supported Java 11 version for MongoDB Snap Pack.

February 2024436patches26244 Latest

Added the following Snap to the MongoDB Snap Pack:

  • MongoDB - Atlas Vector Search: Performs advanced vector-based queries, such as Similarity searches, Approximate Nearest Neighbor (ANN) queries, and Range queries on vector data stored in MongoDB Atlas.

February 2024436patches25893 Latest

Added MongoDB Execute Snap to the MongoDB Snap Pack.

February 2024main25112 StableUpdated and certified against the current SnapLogic Platform release.
November 2023main23721 StableUpdated and certified against the current SnapLogic Platform release.

August 2023

main22460

 


Stable

Updated and certified against the current SnapLogic Platform release.

May 2023

main21015 

Stable

Upgraded with the latest SnapLogic Platform release.

February 2023main19844 StableUpgraded with the latest SnapLogic Platform release.
November 2022main18944 StableUpgraded with the latest SnapLogic Platform release.
September 2022430patches18223 Latest

The MongoDB Update Snap in a low-latency feed Ultra Pipeline now correctly acknowledges the requests.

August 2022430patches17472 Latest

The MongoDB Account with Encryption type set to TLS/SSL does not fail with the "URL cannot be null" error.

August 2022main17386 StableUpgraded with the latest SnapLogic Platform release.
4.29Patches429patches15807 Latest

Updated the expected output for the MongoDB - Update Snap that is changed because of the upgrade of Spring Core framework version.

4.29

main15993

  

Stable

Upgraded with the latest SnapLogic Platform release.

4.28main14627 StableUpgraded with the latest SnapLogic Platform release.

4.27

main12833

 

Stable

Upgraded with the latest SnapLogic Platform release.
4.26main11181 StableUpgraded with the latest SnapLogic Platform release.
4.25main9554
 
StableUpgraded with the latest SnapLogic Platform release.
4.24main8556
Stable

Fixed an issue in the MongoDB accounts to connect to Atlas Free Tier and Shared Cluster database using the Use cursor timeout checkbox in the MongoDB cursor properties. If selected, this option enables the server to close a cursor automatically after a period of inactivity. For the existing accounts that does not have this field, the value for this checkbox returns false, which is backward compatible.

4.23main7430
 
StableUpgraded with the latest SnapLogic Platform release.
4.22main6403
 
StableUpgraded with the latest SnapLogic Platform release.

4.21 Patch

421patches6272 Latest

Fixes the issue where Snowflake SCD2 Snap generates two output documents despite no changes to Cause-historization fields with DATE, TIME and TIMESTAMP Snowflake data types, and with Ignore unchanged rows field selected.

4.21 Patch

421patches6144 Latest

Fixes the following issues with DB Snaps:

  • The connection thread waits indefinitely causing the subsequent connection requests to become unresponsive.
  • Connection leaks occur during Pipeline execution.
4.21 PatchMULTIPLE8841 Latest

Fixes the connection issue in Database Snaps by detecting and closing open connections after the Snap execution ends. 

4.21snapsmrc542

 

StableUpgraded with the latest SnapLogic Platform release.
4.20snapsmrc535
 
StableUpgraded with the latest SnapLogic Platform release.
4.19snaprsmrc528
 
StableUpgraded with the latest SnapLogic Platform release.
4.18snapsmrc523
 
Stable
  • Added the following fields to the Mongo DB Find Snap: Projection Condition, Sort ConditionOffset, Limit, and Group result.
  • Added a new Snap, MongoDB Group, which enables you to group input documents by a specified expression, and output to the next stage, one document for each distinct grouping.
  • Added a new field to the Mongo DB Update Snap, Update operation and Exclude list, which enable you to update operations and exclude a list of JSON properties before sending the updated documents to MongoDB.
4.17ALL7402
 
Latest

Pushed automatic rebuild of the latest version of each Snap Pack to SnapLogic UAT and Elastic servers.

4.17 Patch db/mongo7331 Latest
  • Fixed an issue with the MongoDB - Update Snap wherein the Snap converts all non-updated integer and float data types to string data type.
  • Fixed a Null Pointer Exception for old MongoDB accounts that did not have driver jars.
4.17snapsmrc515
 
Latest

Added the Snap Execution field to all Standard-mode Snaps. In some Snaps, this field replaces the existing Execute during preview check box.

4.16snapsmrc508
 
StableUpgraded with the latest SnapLogic Platform release.
4.15 Patch db/mongo6783 Latest

Fixed connection timeout issue with MongoDB.

4.15 Patch db/mongo6465 Latest

Fixed an issue wherein REST calls to Ultra tasks returned an error message.

4.15snapsmrc500
 
StableUpgraded with the latest SnapLogic Platform release.
4.14 Patch db/mongo5666 Latest

Fixed the Update, Delete, and Find Snaps to populate input view schema for a given table, similar to the Insert Snap.

4.14snapsmrc490
 
StableUpgraded with the latest SnapLogic Platform release.
4.13 Patchmongo5537 Latest

Fixed the Update, Delete, and Find Snaps to populate input view schema for a given table, similar to the Insert Snap.

4.13

snapsmrc486

 
StableUpgraded with the latest SnapLogic Platform release.
4.12

snapsmrc480

 
Stable

Added the SSl certification properties to all the MongoDB Accounts to ensure the validation of the certificate.

4.11snapsmrc465
 
StableUpgraded with the latest SnapLogic Platform release.
4.10 Patchmongo3978 Latest

Resolved an issue where the NumberFormatException was not handled properly for some of the valid Number Types like "NaN"/ "+Infinity"/ "-Infinity".


4.10

snapsmrc414

 
StableUpgraded with the latest SnapLogic Platform release.
4.9.0 Patchmongodb3259 Latest
  • Addressed an issue in MongoDB Update where Upsert Date failed with "Can't find a codec for class org.joda.time".
  • MongoDB Insert Snap - Collectio name expression evaluate fixed.
4.9snapsmrc405
 
Stable
  • Query Condition property is now an expression that evaluates to an object or JSON string.
  • Updated the Snap with Database name property to support the users defined in an authentication database.
4.8.0 Patchmongodb2735 Latest

Added SSL encryption type to all MongoDB accounts and Replica set Accounts and removed the MongoDB SSL account.

4.8

snapsmrc398

 
Stable
  • The MongoDB Delete Snap and MongoDB Update Snap were introduced in this release.
  • Enhanced the MongoDB Snap account with SSL Account type.
  • Updated the Batch Size property in MongoDB Find with the default value of 0.
  • Enhanced the MongoDB Find Snap documentation with an example.
  • Info tab added to accounts.
  • Database accounts now invalidate connection pools if account properties are modified and login attempts fail.
4.7 Patchmongo2375 Latest
  • Update the MongoDB java driver to 3.0.4; Add exception handling to each record processing.
  • MongoDB SSL Account removed and replaced with the new configuration for all the Accounts (Encryption type Property).
4.7 Patchmongo2338 Latest

Add an account for MongoDB SSL connection without certificates validation

4.7 Patchmongo2200 Latest

Fixed an issue for database Select Snaps regarding Limit rows not supporting an empty string from a pipeline parameter.

4.7

snapsmrc382

 
Stable

Updated the Snap account with the LDAP Authentication type.

4.6snapsmrc362
 
Stable

Resolved an issue in MongoDB Insert Snap that processed and inserted all numeric fields as strings.

4.5.1

snapsmrc344

 
Stable
  • Resolved and issue with MongoDB Insert with an empty input view failed.
  • Resolved an issue with MongoDB Insert that changed a numeric type field in MongoDB as String.
  • Resolved an issue that caused Snap execution failures when accessing MongoDB using a Replica Set account.
4.3.2
 Stable

Resolved an issue with MongoDB Find returning nothing when nothing was found.


4.3

Stable
  • Resolved an issue with an incorrect resolution displaying during account validation if the username was blank.
  • Resolved an issue in the MongoDB Find Snap with nested arrays.
  • Resolved an issue in MongoDB Find with data not being usable be other Snaps.
4.2.2

Stable
  • Username and Password are no longer required fields when creating a MongoDB account because it is possible to configure an instance where that information is not required.
  • MongoDB Aaccount now supports Mongo Java Driver 3.0.2.
  • MongoDB Find
    • Resolved an issue with MongoDB Find not properly supporting the expression language.
    • Resolved an issue with MongoDB Find returning "Current context not an ARRAY but OBJECT" for a deep nested ObjectId object.
    • Resolved an issue with MongoDB Find when data had built-in datatype.
    • Resolved a null pointer exception in MongoDB Find.
    • Resolved an issue with MongoDB Find not routing failed documents to the error view.
  • Resolved an issue with MongoDB error handling when maximum number of documents reached.
  • Improved the error message presented when MongoDB database could not reach the JCC.
4.2.1

Stable
  • MongoDB - Find: Resolved Error- Failure: java.util.HashMap cannot be cast to java.lang.String when in query condition passes without single quotes.
  • MongoDB - Find: Resolved failure with Query Using Operators.
  • Resolved MongoDB driver and account do not support current version of MongoDB.
  • Resolved a failure MongoDB Insert with custom _id.

See Also