On this Page

Snap type:

Transform


Description:

This Snap eliminates duplicate documents in a document stream, such as duplicate rows in a CSV file. To understand the functionality of the Unique Snap, it is important to understand that SnapLogic is streaming data, processing "one document at a time" in each Snap. 

The Unique Snap does NOT sort the documents before making that comparison. 

If the data being passed was in the following order:


 
It would recognize that the data was in pairs and the result would be just the 10 de-duplicated rows: 

 
If the data however were in a different order:  

The result would be exactly the same as the input, as the duplicate rows are not adjacent in the flow and therefore, are not identified.
 
If you wanted to eliminate the duplicates, it is possible to use a Sort Snap in the flow before the Unique, in which case the duplicates would be adjacent, and hence de-duplicated. 

  • Expected upstream Snaps:  Any Reader followed by any file Parser.
  • Expected downstream Snaps:  Any file Formatter followed by a Writer.
  • Expected input:  Document input, likely with duplicate data.
  • Expected output:  Unique document data.
Prerequisites:

[None]


Support and limitations:Does not work in Ultra Pipelines.
Account: 

Accounts are not used with this Snap.


Views:


InputThis Snap has exactly one document input view.
OutputThis Snap has exactly one document output view.
ErrorThis Snap has at most one document error view and produces zero or more documents in the view.


Settings

Label


Required. The name for the Snap. You can modify this to be more specific, especially if you have more than one of the same Snap in your pipeline.

Minimum memory (MB)


Default value: 500
Example750

If the available memory is less than this property value while processing input documents, the Snap stops to fetch the next input document until more memory is available. This feature is disabled if this property value is 0.

Minimum free disk space (MB)


Default value: 500
Example750

If the free disk space is less than this property value, the Snap stops processing input documents until more free disc space is available. This feature is disabled if this property value is 0.

Out-of-resource timeout (minutes)


Default value: 30
Example20

If the Snap pauses longer than this property value while waiting for more memory available, it throws an exception to prevent the system from running out of memory or disk space.

Example


A simple pipeline for this Snap would include:

File Reader + CSV Parser + Unique + CSV Formatter + File Writer