Join

On this Page

Overview

Snap type

Transform

Description

This Snap joins two or more data streams. It supports inner, left outer, and outer joins. If input data streams are sorted (ascending or descending), it is a streaming Snap at highly optimized performance. If the data streams are not sorted, you may use a Sort Snap in front of the Join Snap or select UNSORTED for the Sorted streams property. Please note that all documents in the same input view must have the same set of fields, otherwise, the naming of the fields in the output documents may appear to be inaccurate.

  • Expected upstream Snaps. Sort, Mapper, CSV Parser, JSON Parser, XML Parser, or any Snap with a document output view.
  • Expected downstream Snaps. CSV Formatter, JSON Formatter, XML Formatter, or any Snap with a document input view.
  • Expected input. All documents in the same stream should have the same set of fields regardless if values are null or not.
  • Expected output. Data joined from input document streams. Field names in the left input data are passed to the output data 'as is. For all field names in the right input document streams, if a field name conflicts with a field name in the left input data, it will be prefixed with its input view name. If there is no conflict, the field names in the right input documents are used in the output data without any modification.
Prerequisites

All documents in the same stream should have the same set of fields.

Known Issue, support and limitations

Known Issue: When the upstream Snaps of the Join Snap contains Copy, Router, Aggregate, or similar Snaps, it is likely that the data flow of a branch in a Pipeline gets blocked until another branch completes streaming the document. The Join Snap might hang if its upstream Snaps in a Pipeline has a blocked branch.

Workaround: Set Sorted streams to Unsorted in the Join Snap to effectively buffer all documents in all input views internally—this unblocks the document flow of all the upstream branches. The internal sorters sort the input documents from the input views into the local temporary stage.

Limited support in Ultra Task Pipelines:

Works in Ultra if only one of the input views on the Join Snap is connected to the unlinked input view. All other input views to the Join Snap must reach their end of input document.

Examples
Allowed: If one of the views on the Join Snaps is fed by an upstream FileReader.

Not Allowed: If you making a copy of the unlinked input stream and connect both of those output views to a Join Snap.

Account

Accounts are not used with this Snap.

Views
Input

This Snap has two or more document input views.

The input data schema in the upstream Snaps of Join Snap must be consistent for each input view to produce the expected joined output data. Else, the Snap might output unexpected joined data. See examples below for more information.

Workaround:

You can insert a Mapper Snap to add missing fields with null values to fix the inconsistent input schema.

OutputThis Snap has exactly one document output view.
ErrorThis Snap has at most one document error view and produces zero or more documents in the view. The error view behavior of this Snap is different from other Snaps. Error view documents do not contain error, reason, resolution, and stack trace. The left input documents are passed to the error view without any modification if there are no matches in the inner join operation and the Unmatched data to error view property is selected. When the error view is not open (that is, Stop pipeline execution is selected for the error view in the View pane), the input document written to the error view will not stop the pipeline and the pipeline execution will continue.

 

Settings

Label

Required. The name for the Snap. You can modify this to be more specific, especially if you have more than one of the same Snap in your pipeline.

Join Type


 

Required. The type of join to execute. The options available include:

  • Inner
  • Left outer
  • Outer
  • Merge

Default value: Inner

If you select Merge, the documents from the input views are merged into one document. You do not have to specify any other join properties when merging documents.


The rows on the left table are merged with the rows on the right table in the merge operation. If the right table has a fewer number of rows than the left table, null is added in the output document for the remaining rows. 

Join paths


JSON paths to use for left and right sides of the join. 

Each row in the table defines a relationship between the left-field and one of the right fields.
If there are N input views, N-1 rows are required to define each join path relationship. So, M*(N-1) rows are required to define all the join path relationships if there are M relationships. For example, if there are 4 input views and 3 join paths, 9 rows ((4-1) x 3) are required to define all the join path relationships. 

To use a partial set of join path relationships, use multiple Join Snaps.

Default value: [None]

Left path

Required. The JSON path to a value in a document of the first input view. One of the suggested field names should be selected. This property does not support expressions.
Example: $customer_id
Default value: [None]

Right input view

Required. Right input view name which is the second or another next input view.
You may use Suggest to select the right input view names.
Example: input1
Default value: [None] 

Right path

Required. The JSON path to a value in a document of the second or another next input view. One of the suggested field names should be selected. This property does not support expressions.
Example: $customer_id
Default value: [None]

Sorted streams

Required. How the data is sorted. Options available are Ascending, Descending, or Unsorted. If an Unsorted data stream is selected, the Snap sorts input data streams before it starts the join operation. 

Default value: Ascending

Null greater


If selected, null values are considered greater than non-null values. In conjunction with Sort streams:
  • If selected and Sort streams are ascending, nulls appear at the end of the list.
  • If selected and Sort streams are descending, nulls appear at the beginning of the list.
  • If not selected and Sort streams are ascending, nulls appear at the beginning of the list.
  • If not selected and Sort streams are descending, nulls appear at the end of the list.

Default value: Not selected

Unmatched data to error view


If selected, unmatched left input documents are passed to the error view only if the Join type is 'Inner'.

Default value: Not selected (false)

Null-safe access


If selected, the Snap will ignore missing data when accessing the join path. For example, a join path is '$id', but the 'id' key does not exist in the input data. In this case, the Snap will assume its value is null and continue. If unselected, the Snap will write an error to the error view for missing data and stop the execution.

Default value: Not selected (false)

Available Memory Threshold (%)

The Snap keeps all the Right input view documents with the same join-path values in memory until the join operation is done for the specific join-path values. When the Right input view has more than 10,000 input documents with the same join-path values, the Snap checks if the available memory is less than the threshold value mentioned in this property. If so, it starts to store input data into local temporary files to prevent the node from out of memory.

The Snap may fail if there isn't sufficient free local disk space in the node.
The Snap instances that existed before this property was first introduced, execute with a default value of 20% until its property value is updated.

Snap Execution

Select one of the three modes in which the Snap executes. Available options are:

  • Validate & Execute: Performs limited execution of the Snap, and generates a data preview during Pipeline validation. Subsequently, performs full execution of the Snap (unlimited records) during Pipeline runtime.
  • Execute only: Performs full execution of the Snap during Pipeline execution without generating preview data.
  • Disabled: Disables the Snap and all Snaps that are downstream from it.

Temporary Files

During execution, data processing on Snaplex nodes occurs principally in-memory as streaming and is unencrypted. When larger datasets are processed that exceeds the available compute memory, the Snap writes Pipeline data to local storage as unencrypted to optimize the performance. These temporary files are deleted when the Snap/Pipeline execution completes. You can configure the temporary data's location in the Global properties table of the Snaplex's node properties, which can also help avoid Pipeline errors due to the unavailability of space. For more information, see Temporary Folder in Configuration Options

Examples

Providing Consistent Input Schema to Get Correct Joined Output

This example Pipeline demonstrates how you can get expected output joined data from two inputs by providing a consistent input schema. We use the Join Snap to accomplish this task.

First, we provide input documents with consistent input schema using JSON Generator Snaps as shown below. 

Left Input SchemaRight Input Schema

Next, we connect the Join Snap to the upstream Snaps and configure it as shown below. 

Upon validation, the Snap displays the following joined output as a result of providing a consistent input schema. The key name of the right view is the same as in the left view; hence, the Join Snap prefixes it with the right view label in the output data, right_id, right_field1 and right_field2.

Download this Pipeline.

Inconsistent Joined Output Data as a Result of Inconsistent Input Schema

This example Pipeline demonstrates how the Join Snap generates inconsistent output joined data by providing inconsistent input schema in your inputs.

First, we provide input documents with inconsistent input schema using JSON Generator Snaps. 
The complete key set of input documents is {“id”, “field1”, “field2”}. Note that field2 entry is missing in the first left input document, the field1 entry is missing in the second left input document, and so on. The missing entries with null values cause unexpected results in the joined output data.

Left Input SchemaRight Input Schema

Next, we connect the Join Snap to the upstream Snaps to join the left and right input documents. To that end, we configure the Snap as shown below.


Upon validation, the Snap displays inconsistent output result, because the input documents contain incomplete key sets. The value right_c appears in the column field1 and the values right_d and right_h appear in the column field2, wherein they should be under right_field1 and right_field2 columns respectively.

Download this Pipeline.


Merging Documents 

You can use the Merge Join type to merge documents.

In this example:

  • $input comes into input0 and contains a value of 42
  • $output comes into input1 and contains a value of 84

  • Join type: Merge
  • Join Path:
    • Left path: (expression toggle on) $input
    • Right input view: input1
    • Right path: (expression toggle on) $output
  • Sorted streams: Ascending

Merge Result:

 

If the Join type is changed to Outer, the result is:

 

If the Join type is changed to Left Outer, the result is:


If the Join type is changed to Inner, no results are returned because there are no shared records.

If both inputs have a record of $day with a value of today added, set the paths to $day and the Join type to Inner. The result will look like this:


 

Snap Pack History

 Click to view/expand
ReleaseSnap Pack VersionDateType Updates
4.26 Patch426patches12086

LatestFixed an issue with the Join Snap, where it exhausted the memory while buffering millions of objects.
4.26 Patch426patches11725 LatestFixed an issue with the Join Snap where the upstream document flow of the right view is blocked by the left view, which hung the Join Snap.
4.26main11181
 
Stable
  • Enhanced the JSON Splitter Snap with a new field Show Null Values for Include Paths that enables the Snap to show key-value entries of the null values for the objects added to the Include Paths field in the output document.
  • Enhanced the Join Snap with a new field Available Memory Threshold (%) that enables the Snap to keep all the Right-input view documents with the same join-path values in memory until the join operation is done for the specific join-path values.
4.25 Patch425patches10663
 
Latest

Fixed an issue in the CSV Formatter Snap, where even if the Ignore empty stream checkbox is not enabled, the Snap did not produce an empty binary stream output in case there is no input document.

4.25 Patch425patches10152
 
Latest
  • Replaced the Strict XSD output field with Map input to first repeating element in XSD field in the XML Formatter Snap. If selected, the Snap ignores the root element from the XSD file.

  • Enhanced the CSV Parser Snap with a new checkbox Preserve Surrounding Spaces that enables you to preserve the surrounding spaces for the values that are non-quoted.
4.25 Patch 425patches9815
 
Latest

Fixed a ClassCastException error in the Avro Parser Snap and handling of the map, fixed, enum, and bytes data types in the Avro Formatter and Avro Parser Snaps.

4.25 Patch425patches9749
 
Latest

Enhanced XML Parser Snap to recognize input headers when defining inbound schema. 

4.25 Patch425patches9638-Latest

Reverts the Join Snap to the 4.24 release behavior. This is in response to an issue encountered in the Join Snap in the 4.25 release version (main9554), which can result in incorrect outputs from all Join Types. 425patches9638 is the default version for both stable and latest Transform Snap Pack versions for orgs that are on the 4.25 release version. No action is required by customers to receive this update and no impact is anticipated.

4.25main9554
 
Stable

Enhanced the Group By N Snap with the following settings: 

  • Memory Sensitivity: The Snap's response to the memory changes. 
  • Group Size: The maximum number of input documents to be grouped into a single output document.
  • Min Group Size: The minimum number of input documents to be grouped into a single output document.
4.24 Patch424patches8938
 
Latest
  • Fixed the timestamp issue in the JSON Formatter Snap that changed the time zone offset to include colon by default after upgrading to 4.24.

  • Fixed the null pointer exception at runtime in the Fixed Width Parser Snap by setting the Trim column data field to false for empty columns.
  • Enhances the Group By N Snap to process the records efficiently by adding the Flush Timeout field that enables the Snap to flush a partial group of records if the time specified in this field passes with no new input.
4.24main8556
 
StableUpgraded with the latest SnapLogic Platform release.
4.23 Patch423patches7958
  
Latest

Fixes an issue in the JSON Splitter Snap by logging an error when the matcher does not find a pattern.

4.23 Patch423patches7898
 
Latest
  • Fixes an issue in the In-Memory Lookup Snap to correctly handle the Join path in the format like $['join path'].

  • Fixes an issue in the XSLT Snap, wherein null binary header values are now converted to blank strings when injecting them as parameters in the stylesheet.
4.23 Patch423patches7792
 
Latest

Fixes an issue in the XML Formatter Snap when it fails to convert input JSON data, with JSON property having a special character as its prefix, to XML format by sorting the elements.

4.23 Patch423patches7753
Latest

Fixes an issue with the JSON Splitter Snap's behavior in Ultra Pipelines that prevents processed requests to be acknowledged and removed from the FeedMaster queue, resulting in retries of requests that are already processed successfully.

4.23main7430
 
Stable

Enhances the JSON Formatter Snap to render groups output from upstream (Group by) Snaps with one document per group and a new line per group element. You can now select the Format each document and JSON lines check boxes simultaneously.

4.22 Patch422patches6395
Latest

Fixes the JSON Splitter Snap data corruption issue by copying the data in JSON Splitter Snap before sending it to other downstream Snaps.

4.22 Patch422patches6505
Latest

Fixes the XML Generator Snap issue reflecting empty tags and extra space by removing the extra space in the XML output.

4.22main6403
 
LatestUpgraded with the latest SnapLogic Platform release.
4.21 Patch421patches5901
Latest

Enhances the JSON Generator Snap to include pass-through functionality where the Snap embeds the upstream input document under the original field of the output document along with other records.

4.21 Patch421patches5848
 
Latest
  • Fixes the Sort Snap that fails while performing sorting, displaying a NoClassDefFoundError.  
  • Fixes the Excel Formatter Snap that fails to create an empty worksheet when you do not select the Ignore Empty Stream checkbox.
4.21snapsmrc542-Stable

Adds support in the Mapper Snap to display schemas with complex nesting. For example, if Snaps downstream from the

4.20 Patchtransform8792-Latest

Resolves the NoClassDefFoundError in the Join Snap on Windows Snaplex instances.

4.20 Patchtransform8788-Latest

Resolves the NullPointerException in the Join Snap on Windows Snaplex instances.

4.20 Patchtransform8760-Latest
  • Fixes an issue with the CSV Parser Snap wherein the Snap fails when it is configured in the new Form UI if the Contains headers and Validate headers fields are not selected but empty rows exist in the Column names field.
  • Fixes the JSON Formatter Snap to filter the input data based on the value specified in the Content field in the Snap settings when the JSON lines field is also selected. Previously, the Snap was writing the entire input data to the output file.

The JSON Formatter Snap output now includes only those fields from the input file that are specified in the Content field under Settings. 

If your Pipelines use the JSON Formatter Snap with the JSON lines field selected, they may fail to execute correctly if the Content field mentions a specific object or field but the downstream Snap is expecting the entire data. Hence, for backward compatibility, you must review the entries in the Content field based on the desired output, as follows: 

  • Enter to include all the fields in the output.
    OR
  • Enter or select the specific fields to restrict the output to. 

The behavior of the Snap when the JSON lines field is not selected is correct and remains unchanged.  

The Example: Filtering Output Data Using the JSON Formatter Snap's Content Setting further illustrates the corrected vs the old behavior.

  • Fixes an issue in the XML Generator Snap due to which the custom XML data in the Edit XML field got ignored when inbound schema and root XML element were also provided, and instead, the output was generated from upstream data. For Pipelines with a standalone XML Generator Snap, a "Premature end of file" error was displayed. 

XML Generator Snap behavior might break existing Pipelines

The XML Generator Snap now gives precedence to any custom XML data that is provided over data coming from upstream Snaps, to generate the output. 

Existing Pipelines using the XML Generator Snap may fail in the following scenarios. Use the resolution provided to update the Snap settings based on the XML data you want to pass to downstream Snaps.

Breaking Scenario

Resolution

Downstream Snaps expect XML output from upstream Snaps, but custom XML data exists in the XML Generator Snap.

  • To use custom XML data, ensure that the Inbound schema and XML root element are specified and custom XML data is entered using the Edit XML option.
  • To use data from upstream Snaps, ensure that the Inbound schema and XML root element fields are blank and no custom XML data exists in the XML editor.


In the XML Generator Snap settings, the XML root element and Inbound schema are specified but no custom XML data is provided. This will generate a validation error. 
In the XML Generator Snap settings, custom XML data is provided and Validate XML checkbox is selected, but XML root element and Inbound schema are not specified. This will generate a validation error.
4.20 Patchtransform8738-Latest
  • Fixes the error that occurs while transforming an XML document into a string or while using the Excel Formatter Snap by retaining only one version of the Saxon library – net.sf.Saxon: Saxon-HE:9.6.0-10 across the Transform Snap Pack.
  • Fixes null handling in the Aggregate Snap whereby changing the order of entries in the Aggregate fields fieldset results in different outputs when the input documents contain one or more null values.
  • Fixes the issue with the Excel Parser Snap whereby the Snap does not parse the date and time correctly for custom date formats.
4.20snapsmrc535-StableUpgraded with the latest SnapLogic Platform release.
4.19 Patchtransform8280-Latest

Fixed an issue with the Excel Parser Snap wherein the Snap incorrectly outputs Unformatted General Number format. 

4.19snaprsmrc528-Stable

The output of the AVG function in the Aggregate Snap now rounds up all numeric values that have more than 16 digits.

4.18 Patch transform8199-Latest

Fixed an issue with the Excel Multi Sheet Formatter Snap wherein the Snap fails to create sheets in the expected order.

4.18 Patchtransform7994-Latest

Added a field, Round dates, to the Excel Parser Snap which enables you to round numeric excel data values to the closest second.

4.18 Patchtransform7780-Latest
  • Fixed an issue with the Excel Parser Snap by upgrading Apache POI to version 3.14, wherein the Snap is unable to parse an excel file with custom namespaces.
  • Fixed an issue with the Excel Parser Snap wherein the Snap converts real numbers to two decimal places when formatted for currency. A new property Cell formatting now supports unformatted outputs.
4.18 Patchtransform7741-Latest

Fixed an issue with the Sort and Join Snaps wherein the platform removes all temp files at the end of Pipeline execution.

4.18 Patchtransform7711-Latest

Fixed an issue with the XML Parser Snap wherein a class cast exception occurs when the Snap is configured with a Splitter and Namespace Context.

4.18snapsmrc523-Stable
  • Enhanced the sort feature of the Sort Snap to support specifying sort order at the field level. Added two new fields, Sort Path and Sort Order, to the Sort paths property; renamed the Sort order property to Sort order (Global).
  • Enhanced the Group Size property of the Group by N Snap to be expression enabled and removed the upper bound threshold (of 10000).
4.17 Patch Transform7431-Latest

Added a new field, Ignore empty stream, to the Avro Formatter Snap that writes an empty array in the output in case of no input documents.

4.17 PatchTransform7417-Latest

Added a new field, Format as canonical XML, to the XSLT Snap that enables canonical XML formatting.

4.17 PatchALL7402-Latest

Pushed automatic rebuild of the latest version of each Snap Pack to SnapLogic UAT and Elastic servers.

4.17 snapsmrc515-Stable
  • Added the Derive schema from the sample size menu option to the JSON Formatter Snap, whereby you select the sampling size of the schema from the data source.
  • Added the capability to select either document (previously supported) or binary data (new) for your input and output views to the Mapper Snap.
  • Added the Snap Execution field to all Standard-mode Snaps. In some Snaps, this field replaces the existing Execute during the preview check box.
4.16 Patchtransform7093-Latest

Fixed an issue with the XSLT Snap failure by enhancing the Saxon version.

4.16 Patchtransform6962-Latest
  • Added a new property, Escape special characters, to the XML Generator Snap that enables you to escape XML special characters in template variable values.
  • Added a new property, Header size error policy, to the CSV Formatter and CSV Parser Snaps. The property enables you to handle header size errors.
4.16 Patchtransform6869-Latest

Fixed an issue with the XML Parser Snap wherein XSD with annotations were incorrectly interpreted.

4.16snapsmrc508-StableUpgraded with the latest SnapLogic Platform release.
4.15 Patchtransform6736-Latest

Fixed an issue wherein the XML Generator Snap was unable to escape some of the special characters.

4.15 Patchtransform6680-Latest

Fixed an issue with the Excel Parser Snap wherein headers were not parsed properly and header columns were also maintained in an incorrect order.

4.15 Patchtransform6402-Latest

Fixed an issue with an object type in the Fixed Width Formatter Snap.

4.15 Patchtransform6386-Latest

Fixed an issue with the Fixed Width Parser Snap wherein Turkish characters caused incorrect parsing of data on Windows plex.

4.15 Patchtransform6321-Latest

Fixed an issue with the XML Parser Snap wherein the Snap failed to get data types from XSD.

4.15 Patchtransform6265-Latest

Fixed an issue with the XML Parser wherein XML validation was failing if the XSD contained xsd:include.

4.15snapsmrc500-Stable

Added a new property, Binary header properties, to the Document to binary Snap that enables you to specify the properties to add to the binary document's header.

4.14 Patchtransform6098-Latest

Fixed an issue wherein the XML Parser Snap was not maintaining the datatype mentioned in the XSD file.

4.14 Patchtransform6080-Latest

Fixed an issue with the Avro Formatter Snap wherein the Snap was failing for a complex JSON input.

4.14 PatchMULTIPLE5732-Latest

Fixed an issue with S3 file reads getting aborted intermittently because of incomplete consumption of input stream.

4.14 Patchtransform5684-Latest

Fixed the JSON Parser Snap that causes the File Reader Snap to fail to read S3 file intermittently with an AbortException error.

4.14 snapsmrc490-StableUpgraded with the latest SnapLogic Platform release.
4.13 Patchtransform5411-Latest

Fixed an issue in the CSV Formatter Snap where the output showed extra values that were not provided in the input.

4.13snapsmrc486-StableUpgraded with the latest SnapLogic Platform release.
4.12 Patch transform5005-Latest

Fixed an issue with the Excel Parser Snap that drops columns when the value is null at the end of the row.

4.12 Patchtransform4913-Latest

Fixed an issue in the Excel Formatter Snap, wherein opening an output stream prematurely causes the Box Write Snap to fail. Excel Formatter now awaits the first input document, before opening an output stream.

4.12 Patchtransform4747-Latest

Fixed an issue that caused the Join Snap to go out of memory.

4.12snapsmrc480-Stable
  • Improved performance for both Aggregate and Join Snaps.

  • Added Select All and Deselect All buttons to the Mapper table

4.11 Patchtransform4558-Latest

Enforced UTF-8 character encoding for the Fixed Width Formatter Snap.

4.11 Patchtransform4343-Latest

Enhanced enum labels on the Binary to Document and the Document to Binary Snaps for the Encode or Decode property with DOCUMENT and NONE options.

4.11 Patch transform4361-Latest

Fixed a sorting issue with the Join and Sort Snaps where the end of the file was not detected correctly.

4.11 Patchtransform4281-Latest

Added support on the Kryo serialization for UUID and other types to the Sort, Join and In-Memory Lookup Snaps.

4.11snapsmrc465--
  • Updated the JSON Formatter Snap with Binary Header and Content properties to allow the Snap to pass through binary header data.
  • Enhanced the In-memory Lookup Snap for Performance Optimization.
  • Enhanced the Sort Snap for Performance Optimization.
  • Updated the Excel Parser with will Insert null columns property to insert 'null' on missing columns at end.
4.10 Patchtransform4058--

Addressed an issue with the Excel Parser Snap that failed with out of memory when using large input Data (eg. 191 MB).

4.10 Patchtransform3956--

Conditional Snap: fixed an issue with the "Null-safe access" Snap Setting not being respected for return values.

4.10 snapsmrc414--
  • CSV Parser Snap updated with Ignore empty stream property to support passing the empty data.
  • XML Formatter Snap updated with Max schema levels property to support the outbound schema XSD containing import statements.
  • Addressed spelling errors in messages across the Snap Pack.
4.9.0 Patchtransform3343--

Join Snap - All input documents from all input views should be consumed before the end of Snap execution.

4.9.0 Patchtransform3281--

Made all four output views in Diff snap as mandatory.

4.9.0 Patchtransform3264--

Made all four output views in Diff snap as mandatory.

4.9.0 Patchtransform3220--

CSV Parser Snap - A new Snap property Ignore empty data with true default is added

4.9.0 Patchtransform3019--

Addressed an issue with the transform2989 build.

4.9.0 Patchtransform2989--

Addressed an issue with Excel Parser not displaying the most recent cached value for vlookups containing missing external references.

4.9.0
--

Introduced Encrypt Field and Decrypt Field Snaps.

4.8.0 Patch transform2956--

[CSVParser] Fixed an issue where an empty Quote Character config field was defaulting to the unicode quote character U+0000 (null). This caused issues if the input CSV had U+0000 characters in it.

4.8.0 Patchtransform2848--
  • Addressed an issue with Excel Multi sheet Formatter creating unreadable data on the output view when there is no input document.
  • Addressed an issue with CSV Formatter Snap failing on empty input data.
  • Addressed an issue with an upstream Script Snap throwing a NoClassDefFoundError in the Sort Snap.
4.8.0 Patchtransform2768--

Addressed an issue with CSV Parser causing a spike in CPU usage.

4.8.0 Patchtransform2736--

Addressed an issue with Excel Formatter dropping the first record when Ignore empty stream is selected.

4.8.0
--
  • Updated the CSV Formatter Snap with Newline property. This lets you select newline characters as a line break.
  • CSV Formatter Snap: Snap-aware error handling policy enabled for Spark mode. This ensures the error handling specified on the Snap is used.
  • Mapper: Snap-aware error handling policy enabled for Spark mode. This ensures the error handling specified on the Snap is used.
4.7.0 Patchtransform2549--

Addressed an issue with Excel Formatter altering decimal numbers to text.

4.7.0 Patchtransform2344--

Resolved an issue with validation of pipelines taking more time than executing a pipeline when a large amount of data is used.

4.7.0 Patchtransform2335--

Resolved an issue with XML Parser failing with error: 'Maximum attribute size (524288) exceeded'.

4.7.0 Patch transform2206--

Resolved an issue with JSON Generator failing with an "Invalid UTF-8 middle byte 0x70" error on Windows.

4.7.0
--
  • Updated Sort Snap with the property Maximum memory %. (Also released as a patch to 4.6.0)
  • Updated JSON Formatter Snap with the JSON lines field, an option that outputs each document fully in a single line followed by a newline. 
  • Updated the JSON Splitter Snap with  Json Path property. 
  • Updated the Excel Parser Snap with the new field, Header row.
4.6.0 Patchtransform2018

Resolved an issue where Excel Parser did not reliably set header row when "contains headers" is checked

4.6.0 Patchtransform1905

Resolved an issue in Sort Snap where buffer size should not be fixed for optimal performance

4.6.0 Patchtransform1901

  • In-Memory Lookup Snap: Resolved an issue where the internal lookup table size was fixed at 1GB.
  • New property "Maximum memory %" allows users to use larger memory for optimal performance.
4.6.0 Patchtransform1871

Resolved a performance issue in Excel Multi Sheet Formatter.

4.6.0


  • The following Snaps now support error view in Spark mode: Aggregate, CSV Parser, JSON Parser, Join, Unique.
  • Added an option to enable output readability. The newly added option is Pretty-print.
  • Resolved an issue in Excel Formatter Snap that formatted decimal numbers as text in the output XLS file.
  • Resolved an issue in XML Generator Snap that expired the internal cache after 50 minutes.  
  • Resolved an issue in Join Snap that caused unexpected failures while performing join or duplicated column values.
  • Resolved an issue in Sort Snap that caused unexpected results when provided with multiple sort paths.
  • You can now expand/collapse all nodes of a schema tree by holding the Shift key while clicking on the plus (+) sign.
  • Schemas with less than 1000 entries will now auto-expand when searching/filtering. 
4.5.1


  • XML Formatter is enhanced to transform an XML output to canonical XML. For more information, see XML Formatter. 
  • Join Snap is updated to support a new join type, Merge. For more information, see Join.
  • Fixed an error in XML Parser Snap that caused the Snap to fail for a valid specification, a hierarchy with an XPath expression.
  • Fixed an error in CSV Parser Snap that causes the skip lines option to not take effect in Spark mode.
  • Fixed an error in Join Snap that caused unintended values in the output document if one of the inputs has zero documents. 
  • Fixed an error that throws incorrect resolution for a failure when a Join Snap has unconnected input views.
  • Fixed an error in Sort Snap that occurred because the upstream Snap was producing Long instead of Big integer. 
  • Fixed an error in Sort Snap to ensure appropriate error handling. 
  • Fixed an error in Join Snap, in Ultra mode, that that caused execution failure due to lineage information. 
  • The version of the Join Snap deprecated in 4.4.1 is no longer available in the catalog.
  • CSV Parser Snap is updated to ensure appropriate parsing of input data. The Snaps that worked prior to 4.4.1 were failing in the 4.4.2 version. This has been rectified. 
4.5.0


  • Spark Mode enabled for Join, JSON Splitter, JSON Formatter, JSON Parser.
  • Resolved an exception in the Join Snap.
  • Resolved an issue with the Mapper Snap that occurred while evaluating an expression and reporting its error.
  • Resolved an issue with CSV Formatter adding extra data to output when used in Spark mode.
  • Resolved an issue with CSV Parser Snap that occurred when the second input view did not contain a header.
  • Resolved an issue with the Mapper Snap that occurred while evaluating an expression and reporting its error. 
4.4.1


  • NEW! Excel Multi Sheet Formatter
  • NEW! In-Memory Lookup
  • Deprecated: The Join Snap has been deprecated and is labeled as such. Existing pipelines using this Join Snap (which is now deprecated) will continue to function as-is without any issues, including pipeline execution or while pipeline deployment across orgs. This Snap will continue to be supported for two Platform Level (x.x) Sprints, which is approximately six months. After that point, no bug fixes will be applied to the Snap. It is recommended to move to use the new Snaps (Join and In-Memory Lookup), which together have significant performance and efficiency benefits.
  • Name change: The Multi-Join Snap has been renamed to Join.
  • Aggregate: Resolved an issue with Sum and Avg Functions not returning correct output in Spark mode.
  • CSV Parser: A second optional input view was enabled to add the ability to associate an external schema file.
  • CSV Formatter: A new property, Quote mode, specifies how the quote character should be used in formatting the CSV data.
  • JSON Formatter: Resolved an issue with the Ignore Empty Streams option not working.
  • JSON Parser: Resolved an issue with error handling when an empty file is passed in an Ultra task.
  • Resolved an issue with XSLT Snap incorrectly completing successfully when a truncated XML document was passed.
  • ZipFile Read:
    • Resolved an issue with Content-length, content-location showing differently for different protocols.
    • Resolved an issue with spaces in file and folder names displaying as "%20".
4.4


  • NEW! Pivot Snap
  • Resolved an issue with JSON Splitter modifying documents that were already sent out.
  • Spark support added to the Aggregate, CSV Formatter, CSV Parser, Mapper, Sort, and Unique Snaps.
4.3.2


  • NEW! Group By Fields and Group By N Snaps added.
  • Behavior change: CSV Formatter: Expression support added to Delimiter and Quote Character properties. You will need to toggle on the expression button to use an expression in this field.
  • Behavior change: CSV Parser: Expression support added to DelimiterQuote Character, and Escape Character properties. You will need to toggle on the expression button to use an expression in this field.
  • Resolved an issue with Encrypt and Compress showing 352 bytes of data when CSV Formatter has Ignore empty stream enabled.
  • Resolved an issue with Sequence Snap failing if trying to read parameters from upstream.
  • Resolved an issue with Sort Snap not failing with warning/error when incorrect (non-existent) field is referenced in the Sort path.
  • Resolved an issue with CSV Formatter failing with error Input length = 1. 
  • (Data) removed from the Mapper Snap name.
  • Resolved an issue with Excel Parser Snap ignoring a column if it contained null values when selecting the Contains headers option.
  • Resolved an issue with Excel formatter truncating leading zero in string type.
  • Resolved an issue with Binary to Document Snap not routing the Content into the error view.
  • The Multi Join Snap has been pulled from the Snap catalog when in SnapReduce mode as it is not yet supported.
4.3.1


  • Addressed the following defects:
    • Defect: XML Generator failing with null pointer exception when simultaneous triggered task are executed
    • Defect: JSON Formatter Snap - fails to close JSON binary stream with closing bracket ']' when user stop the pipeline
4.3.0


  • Excel Parser now has an Evaluate formulas option. Select this option if the cell formulas are to be evaluated and results displayed instead of raw formulas.
  • Multi Join Snap
    • Enhanced to make the Left and Right path schema-aware. 
    • Resolved an issue when attempting a join where one or both inputs are empty.
4.2.2


  • NEW! Multi Join Snap. This streaming supports joins of two or more sorted data streams.
  • Sequence Snap now supports pass through.
  • Resolved an issue where the Excel Parser Snap always returned the header from the first sheet for all sheets if the Excel file format was the older .xls format. Additionally, Improved the error message when the sheet given does not exists.
  • Sort Snap: Null greater option added to let you indicate if null values should be treated as greater than non-null values when sorted.
  • Resolved an issue with the Aggregate Snap failing with a null pointer exception.

August 7, 2015 (2015.25/4.2.1)

  • Resolved an issue in Excel Parser where Headers were not displayed properly if a few columns were blank.
  • Resolved a null pointer exception in the Aggregate Snap. The resolution included the addition of a new field, Sorted stream, where you designate whether or not the incoming date is sorted, as presorted data will allow the Snap to run more efficiently. The default value is Ascending for newly placed Aggregate Snaps; existing Aggregate Snaps will default to Unsorted for compatibility with previous functionality.
  • NEW! Transcoder Snap. Resolves an issue where preview of Snaps was not able to handle special characters.
  • Resolved an issue with XML Formatter including "Metadata" in the output.

June 27, 2015 (2015.22)

  • Sort Snap: performance improvements for sorting large amounts of records.
  • XML Formatter:
    • resolved an error where documents could not be converted to XML since the last Snap update.
    • resolved an exception found within this Snap.
  • JSON Splitter: resolved an issue with the Snap not honoring the Null-safe access option
  • Excel Parser: Headers were not displaying properly if a few columns are blank.

June 6, 2015 (2015.20)

  • CSV Parser: You can now select the character set in which your incoming CSV data is encoded. 
  • CSV Formatter: You can now select the character set in which encode your CSV data. You can also select to writhe the BOM (ByteOrderMark) for the character set selected.
  • NEW!: Avro Parser and Formatter
  • Excel Parser Snap: resolved failure due to file size
  • XML Formatter:
    • resolved "Cannot read attribute: element has children or text" error
  • Additional property, Strict XSD Output was added to extract the root element name and the wrapper name from the XSD file.

May 2, 2015

  • Excel Parser: exception resolved
  • Join: bug fixes for certain datasets in excess of 1million rows
  • JSON Generator: enhanced error handling
  • Mapper (Data): bug fixes
  • Sequence Snap: enhanced error handling
  • Sort:bug fixes for certain datasets in excess of 1m rows
  • XML Parser: enhancements supporting empty documents

March 2015

  • CSV Formatter: Exception resolved when the quote character and delimiter were the same.
  • XML Formatter: The datetime output now outputs as UTC date/time
  • XML Parser: Examples added to the documentation.

January 2015

  • Binary to Document Snap: The optional property Ignore empty stream was added.
  • CSV Formatter: Performance enhancements
  • CSV Parser:
    • Quote character is no longer a required field.
    • Optional Header size error policy field added. This setting lets you decide how to handle the error. In the newly provided csv, the last record is too big. Existing pipelines with this Snap will default to Both as this was the previous behavior.
  • Excel Parser: Now writes bad records to the document error view if configured.
  • XML Parser
    • The splitting of an XML document functionality has been improved. You can provide the splitter expression (a path to the element you want to split an XML on) in the XML Parser Snap and it will split the document (regardless of the size of the original XML).
    • Performance improvements.

December 20, 2014



  • Join: a new version of the Join Snap has been created. The existing version will continue to work, but will be marked as Deprecated. Any future enhancements will be in the newer Join. The new Join will prefix top level attributes from the right with right_ and top level attributes from the left with left_.
  • CSV Parser & Formatter have been enhanced to utilize Unicode for delimiters.
  • Fixed Width Formatter: Now supports Null-safe access
  • Updates to XML Parser and XML Formatter

November 2014

Aggregate now provides 2 new functions: concat and unique_concat


October 18, 2014

  • XML Generator
    • XSD support added
    • Pass-through support added by default. The Snap will pass-through the input data if an input view is provided. This is not configurable.
  • Optional Ignore empty stream setting added to XML Formatter and Excel Formatter
  • Addressed an issue where Conditional Snap was unable to process null Return Value 

Fall 2014

  • Mapper Snap
    • The Data Snap has been replaced by the Mapper Snap
    • The functionality of the Mapper table was enhanced with the following:
      • Mapper enhanced to support Structure-mapping.
      • Performance enhancements now load the Mapper table sooner.
      • You can now map items by dragging an item from one schema to the other.
      • The Mapping root option now lets you process an array easier by setting this option to the top node of that array.
      • Mapper field highlight visibly indicates the mapping between input and output schemas.
    • Data preview of input and output data.
  • Moved XML & FixedWidth Snaps to Transform Snap Pack

August 2014

  • Fixed Width Parser enhanced to be able to skip any row based on a pattern.
  • CSV Parser now validates that the header field matches the declaration if Contains header is selected.

July/Summer 2014 

  • Binary to Document (Beta), Document to Binary (Beta)

June 30, 2014

  • Addressed the following issues:
    • JSON Generator not streaming.
    • CSV Parser with empty skip lines field leads to null pointer exception during validation

    • CSV Formatter does not parse headers with space

    • XML Formatter: Snap fails converting document into XML


May 2014

  • NEW! CSV Generator Snap
  • NEW! Sequence Snap
  • NEW! Constant Snap
  • Conditional Snap updated with Null-safe access.
  • The Data Snap no longer needs an input. With no input view specified, it generates a downstream flow of one row.

April 2014

  • Aggregate Snap updated to handle aggregating Strings, Date, Time, and DateTime.
  • Excel Parser Snap enhanced to determine column names in a spreadsheet.

March 2014

  • NEW! Unique
  • JSON Splitter Snap was enhanced with an Include parent option, which includes the hierarchy of the list specified by the JSON path by adding it to each document.
  • Snaps like JSON Parser that have a JSON path expression now support pipeline parameter substitution in that expression.
  • Excel Parser Snap property labels updated for clarity. "Skip Lines" labels are replaced with Start Row and End Row.

January 2014

  • NEW! Microsoft Excel Formatter & Parser
  • UTF-16LE & UTF-16BE unicode support was added to the JSON Formatter & Parser and the CSV Formatter & Parser Snaps

December 2013

  • NEW! Fixed Width Formatter and Parser
  • CSV Parser now adds "field00x" to column names for tables with empty column names when Contains headers are false.

November 2013

  • NEW! Conditional
  • NEW! Diff

August 2013

  • NEW! Aggregate: The Aggregate Snap applies aggregate functions on input data with Group By support.
  • NEW! JSON Generator: Generates JSON as a document for the next Snap in the pipeline.
  • Join: The Join Snap was enhanced to support Outer joins.

July 2013

JSON Splitter: The JSON Splitter Snap splits a list of values into separate documents. 


Initial release

  • Transform structure of an input schema
    • Source schema introspection (Suggest)
    • Target schema introspection
  • Join Snap for merging 2 data streams
  • XSLT Snap