In-Memory Lookup

In this article

Snap type:

Transform

Description:

This Snap joins left and right input data streams. It uses the right input data as an in-memory lookup table.

  • Expected upstream Snaps: Any Snap with a document output view, such as CSV Parser, JSON Parser, Mapper.
  • Expected downstream Snaps: Any Snap with a document input view, such as CSV Formatter, JSON Formatter, Mapper.
  • Expected input:   
    • Input data streams may be unsorted.
    • The right input document stream is loaded in memory as a lookup table while the left input document stream is not stored in the Snap.
    • The JOIN operation starts when the right input document stream ends.
    • All input document data should be of a flat map data type.
  • Expected output: Each left input document is joined with the right input data if a match is found. If not, the left input document is written to the output view without join. If the Single document output property is true, only one document is written to the output view for each left input document regardless of the number of matches found in the in-memory lookup table.
Prerequisites:

Enough free memory should be available to load all right input data to the in-memory lookup table.

Support and limitations:

Limited support in Ultra Tasks when the Single document output field is selected. Only one document is written to the output view for each input document, which is a prerequisite for Ultra Pipelines.

Account: 

Accounts are not used with this Snap.

Views:
InputThis Snap has exactly two document input views. Users may want to edit the right input view label in the 'Views' section of the Snap since the right input view label is used as a prefix during the JOIN operation if the same column name exists in the left input data.
OutputThis Snap has exactly one document output view.
ErrorThis Snap has at most one document error view and produces zero or more documents in the view.

Settings

Label

Required. The name for the Snap. You can modify this to be more specific, especially if you have more than one of the same Snap in your pipeline.

Join paths

Required. Field names to use for left and right sides of join. Each row in the table defines a relationship between a left field and a right field. The Snap supports flat map data only for all input documents, and a structured JSON path like '$customer.address' is not supported.

Left path

Required. Field name in the left input data. It can be selected from the suggested list. This property does not support expressions.

Example: customer_id, $customer_id, _leftFieldName (for pipeline parameter with the expression button enabled)

Default value:  None, the expression toggle enabled

Right path

Required. Field name in the right input data. The Right path suggestion is not available yet, except pipeline parameters. This property does not support expressions.

Example: customer_id, $customer_id, _rightFieldName (for pipeline parameter with the expression button enabled)

Default value: None, the expression button enabled

Single document outputIf selected, only one document is always written to the output view for each input document. If more than one row in the lookup table matches with the left input data, the first one in the list is joined with the left input data. If there is no match, the left input data is written to the output view. Leave this property selected if the pipeline is executed in Ultra Tasks mode.
If unselected, each of the matching rows is joined with the left input data. Therefore, the number of output documents may be larger than the input document counts.
Minimum memory (MB)

If the available memory is less than this value when building the in-memory lookup table from the right input documents, the Snap stops to fetch the next right input document until more memory is available. This feature is disabled if this value is 0.

This Snap loads all right input documents into the in-memory lookup table before it starts to perform the JOIN operation. Therefore, if the input data in the right input view exceeds the available memory, it may cause an out-of-memory failure. This field helps reduce the possibility of out-of-memory failures.

Default value: 200 MB
Example: 500 MB

Out-of-memory timeout (minutes)

If the Snap pauses longer than this value while waiting for more memory to become available, it throws an exception to prevent the system from running out of memory.

Default value: 30 minutes
Example: 10 minutes

Snap Execution

Select one of the three modes in which the Snap executes. Available options are:

  • Validate & Execute: Performs limited execution of the Snap, and generates a data preview during Pipeline validation. Subsequently, performs full execution of the Snap (unlimited records) during Pipeline runtime.

  • Execute only: Performs full execution of the Snap during Pipeline execution without generating preview data.

  • Disabled: Disables the Snap and all Snaps that are downstream from it.

Default Value: Execute only
Example: Validate & Execute

Examples


Views Page with Inputs Renamed

In-Memory Lookup overview image


Example Pipeline - Views Page with Inputs Renamed


Views page with inputs renamed:

Left input data:

 Right input data:

Output data with Single document output selected:

Output data with Single document output not selected:


 Click to view/expand
ReleaseSnap Pack VersionDateType Updates
November 2024439patches29078 Latest

Fixed an issue with the CSV Parser Snap that introduced unexpected characters into the records and output data because of incorrect handling of the delimiter.

November 2024main29029 StableUpdated and certified against the current SnapLogic Platform release.
August 2024438patches28073 Latest

Fixed an issue with the JSON Generator and XML Generator Snaps that caused unexpected output displaying '__at__' and '__h__' instead of '@' and '-' respectively because the Snap could not update them to their original values after the Velocity library upgrade.

August 2024438patches27959 Latest

Fixed an issue with the Sort where the Snap could not sort files larger than 52 MB. This fix applies to Join Snap also.

August 2024main27765 StableUpgraded the org.json.json library from v20090211 to v20240303, which is fully backward compatible.
May 2024437patches26643 Latest
  • Fixed an issue with the Sort Snap that displayed an error when estimating the size of the input document provided by the upstream S3 Browser Snap.
  • Fixed an issue with the Parquet Formatter Snap that was unable to route errors to the error view.
May 2024437patches26453 Latest
  • Added expression support to the Skip lines field in the CSV Parser Snap to enable passing pipeline parameters and upstream values. 

  • Fixed an issue with the XML Parser Snap that caused an error when using the Splitter option in the Snap settings. 

May 2024main26341 Stable
  • Added Parquet Parser and Parquet Formatter Snaps to the Transform Snap Pack:
    • Parquet Parser: Reads the binary Parquet data and writes document data to the output.
    • Parquet Formatter: Reads the document data and writes it to the output in binary Parquet format.
  • Enhanced the JSON Splitter Snap to capture metadata and lineage information from the input document.

February 2024436patches25564 Latest

Fixed an issue with the JSON Formatter Snap that generated incorrect schema.

February 2024436patches25292 Latest

Fixed an out-of-memory error issue with the Aggregate Snap. This Snap no longer performs the presort for the input documents.

If the input documents are unsorted and GROUP-BY fields are used, you must use the Sort Snap upstream of the Aggregate Snap to presort the input document stream and set the Sorted stream field Ascending or Descending to prevent the out-of-memory error. However, if the total size of input documents is expected to be relatively small compared to the available memory, then Sort Snap is not required upstream.

Learn more about presorting unsorted input documents to be processed by the Aggregate Snap.

February 2024main25112 StableUpdated and certified against the current SnapLogic Platform release.
November 2023435patches24802 LatestFixed an issue with the Excel Parser Snap that caused a null pointer exception when the input data was an Excel file that did not contain a StylesTable.
November 2023435patches24481 Latest

Fixed an issue with the Aggregate Snap where the Snap was unable to produce the desired number of output documents when the input was unsorted and the GROUP-BY fields field set was used.

November 2023435patches24094 Latest

Fixed a deserialization issue for a unique function in the Aggregate Snap.

November 2023main23721 StableUpdated and certified against the current SnapLogic Platform release.
August 2023434patches23076 LatestFixed an issue with the Binary to Document Snap where an empty input document with Ignore Empty Stream selected caused the Snap to stop executing.
August 2023434patches23034 Latest
  • Fixed an issue with the Transform Snap Pack that caused an error when the input file was a binary JSON file that contained a string value of more than 20,000,000 characters.
  • Fixed a memory issue with the Aggregate Snap that occurred when using GROUP-BY fields.

August 2023434patches22705 Latest

Fixed an issue with the JSON Splitter Snap that caused the pipeline to terminate with excessive memory usage on the Snaplex node after the 4.33 GA upgrade. The Snap now consumes less memory.

August 2023main22460 StableUpdated and certified against the current SnapLogic Platform release.
May 2023433patches22431 Latest
  • Fixed an issue with the Excel Multi Sheet Formatter Snap that caused it to produce binary output data when there was no input document and Ignore empty stream was selected.
  • Introduced the following new Snaps:
    • GeoJSON Parser: Parses geospatial data from binary data input and outputs the contents as a GeoJSON document downstream.

    • WKT Parser: Parses geospatial data from binary data input and outputs the contents as a WKT (Well Known Text) document downstream.

May 2023433patches21779 Latest

The Decrypt Field and Encrypt Field Snaps now support CTR (Counter mode) for the AES (Advanced Encryption Standard) block cipher algorithm.

May 2023433patches21586 Latest

The Decrypt Field Snap now supports the decryption of various encrypted fields on providing a valid decryption key.

May 2023433patches21461 Latest

The following Transform Snaps include new fields to improve memory management: Aggregate, Group By Fields, Group By N, Join, Sort, Unique.

May 2023433patches21336 Latest

Fixed an issue with the AutoPrep Snap where dates could potentially be rendered in a currency format because currency format options were displayed for the DOB column.

May 2023433patches21196 Latest

Enhanced the In-Memory Lookup Snap with the following new fields to improve memory management and help reduce the possibility of out-of-memory failures:

  • Minimum memory (MB)

  • Out-of-memory timeout (minutes)

These new fields replace the Maximum memory % field.

May 2023main21015 StableUpgraded with the latest SnapLogic Platform release.
February 2023432patches20535 Latest

Fixed an issue with the Encrypt Field Snap, where the Snap failed to support an RSA public key to encrypt a message or field. Now the Snap supports the RSA public key to encrypt the message.

February 2023432patches20446 Latest

The Join Snap is enhanced with the following:

  • The Pipeline Execution Statistics of the Join Snap now has a status message that displays the parameters - Free disk space, Available memory, and Average document size.

  • The internal sort buffer size is reduced to a minimum of 10MB when the available memory in the node becomes lower than 500MB to avoid the out-of-memory crash.

  • The internal sort buffer size is restored to its original size when the available memory becomes larger than 2GB.

  • We have improved the readability of the error message for the out of disk space on node error. The updated error message now provides clearer information and guidance for users, as shown below:
    Reason: Insufficient free disk space available to stage sort data into temporary files.
    Resolution:  Increase the amount of free disk space and try again.

February 2023

432patches20250

 Latest
  • Fixed an issue with the JSON Splitter Snap that was causing errors when using multiple repeated dots in the JSON Path.
  • The Sort Snap includes the following improvements:

    • The Maximum memory % field is revised to Maximum memory.

    • The Maximum memory unit (new dropdown list) enables you to choose a unit, percentage (%), or MB for better memory control.

February 2023432patches20151 Stable/Latest

Fixed an issue that occurred with the JSON Splitter Snap when used in an Ultra pipeline. The request was acknowledged before it was processed by the downstream Snaps, which caused a 400 Bad Request response.

February 2023432patches20062 Stable/LatestFixed the behavior of the JSON Splitter Snap for some use cases where its behavior was not backward compatible with the 4.31 GA version. These cases involved certain uses of either the Include scalar parents feature or the Include Paths feature.
February 2023432patches19974 Stable/Latest

Fixed the "Json Splitter expects a list" error by restoring the JSON Splitter Snap's previous behavior of handling the case where the document element referenced by the JSON Path to Split field is an object instead of a list or array.

Review your pipelines where this error occurred to check your assumptions about the input to the JSON Splitter and whether the value referenced by the JSON Path to Split field will always be a list. If the input is provided by an XML-based or SOAP-based Snap like the Workday or NetSuite Snaps, a result set or child collection that’s an array when there's more than one result or child will be an object when there's only one result or child. In these cases, we recommend using a Mapper Snap and the sl.ensureArray() function to ensure that the value being split by the JSON Splitter is always an array (even for the single element cases).

February 2023432patches19918 Stable/Latest
  • Fixed an issue with the CSV Formatter Snap where the Unicode character delimiters using [0-9a-f] did not work.

  • Fixed an issue with the JSON Splitter Snap that was generating null values for empty input data.

February 2023main19844 StableUpgraded with the latest SnapLogic Platform release.
November 2022431patches19441 Stable

The Encrypt Field Snap supports decryption of encrypted output in Snowflake Snaps.

November 2022431patches19385 Latest

The Transform Join Snap now doesn’t fail with the Null Pointer Exception when you configure the Sorted streams field with Ascending.

November 2022431patches19359 LatestThe JSON Splitter Snap includes memory improvements and a new Exclude List from Output Documents checkbox. This checkbox enables you to prevent the list that is split from getting included in output documents, and this also improves memory usage.
November 2022main18944 Stable
  • The Excel Formatter and Excel Multi Sheet Formatter Snaps now include a Convert formula strings to formulas checkbox.
  • The Mapper Snap now has a Sorted checkbox in the Input Schema and Target Schema panels, which allows you to sort the input and target schemas. When unchecked, the Snap unsorts the input and the target schema.

October 2022430patches18800 LatestThe Sort and Join Snaps now have improved memory management, allowing used memory to be released when the Snap stops processing.
October 2022430patches18610 Latest

The CSV Formatter and CSV Parser Snaps now support shorter values of Unicode characters.

 
October 2022430patches18454 Latest
  • The AutoPrep feature now includes the following new transformation options that enable you to:
    • Change the field data type from Data type menu.
    • Format dates and date Strings.
    • Rename a field.
    • Mask sensitive data using an MD5, SHA-1, SHA-256, or SHA-512 algorithm.
  • The data in the Preview data pane format is easier to read and the buttons have been changed to improve usability.

The CSV Parser Snap now parses data with empty values in the columns when using a multi-character delimiter.

September 2022430patches18119 Latest

The Transcoder Snap used in a low-latency feed Ultra Pipeline now acknowledges the requests correctly.

September 2022430patches17802 Latest

The Avro Parser Snap now displays the decimal number correctly in the output view if the column’s logical type is defined as a decimal.

September 2022430patches17737 Stable/LatestAutoPrep now enables you to handle empty or null values.
September 2022430patches17643 LatestThe CSV Parser and CSV Formatter Snaps now support either \ or \\ for a single backslash delimiter which were failing earlier.
September 2022430patches17589 Latest

The CSV Formatter Snap does not hang when running in specific situations involving multibyte characters in a long field. If you notice the CSV Formatter Snap is hung, we recommend that you update to the 430patches17589 version and restart your Snaplex.

August 2022main17386 Stable
  • New Snap Application: The Auto Prep Snap provides a data preparation application where you can flatten structured data, include and exclude data fields, and change data types before forwarding the data for further processing.

  • The Hide whitespace option in the CSV Generator and JSON Generator Snaps allows you to hide the rendering of whitespace as symbols (dot or underscore) in the output that you may have in the CSV or JSON input documents.

  • The Render whitespace checkbox in the Mapper Snap enables or disables the rendering of whitespace in the input document. When a value in the Expression field has blank spaces (leading, trailing, or spaces in the middle of a string), the spaces are rendered as symbols (dot “.” or underscore “_”) in the output on selecting this checkbox.

  • The Excel Parser Snap includes the Custom Locale dropdown list that allows selecting a user-defined locale for formatting numbers..

  • The Selected fields in the Pivot Snap allow you to define fields to be unpivoted so that the remaining fields are automatically pivoted.

  • The XML Generator Snap includes examples on how to escape single (') and double quotes (“) when used with elements or attributes.

4.29 Patch429patches16990 Latest
  • Fixed an issue with the Aggregate Snap where the Snap failed to validate (after first successful validation) while using a field that may contain a date for MIN and MAX functions. The Snap now supports DATE-type fields.
  • Enhanced the Pivot Snap with the Treat selected fields as static checkbox that enables the Snap to treat the selected fields as static to preserve the structure of the selected fields while all other fields are pivoted.

4.29 Patch429patches16923 Latest
  • Enhanced the CSV Formatter and CSV Parser Snaps to support multiple characters or strings as delimiters.
  • Fixed an issue with the Join Snap where the Snap displayed an incorrect error if the Left path or Right path fields were expression-enabled or if you have specified properties other than the field name and irrespective of whether the Sorted streams field is Unsorted or not. For example, "$first + '2' . Now, the Snap runs properly if the Left path or Right path was expression enabled and you have specified properties other than the field name and the Sorted streams field is Sorted or Unsorted. The Snap now displays a proper error that is more informative, in case there is a problem while executing this Snap.

  • Fixed an issue with CSV Parser Snap where the Snap failed when more than six characters are used as delimiters. Now, the Snap executes properly when you use more than six characters as delimiters.

4.29 Patch429patches16521 Latest
  • Removed the default value for the Root element field in the XML Formatter Snap.
  • Fixed an issue with the Transcoder Snap where the Input character-set field was not displaying the suggestions properly.
4.29 Patch429patches16026