Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

On this page

Table of Contents

Overview

Info

This feature is in private beta. Email support@snaplogic.com for an invitation to gain early access.

SnapLogic®

Overview

SnapLogic® Cache Pipelines is a subscription feature that enables lets you to cache reference information in a pipeline. The reference information is generated using Cache Pipelines stored in-memory for a running pipeline, and Expression Language lookup functions retrieve it, taking advantage of Snaplexes running pipelines with any endpoint. Cache Pipelines are supported on all Snaplex types. Looking up values in an external system is a costly operation if the table is queried repeatedly, using the same data as reference fields. An in-memory data store allows your pipelines to look up keys/values/data in other Snaps at runtime.If you have data in an external system that you plan to look up (such as an object or ID), then you can use Cache Pipelines to perform this operation. For example, if you want to look up a department for a set of user IDs in a columnar way, you can use Cache Pipelines. First, you create a generic pipeline (called a cache pipeline in this article) that is specified in your processing pipeline (called a main pipeline in this article). Then, the main pipeline can access the output documents in the cache pipeline using the retrieve expression language functionsdata for efficient retrieval in your pipelines. It stores reference information in memory, reducing the need for costly external system lookups. This feature benefits users by optimizing data retrieval and improving pipeline performance.

If you need to frequently access data from an external system, like looking up department information for user IDs, Cache Pipelines simplifies this process. You create a cache pipeline within your main pipeline, and it allows you to access cached data easily, enhancing the efficiency of your data processing tasks.

In this article

Child pages (Children Display)
depth3

Key Components

The Cache Pipelines feature introduces a new property to the the Edit Pipeline dialog, where you can enhances pipeline management. It enables you to specify a pipeline to run and an alias to use to reference the for referencing its output documents in the pipeline run output. You can then reference a cache pipeline with the cached data as the value. The alias value is then used as a reference Edit Pipeline dialog box. This alias is used in the new retrieve expression language function in the SnapLogic expression language. The path to the cache pipeline can be defined by to access cached data. You can also define the cache pipeline's path through an expression based on the pipeline parameters.

Cache Pipeline properties that allow specification of a

Key Features:

  • Define pipeline path and corresponding cache lookup alias.

  • Retrieve Utilize retrieve expression language functions that allow you to look up records based on a for record lookup with cache alias and a filter object.

  • Caches Store up to 100,000 records using a pipeline-local in-memory lookup

  • Values of pipeline parameters are overridden by the parameters from the main pipeline

Workflow

  • Create a main pipeline

  • Add a new pipeline or use an existing pipeline that contains data that you want to cache for reference in the main pipeline.

  • Set the pipeline path to the pipeline created in Step 2 and specify an alias for the cached data

  • Add a mapper that retrieves data from the cache specified in Step 3

  • Run the main pipeline
    • within the pipeline.

    • Parameter values from the main pipeline override pipeline parameters.

    Prerequisites

    Your Org must have a subscription to the Cache Pipelines feature.

    Usage Guidelines

    • Cache Pipelines must have the following input and output views:

      • 0 unconnected input views (Binary or Document)

      • 1 unconnected Document output view

      • 0 unconnected Binary output views

    • The output document must have a flattened document structure. None of the values for primary keys can have objects or arrays as part of their structure.

    • Pipeline parameters are passed to the Cache Pipelines from the main pipelines that specify them. This also passes through the Pipeline Execute Snap to the child pipeline and to the child pipeline’s cache pipeline.

    • If only the alias or path (but not both) are defined, that pipeline is skipped.

    • Pipelines may be defined using expression properties, specifically leveraging pipeline parameters, but not leveraging expression library functions.

    • The definition of the same alias in multiple Cache Pipelines will result results in the last defined pipeline populating the data for that specific Cache Pipelines.

    Limitations

    • If the last Snap in your cache pipeline has an output document that contains an array or object as part of its structure, you need to restructure your data or remove those fields to produce a flat document structure.

    • If an Ultra Task has a cache pipeline, updates to the cache pipeline do not force an Ultra Task to restart.

    • Dynamic validation is limited to 50 documents.

    • Pipelines with Cache Pipelines need to run to completion. Accordingly, we do not recommend using the messenger service Snaps (like JMS or Kafka) or the Ultra Polling design.

    • The following features are untested in combination with Cache Pipelines:

      • Error Pipelines

      • Resumable Pipelines

      • ELT Snap Packs