Cache Pipelines (Private Beta)

On this page

Overview

SnapLogic® Cache Pipelines is a subscription feature that lets you cache reference data for efficient retrieval in your pipelines. It stores reference information in memory, reducing the need for costly external system lookups. This feature benefits users by optimizing data retrieval and improving pipeline performance.

If you need to frequently access data from an external system, like looking up department information for user IDs, Cache Pipelines simplifies this process. You create a cache pipeline within your main pipeline, and it allows you to access cached data easily, enhancing the efficiency of your data processing tasks.

In this article

Key Components

The Cache Pipelines feature enhances pipeline management. It enables you to specify a pipeline and an alias for referencing its output documents in the Edit Pipeline dialog box. This alias is used in the new retrieve expression language function to access cached data. You can also define the cache pipeline's path through an expression based on the pipeline parameters.

Key Features:

  • Define pipeline path and cache lookup alias.

  • Utilize retrieve expression language functions for record lookup with cache alias and a filter object.

  • Store up to 100,000 records using in-memory lookup within the pipeline.

  • Parameter values from the main pipeline override pipeline parameters.

Prerequisites

Your Org must have a subscription to the Cache Pipelines feature.

Usage Guidelines

  • Cache Pipelines must have the following input and output views:

    • 0 unconnected input views (Binary or Document)

    • 1 unconnected Document output view

    • 0 unconnected Binary output views

  • The output document must have a flattened document structure. None of the values for primary keys can have objects or arrays as part of their structure.

  • Pipeline parameters are passed to the Cache Pipelines from the main pipelines that specify them. This also passes through the Pipeline Execute Snap to the child pipeline and to the child pipeline’s cache pipeline.

  • If only the alias or path (but not both) are defined, that pipeline is skipped.

  • Pipelines may be defined using expression properties, specifically leveraging pipeline parameters, but not leveraging expression library functions.

  • The definition of the same alias in multiple Cache Pipelines results in the last defined pipeline populating the data for that specific Cache Pipelines.

Limitations

  • If the last Snap in your cache pipeline has an output document that contains an array or object as part of its structure, you need to restructure your data or remove those fields to produce a flat document structure.

  • If an Ultra Task has a cache pipeline, updates to the cache pipeline do not force an Ultra Task to restart.

  • Dynamic validation is limited to 50 documents.

  • Pipelines with Cache Pipelines need to run to completion. Accordingly, we do not recommend using the messenger service Snaps (like JMS or Kafka) or the Ultra Polling design.

  • The following features are untested in combination with Cache Pipelines:

    • Error Pipelines

    • Resumable Pipelines

    • ELT Snap Packs