Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

Appsplus panel macro confluence macro
data{"features":["title","icon","rounded","collapsible","border"],"title":"General Guidelines. **Delete before publishing**","titleColor":"#000000","titleColorBG":"#ff5c30","titleSize":"14","titleBold":true,"titleItalic":false,"titleUnderline":false,"lozenge":"Hot stuff","lozengeColor":"#172B4D","lozengeColorBG":"#fff0b3","height":200,"panelPadding":12,"panelColor":"#172B4D","panelColorBG":"#FFFFFF","borderColor":"#ff5c30","borderRadius":3,"borderStyle":"solid","borderWidth":1,"icon":"editor/info","iconPrimary":"#FFFFFF","iconSecondary":"#0052CC","newMacro":false}

**Delete Before Publishing**

  • Always use title casing for Heading formats 1 and 2.

  • Always use active voice.

  • Do not use "Please" anywhere in the document.

  • Screenshots

    • Always use the New Form UI.

    • Be optically similar. Max size corresponding gridline to 1000 px size, as in the style guide.

    • Do not capture Snap borders when showing configurations in the Examples section. You can add a border in the editor here.

    • See Image Style Guide for details.

  • Examples must always use first-person plural references. You can use the second-person if needed depending upon the example's content.

In this article

Table of Contents
minLevel1
maxLevel2
absoluteUrltrue

Overview

You can use this Snap to...

Snap Type

<Snap name> Snap is a READ/WRITE/PARSE/TRANSFORM/FLOW/FORMAT-type Snap that reads/fetches/writes/parses/executes/transforms/calls/creates…

Prerequisites

  • Valid client ID.

  • A valid account with the required permissions.

None.

Support for Ultra Pipelines

 

...

Works in Ultra Pipelines

...

In this article

Table of Contents
minLevel1
maxLevel2
absoluteUrltrue

Overview

You can use this Snap to run a MERGE INTO SQL statement based on the updates available in the source data files. In other words, this Snap allows you to perform a bulk UPSERT (UPDATE + INSERT) operation to update existing rows of a target DLP table and add new rows to the target table. The source of your data can be a file from a cloud storage location, an input view from an upstream Snap, or a table that can be accessed through a JDBC connection. The source data can be in a CSV, JSON, PARQUET, TEXT, or an ORC file.

This Snap uses the following Databricks commands internally:

  • COPY INTO - Enables loading data from staged files to an existing table.

  • CREATE TABLE [USING] - Enables loading data from some external sources like JDBC.

  • CREATE TABLE - Creates table in our case temporary table.

  • MERGE INTO - Inserts new rows, updates existing rows and delete by condition rows.

...

Snap Type

Databricks - Merge Into Snap is a write-type Snap that inserts and updates data in a DLP instance.

Prerequisites

  • Valid access credentials to a DLP instance with adequate access permissions to perform the action in context.

  • Valid access to the external source data in one of the following: Azure Blob Storage, ADLS Gen2, DBFS, GCP, AWS S3, or another database (JDBC-compatible).

Support for Ultra Pipelines

Does not support Ultra Pipelines

Limitations

NoneSnaps in the Databricks Snap Pack do not support array, map, and struct data types in their input and output documents.

Known Issues

None.

Snap Views

Type

Format

Number of Views

Examples of Upstream and Downstream Snaps

Description

Input Document

Binary

Binary or Document

  • Min: 0

  • Max: 2

  • Mapper

  • Copy

  • JSON Generator

  • Databricks - Select

This Snap can read from two input documents at a time:

  • ..

  • Requires the EDI data and internal ID as a JSON document
    • One JSON document for the incoming data from the preceding Snap in the Pipeline.

    • Another JSON document that serves as the data source when Source Type is selected as Input View.

    Output

    Document

    Binary

    Binary or Document

    • Min: 0

    • Max: 1

    • ..

    • ..

    The EDI transaction ID and order confirmation
    • Databricks - Select

    • Databricks - Unload

    A JSON document containing the bulk load request details and the result of the bulk load operation.

    Error

    Error handling is a generic way to handle errors without losing data or failing the Snap execution. You can handle the errors that the Snap might encounter while running the Pipeline by choosing one of the following options from the When errors occur list under the Views tab. The available options are:

    • Stop Pipeline Execution: Stops the current pipeline execution when the Snap encounters an error.

    • Discard Error Data and Continue: Ignores the error, discards that record, and continues with the rest of the records.

    • Route Error Data to Error View: Routes the error data to an error view without stopping the Snap execution.

    Learn more about Error handling in Pipelines.

    Snap Settings

    Appsplus panel macro confluence macro
    data{"features":["title","icon","rounded","collapsible","border"],"title":"Documenting Fields Based On Data Type/UI Element","titleColor":"#000000","titleColorBG":"#ff5c30","titleSize":"14","titleBold":true,"titleItalic":false,"titleUnderline":false,"lozenge":"Hot stuff","lozengeColor":"#172B4D","lozengeColorBG":"#fff0b3","height":200,"panelPadding":12,"panelColor":"#172B4D","panelColorBG":"#FFFFFF","borderColor":"#ff5c30","borderRadius":3,"borderStyle":"solid","borderWidth":1,"icon":"editor/info","iconPrimary":"#FFFFFF","iconSecondary":"#0052CC","newMacro":false}

    **Delete Before Publishing**

    Choose from the following sentences to document specific field types.

    Drop-down lists/Option Buttons (radio buttons):

    You must list the LoV and describe them if their meaning isn't apparent. In this case, format the LoV in italics, regular font for the LoV's description. In either case, list the LoVs as a bullet list.

    • <State what the option should do in this field>. The available options are: <bullet list of LoVs>
      Specify the Salesforce API to be used. The available options are:...

    • Select the <category> that you want to use. Available options are...
      * Option 1<italicized>. <third person singular form of the verb>
      * Option 2<italicized>. <third person singular form of the verb>
      Select the API that you want to use. Available options are:
      Bulk API. Sends the Snap execution request details as a bulk API call.
      REST API. ...
      OR
      Select one of the three following modes in which the Snap executes:
      * Validate & Execute. Performs limited execution of the Snap and generates a data preview during Pipeline validation, then performs full execution of the Snap (unlimited records) during Pipeline runtime.
      * Execute only. Performs full execution of the Snap during Pipeline execution without generating preview data.
      * Disabled. Disables the Snap and all Snaps downstream from it.

    Check boxes:

    • If selected, <Snap behavior>.
      If selected, an empty file is written when the incoming document has no data.

    • If selected, <behavior>. If not selected/Otherwise, <behavior>
      Use "If not selected" if the first sentence is long.
      If selected, the Snap uses the file path value as is. Otherwise, the Snap uses the file path value in the URL.
      If selected, an empty file is written when the incoming document has empty data. If there is no incoming document at the input view of the Snap, no file is written regardless of the value of the property.

    • Select to <action>
      Use this if the behavior is binary. Either this or that, where the converse behavior is apparent/obvious.
      Select to execute the Pipeline during validation.

    Text Fields

    • Describe what the user shall specify in this field. Additional details, as applicable, in a separate sentence. Include caveats such as the field being conditionally mandatory, limitations, etc.
      Enter the name for new account.
      Specify the account ID to use to log in to the endpoint.
      Required if IAM Role is selected.
      Do not use this field if you are using batch processing.

    Numeric Text Fields

    • Describe what the field represents/contains. Additional details, as applicable, in a separate sentence. Include caveats such as the field being conditionally mandatory, limitations, etc. Include special values that impact the field's behavior as a bullet list.
      The number of records in a batch.
      The number of seconds for which you want the Snap to wait between retries.
      The number of seconds for which the Snap waits between retries.
      Use the following special values:
      * 0: Disables batching.
      * 1: Includes all documents in a single request.

    Notes in field descriptions

    • Confluence’s new editor does not allow nesting of most macros inside another macro, especially the Note/Alert/Warning/Info (Panel) macros inside a table macro and Excerpt macros inside Expand or Panel Macro+ macros. So, as a workaround use the Footnotes approach as mentioned below:

      • Assign numbers at the Note locations in the form of follow through phrases like See Note 2 below this table. or such.

      • Add your Notes---an appropriate Note/Alert/Warning/Info (Panel) macro---immediately below the macro (for example, Table macro) beginning the content with the corresponding number assigned.

    Info
    • Asterisk ( * ): Indicates a mandatory field.

    • Suggestion icon ((blue star)): Indicates a list that is dynamically populated based on the configuration.

    • Expression icon ((blue star) ): Indicates whether the value is an expression (if enabled) or a static value (if disabled). Learn more about Using Expressions in SnapLogic.

    • Add icon ( (blue star) ): Indicates that you can add fields in the fieldset.

    • Remove icon ( (blue star)): Indicates that you can remove fields from the fieldset.

    ...

    Field Name

    ...

    Field Type

    ...

    Field Dependency

    ...

    Description

    Label*

    Default ValueELT Database Account
    ExampleELT RS Account

    ...

    String

    ...

    None.

    The name for the Snap. You can modify this to be more specific, especially if you have more than one of the same Snap in your Pipeline.

    Number of records

    Default Value
    Example:

    ...

    String/Expression

    ...

    Sampling Type is Number of records.

    ...

    Enter the number of records to output.

    ...

    Field set Name

    Specify advanced parameters that you want to include in the request.

    ...

    Field 1*

    Default Value<value> or None.
    Example<value>

    ...

    String

    ...

    Debug mode check box is not selected.

    ...

    Field 2

    Default Value<value> or None.
    Example<value>

    ...

    String

    ...

    None.

    Snap Execution

    ...

    Info
    • Asterisk ( * ): Indicates a mandatory field.

    • Suggestion icon ((blue star)): Indicates a list that is dynamically populated based on the configuration.

    • Expression icon ((blue star) ): Indicates whether the value is an expression (if enabled) or a static value (if disabled). Learn more about Using Expressions in SnapLogic.

    • Add icon ( (blue star) ): Indicates that you can add fields in the fieldset.

    • Remove icon ( (blue star)): Indicates that you can remove fields from the fieldset.

    Field Name

    Field Type

    Field Dependency

    Description

    Label*

    Default ValueDatabricks - Merge Into
    ExampleDb_MergeInto_FromS3

    String

    None.

    The name for the Snap. You can modify this to be more specific, especially if you have more than one of the same Snap in your Pipeline.

    Database name

    Default Value: None.
    Example: cust_db

    String/Expression/Suggestion

    None.

    Enter the name of the database in which the target table exists. Leave this blank if you want to use the database name specified in the Database Name field in the account settings.

    Table Name*

    Default Value: None.
    Example: cust_records

    String/Expression/Suggestion

    None.

    Enter the name of the table in which you want to perform the MERGE INTO operation. 

    Target Table Alias*

    Default Value: None.
    Example: trgt_tbl

    String

    None.

    Enter an alias name for the target table to use in the MERGE INTO operation.

    Input Source Alias*

    Default Value: None.
    Example: src_tbl

    String

    None.

    Enter an alias name for the source table/data to use in the MERGE INTO operation.

    ON Condition*

    Default Value: None.
    Example: src.id=trg.id

    String/Expression

    None.

    Specify the condition on which the Snap should update the target table with the data from the source table/files.

    Merge-into Statements

    You can use this fieldset to specify the conditions that activate the MERGE INTO operation and the additional conditions that must be met. Specify each condition in a separate row.

    This field set contains the following fields:

    • When Clause

    • Condition

    • Action

    Info

    The Snap allows the following combinations of actions:

    • INSERT

    • UPDATE

    • DELETE

    • UPDATE AND DELETE

    • UPDATE AND INSERT

    • DELETE AND INSERT

    • UPDATE, DELETE AND INSERT

    When Clause

    Default Value: None.
    Example: WHEN MATCHED

    String/Expression/Suggestion

    None.

    Specify the matching condition based on the outcome of the ON Condition. Alternatively, select a clause from the suggestion list.

    Available options are:

    • WHEN MATCHED: Applies the specified condition and action when the source data matches with the target.

    • WHEN NOT MATCHED: Applies the specified condition and action when the source data does not match with the target.

    DLP supports the following MERGE INTO operations:

    • WHEN MATCHED: UPDATE or DELETE

    • WHEN NOT MATCHED: INSERT

    Condition

    Default Value: None.
    Example: net-value > 5000

    String

    None.

    Specify the additional criteria if needed. The action associated for the specified condition is not performed if the condition's criteria is not fulfilled. It can be a combination of both source and target tables, source table only, target table only, or may not contain references to any table at all.

    Having this additional condition allows the Snap to identify whether the UPDATE or DELETE action must be performed (since both the actions correspond to the WHEN MATCHED clause).

    You can also use Pipeline parameters in this field to bind values. However, you must be careful to avoid SQL injection.

    Action

    Default ValueINSERT
    Example: DELETE

    Dropdown list

    None.

    Choose the action to apply on the condition.

    Available options are:

    • INSERT

    • UPDATE

    • DELETE

    Source Type

    Default ValueCloud Storage File
    Example: Input View

    Dropdown list

    None.

    Select the type of source from which you want to update the data in your DLP instance. The available options are:

    • Cloud Storage File. A file from a cloud location like AWS S3, Azure, or GCS. You can configure a series of options for the MERGE INTO operation as described in this document.

    • Input View. A JSON file coming from the preceding Snap’s output. You need to specify only the Load action.

    • JDBC. A table in another database that can be connected to using a JDBC connector. You can specify the Source table name to load the data from or the Target Table Columns to replace the existing target table with a new one.

    Source table name

    String

    Source Type is JDBC.

    Enter the source table name. The default values (database) configured in the Snap’s account for JDBC Account type are considered, if not specified in this field.

    File format type

    Default ValueCSV
    ExamplePARQUET

    Dropdown list

    Source Type is Cloud Storage file.

    Select the file format of the source data file. It can be CSV, JSON, ORC, PARQUET, or TEXT.

    File Format Option List

    Source Type is Cloud Storage file.

    You can use this field set to choose the file format options to associate with the MERGE INTO operation, based on your source file format. Choose one file format option in each row.

    File format option

    Default Value: None.
    Examplecust_ID

    String/Expression/Suggestion

    Source Type is Cloud Storage file.

    Select a file format option from the available options and set appropriate values to suit your MERGE INTO needs, without affecting the syntax displayed in this field.

    Files provider

    Default ValueFile list
    Examplepattern

    Dropdown list

    Source Type is Cloud Storage file.

    Declare the manner in which you are specifying the source files list - File list or pattern. Based on your selection in this field, the corresponding fields change: File list fieldset for File list and File pattern field for pattern.

    File list

    Source Type is Cloud Storage file and Files provider is File list.

    You can use this field set to specify the file paths to be used for the MERGE INTO operation. Choose one file path in each row.

    File

    Default Value: None.
    Examplecust_data.csv

    String

    Source Type is Cloud Storage file and Files provider is File list.

    Enter the path of the file to be used for the MERGE INTO operation.

    File pattern

    Default Value: None.
    Examplefolder1/*.csv

    String/Expression

    Source Type is Cloud Storage file and Files provider is pattern.

    Enter the regex pattern to use to match the file name and/or absolute path. You can specify this as a regular expression pattern string, enclosed in single quotes. Learn more: Examples of COPY INTO (Delta Lake on Databricks) for DLP.

    Encryption type

    Default Value: None.
    ExampleServer-Side KMS Encryption

    String

    Source Type is Cloud Storage file.

    Select the encryption type to use for decrypting the source data and/or files staged in the S3 buckets.

    Info

    Server-side encryption is available only for S3 accounts.

    KMS key

    Default Value: None.
    ExampleMF96D-M9N47-XKV7X-C3GCQ-G5349

    String/Expression

    Source Type is Cloud Storage file and Encryption type is Server-Side KMS Encryption.

    Enter the AWS Key Management Service (KMS) ID or ARN to use to decrypt the encrypted files from the S3 location. In case that your source files are in S3, see Loading encrypted files from Amazon S3 for more detail.

    Snap Execution

    Default ValueExecute only
    Example: Validate & Execute

    Dropdown list

    N/ANone.

    Select one of the three modes in which the Snap executes. Available options are:

    • Validate & Execute: Performs limited execution of the Snap, and generates a data preview during Pipeline validation. Subsequently, performs full execution of the Snap (unlimited records) during Pipeline runtime.

    • Execute only: Performs full execution of the Snap during Pipeline execution without generating preview data.

    • Disabled: Disables the Snap and all Snaps that are downstream from it.

    Troubleshooting

    Error

    Reason

    Resolution

    Account validation failed.

    The Pipeline ended before the batch could complete execution due to a connection error.

    Verify that the Refresh token field is configured to handle the inputs properly. If you are not sure when the input data is available, configure this field as zero to keep the connection always open.

    Examples

    Excluding Fields from the Input Data Stream

    We can exclude the unrequired fields from the input data stream by omitting them in the Input schema field set. This example demonstrates how we can use the <Snap Name> to achieve this result:

    ...

    Missing property value

    You have not specified a value for the required field where this message appears.

    Ensure that you specify valid values for all required fields.

    Examples

    Updating Employee List in the DLP instance using a CSV file

    Consider the scenario where we have 100 rows of incremental data about employees stored as a CSV file in an S3 location and we need to load this data (insert and update the employee records, as appropriate) into our DLP instance.

    To achieve this, we can use the Databricks - Merge Into Snap.

    ...

    We configure this Snap and its account as follows:

    Snap Account Configuration

    Snap Configuration

    Image Added

    Image Added

    The Snap checks for matching IDs in the target table and inserts the data from the CSV file into the target table for each id not found in it. We have configured the Snap to not perform any action when a matching ID is found. Depending on the incremental data we have in the CSV file, we may choose to add another Merge Into statement as WHEN MATCHED, update the record to have new data in its columns.

    After successful validation, the Snap displays the target table name and the number of rows newly inserted in this run.

    ...

    This means that the target table had data for 5 records before running this Pipeline and 95 records were loaded into the table by this Pipeline.

    Download this Pipeline. 

    Downloads

    Info
    1. Download and import the Pipeline into SnapLogic.

    2. Configure Snap accounts as applicable.

    3. Provide Pipeline parameters as applicable.

    Attachments
    previewtrue
    patterns*.slp, *.zip
    sortByname

    Snap Pack History

    Insert excerpt

    ...

    Databricks Snap Pack
    Databricks Snap Pack
    name

    ...

    Databricks Snap Pack History
    nopaneltrue

    ...