Skip to end of banner
Go to start of banner

BigQuery Upsert (Streaming)

Skip to end of metadata
Go to start of metadata

You are viewing an old version of this page. View the current version.

Compare with Current View Version History

« Previous Version 20 Next »

In this article

Overview

This Snap enables you to perform bulk update or insert (upsert) operations into a BigQuery table from existing tables or any input data stream.

The upsert operation updates existing rows if the specified value exists in the target table and inserts a new row if the specified value does not exist in the target table.

Overview of settings with example values

Snap Type

This Snap is a Write-type Snap that performs a bulk upsert operation.

Prerequisites

Write access for the Google BigQuery Account is required.

Support for Ultra Pipelines

Does not work in Ultra Pipelines.

Limitations and Known Issues

None.

Snap Views

Type

Format

Number of Views

Examples of Upstream and Downstream Snaps

Description

Input 

Document

  • Min: 1

  • Max: 1

  • CSV Parser

  • JSON Parser

  • JSON Generator

This Snap has exactly one document input view. 

Input can come from any Snap that can pass a document to the output view, such as Structure or JSON Generator. Pipeline parameters can also be passed for project ID, dataset ID, and table ID, and so on.

Output

Document

  • Min: 0

  • Max: 1

  • Mapper

  • Google BigQuery Execute

The output is in document view format. The data from the incoming document that is loaded to the destination table is the output from this Snap. It gives the load statistics after the operation is completed

The output view contains information about the bulk load details in the temporary table to better understand the flow. This also helps with error handling.

The output view also lists the number of rows that were updated, modified, or inserted in the target table.

Error

Error handling is a generic way to handle errors without losing data or failing the Snap execution. You can handle the errors that the Snap might encounter when running the pipeline by choosing one of the following options from the When errors occur list under the Views tab:

  • Stop Pipeline Execution: Stops the current pipeline execution if the Snap encounters an error.

  • Discard Error Data and Continue: Ignores the error, discards that record, and continues with the remaining records.

  • Route Error Data to Error View: Routes the error data to an error view without stopping the Snap execution.

Learn more about Handling Errors with an Error Pipeline.

Snap Settings

  • Asterisk ( * ): Indicates a mandatory field.

  • Suggestion icon ((blue star)): Indicates a list that is dynamically populated based on the configuration.

  • Expression icon ((blue star) ): Indicates the value is an expression (if enabled) or a static value (if disabled). Learn more about Using Expressions in SnapLogic.

  • Add icon ( (blue star) ): Indicates that you can add fields in the field set.

  • Remove icon ( (blue star)): Indicates that you can remove fields from the field set.

  • Upload icon ((blue star) ): Indicates that you can upload files.

Field Name

Field Type

Description

Label

Default ValueBigQuery Bulk Upsert (Streaming)
ExampleGBQ Load Employee Tables

String

Specify the name for the Snap. You can modify this to be more specific, especially if you have more than one of the same Snap in your pipeline.

Project ID

Default Value: N/A
Example: test-project-12345

String/Expression/Suggestion

Specify the project ID in which the dataset resides.

Dataset ID

 

Default Value: N/A

Example: dataset-12345

String/Expression/Suggestion

Specify the dataset ID of the destination.

Table ID

 

Default Value: N/A
Example: table-12345

String/Expression/Suggestion

Specify the table ID of the table you are creating.

Batch size

Default value: 1000

String

The number of records batched per request. If the input has 10,000 records and the batch size is set to 100, the total number of requests would be 100.

Batch timeout (milliseconds)

Default value: 2000

String

Time in milliseconds after which the batch will be processed (even though it might be less than the specified batch size).

Batch timeout value must be set with care. When this limit is reached, the batch will be flushed whether or not all the records in the batch were loaded.

Batch retry count

Default value: 0

String

The number of times the server should try to load a failed batch.

Batch retry delay (milliseconds)

Default value: 500

String

The time delay between each retry.

Snap Execution

 

Default Value: Validate & Execute
Example: Execute only

Dropdown list

Select one of the three modes in which the Snap executes. Available options are:

  • Validate & Execute: Performs limited execution of the Snap, and generates a data preview during Pipeline validation. Subsequently, performs full execution of the Snap (unlimited records) during Pipeline runtime.

  • Execute only: Performs full execution of the Snap during Pipeline execution without generating preview data.

  • Disabled: Disables the Snap and all Snaps that are downstream from it.

Troubleshooting

Error

Reason

Resolution

Key column name is required.

No key column(s) specified for checking for existing entries.

Please enter one or more key column names.

Key column name is not present in target table.

Incorrect key column(s) specified for checking for existing entries.

Please select one or more key column names from the suggestion box.

All columns in target table are key columns.

The merge will fail as all columns in the target table are key columns.

Please select one or more (but not all) key column names from the suggestion box.

Examples

Prerequisite: Write access for the Google BigQuery Account is required.

Upsert customer data from Salesforce to a Google BigQuery table

This example demonstrates how to update or insert (upsert) records in a Google BigQuery table.

Pipeline showing the Snaps in this example

First, we configure the Salesforce Read Snap with the required details to read customer account data from Salesforce.

In this example, we selected Output Fields for Total, Id, and Name.

Salesforce Read Snap with output fields selected

Upon validation, the Snap prepares the output to pass to the BigQuery Bulk Upsert Snap.

Next, we configure the BigQuery Bulk Upsert Snap to use unique identifiers to update the existing records.

To upsert data based on the Id and Name key columns, we enter Id and Name in the Key column fields.

BigQuery Upsert Snap with key columns configured

Upon execution, this Snap updates or inserts new records into the Google BigQuery table.

The output shows that 5 records were updated successfully.

Google BigQuery output in JSON format


In this example, we updated the Total for each record (based on the unique identifiers Id and Name selected under Key columns).

The data is updated in the Google BigQuery table, as shown in the BigQuery console.

Updated table data in Google BigQuery

Download this pipeline

Downloads

  1. Download and import the Pipeline into SnapLogic.

  2. Configure Snap accounts, as applicable.

  3. Provide Pipeline parameters, as applicable.

  File Modified
No files shared here yet.

Snap Pack History

 Click to view/expand
Release Snap Pack VersionDateType  Updates
November 2024439patches29574 Latest
Fixed an issue with BigQuery Upsert (Streaming) Snap that displayed a 404 Not Found error because of the region mismatch by ensuring the region is specified correctly in the dataset.
November 2024

439patches29499

 Latest
November 2024main29029 StableUpdated and certified against the current SnapLogic Platform release.
August 2024438patches28058 Latest

Fixed an issue with the BigQuery Table Data List Snap that displayed a null pointer exception when the table source schema contained a nested Array list schema.

August 2024

main27765

 

Stable

Updated and certified against the current Snaplogic Platform release.

May 2024main26341 StableUpdated and certified against the current SnapLogic Platform release.
February 2024main25112 StableUpdated and certified against the current SnapLogic Platform release.
November 2023main23721 StableUpdated and certified against the current SnapLogic Platform release.

August 2023

main22460

 


Stable

Updated and certified against the current SnapLogic Platform release.

May 2023433patches22057  Latest

Introduced the Google BigQuery Upsert (Streaming) Snap, which enables you to perform bulk update/insert operations into a BigQuery table from existing tables or any input data stream.

May 2023433patches21955 LatestFixed an issue with the GBQ-Google Service Account that caused an input stream to remain open.
May 2023main21015 StableUpdated and certified to be compatible with the August SnapLogic Platform release.
February 2023432patches20962 LatestFixed an intermittent null pointer exception that occurred in the BigQuery Write Snap.
February 2023

432patches20298

 Latest
February 2023432patches19840 Latest

Fixed an issue with the Google BigQuery Bulk Load (Streaming) Snap that caused the Table not found message to display even when Create table if not present was selected.

February 2023main19844 Stable
November 2022431patches19301 LatestThe Google BigQuery Bulk Load (Streaming) Snap works as expected, with no active Timer threads remaining when the Pipeline execution fails. 
November 2022main18944 StableUpgraded with the latest SnapLogic Platform release.
August 2022main17386 Stable
4.29main15993 Stable

Upgraded with the latest SnapLogic Platform release.

4.28 Patch428patches15459 Latest
  • Fixed an issue with Google BigQuery Execute Snap, where the Snap displayed 404 Job not found error when calling a procedure.
  • Fixed an issue with the Google BigQuery Bulk Load (Cloud Storage) Snap where the Snap failed, because the access token expired when it had to wait longer to execute. With this fix, the Snap is reloaded to get refreshed access token.

4.28 Patch428patches14743 Latest
  • Fixed an issue with the Google BigQuery Execute Snap, where the Snap displayed an error when the input data contained a table having the record type column and its value was null.
  • Fixed an issue with the Google BigQuery Write Snap, when the input data contained complex data type columns (such as nested fields) and Create table if not present checkbox was selected.

4.28main14627 StableUpgraded with the latest SnapLogic Platform release.
4.27 Patch427patches13752 Latest

Upgraded Google BigQuery driver to 1.119.0 version to support time partition intervals by MONTH and YEAR.

4.27 Patch427patches13615 LatestFixed the table truncate 404 error with the Google BigQuery Load (Streaming) Snap by supporting the retry functionality. The Snap now waits in case of an error and retries before loading the data.
4.27 Patch427patches12691 Latest

Fixed an issue with the Google BigQuery Bulk Load (Cloud Storage) Snap, where the Snap failed with an exception for big query tables. The CreateDisposition is now set conditionally on the basis of the setting in the Create table if not present checkbox.

4.27

main12833

Stable

Enhanced the Google BigQuery Bulk Load (Cloud Storage) Snap with the following batching and retry properties to process input records:

  • Batching: Processes the input records in batches.

  • Batch Size: The number of records batched per request.

  • Batch Timeout (milliseconds): Time in milliseconds to elapse following which the batch, if not empty, will be processed even though it might be lesser than the given batch size.

4.26main11181 StableUpgraded with the latest SnapLogic Platform release.
4.25main9554
 
StableUpgraded with the latest SnapLogic Platform release.
4.24main8556
StableUpgraded with the latest SnapLogic Platform release.
4.23main7430
 
Stable
4.22main6403
 
StableUpgraded with the latest SnapLogic Platform release.
4.21snapsmrc542

 

StableUpgraded with the latest SnapLogic Platform release.
4.20 Patch google/bigquery8773 Latest

Fixed the NPE issue with stored procedures and DROP TABLE queries in the Google BigQuery Execute Snap.

4.20snapsmrc535
 
StableUpgraded with the latest SnapLogic Platform release.
4.19snaprsmrc528
 
StableUpgraded with the latest SnapLogic Platform release.
4.18snapsmrc523
 
StableUpgraded with the latest SnapLogic Platform release.
4.17ALL7402
 
Latest

Pushed automatic rebuild of the latest version of each Snap Pack to SnapLogic UAT and Elastic servers.

4.17snapsmrc515
 
Stable

Added the Snap Execution field to all Standard-mode Snaps. In some Snaps, this field replaces the existing Execute during preview check box.

4.16snapsmrc508
 
StableUpgraded with the latest SnapLogic Platform release.
4.15snapsmrc500
 
StableUpgraded with the latest SnapLogic Platform release.
4.14snapsmrc490
 
StableUpgraded with the latest SnapLogic Platform release.
4.13

snapsmrc486

 
StableUpgraded with the latest SnapLogic Platform release.
4.12

snapsmrc480

 
Stable

Added a new property Schema auto detect in the Google BigQuery Bulk Load (Cloud Storage) Snap to support CSV and JSON files where one or more columns in the source file may not contain any values. 

4.11snapsmrc465
 
Stable
  • Added new Snap: Google BigQuery Bulk Load (Cloud Storage)
  • Added new Snap: Google BigQuery Bulk Load (Streaming).
  • Updated Google Big Query Write Snap with a new Create table if not present property.
4.10 Patch google/bigquery4046 Latest

Addressed an issue when authenticating with Dynamic OAuth accounts.

4.10

snapsmrc414

 
StableUpgraded with the latest SnapLogic Platform release.
4.9snapsmrc405
 
StableUpgraded with the latest SnapLogic Platform release.
4.8 Patchbigquery2952 Latest

Supports refreshing OAuth access tokens during long-running pipeline executions. Fixed an issue with writing small batch sizes and when querying empty dataset tables.

4.8.0 Patchbigquery2813 Latest

Reload OAuth account from Platform when the access token expires during pipeline execution.

4.8

snapsmrc398

 
StableUpgraded with the latest SnapLogic Platform release.
4.7

snapsmrc382

 
StableUpgraded with the latest SnapLogic Platform release.
4.6snapsmrc362
 
StableUpgraded with the latest SnapLogic Platform release.
4.5.1

snapsmrc344

 
StableUpgraded with the latest SnapLogic Platform release.
4.4.1NA StableUpgraded with the latest SnapLogic Platform release.
4.4NA StableUpgraded with the latest SnapLogic Platform release.
4.3.2NA Stable
  • Resolved the following issues with the Google BigQuery Execute Snap:
    • throwing binary data in stacktrace and two error messages.
    • improve error handling for suggestions
    • improve error handling on bad queries
    • suggestion bubble missing for Destination table ID
  • Resolved an issue with Auto refresh token not working in Google BigQuery account.


Related Content

  • No labels