BigQuery Upsert (Streaming)
In this article
Overview
This Snap enables you to perform bulk update or insert (upsert) operations into a BigQuery table from existing tables or any input data stream.
The upsert operation updates existing rows if the specified value exists in the target table and inserts a new row if the specified value does not exist in the target table.
Snap Type
This Snap is a Write-type Snap that performs a bulk upsert operation.
Prerequisites
Write access for the Google BigQuery Account is required.
Support for Ultra Pipelines
Does not work in Ultra Pipelines.
Limitations and Known Issues
None.
Snap Views
Type | Format | Number of Views | Examples of Upstream and Downstream Snaps | Description |
---|---|---|---|---|
Input | Document
|
|
| This Snap has exactly one document input view. Input can come from any Snap that can pass a document to the output view, such as Structure or JSON Generator. Pipeline parameters can also be passed for project ID, dataset ID, and table ID, and so on. |
Output | Document |
|
| The output is in document view format. The data from the incoming document that is loaded to the destination table is the output from this Snap. It gives the load statistics after the operation is completed The output view contains information about the bulk load details in the temporary table to better understand the flow. This also helps with error handling. The output view also lists the number of rows that were updated, modified, or inserted in the target table. |
Error | Error handling is a generic way to handle errors without losing data or failing the Snap execution. You can handle the errors that the Snap might encounter when running the pipeline by choosing one of the following options from the When errors occur list under the Views tab:
Learn more about Handling Errors with an Error Pipeline. |
Snap Settings
Asterisk ( * ): Indicates a mandatory field.
Suggestion icon (): Indicates a list that is dynamically populated based on the configuration.
Expression icon ( ): Indicates the value is an expression (if enabled) or a static value (if disabled). Learn more about Using Expressions in SnapLogic.
Add icon ( ): Indicates that you can add fields in the field set.
Remove icon ( ): Indicates that you can remove fields from the field set.
Upload icon ( ): Indicates that you can upload files.
Field Name | Field Type | Description |
---|---|---|
Label Default Value: BigQuery Bulk Upsert (Streaming)
| String | Specify the name for the Snap. You can modify this to be more specific, especially if you have more than one of the same Snap in your pipeline.
|
Project ID Default Value: N/A
| String/Expression/Suggestion | Specify the project ID in which the dataset resides. |
Dataset ID
Default Value: N/A Example: dataset-12345
| String/Expression/Suggestion | Specify the dataset ID of the destination. |
Table ID
Default Value: N/A | String/Expression/Suggestion | Specify the table ID of the table you are creating. |
Batch size
Default value: 1000 | String | The number of records batched per request. If the input has 10,000 records and the batch size is set to 100, the total number of requests would be 100.
|
Batch timeout (milliseconds) Default value: 2000 | String | Time in milliseconds after which the batch will be processed (even though it might be less than the specified batch size). Batch timeout value must be set with care. When this limit is reached, the batch will be flushed whether or not all the records in the batch were loaded. |
Batch retry count Default value: 0 | String | The number of times the server should try to load a failed batch.
|
Batch retry delay (milliseconds) Default value: 500 | String | The time delay between each retry.
|
Snap Execution
Default Value: Validate & Execute
| Dropdown list | Select one of the three modes in which the Snap executes. Available options are:
|
Troubleshooting
Error | Reason | Resolution |
---|---|---|
Key column name is required. | No key column(s) specified for checking for existing entries. | Please enter one or more key column names. |
Key column name is not present in target table. | Incorrect key column(s) specified for checking for existing entries. | Please select one or more key column names from the suggestion box. |
All columns in target table are key columns. | The merge will fail as all columns in the target table are key columns. | Please select one or more (but not all) key column names from the suggestion box. |
Examples
Prerequisite: Write access for the Google BigQuery Account is required.
Upsert customer data from Salesforce to a Google BigQuery table
This example demonstrates how to update or insert (upsert) records in a Google BigQuery table.
First, we configure the Salesforce Read Snap with the required details to read customer account data from Salesforce.
In this example, we selected Output Fields for Total, Id, and Name.
Upon validation, the Snap prepares the output to pass to the BigQuery Bulk Upsert Snap.
Next, we configure the BigQuery Bulk Upsert Snap to use unique identifiers to update the existing records.
To upsert data based on the Id and Name key columns, we enter Id and Name in the Key column fields.
Upon execution, this Snap updates or inserts new records into the Google BigQuery table.
The output shows that 5 records were updated successfully.
In this example, we updated the Total for each record (based on the unique identifiers Id and Name selected under Key columns).
The data is updated in the Google BigQuery table, as shown in the BigQuery console.
Downloads
Download and import the Pipeline into SnapLogic.
Configure Snap accounts, as applicable.
Provide Pipeline parameters, as applicable.
Related Content
Have feedback? Email documentation@snaplogic.com | Ask a question in the SnapLogic Community
© 2017-2024 SnapLogic, Inc.