On this Page

Snap Type:

Write

Description:

This Snap performs a bulk load operation into a Google BigQuery database. Depending upon the Snap's configuration, it does so either by using data from incoming documents or by using existing files in the Google Cloud Storage bucket. The Snap supports all three file types supported by Google BigQuery - CSV, JSON, and AVRO.

When using incoming documents:

In the case where data from incoming documents is being loaded, the data is first uploaded to a temporary file on Google Cloud Storage and from that temporary file the data is loaded into the destination table. The user can choose to either retain or delete this temporary file after the Snap terminates.

When using existing files from Google Cloud Storage:

In the case where existing files from Google Cloud Storage are being used, the data is loaded directly from the specified files into the destination table, there are no temporary files created for this operation. However, the user can choose to either retain or delete these existing files after the Snap terminates.

ETL Transformations & Data Flow

The Google BigQuery Bulk Load (Cloud Storage) Snap performs a bulk load of the input records into the specified database. If the data is being loaded from incoming documents, it is sent to a temporary file in the cloud storage and from there to the destination table. The temporary file is retained after Snap's execution. However, the user can choose to delete it if they so wish and configure the Snap accordingly.

Input & Output

Input: Any Snap that can pass a document output view, such as Structure or JSON Generator. Pipleline parameters can also be passed, only for bucket ID, project ID, table ID, and so on. In cases where existing files from the Google Cloud Storage bucket are being used, then input data is taken from the source files.
Output: The output is in document view format. It displays the statistics for the completed bulk load operation.

Modes

Ultra Pipelines: Works in Ultra Pipelines.

Consider the following points when using this Snap:

Make sure the schema of the incoming documents or existing files matches the schema of the destination table if CSV format is used. Refer to the Troubleshooting section for a workaround in this case.
Make sure when using existing files that the destination table exists.
Make sure when using incoming documents that the selected file type for the temporary file supports the data types in the incoming document. For example, CSV file format does not support arrays/lists, AVRO file format does not support DATE TIME data type, and so on.
The JSON format works for any data type and any complex schema.

Snaps in Google BigQuery Snap Pack

Write datetime values to the database tables, always in UTC format.
Convert any non-UTC values in the incoming data to UTC before writing them.
Consider datetime values without the time zone mentioned, as in UTC.

So, ensure that you include the time zone in all the datetime values that you load into Google BigQuery tables using this Snap.

For example: "2020-08-29T18:38:07.370 America/Los_Angeles", “2020-09-11T10:05:14.000-07:00", “2020-09-11T17:05:14.000Z”

Prerequisites:

Write access to the Google BigQuery Account and Read & Write access from/to the Google Cloud Storage account is required.

Limitations and Known Issues

When using incoming documents with arrays/lists, CSV cannot be selected as the file format for the temporary file. This will cause an error. Select JSON or AVRO instead.
When using incoming documents with date time data types, AVRO cannot be selected as the file format for the temporary file. This will cause an error. Select CSV or JSON.
When uploading incoming documents via a CSV file or uploading an existing CSV file from Google Cloud Storage, make sure the CSV file contains all the destination table columns in the same order as per the table. Otherwise, use the workarounds in the Troubleshooting section to handle a CSV file that does not contain all the table columns in the same order.

When uploading from existing documents on Google Cloud Storage, enabling the Create table if not the present property will throw an exception error. To avoid this, make sure the destination table exists.

Known Issue

If you copy data by creating the same table after immediately deleting it, may cause a loss of data. This behavior is expected due to the way the tables are cached and the internal table id is propagated throughout the system. It is recommended to avoid rewriting in Big Query.

Workaround

Truncate the table instead of deleting it.
Add some randomly generated prefix each time you create the table, so it has a new name every time.

Configurations:

Account & Access

This Snap uses account references created on the Accounts page of SnapLogic Manager to handle access to this endpoint. See Google BigQuery Account for information on setting up this type of account.

Views

Input	This Snap has exactly at most one document input view.
Output	This Snap has exactly one document output view.
Error	This Snap has at most one document error view and produces zero or more documents in the view.

Troubleshooting:

Mismatch in the order of columns/number of columns between incoming document/existing Google Cloud Storage files in CSV format:

Incoming documents: Use JSON/AVRO format for the temp file in the Cloud Storage Bulk Load Snap.
Existing files from Google Cloud Storage: Read the file using File Reader Snap followed by the CSV Parser Snap, then use Google BigQuery Bulk Load (Cloud Storage) Snap and selec Upload type as Upload incoming documents and File format as JSON or AVRO.

Settings

Label

Specify a name for the Snap. You can modify this to be more specific, especially if you have more than one of the same Snap in your pipeline.

Project ID

Required. Project ID of the project containing the table. This is a suggestible field and can be populated based on the Account settings.

Example: project1234

Default value: [None]

Dataset ID

Required. Dataset ID of the dataset containing the table. This is a suggestible field and all the datasets in the specified project will be populated.

Example: dataset1234

Default value: [None]

Table ID

Required. Name of the target table into which the records are to be loaded. This is a suggestible field and all the tables in the datasets will be listed.

Example: table1234

Default value: [None]

Create table if not present

Specifies that the table should be created if not already present in the database.

Default value: Not selected

When selecting <i>Use existing files</i> in <b>Upload type</b>

Ensure that this property is not selected when choosing the Upload exising files on Google Cloud Storage option in Upload type property, else an exception will be thrown and the Snap will not execute.

Bucket name

Name of the Google Cloud Storage bucket to be used for the operation. This is a suggestible field and will list all the buckets within the given account.

Example: project1234

Default value: [None]

Upload type

This is a drop-down menu consisting of two options: Upload incoming documents and Upload existing files from Google Cloud Storage. It specifies the data source to the Snap: incoming files or existing files in Google Cloud Storage bucket have to be uploaded. To upload data from existing files in Google Cloud Storage bucket select the option Upload existing files from Google Cloud Storage.

Default value: Upload existing files from Google Cloud Storage.

Based on the selection here, only one of the succeeding sections has to be configured. The succeeding sections being Properties for uploading incoming documents and Properties for uploading existing files from Google Cloud Storage.

Properties for uploading incoming documents

This sub-section has to be configured if the option Upload incoming documents was selected in the Upload type property.

File format

For selecting the file preferred format of the temporary file. This is a drop-down list that has three options: CSV, JSON, and AVRO.

Default value: CSV

File formats and their limitations

The file format should be selected based on the data type they support, for example if the incoming document contains arrays or lists then selecting CSV in the File format property will throw an execption and the Snap will not execute. To avoid this, AVRO or JSON must be selected. Similarly, AVRO file format should not be used for Date Time data types.

If the incoming documents do not contain all the table columns in the same order as the destination table then do not use CSV.

Temp file name

The name of the temporary file that is created on the Google Cloud Storage bucket. If a filename is not provided, then a system generated file name is used.

Default value: [None]

Preserve temp file

Specifies whether the temporary file created for the load operation has to be retained or deleted after the Snap's execution. By default the temporary file is deleted.

Default value: Selected

Properties for uploading existing files from Google Cloud Storage

This sub-section has to be configured if the option "Upload existing files from Google Cloud Storage" was selected in the Upload type property.

File paths

Multiple files can be selected based on the need. When the pipeline is executed, the output data will have as many records listed. Based on the number of files added, the Snap will group them into categories (CSV with header & delimiter, CSV with delimiter but without header, JSON, and AVRO). This distinction is maintained in the output preview as well (shown distinctly according to File type).

File format

This is a drop-down list that has three options: CSV, JSON, and AVRO.

Default value: CSV

File path

The file's location in the Google Cloud Storage bucket.

Example: gs://gcs_existingbucket/exisitng_file.csv.

Default value: [None]

CSV file contains headers

Specifies that the CSV file contains headers. Use this option to enable the Snap to differentiate between the headers and records.

Default value: Not selected

CSV delimiter

Specifies the delimiter for the CSV file. This is needed only for CSV file types.

Example: | (pipe)

Default value: , (comma)

Custom delimiters

All custom delimiters supported by BigQuery are supported as well.

Delete files upon exit

Similar operation as Preserve temp file. If this option is enabled, the files from which the data is loaded to the destination table are deleted after Snap's execution.

Default value: Not selected

Snap execution

Select one of the three modes in which the Snap executes. Available options are:

Validate & Execute: Performs limited execution of the Snap, and generates a data preview during Pipeline validation. Subsequently, performs full execution of the Snap (unlimited records) during Pipeline runtime.
Execute only: Performs full execution of the Snap during Pipeline execution without generating preview data.
Disabled: Disables the Snap and all Snaps that are downstream from it.

Writing numeric values into Google BigQuery tables

Google BigQuery tables support columns with NUMERIC data type to allow storing big decimal numbers (up to 38 digits with nine decimal places). But Snaps in Google BigQuery Snap Pack that load data into tables cannot create numeric columns. When the Create table if not present check box is selected, the Snaps create the required table schema, but map big decimals to a FLOAT64 column. So, to store the data into numeric columns using these Snaps, we recommend the following actions:

Create the required schema, beforehand, with numeric columns in Google BigQuery.
Pass the number as a string.

The Google API converts this string into a number with full precision and saves it in the numeric column.

Example:

Value Passed Through Snap	Value Stored in BigQuery	Remarks
`"12345678901234567890123456789.123456789"`	12345678901234567890123456789.123456789	As per this issue logged in Google Issue Tracker, if you send the values as strings, the values are never converted to floating-point form, so this works as expected.
`12345678901234567890123456789.123456789`	123456789012345678000000000000	Big decimal values sent as non-string values lose precision.

Examples

Basic Use Case - 1 (Upload incoming documents)

In this example, incoming documents are used for the bulk load. It will cover two scenarios:

Temporary file is preserved.
Temporary file is deleted.

The following pipeline is executed:

The CSV Generator Snap provides the input data to the Snap. A preview of the sample data from the incoming document is shown below:

Scenario - 1 - Temporary File Preserved

The following image is the configuration of the Google BigQuery Bulk Load (Cloud Storage) Snap:

The Preserve temp file checkbox is selected and the temprary file is specified as tempfile.csv. Upon execution, the records from the incoming CSV file are loaded into the temporary file present in the gcs_cloud_1 bucket. From there it is loaded into the destination table csvtablenew in the babynames dataset within the case16370 project.

The following is a preview of the data output from the Google BigQuery Bulk Load (Cloud Storage) Snap:

Scenario - 2 - Temporary File Deleted

The following image is the configuration of the Google BigQuery Bulk Load (Cloud Storage) Snap:

The Preserve temp file checkbox is not selected and the temprary file is specified as tempfile.csv. Upon execution, the records from the incoming CSV file are loaded into the temporary file present in the gcs_cloud_1 bucket. From there it is loaded into the destination table csvtablenew in the babynames dataset within the case16370 project. The temporary file is deleted.

The following is a preview of the data output from the Google BigQuery Bulk Load (Cloud Storage) Snap:

Basic Use Case - 2 (Upload Existing Files)

In this example, incoming documents are used for the bulk load. It will cover two scenarios:

Cloud Storage file is preserved.
Cloud Storage file is deleted.

The following pipeline is executed:

Scenario - 1 - Cloud Storage File Preserved

The following image is the configuration of the Google BigQuery Bulk Load (Cloud Storage) Snap:

An AVRO file is chosen as the input file and the Delete files upon exit checkbox is not selected. Upon the Snap's execution, the data from the input file is loaded into the table named avrotable in the babynames dataset within the case16370 project. The AVRO file is not deleted.

The following is a preview of the Snap's output:

Scenario - 2 - Cloud Storage File Deleted

The following image is the configuration of the Google BigQuery Bulk Load (Cloud Storage) Snap:

The following is a preview of the Snap's output:

The exported pipeline is available in Downloads section below.

Typical Snap Configurations

The key configuration of the Snap lies in how the values are passed to the Snap. This can be done in the following ways:

Without expressions
The values are passed to the Snap directly.
With expressions
- Using pipeline parameters
  The values are passed as pipeline parameters. The parameters to be applied have to be selected by enabling the corresponding checkbox under the Capture column.

Advanced Use Case

The following describes a pipeline with a broader business logic involving multiple ETL transformations. It shows how in an enterprise environment, Bulk Load functionality can typically be used.

This pipeline reads social data and writes it into an existing file in the Google Cloud Storage. Bulk load of this social data is then performed from the existing file in the Google Cloud Storage to the destination table.

Extract: Social data matching a certain criteria is extracted by the Query Snap.
Transform: The extracted data from the Query Snap is transformed by the CSV Formatter Snap.
Load: The incoming data from the CSV Formatter Snap is written into a CSV file in Cloud Storage by the File Writer Snap.
Extract & Load: The Google BigQuery Bulk Load (Cloud Storage) Snap extracts the data from the existing CSV file in Cloud Storage and performs bulk load of this extracted data into the destination table.
Extract: The Google BigQuery Execute Snap extracts the data inserted into the destination table by the Google BigQuery Bulk Load (Cloud Storage) Snap.

The exported pipeline is available in the Downloads section below.

Downloads

	File	Modified

No files shared here yet.

Snap Pack History

Click to view/expand

Release	Snap Pack Version	Date	Type	Updates
February 2025	440patches29960	13 Feb 2025	Latest	Fixed an issue with the BigQuery Upsert (Streaming) Snap where an HTTP 404 error occurred when retrieving job status for multi-regional datasets in BigQuery. This occurred because the location was not explicitly specified in the request. The job request now includes the location information, ensuring successful job polling regardless of the region (for example, the US or EU).
February 2025	main29887	12 Feb 2025	Stable	The Google BigQuery Bulk Load (Cloud Storage) Snap now displays a lint warning when the bulk load process completes successfully but the temporary file deletion fails. Enhanced the following Google BigQuery Snaps with implicit retry functionality to improve the reliability of CRUD operations by retrying after errors: BigQuery Write BigQuery Execute BigQuery Bulk Load (Cloud Storage)
November 2024	439patches29574	24 Jan 2025	Latest	Fixed an issue with BigQuery Upsert (Streaming) Snap that displayed a `404 Not Found` error because of the region mismatch by ensuring the region is specified correctly in the dataset.
November 2024	439patches29499	10 Jan 2025	Latest	The BigQuery Bulk Load (Cloud Storage) Snap now displays a lint warning when the bulk load process completes successfully but the temporary file deletion fails. Enhanced the following Google BigQuery Snaps with implicit retry functionality to improve the reliability of CRUD operations in BigQuery and handle all retriable BigQuery errors: BigQuery Write BigQuery Table Delete BigQuery Table Data List BigQuery Table Create BigQuery Execute BigQuery Dataset List BigQuery Dataset Delete BigQuery Dataset Create BigQuery Table List BigQuery Bulk Load (Cloud Storage)
November 2024	main29029	13 Nov 2024	Stable	Updated and certified against the current SnapLogic Platform release.
August 2024	438patches28058	13 Sep 2024	Latest	Fixed an issue with the BigQuery Table Data List Snap that displayed a `null pointer exception` when the table source schema contained a nested `Array list` schema.
August 2024	main27765	21 Aug 2024	Stable	Updated and certified against the current SnapLogic Platform release.
May 2024	main26341	08 May 2024	Stable	Updated and certified against the current SnapLogic Platform release.
February 2024	main25112	14 Feb 2024	Stable	Updated and certified against the current SnapLogic Platform release.
November 2023	main23721	08 Nov 2023	Stable	Updated and certified against the current SnapLogic Platform release.
August 2023	main22460	16 Aug 2023	Stable	Updated and certified against the current SnapLogic Platform release.
May 2023	433patches22057	11 Aug 2023	Latest	Introduced the Google BigQuery Upsert (Streaming) Snap, which enables you to perform bulk update/insert operations into a BigQuery table from existing tables or any input data stream.
May 2023	433patches21955	24 Jul 2023	Latest	Fixed an issue with the GBQ-Google Service Account that caused an input stream to remain open.
May 2023	main21015	10 May 2023	Stable	Updated and certified to be compatible with the August SnapLogic Platform release.
February 2023	432patches20962	09 May 2023	Latest	Fixed an intermittent null pointer exception that occurred in the BigQuery Write Snap.
February 2023	432patches20298	28 Mar 2023	Latest	Disabled retries on truncated tables for the the BigQuery Write and BigQuery Bulk Load (Streaming) Snaps. Added better error messages for error conditions in the BigQuery Table Data List Snap. Fixed an issue with the BigQuery Table Create Snap that caused an error to display when fields were separated with a comma and a space. Fixed an issue with the sorting of the Partitioning time dropdown in the BigQuery Table Create Snap.
February 2023	432patches19840	01 Mar 2023	Latest	Fixed an issue with the Google BigQuery Bulk Load (Streaming) Snap that caused the `Table not found` message to display even when Create table if not present was selected.
February 2023	main19844	09 Feb 2023	Stable	Introduced the following new Snaps: Google BigQuery Table Create Snap—enables you to create tables that support clustering and partitioning. Google BigQuery Table Data List Snap—enables you to read table data from a BigQuery dataset and lists the contents of the table in rows in the output.
November 2022	431patches19301	12 Jan 2023	Latest	The Google BigQuery Bulk Load (Streaming) Snap works as expected, with no active Timer threads remaining when the Pipeline execution fails.
November 2022	main18944	10 Nov 2022	Stable	Upgraded with the latest SnapLogic Platform release.
August 2022	main17386	11 Aug 2022	Stable	The Location field in the BigQuery Execute Snap lists all the locations in the suggestions. Introduced the following Snaps and accounts to manage datasets in Google BigQuery (GBQ): BigQuery Dataset Create: Creates datasets in the GBQ. BigQuery Dataset Delete: Removes datasets from the GBQ. BigQuery Dataset List: Lists datasets from the GBQ. BigQuery Table Delete: Removes tables from the GBQ. BigQuery Table List: Lists tables that are read from the GBQ dataset. Google Service Account JSON: Accesses the resources on GBQ using a JSON Key.
4.29	main15993	14 May 2022	Stable	Upgraded with the latest SnapLogic Platform release.
4.28 Patch	428patches15459	12 Apr 2022	Latest	Fixed an issue with Google BigQuery Execute Snap, where the Snap displayed `404 Job not found` error when calling a procedure. Fixed an issue with the Google BigQuery Bulk Load (Cloud Storage) Snap where the Snap failed, because the access token expired when it had to wait longer to execute. With this fix, the Snap is reloaded to get refreshed access token.
4.28 Patch	428patches14743	23 Feb 2022	Latest	Fixed an issue with the Google BigQuery Execute Snap, where the Snap displayed an error when the input data contained a table having the record type column and its value was null. Fixed an issue with the Google BigQuery Write Snap, when the input data contained complex data type columns (such as nested fields) and Create table if not present checkbox was selected.
4.28	main14627	12 Feb 2022	Stable	Upgraded with the latest SnapLogic Platform release.
4.27 Patch	427patches13752	12 Jan 2022	Latest	Upgraded Google BigQuery driver to 1.119.0 version to support time partition intervals by MONTH and YEAR.
4.27 Patch	427patches13615	16 Dec 2021	Latest	Fixed the table truncate 404 error with the Google BigQuery Load (Streaming) Snap by supporting the retry functionality. The Snap now waits in case of an error and retries before loading the data.
4.27 Patch	427patches12691	23 Nov 2021	Latest	Fixed an issue with the Google BigQuery Bulk Load (Cloud Storage) Snap, where the Snap failed with an exception for big query tables. The `CreateDisposition` is now set conditionally on the basis of the setting in the Create table if not present checkbox.
4.27	main12833	13 Nov 2021	Stable	Enhanced the Google BigQuery Bulk Load (Cloud Storage) Snap with the following batching and retry properties to process input records: Batching: Processes the input records in batches. Batch Size: The number of records batched per request. Batch Timeout (milliseconds): Time in milliseconds to elapse following which the batch, if not empty, will be processed even though it might be lesser than the given batch size.
4.26	main11181	14 Aug 2021	Stable	Upgraded with the latest SnapLogic Platform release.
4.25	main9554	08 May 2021	Stable	Upgraded with the latest SnapLogic Platform release.
4.24	main8556	13 Feb 2021	Stable	Upgraded with the latest SnapLogic Platform release.
4.23	main7430	14 Nov 2020	Stable	Enhanced the Google BigQuery Execute Snap and Google BigQuery accounts to enable you to choose the SQL dialect to use in the Query field. See Interpreting the SQL Query Dialect for more information. Fixed the precision loss in Google BigQuery Execute Snap output that strips millisecond values while retrieving TIMESTAMP values from Google BigQuery tables.
4.22	main6403	12 Sep 2020	Stable	Upgraded with the latest SnapLogic Platform release.
4.21	snapsmrc542	09 May 2020	Stable	Upgraded with the latest SnapLogic Platform release.
4.20 Patch	google/bigquery8773	19 Mar 2020	Latest	Fixed the NPE issue with stored procedures and DROP TABLE queries in the Google BigQuery Execute Snap.
4.20	snapsmrc535	08 Feb 2020	Stable	Upgraded with the latest SnapLogic Platform release.
4.19	snaprsmrc528	14 Nov 2019	Stable	Upgraded with the latest SnapLogic Platform release.
4.18	snapsmrc523	10 Aug 2019	Stable	Upgraded with the latest SnapLogic Platform release.
4.17	ALL7402	11 Jun 2019	Latest	Pushed automatic rebuild of the latest version of each Snap Pack to SnapLogic UAT and Elastic servers.
4.17	snapsmrc515	11 Jun 2019	Stable	Added the Snap Execution field to all Standard-mode Snaps. In some Snaps, this field replaces the existing Execute during preview check box.
4.16	snapsmrc508	16 Feb 2019	Stable	Upgraded with the latest SnapLogic Platform release.
4.15	snapsmrc500	15 Dec 2018	Stable	Upgraded with the latest SnapLogic Platform release.
4.14	snapsmrc490	11 Aug 2018	Stable	Upgraded with the latest SnapLogic Platform release.
4.13	snapsmrc486	12 May 2018	Stable	Upgraded with the latest SnapLogic Platform release.
4.12	snapsmrc480	17 Feb 2018	Stable	Added a new property Schema auto detect in the Google BigQuery Bulk Load (Cloud Storage) Snap to support CSV and JSON files where one or more columns in the source file may not contain any values.
4.11	snapsmrc465	11 Nov 2017	Stable	Added new Snap: Google BigQuery Bulk Load (Cloud Storage) Added new Snap: Google BigQuery Bulk Load (Streaming). Updated Google Big Query Write Snap with a new Create table if not present property.
4.10 Patch	google/bigquery4046	05 Oct 2017	Latest	Addressed an issue when authenticating with Dynamic OAuth accounts.
4.10	snapsmrc414	12 Aug 2017	Stable	Upgraded with the latest SnapLogic Platform release.
4.9	snapsmrc405	13 May 2017	Stable	Upgraded with the latest SnapLogic Platform release.
4.8 Patch	bigquery2952	27 Apr 2017	Latest	Supports refreshing OAuth access tokens during long-running pipeline executions. Fixed an issue with writing small batch sizes and when querying empty dataset tables.
4.8.0 Patch	bigquery2813	05 Apr 2017	Latest	Reload OAuth account from Platform when the access token expires during pipeline execution.
4.8	snapsmrc398	11 Feb 2017	Stable	Upgraded with the latest SnapLogic Platform release.
4.7	snapsmrc382	23 Nov 2016	Stable	Upgraded with the latest SnapLogic Platform release.
4.6	snapsmrc362	13 Aug 2016	Stable	Upgraded with the latest SnapLogic Platform release.
4.5.1	snapsmrc344	18 May 2016	Stable	Upgraded with the latest SnapLogic Platform release.
4.4.1	NA	18 Mar 2016	Stable	Upgraded with the latest SnapLogic Platform release.
4.4	NA	13 Feb 2016	Stable	Upgraded with the latest SnapLogic Platform release.
4.3.2	NA	15 Jan 2016	Stable	Resolved the following issues with the Google BigQuery Execute Snap: throwing binary data in stacktrace and two error messages. improve error handling for suggestions improve error handling on bad queries suggestion bubble missing for Destination table ID Resolved an issue with Auto refresh token not working in Google BigQuery account.