Skip to end of banner
Go to start of banner

Hive - Execute

Skip to end of metadata
Go to start of metadata

You are viewing an old version of this page. View the current version.

Compare with Current View Page History

« Previous Version 29 Current »

In this article

Overview

You can use the Hive - Execute Snap to execute arbitrary SQL. This Snap enables you to execute simple DML (SELECT, INSERT, UPDATE, DELETE) type statements. For the comprehensive scripting functionality offered by the various databases, you should use the stored procedure functionality offered by their chosen database in the Stored Procedure Snap.

This Snap works only with single queries.

You can drop your database when working with queries, so be cautious.

Snap Type

The Hive Execute Snap is a Write-type Snap.

Prerequisites

None.

Limitations and Known Issues

The Hive Snap Pack does not validate with Apache Hive JDBC v1.2.1 JARS or earlier because of a defect in Hive. HDP 2.6.3 and HDP 2.6.1 run on Apache Hive JDBC v1.2.1 JARs. 
"Method not supported" error is displayed when validating Apache Hive JDBC v1.2.1 or earlier. To validate Snaps that must work with HDP 2.6.3 and HDP 2.6.1, use JDBC v2.0.0 JARs.

Support for Ultra Pipelines

Works in Ultra Pipelines. However, we recommend not using this Snap in an Ultra Pipeline.

Snap Views

Type

Format

Number of Views

Examples of Upstream and Downstream Snaps

Description

Input 

Document

 

  • Min: 0

  • Max: 1

Hive Insert

If the input view is defined, then the WHERE clause can substitute incoming values for a particular expression.

Valid JSON Paths that are defined in the where clause for queries/statements will be substituted with values from an incoming document.

Output

Document

  • Min: 0

  • Max: 1

  • File Reader

  • Mapper

The status message of the executed query.

Documents will be written to the error view if the document is missing a value to be substituted into the query/statement.

Error

Error handling is a generic way to handle errors without losing data or failing the Snap execution. You can handle the errors that the Snap might encounter while running the Pipeline by choosing one of the following options from the When errors occur list under the Views tab:

  • Stop Pipeline Execution: Stops the current pipeline execution if the Snap encounters an error.

  • Discard Error Data and Continue: Ignores the error, discards that record, and continues with the remaining records.

  • Route Error Data to Error View: Routes the error data to an error view without stopping the Snap execution.

Learn more about Error handling in Pipelines.

Snap Settings

  • Asterisk (*): Indicates a mandatory field.

  • Suggestion icon ((blue star)): Indicates a list that is dynamically populated based on the configuration.

  • Expression icon ((blue star)): Indicates whether the value is an expression (if enabled) or a static value (if disabled). Learn more about Using Expressions in SnapLogic.

  • Add icon ((blue star)): Indicates that you can add fields in the field set.

  • Remove icon ((blue star)): Indicates that you can remove fields from the field set.

Field Name

Field Type

Description

Label*

 

Default ValueHive - Execute
Example: Hive - Execute

String

Specify a unique name for the Snap. You can modify this to be more specific, especially if you have more than one of the same Snap in your pipeline.

Hive Configuration Properties 

Use this field set to configure property-value pairs to set for the SQL statement. The values are used in 'set x=y' statements executed immediately before the SQL.

Property

Default Value: N/A
Example: name

String

Specify the name of the configuration property.

Value

Default Value: N/A
Example: jack

String

Specify the value of the configuration property.

SQL Statement*

Default Value: N/A
Exampleselect * from people LIMIT 10 or select * from people where name = $people.name

String/Expression

Specify the SQL statement to execute on the server. Document value substitution will be performed on literals starting with $ (such as $people.name will be substituted with its value in the incoming document).

The Snap does not allow to inject SQL, such as select * from people where $columName = 'abc'. Only values can be substituted since prepared statements are used for execution, which result, for example, in select * from people where address = ?

  • We recommend you to add a single query in the SQL Statement field.

  • The '$' sign and identifier characters, such as double quotes (“), single quotes ('), or back quotes (`), are reserved characters and should not be used in comments or for purposes other than their originally intended purpose.

  • If a select query is executed, the query's results are merged into the incoming document and any existing keys will have their values overwritten. On the other hand, the original document is written if there are no results from the query.

Query type

 

Default Value: Auto
Example: Read

Dropdown list/Expression

Select the type of query for your SQL statement (Read or Write).

When Auto is selected, the Snap tries to determine the query type automatically.
If the execution result of the query is not as expected, you can change the query type to Read or Write.

Pass through

Default Value: Selected

Checkbox

Select this checkbox if you want the input document to be passed through to the output view under the key 'original'. This property applies only to the Execute Snaps with a SELECT statement.

Ignore empty result

Default Value: Deselected

Checkbox

Select if you want no document written to the output view when the SELECT operation does not produce any result. If this property is not selected and the Pass through property is selected, the input document will be passed through to the output view.

Number of Retries*

 

Default Value0
Example5

Integer

Specify the maximum number of reconnections in case of a connection failure or timeout.

Retry Interval (seconds)*

 

Default Value1
Example11

Integer

Specify the time interval in seconds between connection attempts.

Snap execution

 

Default ValueExecute only
ExampleValidate & Execute

Dropdown list

Select one of the three modes in which the Snap executes. Available options are:

  • Validate & Execute: Performs limited execution of the Snap, and generates a data preview during Pipeline validation. Subsequently, performs full execution of the Snap (unlimited records) during Pipeline runtime.
  • Execute only: Performs full execution of the Snap during Pipeline execution without generating preview data.
  • Disabled: Disables the Snap and all Snaps that are downstream from it.

Example

This example demonstrates creating the DROP, CREATE, INSERT, and SELECT queries in the Hive database using the Hive-Execute Snap authenticated against the Cloudera Data Platform through the Generic Hive Database account.

Prerequisites:

  • A valid Generic Hive Database Account.

Configure the Generic Hive Database account as follows:

Configure the Hive - Execute Snaps as shown in the table below to create queries in the following order and validate the pipeline afterward:

DROP query

The following query drops Doc_demo table from the database.

DROP TABLE Doc_demo

CREATE query:

The following query creates a table:

CREATE TABLE IF NOT EXISTS Doc_demo(ID INT,NAME STRING,AGE INT,ADDRESS STRING,SALARY INT) ROW FORMAT DELIMITED FIELDS TERMINATED BY ','

INSERT query:

The following query inserts the records of the members into the Doc_demo table:

INSERT INTO Doc_demo
VALUES (1, 'Ramesh', 32, 'Ahmedabad', 2000.00 ), (2, 'Khilan', 25, 'Delhi', 1500.00 ) ,(3, 'kaushik', 23, 'Kota', 2000.00 ), (4, 'Chaitali', 25, 'Mumbai', 6500.00 ),(5, 'Hardik', 27, 'Bhopal', 8500.00 ),
(6, 'Komal', 22, 'MP', 4500.00 )

SELECT query:

The following query retrieves all records from Doc_demo table:

Select * from Doc_demo

Snap Pack History

 Click here to expand...

Release 

Snap Pack Version

Date

Type

  Updates

May 2024

main26341

08 May 2024 

Stable

Updated and certified against the current SnapLogic Platform release.

February 2024

main25112

Stable

Updated and certified against the current SnapLogic Platform release.

November 2023

main23721

Stable

Updated and certified against the current SnapLogic Platform release.

August 2023

main22460

Stable

The Hive-Execute Snap now includes a new Query type field. When Auto is selected, the Snap tries to determine the query type automatically.

May 2023

main21015

Stable

The Hive Snap Pack is Cloudera-certified for Cloudera Data Warehouse (CDW). You can use the Hive Execute Snap to work with CDW clusters through a Generic Hive Database account.

February 2023

main19844

09 Feb 2023 

Stable

Upgraded with the latest SnapLogic Platform release.

November 2022

main18944

10 Nov 2022 

Stable

Upgraded with the latest SnapLogic Platform release.

August 2022

main17386

11 Aug 2022  

Stable

Upgraded with the latest SnapLogic Platform release.

4.29

main15993

14 May 2022 

Stable

Upgraded with the latest SnapLogic Platform release.

4.28

main14627

12 Feb 2022 

Stable

Upgraded with the latest SnapLogic Platform release.

4.27

main12833

13 Nov 2021 

Stable

Upgraded with the latest SnapLogic Platform release.

4.26

main11181

14 Aug 2021 

Stable

Upgraded with the latest SnapLogic Platform release.

4.25

main9554

08 May 2021 

Stable

Upgraded with the latest SnapLogic Platform release.

4.24 Patch

424patches8867

11 Mar 2021 

Latest

Fixes the missing library error in Hive Snap Pack when running Hadoop Pipelines in JDK11 runtime.

4.24

main8556

13 Feb 2021

Stable

Upgraded with the latest SnapLogic Platform release.

4.23

main7430

14 Nov 2020 

Stable

Upgraded with the latest SnapLogic Platform release.

4.22

main6403

12 Sep 2020 

Stable

Upgraded with the latest SnapLogic Platform release.

4.21 Patch

421patches6272

27 Jul 2020 

Latest

Fixes the issue where Snowflake SCD2 Snap generates two output documents despite no changes to Cause-historization fields with DATE, TIME and TIMESTAMP Snowflake data types, and with Ignore unchanged rows field selected.

4.21 Patch

421patches6144

02 Jul 2020 

Latest

Fixes the following issues with DB Snaps:

  • The connection thread waits indefinitely causing the subsequent connection requests to become unresponsive.

  • Connection leaks occur during Pipeline execution.

4.21 Patch

421patches5851

08 Jun 2020 

Latest

Fixes the Hive Execute Snap that fails with a java.lang.NullPointerException error.

4.21 Patch

MULTIPLE8841

19 May 2020 

Latest

Fixes the connection issue in Database Snaps by detecting and closing open connections after the Snap execution ends.

4.21

snapsmrc542

09 May 2020 

Stable

Upgraded with the latest SnapLogic Platform release.

4.20

snapsmrc535

08 Feb 2020 

Stable

Upgraded with the latest SnapLogic Platform release.

4.19

snaprsmrc528

14 Nov 2019 

Stable

Upgraded with the latest SnapLogic Platform release.

4.18

snapsmrc523

10 Aug 2019 

Stable

Upgraded with the latest SnapLogic Platform release.

4.17

ALL7402

11 Jun 2019 

Latest

Pushed automatic rebuild of the latest version of each Snap Pack to SnapLogic UAT and Elastic servers.

4.17

snapsmrc515

11 Jun 2019 

Stable

  • Certified and tested the Snap Pack against CDH 6.1.

  • Fixes an issue with the Hive Execute Snap wherein the Snap would send the input document to the output view even if the Pass through field is not selected in the Snap configuration. With this fix, the Snap sends the input document to the output view, under the key original, only if you select the Pass through field.

  • Adds the Snap Execution field to all Standard-mode Snaps. In some Snaps, this field replaces the existing Execute during preview check box.

  • Adds a new authentication method, User ID and Password with SSL, for Hive SSL Accounts which allows SSL connections for valid user name and password credentials.

4.16

snapsmrc508

16 Feb 2019 

Stable

Upgraded with the latest SnapLogic Platform release.

4.15 Patch 

db/hive6330

05 Dec 2018 

Latest

Replaced Max idle time and Idle connection test period properties with Max life time and Idle Timeout properties respectively, in the Account configuration. The new properties fix the connection release issues that were occurring due to default/restricted DB Account settings.

4.15

snapsmrc500

15 Dec 2018 

Stable

Added Hive HA support for Zookeeper.

4.14

snapsmrc490

11 Aug 2018 

Stable

Added a new account type: Generic Hive Database Account, this enables connecting to different types of clusters using JDBC URL.

4.13 Patch 

db/hive5269

07 Jun 2018 

Latest

Fixes the Hive Execute Snap that stores account passwords in plain text in the log file. 

4.13

snapsmrc486

12 May 2018 

Stable

Upgraded with the latest SnapLogic Platform release.

4.12

snapsmrc480

17 Feb 2018 

Stable

Upgraded with the latest SnapLogic Platform release.

4.11

snapsmrc465

11 Nov 2017 

Stable

Upgraded with the latest SnapLogic Platform release.

4.10

snapsmrc414

12 Aug 2017 

Stable

Upgraded with the latest SnapLogic Platform release.

4.9 Patch

hive3068

01 Jun 2017 

Latest

  • Fixes an issue regarding connection not closed after login failure; Expose autocommit for "Select into" statement in PostgreSQL Execute Snap and Redshift Execute Snap

4.9

snapsmrc405

13 May 2017 

Stable

  • Hive - Execute Snap is tested on Cloudera Version 5.8.

  • Hive - Execute Snap(Kerberos) now works on Groundplex.

4.8 Patch

hive2752

27 Mar 2017 

Latest

Potential fix for JDBC deadlock issue.

4.8

snapsmrc398

11 Feb 2017 

Stable

  • Info tab added to accounts.

  • Database accounts now invalidate connection pools if account properties are modified and login attempts fail.

4.7.0 Patch

hive2469

17 Jan 2017 

Latest

Addresses an issue with ClouderaHiveJDBCDriver(500168) Unable to connect to server: GSS initiate failed, Fixes by changing the connection pooling to Hikari and added privileged user to all getConnect() request.

4.7.0 Patch

hive2199

28 Nov 2016 

Latest

Fixes an issue for database Select Snaps regarding Limit rows not supporting an empty string from a pipeline parameter.

4.7

snapsmrc382

23 Nov 2016 

Stable

  • The editor box for the SQL statement property in certain database Snaps can now be resized to make it easier to read the contents. This setting is in the Execute Snaps for Cassandra, Hive, JDBC, Oracle, MySQL, SQL Server, PostgreSQL, SAP HANA, Vertica, and Teradata.

  • Enabled the Hive account with Kerberos authentication (Hive with Kerberos works only on Hive JDBC4 driver 2.5.12 and above).

4.6 Patch

hive1958

05 Oct 2016 

Latest

Resolved a performance issue with Hive Execute and JDBC Execute Snaps when running Hive Queries.

4.6

snapsmrc362

13 Aug 2016 

Stable

Snap Pack introduced in 4.6.0. This includes only a Hive Execute Snap that executes DML and DDL statements with Kerberos enabled. It does not include Snaps for load, select, insert, delete, execute or others at this time. Tested only on Cloudera CDH 5.3 & 5.5, Hortonworks HDP 2.3.4.

Related Content:

  • No labels