Hive - Execute

In this article

Overview

You can use the Hive - Execute Snap to execute arbitrary SQL. This Snap enables you to execute simple DML (SELECT, INSERT, UPDATE, DELETE) type statements. For the comprehensive scripting functionality offered by the various databases, you should use the stored procedure functionality offered by their chosen database in the Stored Procedure Snap.

This Snap works only with single queries.

You can drop your database when working with queries, so be cautious.

Snap Type

The Hive Execute Snap is a Write-type Snap.

Prerequisites

None.

Limitations and Known Issues

The Hive Snap Pack does not validate with Apache Hive JDBC v1.2.1 JARS or earlier because of a defect in Hive. HDP 2.6.3 and HDP 2.6.1 run on Apache Hive JDBC v1.2.1 JARs. 
"Method not supported" error is displayed when validating Apache Hive JDBC v1.2.1 or earlier. To validate Snaps that must work with HDP 2.6.3 and HDP 2.6.1, use JDBC v2.0.0 JARs.

Support for Ultra Pipelines

Works in Ultra Pipelines. However, we recommend not using this Snap in an Ultra Pipeline.

Snap Views

Type

Format

Number of Views

Examples of Upstream and Downstream Snaps

Description

Type

Format

Number of Views

Examples of Upstream and Downstream Snaps

Description

Input 

Document

 

  • Min: 0

  • Max: 1

Hive Insert

If the input view is defined, then the WHERE clause can substitute incoming values for a particular expression.

Valid JSON Paths that are defined in the where clause for queries/statements will be substituted with values from an incoming document.

Output

Document

  • Min: 0

  • Max: 1

  • File Reader

  • Mapper

The status message of the executed query.

Documents will be written to the error view if the document is missing a value to be substituted into the query/statement.

Error

Error handling is a generic way to handle errors without losing data or failing the Snap execution. You can handle the errors that the Snap might encounter while running the Pipeline by choosing one of the following options from the When errors occur list under the Views tab:

  • Stop Pipeline Execution: Stops the current pipeline execution if the Snap encounters an error.

  • Discard Error Data and Continue: Ignores the error, discards that record, and continues with the remaining records.

  • Route Error Data to Error View: Routes the error data to an error view without stopping the Snap execution.

Learn more about Error handling in Pipelines.

Snap Settings

  • Asterisk (*): Indicates a mandatory field.

  • Suggestion icon (): Indicates a list that is dynamically populated based on the configuration.

  • Expression icon (): Indicates whether the value is an expression (if enabled) or a static value (if disabled). Learn more about Using Expressions in SnapLogic.

  • Add icon (): Indicates that you can add fields in the field set.

  • Remove icon (): Indicates that you can remove fields from the field set.

Field Name

Field Type

Description

Label*

 

Default Value: Hive - Execute
Example: Hive - Execute

String

Specify a unique name for the Snap. You can modify this to be more specific, especially if you have more than one of the same Snap in your pipeline.

Hive Configuration Properties 

Use this field set to configure property-value pairs to set for the SQL statement. The values are used in 'set x=y' statements executed immediately before the SQL.

Property

 

Default Value: N/A
Example: name

String

Specify the name of the configuration property.

Value

 

Default Value: N/A
Example: jack

String

Specify the value of the configuration property.

SQL Statement*

 

Default Value: N/A
Example: select * from people LIMIT 10 or select * from people where name = $people.name

String/Expression

 

Specify the SQL statement to execute on the server. Document value substitution will be performed on literals starting with $ (such as $people.name will be substituted with its value in the incoming document).

The Snap does not allow to inject SQL, such as select * from people where $columName = 'abc'. Only values can be substituted since prepared statements are used for execution, which result, for example, in select * from people where address = ?

Query type

 

Default Value: Auto
Example: Read

Dropdown list/Expression

Select the type of query for your SQL statement (Read or Write).

When Auto is selected, the Snap tries to determine the query type automatically.
If the execution result of the query is not as expected, you can change the query type to Read or Write.

Pass through

 

Default Value: Selected

Checkbox

Select this checkbox if you want the input document to be passed through to the output view under the key 'original'. This property applies only to the Execute Snaps with a SELECT statement.

Ignore empty result

 

Default Value: Deselected

Checkbox

Select if you want no document written to the output view when the SELECT operation does not produce any result. If this property is not selected and the Pass through property is selected, the input document will be passed through to the output view.

Number of Retries*

 

Default Value: 0
Example: 5

Integer

Specify the maximum number of retry attempts the Snap must make in case there is a network failure and is unable to read the target file. The request is terminated if the attempts do not result in a response.

  • If the Number of retries value is set to 0 (the default value), the retry option is disabled, and the Snap does not initiate a retry. The pipeline will not attempt to retry the operation in case of a failure—any failure encountered during the database operation will immediately result in the pipeline failing without any retry attempts to recover from the errors.

  • If the Number of retries value is greater than 0, the Snap initiates a download of the target file into a temporary local file. If any error occurs during the download, the Snap waits for the time specified in the Retry interval and then attempts to download the file again from the beginning. After the download is successful, the Snap streams the data from the temporary file to the downstream pipeline. All temporary local files are deleted when they are no longer needed.

Retry Interval (seconds)*

 

Default Value: 1
Example: 11

Integer

Specify the time interval in seconds between connection attempts.

 

Default Value: Execute only
Example: Validate & Execute

Dropdown list

 

Example

This example demonstrates creating the DROP, CREATE, INSERT, and SELECT queries in the Hive database using the Hive-Execute Snap authenticated against the Cloudera Data Platform through the Generic Hive Database account.

Prerequisites:

  • A valid Generic Hive Database Account.

Configure the Generic Hive Database account as follows:

Configure the Hive - Execute Snaps as shown in the table below to create queries in the following order and validate the pipeline afterward:

Query

Configuration

Output

Query

Configuration

Output

DROP query

The following query drops Doc_demo table from the database.

DROP TABLE Doc_demo

 

CREATE query:

The following query creates a table:

CREATE TABLE IF NOT EXISTS Doc_demo(ID INT,NAME STRING,AGE INT,ADDRESS STRING,SALARY INT) ROW FORMAT DELIMITED FIELDS TERMINATED BY ','

INSERT query:

The following query inserts the records of the members into the Doc_demo table:

INSERT INTO Doc_demo
VALUES (1, 'Ramesh', 32, 'Ahmedabad', 2000.00 ), (2, 'Khilan', 25, 'Delhi', 1500.00 ) ,(3, 'kaushik', 23, 'Kota', 2000.00 ), (4, 'Chaitali', 25, 'Mumbai', 6500.00 ),(5, 'Hardik', 27, 'Bhopal', 8500.00 ),
(6, 'Komal', 22, 'MP', 4500.00 )

SELECT query:

The following query retrieves all records from Doc_demo table:

Select * from Doc_demo

Snap Pack History

Related Content:

Â