On this page

Table of Contents

maxLevel	2
exclude	Older Versions\|Additional Resources\|Related Links\|Related Information

Snap type:

Write

Description:

This Snap exports data from Teradata and directly loads it into Hadoop (HDFS).

Note
This Snap does not support kerberized Hadoop environments.

Note
You can drop your database with it, so be careful.

Valid JSON paths that are defined in the where clause for queries/statements will be substituted with values from an incoming document. Documents will be written to the error view if the document is missing a value to be substituted into the query/statement.

If a select query is executed, the query's results are merged into the incoming document and any existing keys will have their values overwritten. On the other hand, the original document is written if there are no results from the query.

Expected upstream Snaps: Any Snap that provides a document output view, such as Structure or a JSON Generator Snap.
Expected downstream Snaps: A Snap monitoring the exit code to check result status.
Expected input: None.
Expected output: A single document containing the console output and return status from the external Teradata application for each input document

Prerequisites:

Teradata Connector for Hadoop (v1.5.1). See Account for information on the necessary jar file.

Support and limitations:

Works in Ultra Pipelines.

Account:

This Snap uses account references created on the Accounts page of SnapLogic Manager to handle access to this endpoint. One additional jar must be added to the JDBC Driver jars on the Teradata account page - the teradata-connector jar from the Teradata Connector to Hadoop (TDCH) package. You can find it by installing the package on one system and looking in the usr/lib/tdch/1.5/lib directory. It would be safest to also use the terajdbc4.jar and tdgssconfig.jars in the same directory unless you have a specific need to use a different version of the jars. See Configuring Teradata Database Accounts for information on setting up this type of account.

Views:

Input

This Snap allows zero or one input views. If the input view is defined, then the where clause can substitute incoming values for a given expression

Output

This Snap has exactly one output view and produces documents in the view. The output fields of a single view are:

- OUT - The console output from the sub process
- OUTPUT SUMMARY - The count of the input, output and skipped records
- err - The console error output from the sub process
- TERADATA STATUS - The exit code of the sub process
- CLASSPATH- The classpath used by the sub process. This identifies the location of all jar files.
- ENVIRONMENT - The full environment variables seen by the sub process. This identifies the location of the Hadoop configuration files.

Error

This Snap has at most one error view and produces zero or more Document(s) in the view.
Only internal exceptions will be written to the error view. Subprocess failures (with a non-zero exit code) are written to the output view in order for the user to see the full console output an environment. It is important to check the status field in the output view.

Settings

Label

Required. The name for the Snap. You can modify this to be more specific, especially if you have more than one of the same Snap in your pipeline.

SQL Statement

Required. SQL statement to execute on the Teradata server. Document value substitution will be performed on literals starting with $ (such as $people.name will be substituted with its value in the incoming document).

The Snap does not allow to inject SQL, such as select * from people where $columName = 'abc'.
Only values can be substituted since we use prepared statements for execution which result e.g. in select * from people where address = ?

Example: select * from people LIMIT 10 or select * from people where name = $people.name

Default value: [None]

Multiexcerpt include macro

name	ME_Number_of_retries
page	Teradata Execute

Multiexcerpt include macro

name	ME_Number_of_retries_Desc
page	Teradata Execute

Multiexcerpt include macro

name	ME_Retry_Interval
page	Teradata Execute

Multiexcerpt include macro

name	ME_Retry_Interval_Desc
page	Teradata Execute

HDFS destination

Directory

Required. The HDFS directory where the output files will be written. This directory must not already exist.

Example: people

Default value: hdfs://<hostname>:<port>/

TDCH conversion properties

Separator

Required. Field separator in text file output. The available options are Comma, Tab, Pipe (I).

Default value: Comma

Number of mappers

The number of mappers to use to export table data from Teradata. The degree of parallelism for these TDCH jobs is defined by the number of mappers (a Snap configuration) used by the MapReduce job. The number of mappers also defines the number of files created in HDFS location.

More mappers leads to faster execution, however, the number of mappers is limited by the number of nodes in the cluster and the available bandwidth.

Default value: 2

Multiexcerpt include macro

name	Snap Execution
page	SOAP Execute

Multiexcerpt include macro

name	Execution_Detail_Write
page	SOAP Execute

Example

In this example pipeline, the Teradata Export to HDFS Snap executes a SQL query and publishes the results to an HDFS directory.

The Teradata Export to HDFS Snap, selects the documents from the table ADW_SNAPL"."Channel" and publishes it to the HDFS destination directory path.

Successful execution of the pipeline displays the below output:

Versions Compared

Old Version 20

New Version 21

Key

Settings

Example

See Also

Page Comparison

Versions Compared

Old Version 20

New Version 21

Key

Settings

Example

See Also