Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

On this page

Table of Contents
maxLevel2
excludeOlder Versions|Additional Resources|Related Links|Related Information

Snap type:

Write


Description:

This Snap exports data from Teradata and directly loads it into Hadoop (HDFS). 

Note
This Snap does not support kerberized Hadoop environments.


Note

You can drop your database with it, so be careful.


Valid JSON paths that are defined in the where clause for queries/statements will be substituted with values from an incoming document. Documents will be written to the error view if the document is missing a value to be substituted into the query/statement.

If a select query is executed, the query's results are merged into the incoming document and any existing keys will have their values overwritten. On the other hand, the original document is written if there are no results from the query.


  • Expected upstream Snaps: Any Snap that provides a document output view, such as Structure or a JSON Generator Snap.
  • Expected downstream Snaps: A Snap monitoring the exit code to check result status.
  • Expected input: None.
  • Expected output: A single document containing the console output and return status from the external Teradata application for each input document


Prerequisites:

Teradata Connector for Hadoop (v1.5.1). See Account for information on the necessary jar file.


Support and limitations:Works in Ultra Pipelines.
Account: 

This Snap uses account references created on the Accounts page of SnapLogic Manager to handle access to this endpoint. One additional jar must be added to the JDBC Driver jars on the Teradata account page - the teradata-connector jar from the Teradata Connector to Hadoop (TDCH) package. You can find it by installing the package on one system and looking in the usr/lib/tdch/1.5/lib directory. It would be safest to also use the terajdbc4.jar and tdgssconfig.jars in the same directory unless you have a specific need to use a different version of the jars. See Configuring Teradata Database Accounts for information on setting up this type of account.


Views:


Input

This Snap allows zero or one input views. If the input view is defined, then the where clause can substitute incoming values for a given expression

Output

This Snap has exactly one output view and produces documents in the view. The output fields of a single view are:

    • OUT - The console output from the sub process

    • OUTPUT SUMMARY - The count of the input, output and skipped records

    • err - The console error output from the sub process

    • TERADATA STATUS - The exit code of the sub process

    • CLASSPATH- The classpath used by the sub process. This identifies the location of all jar files.

    • ENVIRONMENT - The full environment variables seen by the sub process. This identifies the location of the Hadoop configuration files.


Error

This Snap has at most one error view and produces zero or more Document(s) in the view.
Only internal exceptions will be written to the error view. Subprocess failures (with a non-zero exit code) are written to the output view in order for the user to see the full console output an environment. It is important to check the status field in the output view. 

 

  

Settings

Label


Required. The name for the Snap. You can modify this to be more specific, especially if you have more than one of the same Snap in your pipeline.


SQL Statement



Required. SQL statement to execute on the Teradata server. Document value substitution will be performed on literals starting with $ (such as $people.name will be substituted with its value in the incoming document).

The Snap does not allow to inject SQL, such as select * from people where $columName = 'abc'. 
Only values can be substituted since we use prepared statements for execution which result e.g. in select * from people where address = ?

Example: select * from people LIMIT 10 or select * from people where name = $people.name

Default value: [None]

Multiexcerpt include macro
nameME_Number_of_retries
pageTeradata Execute
Multiexcerpt include macro
nameME_Number_of_retries_Desc
pageTeradata Execute
Multiexcerpt include macro
nameME_Retry_Interval
pageTeradata Execute
Multiexcerpt include macro
nameME_Retry_Interval_Desc
pageTeradata Execute
HDFS destination  

 

Directory

Required. The HDFS directory where the output files will be written. This directory must not already exist. 

Example: people

Default value: hdfs://<hostname>:<port>/   

TDCH conversion properties


Separator


Required. Field separator in text file output.  The available options are Comma, Tab, Pipe (I).

Default value: Comma 

Number of mappers


The number of mappers to use to export table data from Teradata. The degree of parallelism for these TDCH jobs is defined by the number of mappers (a Snap configuration) used by the MapReduce job. The number of mappers also defines the number of files created in HDFS location.

More mappers leads to faster execution, however, the number of mappers is limited by the number of nodes in the cluster and the available bandwidth.

Default value: 2 

Multiexcerpt include macro
nameSnap Execution
pageSOAP Execute

Multiexcerpt include macro
nameExecution_Detail_Write
pageSOAP Execute

Example


In this example pipeline, the Teradata Export to HDFS Snap executes a SQL query and publishes the results to an HDFS directory.

   

The Teradata Export to HDFS Snap, selects the documents from the table ADW_SNAPL"."Channel" and publishes it to the HDFS destination directory path.



Successful execution of the pipeline displays the below output:  

 
 See Also

Insert excerpt
Teradata Snap Pack
Teradata Snap Pack
nopaneltrue