On this Page

Snap type:

Format


Description:

This Snap formats the incoming document from the upstream Snaps to the RC (Row columnar) file format used for storing data in an optimized way to answer aggregate queries faster.

  • Expected upstream Snaps: The upstream Snap should output table oriented data with columns and rows.
  • Expected downstream Snaps: The RC File Formatter Snap outputs binary data, so the downstream Snap must be a data output Snap (i.e. File Writer, HDFS Writer, etc.).


Prerequisites:

[None]


Support and limitations:Works in Ultra Task Pipelines.
Account: 

Accounts are not used with this Snap. 


Views:


InputThis Snap has at most one document input view.
OutputThis Snap has at most one document output view.
ErrorThis Snap has at most one document error view and produces zero or more documents in the view.



Settings



Label


Required. The name for the Snap. You can modify this to be more specific, especially if you have more than one of the same Snap in your pipeline.

Hive Metastore URL



Hive Metastore URI, such as: thrift://localhost:9083

Example: thrift://hive.metastore.com:9083 u

Default value: [None]


Database


Database which holds the schema for the outgoing RC File data.

Example: hive_db
 

Default value:  [None]


Table


Table whose schema should be used for parsing the outgoing RC file data.

Example: hive_tbl
 

Default value:  [None]


Column paths


Required. Paths where the column values appear in the document.

Example:
    Column Name: Fun
    Column Path: $column_from_input_data
    Column Type: string
 

Default value:  [None]



Troubleshooting