On this Page
Table of Contents | ||||
---|---|---|---|---|
|
Snap type: | Parse | |||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
Description: | This Snap reads CSV binary data from its input view, parses and writes it as CSV document data to its output view. | |||||||||||||
Prerequisites: | [None] | |||||||||||||
Support and limitations: | Works in Ultra Task Pipelines. | |||||||||||||
Account: | Accounts are not used with this Snap. | |||||||||||||
Views: |
| |||||||||||||
Settings | ||||||||||||||
Label | Required. The name for the Snap. You can modify this to be more specific, especially if you have more than one of the same Snap in your pipeline. | |||||||||||||
Quote character | The Quote character property specifies the character to be used for a quote. As of 4.3.2, this property can be an expression, which is evaluated with the values from the Pipeline parameters.
Default value: “ | |||||||||||||
Delimiter | Required. The Delimiter property specifies the character to be used as a delimiter in parsing the CSV data. In case that tab is used as a delimiter, enter "\t" instead of pressing the Tab key. Any Unicode character is also supported. As of 4.3.2, this property can be an expression, which is evaluated with the values from the pipeline parameters. | |||||||||||||
Escape character | The escape character used when parsing rows. Only single characters are supported. As of 4.3.2, this property can be an expression, which is evaluated with the values from the pipeline parameters. Leave this property empty if no escape character is used in the input CSV data. Default value: \ | |||||||||||||
Skip lines | Required. The Skip lines property specifies the number of lines in the input data to be skipped before the Snap starts parsing.
Example: 0 | |||||||||||||
Contains header | The Contains header property specifies whether the input data contains the CSV header or not. | |||||||||||||
Column names | The Column names property is a composite table property, which contains the Columns property in it. This property is ignored if the second input view is used for the CSV metadata. | |||||||||||||
Header | Specifies the list of headers to be used as a CSV header in formatting when the Contains header property is deselected. Example:
Default value: [None] | |||||||||||||
Validate headers | This option specifies whether or not the headers from the input data should be validated against the Column names table property. If this option is checked, the Snap throws an exception when they do not match exactly. | |||||||||||||
Header size error policy | Enables you Defines how to handle any header size errors, which occur when the number of values in a CSV line is larger than the header size. To handle header size errors, you can select any of the following optionserrors for records that do not match the header columns in the CSV file. This error condition occurs if the input document has fewer or additional columns that do not match with the header columns. The available options are:
Default value: Both Example: Trim record to fit header | |||||||||||||
Character set | This setting lets you select the character set in which input CSV data is encoded. The supported selections are:
Default value: Auto BOM detect. | |||||||||||||
Ignore empty data | This property can be set false to produce an empty output document when the input CSV data is empty (both an empty binary stream and a binary stream with CSV headers only). This feature may be useful if the downstream Snaps should be executed whether the input CSV data is empty or not. Default value: True | |||||||||||||
Preserve Surrounding Spaces | Select this checkbox to preserve the surrounding spaces for the values that are non-quoted.
For example, if you are using data with a delimiter as follows:
| |||||||||||||
|
|
Note |
---|
You must either select Contains header or specify a Column name in order for validation on the pipeline to work. |
Examples
Use Case
Using the CSV Parser Snap Schema Capability
One of the features in the CSV Parser which customers sometimes request is the ability to define the fields (and their data types) for incoming CSV files. This is made easy by adding a second input view to the CSV Parser Snap, and providing the definition of the fields, and their data types in the flow.
For example, if you have input data in my CSV file as follows, with no header line:
You can create a definition of the CSV data in another CSV file as follows:
Note the data types are optional, and defined on the second line of the input file. The parser supports the use of 'string', 'integer', 'float' and 'boolean' types. String is the default data type, any empty data type fields are considered to be strings.
The configuration of the pipeline for this use is as follows:
where the Read Snaps are File Readers.
The CSV Parser is configured as follows:
with the View settings as:
The resulting data in the SnapLogic pipeline data flow looks like this:
Insert excerpt | ||||||
---|---|---|---|---|---|---|
|