In this article
Table of Contents | ||||
---|---|---|---|---|
|
Overview
You can use this Snap to read data from various sources (such as SLDB, HTTP, S3, SFTP, HDFS, etc.) and produce a binary data stream at the output.
Multiexcerpt macro | ||||||
---|---|---|---|---|---|---|
| ||||||
We plan to introduce additional S3 features exclusively in Amazon S3 Snaps, while Binary Snaps with S3 support will not contain these updates. Therefore, we recommend you to use the Amazon S3 Snap Pack for all your S3 operations within your pipelines. However, Binary Snaps will be retained as is to maintain backward compatibility, but be aware that we will no longer provide S3 support for the Binary Snaps. Learn more: Migration from Binary Snaps to Amazon S3 Snaps. |
Snap Type
The File Reader Snap is a Read type Snap.
Prerequisites
Multiexcerpt macro | ||
---|---|---|
| ||
IAM Roles for Amazon EC2The 'IAM_CREDENTIAL_FOR_S3' feature is used to access S3 files from EC2 Groundplex, without Access-key ID and Secret key in the AWS S3 account in the Snap. The IAM credential stored in the EC2 metadata is used to gain access rights to the S3 buckets. To enable this feature, set the Global properties (Key-Value parameters) and restart the JCC: This feature is supported in the EC2-type Groundplex only. Learn more. |
Multiexcerpt include macro | ||||||||
---|---|---|---|---|---|---|---|---|
|
Support for Ultra Pipelines
Works in Ultra Pipelines.
Limitations
For most file protocols, the Snap behaves the same in both Snaplex and Groundplex. However, the HDFS protocol works only in the Groundplex. The Hadoop cluster must be open to the Groundplex server instance without any authentication.
When reading a file over HTTP, the File Reader Snap displays an error if the number of bytes consumed does not match the Content-Length header value present in the response.
- Do not use
sldb
as a file system or storage. File Assets are intended only for specialized files that a pipeline uses to reference specific data, such as accounts, expressions, or JAR files. Use a Cloud storage provider to store production data. File Assets should not be used as a file source or as a destination in production pipelines. When you configure the File Reader Snap, set the file path to a cloud provider or an external file system.
Known Issues
- This Snap fails for SMB file path with the error:
unable to create new native thread
. - This Snap Pack does not natively support SHA1-based algorithms to connect to SFTP endpoints. With the August 2023 GA release, you can now leverage the properties specified in the Configuration settings for Snaps to add support for ones that are disabled on your Snaplex.
- If the Snap encounters a file with the same name as your Project Space, it can result in an error when you attempt to use that file's name within the Mapper Snap. For instance, if your Project Space is named "
servicenow/to_snowflake
" and the file being read is named "servicenow_to_snowflake_demo.json
," you may encounter issues.
Consider using the complete file path instead of just the file name as a workaround.
Snap Views
Input | Document |
| Upstream Snap is optional. Any Snap with a document output view can be connected upstream. | Input may contain value(s) to evaluate the JavaScript expression in the File property. | ||||
Output | Document |
|
| Binary data read from the source specified in the File property with header information about the binary stream.
An example of the output preview on the File property value of "http://www.facebook.com" is as follows:
| ||||
Error | Error handling is a generic way to handle errors without losing data or failing the Snap execution. You can handle the errors that the Snap might encounter when running the Pipeline by choosing one of the following options from the When errors occur list under the Views tab:
Learn more about Error handling in Pipelines. |
Snap Settings
Field | Field Type | Description | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
Label* Default Value: File Reader | String |
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
File* Default Value: N/A
| String/Expression | Specify the URL for a regular file that must begin with a file protocol. The supported file protocols are:
You can also upload a file from using the Upload icon. You can preview the uploaded file using the previewicon. Learn more about Previewing File. This Snap supports S3 Virtual Private Cloud (VPC) endpoints. For example:
This Snap supports Oracle Object Storage endpoints when used with pre-authenticated requests. For example:
To create a pre-authenticated request, refer to the instructions in the following Oracle article:
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
Prevent URL encoding Default value: Deselected | Checkbox | When enabled, this will prevent the Snap from automatically URL encoding the file path (including the query string if it exists). Enable this setting to use the file path value as-is. When disabled, the following are some of the common characters that are automatically encoded by the Snap:
And these are some of the characters that are not automatically encoded by the Snap:
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
Enable staging Default value: Deselected | Checkbox | If selected, the Snap downloads the source file into a local temporary file. When the download is completed, it streams the data from the temporary file to the output view. This property prevents the Snap from being blocked by slow downstream pipeline. The local disk should have sufficient free space as large as the expected file size.
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
Number of retries Default Value: 0 | Integer/Expression | Specify the maximum number of retry attempts that the Snap must make in case there is a network failure, and the Snap is unable to read the target file.
Minimum value: 0 | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
Retry interval (seconds) Default Value: 1 | Integer/Expression | Specify the minimum number of seconds for which the Snap must wait before attempting recovery from a network failure. Minimum value: 1 | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
Advanced properties | Use this field set to define specific settings for polling files. Click to add a new row for defining an advanced property. This field set contains the following fields:
| |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
Properties | Dropdown list |
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
Values Default Value: N/A Example: https://myaccount.blob.core.windows.net/sascontainer/sasblob.txt?sv=2015-04-05&st=2015-04- | String/Expression | Specify the value for the SAS URI.
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
Snap Execution
| Dropdown list | Select one of the following three modes in which the Snap executes:
|
Note |
---|
|
Preview File
To preview a file, in the File field, click the Preview icon.
The Preview Type contains the following options:
- Hex: Displays the preview data in hexadecimal format.
- Text: Displays the preview data in text format.
- Render text with whitespace: Renders whitespaces as dots "." and tabs as underscores "_" in the preview data.
Examples
Expand | ||||||
---|---|---|---|---|---|---|
| ||||||
HDFSFor hdfs:// file access, please use a SnapLogic on-premises Groundplex and make sure that its instance is within the Hadoop cluster and SSH authentication has already been established. You can access HDFS files in the same way as other file protocols in File Reader and File Writer Snaps. There is no need to use any account in the Snap.
An example for HDFS is:
hdfs://ec2-54-198-212-134.compute-1.amazonaws.com:8020/user/john/input/sample.csv |
SFTP File Read
Example pipeline for an SFTP file read as shown below:
Note |
---|
|
Sample for AWS S3 Support
Troubleshooting
Error | Reason | Resolution |
---|---|---|
Response code: 400, unable to import the file < Request from elastic.snaplogic.com returned an error. | The name of the file that is being read by the Snap cannot be the same as the Project Space name. | Provide the complete path of the file (instead of only the file name) in this format: “ For example: |
Response code: 400, unable to import expression library: Request from elastic.snaplogic.com returned an error. | Path names at root level are not allowed. | Provide the complete path of the file (instead of only the file name) in this format: “ For example: |
Related Links
Insert excerpt | ||||||
---|---|---|---|---|---|---|
|