On this Page
Table of Contents | ||||
---|---|---|---|---|
|
Snap type:
Read
Description:
This Snap reads binary data from various sources such as SLDB, HTTP, S3, SFTP, HDFS, and produces a binary data stream at the output. Unlike the File Reader Snap, this Snap can read more than one file in the given directory and its subdirectories recursively.
In this article
Table of Contents | ||||
---|---|---|---|---|
|
Overview
You can use this read type Snap to read binary data from various sources such as SLDB, HTTP, S3, SFTP, HDFS, and produces a binary data stream at the output. Unlike the File Reader Snap, this Snap can read more than one file in the given directory and its subdirectories recursively.
Multiexcerpt include macro | ||||||||
---|---|---|---|---|---|---|---|---|
|
Snap Type
Multi File Reader is a READ-type Snap.
Prerequisites
Multiexcerpt include macro | ||||
---|---|---|---|---|
|
Multiexcerpt include macro name FTPS_Prerequisite templateData eJyLjgUAARUAuQ== page Directory Browser addpanel false
Support for Ultra Pipelines
Works in Ultra Pipelines.
Limitations
- For most file protocols, the Snap behaves the same way in both Snaplex and Groundplex. However, the HDFS protocol works only in a Groundplex. The Hadoop cluster must open to the Groundplex server instance without any authentication.
- Do not use
sldb
as a file system or storage. File Assets are intended only for specialized files that a pipeline uses to reference specific data, such as accounts, expressions, or JAR files. Use a Cloud storage provider to store production data. File Assets should not be used as a file source or as a destination in production pipelines. When you configure the Multi File Reade, set the file path to a cloud provider or external file system.
Known Issues
Multiexcerpt include macro | ||||||||
---|---|---|---|---|---|---|---|---|
|
Snap Views
Type | Format | Number of Views | Examples of Upstream and Downstream Snaps | Description |
---|---|---|---|---|
Input | Document |
| N/A | N/A |
Output | Binary |
|
| Binary data read from the source specified in the Selected files property. |
Multiexcerpt include macro | ||||
---|---|---|---|---|
|
- Works in Ultra Pipelines.
- For most file protocols, the Snap behaves the same in both Snaplex and Groundplex. However, the hdfs protocol works only in a Groundplex. The Hadoop cluster must open to the Groundplex server instance without any authentication.
Error | Error handling is a generic way to handle errors without losing data or failing the Snap execution. You can handle the errors that the Snap might encounter while running the Pipeline by choosing one of the following options from the When errors occur list under the Views tab. The available options are:
Learn more about Error handling in Pipelines. |
Account
This Snap uses account references created on the Accounts page of SnapLogic Manager to handle access to this endpoint. This Snap supports a Basic auth account, an AWS S3 auth account, SSH Auth account, SMB account, or no account. See Configuring Binary Accounts for information on setting up accounts that work with this Snap.
Account types supported by each protocol are as follows:
Protocol | Account types |
---|---|
sldb | no account |
s3 | AWS S3 |
ftp | Basic Auth |
sftp | Basic Auth, SSH Auth |
ftps | Basic Auth |
hdfs | no account |
http | no account |
https | no account |
smb | SMB |
wasb | Azure Storage |
wasbs | Azure Storage |
gs | Google Storage |
Note |
---|
The FTPS file protocol works only in explicit mode. The implicit mode is not supported. |
Required settings for account types are as follows:
Account Type | Settings |
---|---|
Basic Auth | Username, Password |
AWS S3 | Access-key ID, Secret key |
SSH Auth | Username, Private key, Key Passphrase |
SMB | Domain, Username, Password |
Azure Storage | Account name, Primary access key |
Google Storage | Approval prompt, Application scope, Auto-refresh token (Read-only properties are Access token, Refresh token, Access token expiration, OAuth2 Endpoint, OAuth2 token and Access type.) |
Input | This Snap has no input views. |
---|---|
Output | This Snap has exactly one binary output view provides the binary data stream read from the specified sources. |
Error | This Snap has at most one document error view and produces zero or more documents in the view. |
Settings
Label
Selected Files
A table property, which consists of three columns: Folder/File, Wildcard Regex, and Include Subfolders. A user can specify one or more data sources by clicking the + button.
Note: All selected files must be under the same protocol.
The URL for the data source and
Protocol | Account types |
---|---|
sldb | no account |
s3 | AWS S3 |
ftp | Basic Auth |
sftp | Basic Auth, SSH Auth |
ftps | Basic Auth |
hdfs | no account |
http | no account |
https | no account |
smb | SMB |
wasb | Azure Storage |
wasbs | Azure Storage |
gs | Google Storage |
Note |
---|
The FTPS file protocol works only in explicit mode. The implicit mode is not supported. |
Required settings for account types are as follows:
Account type | Settings |
---|---|
Basic Auth | Username, Password |
AWS S3 | Access-key ID, Secret key |
SSH Auth | Username, Private key, Key Passphrase |
SMB | Domain, Username, Password |
Azure Storage | Account name, Primary access key |
Google Storage | Approval prompt, Application scope, Auto-refresh token (Read-only properties are Access token, Refresh token, Access token expiration, OAuth2 Endpoint, OAuth2 token and Access type.) |
Snap Settings
Info |
---|
|
Field Name | Field Type | Description | ||||
---|---|---|---|---|---|---|
Label* Default Value: Multi File Reader | String | Specify the name for the Snap. You can modify this to be more specific, especially if you have more than one of the same Snap in your Pipeline. | ||||
Selected Files | Use this field set to define data sources.
| |||||
Folder/File Default Value: [None] Example:
| String/Expression | Specify the URL for the data source, which can be a directory or a file. It should |
begin with a file protocol. The supported file protocols are:
|
[protocol]://[host][:port]/[path]
Please note "://" is a separator between the file protocol and the rest of the URL and the host name and the port number should be between "://" and "/". If the port number is omitted, a default port for the protocol is used. The hostname and port number are omitted in the sldb and s3 protocols.The File property should be an absolute path for all protocols except sldb. For sldb files, the Snap can access only files in the same project directory or the shared project directory, and cannot access files in other projects.For sldb, http and https protocols, URL for a regular file should be entered. Folders are not supported for these protocols.
If this property is a regular file, the Wildcard and Include subfolders property are ignored.
Note |
---|
In the SnapLogic 4.3.2 release, WASB (Windows Azure Storage Blob) or WASBS protocol (wasb:/// or wasbs:///) support has been added to the Binary Snaps. |
Example:
s3:///<S3_bucket_name>@s3.<region_name>.amazonaws.com/<path>
For region names and their details, see AWS Regions and Endpoints.
The File property should have the syntax: [protocol]://[host][:port]/[path]
|
|
|
|
|
|
| |
Wildcard Default Value: |
Default value: [None]
Wildcard
[None]
| String/Expression | Specify the wildcard pattern, if the URL in the Folder/File property is for a directory. All files matching the wildcard pattern are selected. This property is not supported for the sldb, http, and https protocols. The asterisk pattern character ("*", also called "star") and the question mark ("?") are supported. The "*" character matches zero or more characters. The "?" |
Example:
- *.*
- *.csv
- *.json
- *.??? (matches all files with three-character extensions)
Default value: [None]
Include Subfolders
If the Include Subfolders property is true andmatches exactly one character. | |||
Include Subfolders Default Value: Not selected | Checkbox | Select to search subfolders for the specified Wildcard if Folder/File is set to a directory. If you select this checkbox and the Folder/File property is |
a folder, all files in the subfolders matching the given wildcard pattern are selected. |
This property
This checkbox is not supported for the sldb, http, and https protocols. |
Number of retries Default Value: 0 | Integer/Expression | Specify the maximum number of |
retry attempts the Snap must make in case there is a network failure, and the Snap is unable to read the target file. If the value is larger than 0, the Snap first downloads the target file to a temporary local file. If any error occurs during the download, the Snap waits for the time specified in the Retry interval and attempts to download the file again from the beginning. When the download is successful, the Snap starts to stream the data from the temporary file to the downstream pipeline. All temporary local files are deleted when they are no longer needed.
|
Example: 3
Minimum value: 0
Default value: 0
Multiexcerpt include macro name retries page File Reader
Minimum value: 0 | ||
Retry interval (seconds) |
Default value: 1 | Integer/Expression | Specify the minimum number of seconds for which the Snap must wait before attempting recovery from a network failure. |
Example: 3
Minimum value: 1 |
Default value: 1
Advanced Properties | Use this field set to define additional properties. | ||||||||
SAS URI | Dropdown list |
|
Multiexcerpt include macro | ||||
---|---|---|---|---|
|
Multiexcerpt include macro | ||||
---|---|---|---|---|
|
| |||||
Values | String/Expression | Specify the value for the property. | |||
Snap Execution | Dropdown list | Select one of the three modes in which the Snap executes. Available options are:
|
Note |
---|
The Pipeline validation (achieved by pressing "Retry") imposes a 5-minute timeout. If there are a large number of files to be read by the Snap as a result of Wildcard and Include subfolders settings, the Snap validation may fail due to this 5-minute timeout limit. |
Output Fields for the Different Protocols
The output fields that the Multi File Reader Snap generates depends on the protocol you select. The following table lists the output fields for the different protocols supported by the Snap:
Protocol | Output Fields |
---|---|
S3 |
|
SLDB |
|
|
|
Snap Pack History
Insert excerpt | ||||||
---|---|---|---|---|---|---|
|