Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

On this Page

Table of Contents
maxLevel2
excludeOlder Versions|Additional Resources|Related Links|Related Information

Snap type:

Read

Description:

This Snap reads binary data from various sources such as SLDB, HTTP, S3, SFTP, HDFS, and produces a binary data stream at the output. Unlike the File Reader Snap, this Snap can read more than one file in the given directory and its subdirectories recursively.

  • Expected upstream Snaps: The Snap has no input view and no Snap can be connected upstream.
  • Expected downstream Snaps: Any Snap with a binary input view can be connected downstream, such as File Writer, CSV Parser, JSON Parser, or XML Parser.
  • Expected input: The Snap has no input view.
  • Expected output:

    In this article

    Table of Contents
    maxLevel2
    excludeOlder Versions|Additional Resources|Related Links|Related Information

    Overview

    You can use this read type Snap to read binary data from various sources such as SLDB, HTTP, S3, SFTP, HDFS, and produces a binary data stream at the output. Unlike the File Reader Snap, this Snap can read more than one file in the given directory and its subdirectories recursively.

    Multiexcerpt include macro
    nameMigrating from Binary to Amazon S3
    templateData[]
    pageFile Reader
    addpanelfalse

    Image Added

    Snap Type

    Multi File Reader is a READ-type Snap.

    Prerequisites

    Multiexcerpt include macro
    nameEC2Prerequisite
    pageFile Reader


    Multiexcerpt include macro
    nameFTPS_Prerequisite
    templateDataeJyLjgUAARUAuQ==
    pageDirectory Browser
    addpanelfalse

    Support for Ultra Pipelines

    Works in Ultra Pipelines

    Limitations

    • For most file protocols, the Snap behaves the same way in both Snaplex and Groundplex. However, the HDFS protocol works only in a Groundplex. The Hadoop cluster must open to the Groundplex server instance without any authentication.
    • Do not use sldb as a file system or storage. File Assets are intended only for specialized files that a pipeline uses to reference specific data, such as accounts, expressions, or JAR files. Use a Cloud storage provider to store production data. File Assets should not be used as a file source or as a destination in production pipelines. When you configure the Multi File Reade, set the file path to a cloud provider or external file system.

    Known Issues

    Multiexcerpt include macro
    nameKI
    templateData[]
    pageZipFile Write
    addpanelfalse


    Snap Views

    TypeFormatNumber of ViewsExamples of Upstream and Downstream SnapsDescription
    Input 

    Document



    • Min:0
    • Max:1
    N/AN/A
    Output

    Binary

    • Min:1
    • Max:1
    • File Writer
    • CSV Parser
    • JSON Parser
    • XML Parser

    Binary data read from the source specified in the Selected files property.

    The binary data can be previewed at the output of the Snap.Prerequisites:
    Multiexcerpt include macro
    nameEC2Prerequisite
    pageFile Reader
    Support and limitations:
    • Works in Ultra Pipelines.
    • For most file protocols, the Snap behaves the same in both Snaplex and Groundplex. However, the hdfs protocol works only in a Groundplex. The Hadoop cluster must open to the Groundplex server instance without any authentication.
    Account: 

    Error

    Error handling is a generic way to handle errors without losing data or failing the Snap execution. You can handle the errors that the Snap might encounter while running the Pipeline by choosing one of the following options from the When errors occur list under the Views tab. The available options are:

    • Stop Pipeline Execution: Stops the current pipeline execution when the Snap encounters an error.

    • Discard Error Data and Continue: Ignores the error, discards that record, and continues with the rest of the records.

    • Route Error Data to Error View: Routes the error data to an error view without stopping the Snap execution.

    Learn more about Error handling in Pipelines.

    Account

    This Snap uses account references created on the Accounts page of SnapLogic Manager to handle access to this endpoint. This Snap supports a Basic auth account, an AWS S3 auth account, SSH Auth account, SMB account, or no account. See Configuring Binary Accounts for information on setting up accounts that work with this Snap.

    Account types supported by each protocol are as follows:

    Protocol Account types
    sldbno account
    s3AWS S3
    ftpBasic Auth
    sftpBasic Auth, SSH Auth 
    ftpsBasic Auth
    hdfsno account
    httpno account
    httpsno account
    smb

    SMB

    wasbAzure Storage
    wasbsAzure Storage
    gs

    Google Storage

    Note

    The FTPS file protocol works only in explicit mode. The implicit mode is not supported.

    Required settings for account types are as follows:

     Account Type Settings
    Basic AuthUsername, Password
    AWS S3Access-key ID, Secret key
    SSH AuthUsername, Private key, Key Passphrase
    SMBDomain, Username, Password
    Azure StorageAccount name, Primary access key
    Google StorageApproval prompt, Application scope, Auto-refresh token
    (Read-only properties are Access token, Refresh token, Access token expiration, OAuth2 Endpoint, OAuth2 token and Access type.)
    Views:
    InputThis Snap has no input views.
    OutputThis Snap has exactly one binary output view provides the binary data stream read from the specified sources. 
    ErrorThis Snap has at most one document error view and produces zero or more documents in the view.

    Settings

    Label

    Required. The name for the Snap. You can modify this to be more specific, especially if you have more than one of the same Snap in your pipeline.

    Selected Files 

    A table property, which consists of three columns: Folder/FileWildcard Regex, and Include Subfolders. A user can specify one or more data sources by clicking the + button.
    Note: All selected files must be under the same protocol.

    Folder/File 
    The URL for the data source and

    ProtocolAccount types
    sldbno account
    s3AWS S3
    ftpBasic Auth
    sftpBasic Auth, SSH Auth
    ftpsBasic Auth
    hdfsno account
    httpno account
    httpsno account
    smb

    SMB

    wasbAzure Storage
    wasbsAzure Storage
    gs

    Google Storage


    Note

    The FTPS file protocol works only in explicit mode. The implicit mode is not supported.

    Required settings for account types are as follows:

    Account typeSettings
    Basic AuthUsername, Password
    AWS S3Access-key ID, Secret key
    SSH AuthUsername, Private key, Key Passphrase
    SMBDomain, Username, Password
    Azure StorageAccount name, Primary access key
    Google StorageApproval prompt, Application scope, Auto-refresh token
    (Read-only properties are Access token, Refresh token, Access token expiration, OAuth2 Endpoint, OAuth2 token and Access type.)

    Snap Settings

    Info
    • Asterisk (*): Indicates a mandatory field.

    • Suggestion icon (Image Added): Indicates a list that is dynamically populated based on the configuration.

    • Expression icon ( Image Added ): Indicates whether the value is an expression (if enabled) or a static value (if disabled). Learn more about Using Expressions in SnapLogic.

    • Add icon ( Image Added ): Indicates that you can add fields in the fieldset.

    • Remove icon (Image Added): Indicates that you can remove fields from the fieldset.


    Field NameField TypeDescription

    Label*


    Default Value: Multi File Reader
    Example
    Multi File Reader


    String

    Specify the name for the Snap. You can modify this to be more specific, especially if you have more than one of the same Snap in your Pipeline.


    Selected Files 










    Use this field set to define data sources. 

    Note

    All selected files must be under the same protocol.


    Folder/File


    Default Value: [None]

    Example

    • s3:///<S3_bucket_name>@s3.<region_name>.amazonaws.com/<path>
      For region names and their details, see AWS Regions and Endpoints.
    • sftp://ftp.snaplogic.com:22/dir/filename
    • smb://smb.Snaplogic.com:445/test_files/csv/input.csv




    String/Expression

    Specify the URL for the data source, which can be a directory or a file. It should

    start
  • smb://smb.Snaplogic.com:445/test_files/csv/input.csv
  • _filename (A key/value pair with "filename" key should be defined as a pipeline parameter.)
  • sldb:///usr/john/json_files/asset.json
  • s3://yourAcccessKeyID:yourSecretKey@s3/yourBucketName/folder1/folder2/ (if an account is not used within the Snap)
  • begin with a file protocol. The supported file protocols are:

    • http:
    • https:
    • s3:
    • sftp:
    • ftp:
    • ftps:
    • hdfs:
    • sldb:
    • smb:
    • wasb:
    • wasbs:
    • gs:
    The File property should have the syntax:
            [protocol]://[host][:port]/[path]
    Please note "://" is a separator between the file protocol and the rest of the URL and the host name and the port number should be between "://" and "/". If the port number is omitted, a default port for the protocol is used. The hostname and port number are omitted in the sldb and s3 protocols.The File property should be an absolute path for all protocols except sldb. For sldb files, the Snap can access only files in the same project directory or the shared project directory, and cannot access files in other projects.For sldb, http and https protocols, URL for a regular file should be entered. Folders are not supported for these protocols.
    If this property is a regular file, the Wildcard and Include subfolders property are ignored.
    Note
    In the SnapLogic 4.3.2 release, WASB (Windows Azure Storage Blob) or WASBS protocol (wasb:/// or wasbs:///) support has been added to the Binary Snaps.
    In the WASB and WASBS file URL, the top directory should be the name of the 'Azure Storage container'.

    Example

  • If a pipeline is created in a project other than the shared project and you want to read the "asset.json" file from the same project, enter "asset.json" or "sldb:///asset.json".
  • If a pipeline is created in a project other than the shared project and you want to read the "asset.json" file from the shared project, enter "shared/asset.json" or "sldb:///shared/asset.json".
  • If a pipeline is created in the shared project and you want to read the "asset.json" file from the shared project, enter "asset.json" or "sldb:///asset.json".
  • s3:///<S3_bucket_name>@s3.<region_name>.amazonaws.com/<path>
    For region names and their details, see AWS Regions and Endpoints.
  • sftp://ftp.snaplogic.com:22/dir/filename
  • sftp://ftp.snaplogic.com:22/dir/
  • Note

    This Snap supports S3 Virtual Private Cloud (VPC) endpoint. For example, s3://my-bucket@bucket.vpce-028b7814794578709-vu0vvauy.s3.us-west-2.vpce.amazonaws.com


    The File property should have the syntax: [protocol]://[host][:port]/[path]

    • _filename (A key/value pair with "filename" key should be defined as a pipeline parameter.)
    • If a Pipeline is created in a project other than the shared project and you want to read the "asset.json" file from the shared project, enter "shared/asset.json" or "sldb:///shared/asset.json".
    • If a Pipeline is created in the shared project and you want to read the "asset.json" file from the shared project, enter "asset.json" or "sldb:///asset.json".


    Info

    "://" is a separator between the file protocol and the rest of the URL and the host name and the port number should be between "://" and "/". If the port number is omitted, a default port for the protocol is used. The hostname and port number are omitted in the sldb and s3 protocols.


    Info
    • Ensure the file name, folder name, or the file path does not contain '?' character because it is not fully supported and when present, the Snap might fail.
    • The File property should be an absolute path for all protocols except sldb. For sldb files, the Snap can access only files in the same project directory or the shared project directory, and cannot access files in other projects.
    • For sldb, http and https protocols, URL for a regular file should be entered. Folders are not supported for these protocols.
      If this property is a regular file, the Wildcard and Include subfolders property are ignored.


    Note
    In the SnapLogic 4.3.2 release, WASB (Windows Azure Storage Blob) or WASBS protocol (wasb:/// or wasbs:///) support has been added to the Binary Snaps.


    In the WASB and WASBS file URL, the top directory should be the name of the 'Azure Storage container'.

    • If an account is not used within the Snap, then use: s3://yourAcccessKeyID:yourSecretKey@s3/yourBucketName/folder1/
    rawData.csv (
    • folder2/ 
    • if an account is not used within the Snap
    )file
    • , then use:
      s3://yourAcccessKeyID:yourSecretKey@s3/
    D:
    • yourBucketName/
    testFolder/  (if
    • folder1/rawData.csv
    • If the Snap is executed in the Windows Groundplex and needs to access D: drive
    )
    • , then use file:///D:/testFolder/ 
    • To read files in the 'testDir' folder in the 'Snaplogic' container, then use wasb:///Snaplogic/testDir/sample
    .csv  (to read files in the 'testDir' folder in the 'Snaplogic' container)
    • .csv  
    • If the bucket name is 'testBucket', then gs:///testBucket/testDir/ 

    Wildcard 


    Default Value: 

    (if the bucket name is 'testBucket')

    Default value: [None]

    Wildcard 

    A

    [None]
    Example

    • *.*
    • *.csv
    • *.json
    • *.??? (matches all files with three-character extensions)
    String/Expression

    Specify the wildcard pattern, if the URL in the Folder/File property is for a directory. All files matching the wildcard pattern are selected. This property is not supported for the sldb, http, and https protocols. The asterisk pattern character ("*", also called "star") and the question mark ("?") are supported. The "*" character matches zero or more characters. The "?"

    matches exactly one character.
    Example
    • *.*
    • *.csv
    • *.json
    • *.??? (matches all files with three-character extensions)

    Default value: [None]

    Include Subfolders 

    If the Include Subfolders property is true and

    matches exactly one character.

    Include Subfolders 


    Default ValueNot selected 

    Checkbox

    Select to search subfolders for the specified Wildcard if Folder/File is set to a directory.

    If you select this checkbox and the Folder/File property is

    for

    a folder, all files in the subfolders matching the given wildcard pattern are selected.

     
    This property Number of retriesSpecifies the

     This checkbox is not supported for the sldb, http, and https protocols.

    Default value: Not selected 

    Number of retries


    Default Value: 0
    Example
    3

    Integer/Expression

    Specify the maximum number of

    retry attempts that

    retry attempts the Snap must make in case there is a network failure, and the Snap is unable to read the target file.

    If the value is larger than 0, the Snap first downloads the target file to a temporary local file. If any error occurs during the download, the Snap waits for the time specified in the Retry interval and attempts to download the file again from the beginning. When the download is successful, the Snap starts to stream the data from the temporary file to the downstream pipeline. All temporary local files are deleted when they are no longer needed.

    Info
    • Ensure that the local drive has sufficient free disk space to store the temporary local file.

    • The retry operation is applied for each file the Snap downloads

    .

    Example:  3

    Minimum value: 0

    Default value: 0

    Multiexcerpt include macro
    nameretries
    pageFile Reader

    • .

    Minimum value: 0

    Retry interval (seconds)

    Specifies the


    Default value: 1
    Example
    3

    Integer/Expression

    Specify the minimum number of seconds for which the Snap must wait before attempting recovery from a network failure.

    Example:  3

    Multiexcerpt include macronameSASURI_NamepageFile Reader

    Minimum value: 1

    Default value: 1


    Advanced PropertiesUse this field set to define additional properties.
    SAS URIDropdown list

    Multiexcerpt include macro
    nameSASURI_Description_NoAccountSAS
    pageDirectory Browser

    Multiexcerpt include macro
    nameSnap Execution
    pageSOAP Execute

    Multiexcerpt include macro
    nameSnap_Execution_Introduced
    pageAnaplan Read

    NoteThe pipeline

    Note

    If the SAS URI value is provided in the Snap settings, then the settings provided in the account (if any account is attached) are ignored.


    ValuesString/ExpressionSpecify the value for the property.

    Snap Execution

    Dropdown list

    Select one of the three modes in which the Snap executes. Available options are:

    • Validate & Execute: Performs limited execution of the Snap, and generates a data preview during Pipeline validation. Subsequently, performs full execution of the Snap (unlimited records) during Pipeline runtime.
    • Execute only: Performs full execution of the Snap during Pipeline execution without generating preview data.
    • Disabled: Disables the Snap and all Snaps that are downstream from it.


    Note
    The Pipeline validation (achieved by pressing "Retry") imposes a 5-minute timeout. If there are a large number of files to be read by the Snap as a result of Wildcard and Include subfolders settings, the Snap validation may fail due to this 5-minute timeout limit.

    Output Fields for the Different Protocols

    The output fields that the Multi File Reader Snap generates depends on the protocol you select. The following table lists the output fields for the different protocols supported by the Snap:


    Protocol
    Output Fields
    S3
    • content-type
    • content-length
    • last-modified: _snaptype_datetime
    • etag
    • accept-ranges
    • content-location
    • content-disposition
    SLDB
    • content-type
    • date
    • x-amz-meta-md5
    • content-length
    • server
    • x-amz-server-side-encryption
    • x-amz-meta-length
    • x-amz-meta-create_time
    • last-modified: _snaptype_datetime
    • x-amz-meta-file_id
    • x-amz-meta-ttl
    • content-disposition
    • x-amz-meta-owner
    • x-amz-meta-expire_time
    • etag
    • x-amz-request-id
    • x-amz-meta-mimetype
    • x-amz-id-2
    • accept-ranges
    • content-location
    • WASB
    • SMB
    • SFTP
    • GStorage
    • content-type
    • content-location
    • content-disposition


    Example

    Snap Pack History

    Insert excerpt
    Binary Snap Pack
    Binary Snap Pack
    nopaneltrue