Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.
Comment: Reverted from v. 12


Snap type:

Read

Description:

Use the File Poller Snap to poll (survey at regular intervals) a specific directory for files whose names match a specific pattern. The Snap This Snap polls the target directory and looks for file names matching the specified pattern. It continues polling at the intervals specified in the Polling interval property until the timeout (specified in the Polling timeout property) is reached. Once polling is done, the Snap lists all files whose names match the specified pattern.

Note

This Snap polls the target directory only; subdirectories, if any, are ignored. Use the Directory Browser Snap if you want to poll files in the directory and all subdirectories, as well as to poll a directory only once.


The File Poller Snap can be used to trigger a specific operation in situations where an operation must be triggered when a particular specific file is found in the target directory. The pipeline can be configured with additional Snaps to process the Snap's output and perform tasks delete the matched file before the Polling interval value is reached.

  • Expected input: Optional. Documents containing expressions that must be evaluated in the Directory and / or File filter properties. Note that each input document will trigger a separate execution of the Snap.Expected output: upstream Snaps: Any Snap with a document output view, such as Mapper, JSON Generator.
  • Expected downstream Snaps: Any Snap with a document input view, such as File Reader, Mapper, JSON Formatter.
  • Expected input: An optional document to evaluate expressions in the Directory and/or File filter properties. Note that each input document will trigger the execution of the Snap.
  • Expected output: A full path in each document as a value for a key "path". If there are multiple files matching the filter, the same number of documents will be provided in the output view after each interval.
  • Expected upstream Snaps: Any Snap with a document output view, such as Mapper, JSON Generator.
  • Expected downstream Snaps: Any Snap with a document input view, such as File Reader, Mapper, JSON Formatter.


Code Block
[
        {
                "path" :  "sftp://sftp.smart.com/home/voo/test1.csv"
        },
        {
                "path" :  "sftp://sftp.smart.com/home/voo/test2.csv"
        }
]

Prerequisites:

IAM Roles for Amazon EC2

The 'IAM_CREDENTIAL_FOR_S3' feature is to access S3 files from EC2 Groundplex, without Access-key ID and Secret key in the AWS S3 account in the Snap. The IAM credential stored in the EC2 metadata is used to gain the access rights to the S3 buckets. To enable this feature, the following line should be added to global.properties and the jcc (node) restarted:
jcc.jvm_options = -DIAM_CREDENTIAL_FOR_S3=TRUE

Please note this feature is supported in the EC2-type Groundplex only.

For more information on IAM Roles, see http://docs.aws.amazon.com/AWSEC2/latest/UserGuide/iam-roles-for-amazon-ec2.html

Support and limitations:

Modes

Limitation

  • For S3 folders, the Snap currently supports polling the target directory for a maximum of 10,000 files. If there are more than that, the Snap does not provide any output.
Account: 

This Snap uses account references created on the Accounts page of SnapLogic Manager to handle access to this endpoint. This Snap supports several account types, as listed in the table below, or no account. See Binary Account for information on setting up these types of accounts. Account types supported by each protocol are as follows:


ProtocolAccount types
sldbno account
s3AWS S3
ftpBasic Auth
sftpBasic Auth, SSH Auth 
ftpsBasic Auth
hdfsno account
webhdfsno account
smbSMB
wasbAzure Storage
wasbsAzure Storage
gs

Google Storage

fileLocal file system


Required settings for account types are as follows:

Account TypeSettings
Basic AuthUsername, Password
AWS S3Access-key ID, Secret key
SSH AuthUsername, Private key
SMBDomain, Username, Password
Azure StorageAccount name, Primary access key
Google StorageApproval prompt, Application scope, Auto-refresh token
(Read-only properties are Access token, Refresh token, Access token expiration, OAuth2 Endpoint, OAuth2 token and Access type.)


Views:


InputThis Snap has at most one document input view.
OutputThis Snap has exactly one document output view.
ErrorThis Snap has at most one document error view.


Settings

Label

Required. The name for the Snap. You can modify this to be more specific, especially if you have more than one of the same Snap in your pipeline.

Directory 


This property is a URL path to the directory where files will be searched. The expected syntax is: 

  [protocol]://[host][:port]/[path]

The supported file protocols are:

  • s3:
  • file:
  • ftp:

  • ftps:
  • sftp: 
  • hdfs:
  • webhdfs:
  • sldb: 

  • smb:

  • wasb:

  • wasbs:

  • gs:

Example

    • s3:///[bucket_name]/[dir_path]

    • sftp://ftp.snaplogic.com:22/home/test/dir

    • ftp://ftp.snaplogic.com/test/csv

    • $directory 

    • _directory (A key-value pair with "directory" key should be defined as a pipeline parameter. Ensure that the '=' button is enabled when using parameters.)

    • file:///D:/testFolder/  (if the Snap is executed in the Windows Groundplex and needs to access the D: drive)

    • wasb:///Snaplogic/testDir/ or wasbs:///Snaplogic/testDir/

    • gs:///testBucket/testDir/ 

Default value:  [None]


Note
  • The protocol and the rest of the URL should be separated by "://". The host name and the port number should be between "://" and "/".
  • Not all file protocols support "//", use "///" instead. For example, if polling files in SLDB and S3 (see the examples shown above).  
  • This property should be an absolute path for all protocols except SLDB. For SLDB, the Snap can access only the same project directory or the shared project directory, and cannot access other project directories.
  • If you want this property to refer to the SLDB project (or shared project) directory where the pipeline of this Snap belongs to, enter "sldb:///" or leave it blank.
  • If the pipeline is created in a project other than the shared project and you want this property to refer to the shared project, enter "shared" or "sldb:///shared".
  • If the port number is omitted, a default port for the protocol is used. The hostname and port number are omitted in the SLDB and S3 protocols.


File filter

Required. A GLOB pattern to be applied to select one or more files in the directory. The File filter property can be a JavaScript expression which will be evaluated with values from the input view document.
Example:

  • *.txt
  • ab????xx.csv

Default value [None]

Excerpt


Expand
titleGlob Pattern Interpretation Rules

Use glob patterns in this filter to select one or more files in the directory. For example:

  • *.java Matches file names ending in .java.
  • *.* Matches file names containing a dot.
  • *.{java,class} Matches file names ending with .java or .class.
  • foo.? Matches file names starting with foo. and a single character extension.

The following rules are used to interpret glob patterns:

  • The * character matches zero or more characters of a name component without crossing directory boundaries.
  • The ? character matches exactly one character of a name component.
  • The backslash character (\) is used to escape characters that would otherwise be interpreted as special characters. For example, the expression \\ matches a single backslash, and "\{" matches a left brace.
  • The [ ] characters are a bracket expression that match a single character of a name component out of a set of characters. For example, [abc] matches 'a', 'b', or 'c'. The hyphen (-) may be used to specify a range; so, [a-z] specifies a range that matches from 'a' to 'z' (inclusive). These forms can be mixed; so, [abce-g] matches 'a'", 'b', 'c', 'e', 'f' or 'g'. If the character after the '[' is an '!', then it is used for negation; so, [!a-c] matches any character except 'a', 'b', or 'c'.
  • Within a bracket expression, the *, ?, and \ characters match themselves. The (-) character matches itself if it is the first character within the brackets, or the first character after the '!', if negating.
  • The { } characters are a group of subpatterns, where the group matches if any subpattern in the group matches. The ',' character is used to separate subpatterns. Groups cannot be nested.
  • Leading period / dot characters in file names are treated as regular characters in match operations. For example, the '*' glob pattern matches file name '.login'.
  • Some special characters are not supported. A partial list of unsupported special characters: #, ^, â, ê, î, ç, ¿, SPACE.



Polling interval in seconds

RequiredThe time-gap between each poll request (in seconds).

Example: 10

Default value: 30

Polling timeout

Required. A period of time after which file polling must end. You specify the number here and the time unit in the next property. Its unit is selected in the next property.

Example:

  • -1 (to poll indefinitely)
  • 0 (to poll once)
  • 60 (its unit shown in the next property)

Default value: 30

Note

Configure this property based on the expected number of files in the target directory. If there are many files and this property's value is small, the Snap may complete the operation and stop before the file is found. 


Polling-timeout unit

Unit for the polling timeout. Allowed values are SECONDS, MINUTES and HOURS.

Example: SECONDS

Default value: MINUTES

Examples


Insert excerpt
Binary Snap Pack
Binary Snap Pack
nopaneltrue