...
...
...
...
...
...
...
...
...
...
...
...
...
...
...
...
...
...
...
...
...
...
...
...
...
...
...
...
...
...
...
...
...
...
...
...
...
...
...
...
...
...
...
...
...
...
...
...
...
...
...
...
...
...
...
...
...
...
...
...
...
...
...
...
...
...
...
...
...
...
...
...
...
...
...
...
...
...
...
...
...
...
Snap type:
...
Read
...
Description:
This Snap polls the target directory and looks for file names matching the specified pattern. It continues polling at the intervals specified in the Polling interval property until the timeout (specified in the Polling timeout property) is reached. Once polling is done, the Snap lists all files whose names match the specified pattern.
Note |
---|
This Snap polls the target directory only; subdirectories, if any, are ignored. Use the Directory Browser Snap if you want to poll files in the directory and all subdirectories, as well as to poll a directory only once. |
The File Poller Snap can be used in situations where an operation must be triggered when a specific file is found in the target directory. The pipeline can be configured with additional Snaps to process the Snap's output and delete the matched file before the Polling interval value is reached.
- Expected upstream Snaps: Any Snap with a document output view, such as Mapper, JSON Generator.
- Expected downstream Snaps: Any Snap with a document input view, such as File Reader, Mapper, JSON Formatter.
- Expected input: An optional document to evaluate expressions in the Directory and/or File filter properties. Note that each input document will trigger the execution of the Snap.
- Expected output: A full path in each document as a value for a key "path". If there are multiple files matching the filter, the same number of documents will be provided in the output view after each interval.
Code Block |
---|
[
{
"path" : "sftp://sftp.smart.com/home/voo/test1.csv"
},
{
"path" : "sftp://sftp.smart.com/home/voo/test2.csv"
}
] |
...
IAM Roles for Amazon EC2
The 'IAM_CREDENTIAL_FOR_S3' feature is to access S3 files from EC2 Groundplex, without Access-key ID and Secret key in the AWS S3 account in the Snap. The IAM credential stored in the EC2 metadata is used to gain the access rights to the S3 buckets. To enable this feature, the following line should be added to global.properties and the jcc (node) restarted:
jcc.jvm_options = -DIAM_CREDENTIAL_FOR_S3=TRUE
Please note this feature is supported in the EC2-type Groundplex only.
For more information on IAM Roles, see http://docs.aws.amazon.com/AWSEC2/latest/UserGuide/iam-roles-for-amazon-ec2.html
...
Modes
...
Limitation
- For S3 folders, the Snap currently supports polling the target directory for a maximum of 10,000 files. If there are more than that, the Snap does not provide any output.
...
This Snap uses account references created on the Accounts page of SnapLogic Manager to handle access to this endpoint. This Snap supports several account types, as listed in the table below, or no account. See Binary Account for information on setting up these types of accounts. Account types supported by each protocol are as follows:
Protocol | Account types |
---|---|
sldb | no account |
s3 | AWS S3 |
ftp | Basic Auth |
sftp | Basic Auth, SSH Auth |
ftps | Basic Auth |
hdfs | no account |
webhdfs | no account |
smb | SMB |
wasb | Azure Storage |
wasbs | Azure Storage |
gs | Google Storage |
file | Local file system |
Required settings for account types are as follows:
Account Type | Settings |
---|---|
Basic Auth | Username, Password |
AWS S3 | Access-key ID, Secret key |
SSH Auth | Username, Private key |
SMB | Domain, Username, Password |
Azure Storage | Account name, Primary access key |
Google Storage | Approval prompt, Application scope, Auto-refresh token (Read-only properties are Access token, Refresh token, Access token expiration, OAuth2 Endpoint, OAuth2 token and Access type.) |
...
Input | This Snap has at most one document input view. |
---|---|
Output | This Snap has exactly one document output view. |
Error | This Snap has at most one document error view. |
...
Settings
...
Label
...
Directory
This property is a URL path to the directory where files will be searched. The expected syntax is:
[protocol]://[host][:port]/[path]
The supported file protocols are:
- s3:
- file:
ftp:
- ftps:
- sftp:
- hdfs:
- webhdfs:
sldb:
smb:
wasb:
wasbs:
gs:
Example:
s3:///[bucket_name]/[dir_path]
sftp://ftp.snaplogic.com:22/home/test/dir
ftp://ftp.snaplogic.com/test/csv
$directory
_directory
(A key-value pair with "directory" key should be defined as a pipeline parameter. Ensure that the '=' button is enabled when using parameters.)file:///D:/testFolder/
(if the Snap is executed in the Windows Groundplex and needs to access the D: drive)wasb:///Snaplogic/testDir/
orwasbs:///Snaplogic/testDir/
gs:///testBucket/testDir/
Default value: [None]
Note |
---|
|
...
File filter
Required. A GLOB pattern to be applied to select one or more files in the directory. The File filter property can be a JavaScript expression which will be evaluated with values from the input view document.
Example:
- *.txt
- ab????xx.csv
Default value: [None]
...
Expand | ||
---|---|---|
| ||
Use glob patterns in this filter to select one or more files in the directory. For example:
The following rules are used to interpret glob patterns:
|
...
Polling interval in seconds
...
Required. The time-gap between each poll request (in seconds).
Example: 10
Default value: 30
...
Polling timeout
Required. A period of time after which file polling must end. You specify the number here and the time unit in the next property. Its unit is selected in the next property.
Example:
- -1 (to poll indefinitely)
- 0 (to poll once)
- 60 (its unit shown in the next property)
Default value: 30
Note |
---|
Configure this property based on the expected number of files in the target directory. If there are many files and this property's value is small, the Snap may complete the operation and stop before the file is found. |
...
Unit for the polling timeout. Allowed values are SECONDS, MINUTES and HOURS.
Example: SECONDS
Default value: MINUTES
...
Select this check box to instruct the Snap to provide an output only when there is a change in the contents of the polled directory. When selected, the Snap provides an output during its initial run if it finds matching documents. However, it provides polling results in the next run only if the polled directory has newer files that match the pattern specified.
Default value: Selected
...
Specifies the maximum number of retry attempts in case of a network failure.
Example: 3
Minimum value: 0
Default value: 0
...
Specifies the minimum number of seconds for which the Snap must wait before attempting recovery from a network failure.
Example: 3
Minimum value: 1
Default value: 1
Multiexcerpt include macro | ||||
---|---|---|---|---|
|
...
Multiexcerpt include macro | ||||
---|---|---|---|---|
|
Examples
Writing out the List of Files in a Specific Directory Using the File Poller
...
In this article
Table of Contents | ||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|
|
Overview
You can use this Snap to poll the target directory and find file names matching the specified pattern.
Info |
---|
|
Note |
---|
|
...
Snap Type
The File Poller Snap is a Read-type Snap.
Prerequisites
Multiexcerpt include macro | ||||||
---|---|---|---|---|---|---|
|
Multiexcerpt include macro | ||||||||
---|---|---|---|---|---|---|---|---|
|
Support for Ultra Pipelines
Works in Ultra Pipelines.
Limitations
For S3 folders, the Snap currently supports polling the target directory for a maximum of 10,000 files. If there are more than that, the Snap does not provide any output.
Known Issues
The Snap is expected to fail if there is no account selected. However, the Snap may execute successfully without any account if all the following conditions exist:
The Snap is executed in an EC2-instance Snaplex where your pipeline runs with an IAM role.
The S3 bucket accessed by the Snap includes the necessary permissions for use with the specific IAM role.
The following global property is set as a node property in the plex:
jcc.jvm_options = -DIAM_CREDENTIAL_FOR_S3=TRUE
Multiexcerpt include macro name KI templateData [] page ZipFile Write addpanel false
Behavior Change
The File Poller Snap now honors the value specified in the Polling timeout field instead of polling indefinitely in case of poor file polling operations. To handle indefinite polling operations the polling is done in a separate thread. However, when the execution time exceeds the value specified in the Polling timeout, a timeout exception is written to the log to prevent the polling from getting stuck and the Snap continues polling depending on the Polling timeout.
If the Polling timeout value is greater than 0, the Snap polls until the end of polling window.
If it is less than 0, the Snap stops polling.
If it is -1, the Snap continues polling.
Supported Protocols
Account types supported by each protocol are as follows:
Protocol | Account types |
---|---|
sldb | no account |
s3 | AWS S3 |
ftp | Basic Auth |
sftp | Basic Auth, SSH Auth |
ftps | Basic Auth |
hdfs | no account |
smb | SMB |
wasb | Azure Storage |
wasbs | Azure Storage |
gs | Google Storage |
file | Local file system |
Note |
---|
The FTPS file protocol works only in explicit mode. The implicit mode is not supported. |
Required settings for account types are as follows:
Account Type | Settings |
---|---|
Basic Auth | Username, Password |
AWS S3 | Access-key ID, Secret key |
SSH Auth | Username, Private key, Key Passphrase |
SMB | Domain, Username, Password |
Azure Storage | Account name, Primary access key |
Google Storage | Approval prompt, Application scope, Auto-refresh token |
Snap Views
Type | Format | Number of Views | Examples of Upstream and Downstream Snaps | Description | ||
---|---|---|---|---|---|---|
Input | Document
|
|
| An optional document to evaluate expressions in the Directory and/or File filter properties. Note that each input document will trigger the execution of the Snap. | ||
Output | Document
|
|
| A full path in each document as a value for a key "path". If multiple files match the filter, the same number of documents will be provided in the output view after each interval.
| ||
Error | Error handling is a generic way to handle errors without losing data or failing the Snap execution. You can handle the errors that the Snap might encounter when running the pipeline by choosing one of the following options from the When errors occur list under the Views tab:
Learn more about Error handling in Pipelines. |
Snap Settings
Info |
---|
|
Field Name | Field Type | Description | ||||||||
---|---|---|---|---|---|---|---|---|---|---|
Label*
| String | Specify the name for the Snap. You can modify this to be more specific, especially if you have more than one of the same Snap in your pipeline. | ||||||||
Directory Default Value: N/A
| String/Expression | Specify the URL path to the directory where files will be searched in the following format: The supported file protocols are:
| ||||||||
File filter*
| String/Expression | Specify a GLOB pattern to be applied to select one or more files in the directory. The File filter property can be a JavaScript expression which will be evaluated with values from the input view document. [None]
| ||||||||
Polling interval in seconds*
| Integer | Specify the time gap between each poll request (in seconds). | ||||||||
Polling timeout* Default value: 30 | Integer | Specify a period of time after which file polling must end. If the Polling timeout is set to:
| ||||||||
Polling-timeout unit
| Dropdown list | Specify a value for polling timeout. | ||||||||
Only Output on Change Default value: Selected | Checkbox | Select this check box to instruct the Snap to provide an output only when there is a change in the contents of the polled directory. When selected, the Snap provides an output during its initial run if it finds matching documents. However, it provides polling results in the next run only if the polled directory has newer files that match the pattern specified. | ||||||||
Number of retries Minimum value: 0 | Integer | Specify the maximum number of retry attempts that the Snap must make in case there is a network failure, and the Snap is unable to read the target file. If the value is larger than 0, the Snap first downloads the target file into a temporary local file. If any error occurs during the download, the Snap waits for the time specified in the Retry interval and attempts to download the file again from the beginning. When the download is successful, the Snap streams the data from the temporary file to the downstream Pipeline. All temporary local files are deleted when they are no longer needed. Ensure that the local drive has sufficient free disk space to store the temporary local file. | ||||||||
Retry interval (seconds) Minimum value: 1 Default value: 1 | Integer | Specify the minimum number of seconds for which the Snap must wait before attempting recovery from a network failure. | ||||||||
Advanced properties | Use this field set to define specific settings for polling files. | |||||||||
Properties | Dropdown list | Choose either of the following options:
| ||||||||
Values | String/Expression |
| ||||||||
Snap execution Default Value: Validate & Execute | Dropdown list | Select one of the following three modes in which the Snap executes:
|
Troubleshooting
Error | Reason | Resolution |
---|---|---|
| The library that we use for SFTP connections no longer supports deprecated signature protocols by default. This changed with the 4.33 GA release. | Add the algorithm to the
Learn more: Configuration Options |
| If you have set the Polling Timeout value to a few seconds, it results in the S3 request getting canceled. | Increase the value of Polling Timeout (in seconds) for the Snap to work successfully. We recommend that you set the Polling Timeout value to the default value of 30 minutes or more to fetch all the data from S3. |
Examples
...
Write a List of Files in a Specific Directory
This example pipeline demonstrates how to list out files from a specific directory. After the Poller Snap lists the files, then write the output to a file and run the File Poller Snap again to check whether the new file was created as expected. To ensure that the File Poller Snap doesn't pick up any existing files, you use an unusual extension for this
...
example.
...
Download this pipeline.
...
title | Understanding the pipeline |
---|
File Poller
...
Configure the File Poller Snap
...
to poll a directory for all files with the extension ".JSON2"
...
.
...
Connect JSON Formatter and File Writer Snaps to process the output and write it
...
to a file. Use the JSON Formatter Snap with the default settings. In the File Writer Snap, use the date.now() function to give the file a name, so a new file is created every time you run the pipeline
...
.
You use the JSON Formatter Snap with the default settings.
File Writer
...
When used in production, the output from the File Poller Snap can be used to trigger specific tasks as needed. In this example, you write it to a file
...
. As expected, the file contains no output, as there is no file in the target directory with the extension ".JSON2"
.
...
...
Next, add
...
the second File Poller Snap and configure it exactly as
...
the first one. Once again, you add a JSON Formatter and File Writer Snap with the same settings as for the previous pair. But this time, the file created is not blank: It lists out the file that you created using the first three Snaps in the pipeline:
...
Download this pipeline.
...
Poll a Directory Using a Trigger Task from ServiceNow
In this example, you call a Trigger Task from ServiceNow to poll a directory for files of a specific type.
...
Download this pipeline
Expand | ||
---|---|---|
| ||
To make this example work, you must perform the following tasks:
|
...
Create the File Poller Pipeline You design a pipeline containing the following Snaps:
You will note that the File Poller has open input and output views. This is because it receives data from the Trigger Task associated with it and returns processed data back to the same Trigger Task. CSV Parser |
...
:Use the CSV Parser Snap with the default settings: |
...
Mapper |
...
: Configure the Mapper Snap to receive the parsed CSV data and map the message in the CSV document to the $msg variable: |
...
File Poller |
...
: Configure the File Poller Snap to poll the /QA/Documentation/File Poller/ directory for all files that match the pattern contained in the $msg variable, which you use as a file filter parameter: |
...
...
Create a Trigger Task for the File Poller Pipeline |
...
: Save the pipeline, click the (Create Task) button, and configure the Trigger Task: |
...
...
Click Update to complete setting up the task; then navigate to the Manager to view the Trigger Task's properties: |
...
You copy the Cloud URL and authorization bearer token, and navigate to ServiceNow to set up the API call. |
...
Create a REST call in ServiceNow |
...
:Create a REST Call in ServiceNow by appending the authorization token to the Cloud URL that you copied in the previous step:
|
...
For details on how to set up REST calls using ServiceNow, see ServiceNow documentation. While configuring the REST call, ensure that:
You click Test to check the REST call. For a successful execution, the pipeline returns a list of files whose extension matches the value in the Content field: |
...
Download this pipeline. |
Downloads
Multiexcerpt include macro | ||||
---|---|---|---|---|
|
Attachments | ||||||||
---|---|---|---|---|---|---|---|---|
|
...
Snap Pack History
Insert excerpt | ||||||||
---|---|---|---|---|---|---|---|---|
|