Snap type: | Read |
---|
Description: | This Snap polls a directory looking for files matching the specified pattern. - Expected upstream Snaps: Any Snap with a document output view, such as Mapper, JSON Generator.
- Expected downstream Snaps: Any Snap with a document input view, such as File Reader, Mapper, JSON Formatter.
- Expected input: An optional document to evaluate expressions in the Directory and/or File filter properties. Please note that each input document will trigger the execution of the Snap.
- Expected output: A full path in each document as a value for a key "path". If there are multiple files matching the filter, the same number of documents will be provided in the output view after each interval.
Code Block |
---|
[
{
"path" : "sftp://sftp.smart.com/home/voo/test1.csv"
},
{
"path" : "sftp://sftp.smart.com/home/voo/test2.csv"
}
] |
|
---|
Prerequisites: | IAM Roles for Amazon EC2 The 'IAM_CREDENTIAL_FOR_S3' feature is to access S3 files from EC2 Groundplex, without Access-key ID and Secret key in the AWS S3 account in the Snap. The IAM credential stored in the EC2 metadata is used to gain the access rights to the S3 buckets. To enable this feature, the following line should be added to global.properties and the jcc (node) restarted: jcc.jvm_options = -DIAM_CREDENTIAL_FOR_S3=TRUE Please note this feature is supported in the EC2-type Groundplex only. For more information on IAM Roles, see http://docs.aws.amazon.com/AWSEC2/latest/UserGuide/iam-roles-for-amazon-ec2.html |
---|
Support and limitations: | - Ultra pipelines: May work in Ultra Pipelines.
- Spark mode: Not supported in Spark modesupported in Spark mode.
|
---|
Account: | This Snap uses account references created on the Accounts page the Accounts page of SnapLogic Manager to handle access to this endpoint. This Snap supports several account types, as listed in the table below, or no account. See Binary Account for information on setting up these types of accounts. Account types supported by each protocol are as follows: Protocol | Account types |
---|
sldb | no account | s3 | AWS S3 | ftp | Basic Auth | sftp | Basic Auth, SSH Auth | ftps | Basic Auth | hdfs | no account | webhdfs | no account | smb | SMB | wasb | Azure Storage | wasbs | Azure Storage | gs | Google Storage | file | Local file system |
Required settings for account types are as follows: Account Type | Settings |
---|
Basic Auth | Username, Password | AWS S3 | Access-key ID, Secret key | SSH Auth | Username, Private key | SMB | Domain, Username, Password | Azure Storage | Account name, Primary access key | Google Storage | Approval prompt, Application scope, Auto-refresh token (Read-only properties are Access token, Refresh token, Access token expiration, OAuth2 Endpoint, OAuth2 token and Access type.) |
|
---|
Views: |
Input | This Snap has at most one document input view. |
---|
Output | This Snap has exactly one document output view. |
---|
Error | This Snap has at most one document error view and produces zero or more documents in the view. If the Snap fails during the operation, an error document is sent to the error view containing the fields error, reason, resolution, and stacktrace. |
---|
|
---|
Settings |
---|
Label Required. | The name for the Snap. You can modify this to be more specific, especially if you have more than one of the same Snap in your pipeline. |
---|
Directory | This property is a URL path to the directory where files will be searched. The supported file protocols are: - s3:
- file:
ftp: - ftps:
- sftp:
- hdfs:
- webhdfs:
sldb: smb: wasb: wasbs: gs:
The property can be a JavaScript expression which will be evaluated with values from the input view document and the pipeline parameters. The property should have the syntax: [protocol]://[host][:port]/[path] Please note "://" is a separator between the file protocol and the rest of the URL and the host name and the port number should be between "://" and "/". If the port number is omitted, a default port for the protocol is used. The hostname and port number are omitted in the sldb and s3 protocols. This property should be an absolute path for all protocols except sldb. For sldb, the Snap can access only the same project directory or the shared project directory, and cannot access other project directories. You may leave this property blank to indicate the current sldb project where the pipeline belongs to.
Example: - If you want this property to refer to the sldb project (or shared project) directory where the pipeline of this Snap belongs to, enter "sldb:///" or leave it blank.
- If the pipeline is created in a project other than the shared project and you want this property to refer to the shared project, enter "shared" or "sldb:///shared".
s3:///[bucket_name]/[dir_path] sftp://ftp. Snaplogicsnaplogic.com:22/home/test/dir ftp://ftp. Snaplogicsnaplogic.com/test/csv $directory (The value of the $directory is obtained from the input document and the document should have an entry with the "directory" key. You must press the '=' button.)_directory (A key/value pair with "directory" key should be defined as a pipeline parameter. You must press the '=' button.)file:///D:/testFolder/ (if the Snap is executed in the Windows Groundplex and needs to access D: drive)wasb:///Snaplogic/testDir/ or wasbs:///Snaplogic/testDir/ (if the name of the container is 'Snaplogic')gs:///testBucket/testDir/ (if the bucket name is 'testBucket')
Default value: [None] |
---|
File filter Required. | A GLOB pattern to be applied to select one or more files in the directory. The File filter property can be a JavaScript expression which will be evaluated with values from the input view document. Example: Default value: [None] |
---|
Polling interval in seconds Required. | A time interval in seconds to search the directory Example: 10 Default value: 30 |
---|
Polling timeout Required. | A time to end polling. Its unit is selected in the next property. Example: - -1 (to poll indefinitely)
- 0 (to poll once)
- 60 (its unit shown in the next property)
Default value: 30 |
---|
Polling-timeout unit | Unit for polling timeout. Allowed values are SECONDS, MINUTES and HOURS. Example: SECONDS Default value: MINUTES |
---|