Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.
Comment: Added S3 example.

On this Page

Table of Contents
maxLevel2
excludeOlder Versions|Additional Resources|Related Links|Related Information

Snap Type:Read
Description:

This Snap reads any type of data from an S3 bucket. If a value is given in the Version ID property, it will read the specific version of an S3 file object. The suggest features can be used to view and select S3 buckets, sub-directories, files and version IDs.

ETL Transformations & Data Flow

This Snap extracts data from S3 buckets. 

Input & Output

  • Expected input: An upstream Snap is optional and any Snap with a document output view can be connected upstream (such as Mapper, File Writer, and so on). Any document with key-value pairs to evaluate expression properties in the S3 File Reader Snap. Each input document, if any, will cause one read operation of the Snap.
  • Expected output: Any Snap with a binary input view can be connected downstream, such as CSV Parser, JSON Parser, XML Parser, and so on. Binary data read from AWS S3 specified in the File property with header information about the binary stream. The binary data and header information can be previewed at the output of the Snap. An example is: 
Expand


Paste code macro
languagejson
{
	"content-length": "96258"
	"last-modified":  {
		"_snaptype_datetime": "2014-06-26T23:27:01.000 UTC"}
	"content-disposition": "attachment; filename="leads.csv""
	"content-location": "s3:///mr_test/leads.csv"
	"content-type": "text/csv"
	"etag": "730145bec198288e9f428193fde851b7"
}



Prerequisites

IAM Roles for Amazon EC2

The IAM_CREDENTIAL_FOR_S3 feature is to access S3 files from Groundplex nodes hosted in the EC2 environment without Access-key ID and Secret key in the AWS S3 account in the Snap. The IAM credential stored in the EC2 metadata provides access rights to the S3 buckets. 

Expand
titleSteps to enable the IAM_CREDENTIAL_FOR_S3 feature
  1. Open Manager.
  2. Open the Snaplexes tab of the project that contains the EC2-based Groundplex.
  3. Click the Groundplex to open its Properties.
  4. Open the Node Properties tab.
  5. Click + to add a new row in the Global properties section.
  6. Enter jvm_options in Key and -DIAM_CREDENTIAL_FOR_S3=TRUE in Value.
    Image Modified
  7. Restart the JCC (node).


Note
  • IAM role is supported only in the Groundplex nodes hosted in the EC2 environment.
  • The IAM Role stored on the EC2 instance requires List, Read, and Write permissions.

  • S3 account validation is not supported when you enable IAM role property. 

For more information on IAM Roles, see IAM Roles for Amazon EC2.

Support and Limitations

Works in Ultra Pipelines.

Configurations

Account & Access

This Snap uses account references created on the Accounts page of SnapLogic Manager to handle access to this endpoint. See Configuring Binary Accounts for information on setting up accounts that work with this Snap.

Required settings for account types are as follows: 

  • AWS S3 - Access-key ID, Secret key, Security token.
  • S3 Dynamic - Access-key ID, Secret key, Security token, Server-side encryption.

Views

InputThis Snap has at most one document input view. 
OutputThis Snap has exactly one binary output view.
ErrorThis Snap has at most one document error view and produces zero or more documents in the view.


Troubleshooting:The section describes typical issues you may encounter while using this Snap, and instructions on how to workaround them:

Settings

Label

Required. The name for the Snap. You can modify this to be more specific, especially if you have more than one of the same Snap in your pipeline.
File

Required. This property specifies the URL for the S3 file, from where the binary data is to be read. It must start with "s3:///". The suggest feature can be used to view the list of buckets, subdirectories and files. Bucket names are suggested if the property is empty or "s3:///". Once a bucket is selected, it can list subdirectories and files immediately below the bucket. Names of subdirectories end with a forward slash ("/"). The suggest feature is not supported if the properties in the S3 Dynamic account are parameters.

Note
titlePrerequisite

The provided account must have 'read' access to the specified S3 bucket in order to read the file successfully.

Using Expressions:

This property can be an expression with the "=" button pressed.

For example, if the File property is "s3:///mybucket/out_" + Date.now() + ".csv" then the evaluated filename is s3:///mybucket/out_2013-11-13T00:22:31.880Z.csv.

Syntax:

Paste code macro
s3:///<S3_bucket_name>@s3.<region_name>.amazonaws.com/<path>

For region names and their details, see AWS Regions and Endpoints.

Note
titleRegion Name

Region name is optional only if the region is us-east-1. In all other cases the region name must be specified based on the syntax above. For example, mybucket@eu-west-1. 

See AWS Regions and Endpoints for details.

Examples

  • s3:///mybucket@s3.eu-west-1.amazonaws.com/test.json
  • s3:///rds-sql-test-staging@s3.us-gov-east-1.amazonaws.com/test.csv
  • _filename (A key/value pair with "filename" key should be defined as a pipeline parameter)
  • $filename (A key/value pair with "filename" key should be defined the input document)

Default value:  s3:///

Version ID

Enter or select S3 file version ID. If the property is empty, the Snap reads the latest version. The suggest feature can be used to view the list of version IDs for the S3 file in the File property. The suggest feature is not supported if the properties in the S3 Dynamic account are parameterized. Each line in the suggested list also includes the last modified date and the file size to help select a version. When the property value is entered manually, only the version ID is needed. The Snap ignores the last modified date and size information of a version when it reads the file. If the versioning of a S3 bucket is not enabled, no version ID is suggested. The versions of the following cases will be omitted in the suggested list since their files cannot be downloaded: 

  • If a file had existed before the versioning was enabled, its version does not have any version ID assigned to it.
  • Version ID's with 'Deleted Marker' resource type are also omitted in the suggested list.

Examples:   xvcnB8gPi37l3hbOzlsRFxjVwQ.numQz

Default value:  [None]

Version ID suggestion intervalEnter the time interval for the Version ID suggestion. Enter two rows to provide a start date and an end dates. If only one row is provided, the interval will be from the date until now. If empty, all version IDs will be suggested. This property may be useful when a given S3 file has many versions. This property is used for the Version ID suggestion only, and not used during the Snap preview or execution.
Year

Enter the year as a 4-digit integer.

Example:  2017

Default value:  [None]

Month

Enter the month as an integer.

Examples:   9, 09, 12, and so on.

Default value:  [None]

Date

Enter the day of the month.

Examples:   28, 09, 12, and so on.

Default value:  [None]

Zone

Enter or select a time zone ID from the suggested list. May be empty for UTC. Please note only zone IDs in the suggested list are supported.

Examples:   US/Pacific

Default value:  [None]

Enable staging

If selected, the Snap downloads the source file into a local temporary file. When the download is completed, it streams the data from the temporary file to the output view. This property prevents the Snap from being blocked by slow downstream pipeline. The local disk should have sufficient free space as large as the expected file size. 

Default value: Not selected


Note

Some Snaps may take a long time to process large amounts of data. This, in turn, could lead to connection timeouts, causing the pipeline to fail. Selecting this property saves the data on your local disk, enabling you to avoid such timeouts.


Number of retries

Specifies the maximum number of retry attempts that the Snap must make in case there is a network failure, and the Snap is unable to read the target file.

If the value is larger than 0, the Snap overrides the Enable staging value to true and downloads the S3 file to a temporary local file. If any error occurs during the download, the Snap waits for the time specified in the Retry interval and attempts to download the file again from the beginning. When the download is successful, the Snap starts to stream the data from the temporary file to the downstream Pipeline. All temporary local files are deleted when they are no longer needed.


Info

Ensure that the local drive has sufficient free disk space to store the temporary local file.

Example:  3

Minimum value: 0

Default value: 0

Multiexcerpt include macro
nameretries
pageFile Reader

Retry interval (seconds)

Specifies the minimum number of seconds for which the Snap must wait before attempting recovery from a network failure.

Example:  3

Minimum value: 1

Default value: 1

Get Object Tags

Select this check box to include object tags in the header of the output binary data. See Object Tagging for more information on object tags.

You must have the S3:GetObjectTagging permission to be able to use this feature.

Default value: Not selected

Multiexcerpt include macro
nameSnap Execution
pageSOAP Execute

Multiexcerpt include macro
nameSnap_Execution_Introduced
pageAnaplan Read

Examples


Basic Use Case

The following pipeline describes how the Snap functions as a standalone Snap in a pipeline:










The Snap is configured with the following parameters:

A preview of the output from executing this pipeline is shown below:

The exported pipeline is available in the Downloads section below.

Typical Snap Configurations

Key configuration of the Snap lies in how the values are passed. Values can be passed:

  • Without Expressions: Values are passed directly in the Snap.

         

  • With Expressions:
    • Using pipeline parameters: Values are passed as pipeline parameters:

                    


A few examples of how the Snap's suggestion works:


Expand
titleSample S3 File Reader Snap with bucket names suggested


Expand
titleSample S3 File Reader Snap with sub-directories and files suggested


Expand
titleSample S3 File Reader Snap with all version IDs suggested without interval


Expand
titleSample S3 File Reader Snap with Version ID suggestion interval and time zone IDs suggested


Expand
titleSample S3 File Reader Snap with version IDs suggested with interval from 11/21/2017 to 11/30/2017 (as given above)

Downloads

Attachments
patterns*.slp,*.zip

See Also

Insert excerpt
Binary Snap Pack
Binary Snap Pack
nopaneltrue