Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

In this article

Table of Contents
maxLevel2
excludeOlder Versions|Additional Resources|Related Links|Related Information

Overview

The S3 File Writer Snap is a Write-type Snap that reads a binary data stream from its input view and writes it to an S3 file destination. If you provide values for File permissions, the Snap sets the permissions to the file. 

Info
  • This Snap has the ability to use an MD5 checksum that automatically checks for data integrity and corruption while uploading the file. If the checksum that Amazon S3 calculates during the upload does not match the value that Snap has sent in the request, S3 fails to store the object and the Snap displays an error.
  • The current Snap functionality supports AWS S3 Cloud Service and is applicable for AWSGovCloud setup.

Multiexcerpt include macro
nameMigrating from Binary to Amazon S3
templateData[]
pageFile Reader
addpanelfalse


Prerequisites

None.

Supported Features

Works in Ultra Task Pipelines.

Limitations and Known Issues

None.

Snap Views

Field NameField Type Description

Label*

Default Value:S3 File Writer
Example
S3 File Writer

String

Specify the name for the Snap. You can modify this to be more specific, especially if you have more than one of the same Snap in your Pipeline.

File name*

Default Value: s3:///

Examples

String/Expression/Suggestion

Specify the URL for the S3 file, from where the binary data is to be read. This Snap also supports S3 Virtual Private Cloud (VPC) endpoint. For example, s3://my-bucket@bucket.vpce-028b7814794578709-vu0vvauy.s3.us-west-2.vpce.amazonaws.com

Info

The file name must start with s3:///. You can use the suggest feature to view the list of buckets, subdirectories and files. Bucket names are suggested if the property is empty or s3:///. Once you select a bucket, it can list subdirectories and files immediately below the bucket. Names of subdirectories end with a forward slash ("/"). The suggest feature is not supported if the properties in the S3 Dynamic account are parameters.

Prerequisite: The provided account must have Read access to the specified S3 bucket in order to read the file successfully.

Using Expressions:

This property can be an expression with the Expression enabler Image Removed enabled.

For example, if the File property is "s3:///mybucket/out_" + Date.now() + ".csv" then the evaluated filename is s3:///mybucket/out_2013-11-13T00:22:31.880Z.csv.

Syntax:

Paste code macro
s3:///<S3_bucket_name>@s3.<region_name>.amazonaws.com/<path>

For region names and their details, see AWS Regions and Endpoints.

Note
titleRegion Name

Region name is optional only if the region is us-east-1. In all other cases the region name must be specified based on the syntax above. For example, mybucket@eu-west-1. 

For more information about regions, see AWS Regions and Endpoints.

Multiexcerpt include macro
nameAcceptable File Paths
pageFile Writer

Multiexcerpt include macro
nameLint Warning
pageFile Writer

Suggest fully-qualified file names
View TypeView FormatNumber of ViewsExamples of Upstream and Downstream SnapsDescription
Input Document
  • Min: 0
  • Max: 1
  • CSV Formatter
  • JSON Formatter
  • XML Formatter
  • File Reader
Any binary data stream.
OutputDocument
  • Min: 0
  • Max: 1
  • File Reader
  • Mapper

If an output view is open and the file write action is successful, the output view provides a document with information on the filename, result, and original data. An example is:

Code Block
{
        "filename": "s3:///mybucket/qatest/user_manual.json",
        "result": "overwritten",
        "original": {
            "content-type" : "application/json"
        }
    }


Error

Document

  • Min: 1

  • Max: 1

N/A

The error view contains error, reason, resolution and stack trace. For more information, see Handling Errors with an Error Pipeline

Optional Configuration

IAM Roles for Amazon EC2

To access S3 files from Groundplex nodes hosted in the EC2 environment without specifying Access-key ID and Secret key in AWS S3 account configured for the Snap, enable the ‘IAM_CREDENTIAL_FOR_S3’ feature. When you enable this feature, the IAM credential stored in EC2 metadata is used to access S3 buckets.

To enable the IAM_CREDENTIAL_FOR_S3 feature:

  1. Open Manager.
  2. Open the Snaplexes tab of the project that contains the EC2-based Groundplex.
  3. Click the Groundplex to open its Properties.
  4. Open the Node Properties tab.
  5. Add a new row in the Global properties section.
  6. Specify jcc.jvm_options as the Key and -DIAM_CREDENTIAL_FOR_S3=TRUE as the Value
    Image Removed

    Restart the JCC (node) to apply the changes. For more information about IAM Roles, refer to IAM Roles for Amazon EC2.

Snap Settings

Info
  • Asterisk ( * ): Indicates a mandatory field.

  • Suggestion icon (Image Removed): Indicates a list that is dynamically populated based on the configuration.

  • Expression icon (Image Removed ): Indicates the value is an expression (if enabled) or a static value (if disabled). Learn more about Using Expressions in SnapLogic.

  • Add icon ( Image Removed ): Indicates that you can add fields in the fieldset.

  • Remove icon ( Image Removed): Indicates that you can remove fields from the fieldset.

  • Upload icon (Image Removed ): Indicates that you can upload files.

Error handling is a generic way to handle errors without losing data or failing the Snap execution. You can handle the errors that the Snap might encounter when running the pipeline by choosing one of the following options from the When errors occur list under the Views tab:

  • Stop pipeline Execution: Stops the current pipeline execution if the Snap encounters an error.

  • Discard Error Data and Continue: Ignores the error, discards that record, and continues with the remaining records.

  • Route Error Data to Error View: Routes the error data to an error view without stopping the Snap execution.

Learn more about Error handling in Pipelines.


Snap Settings

Info
  • Asterisk ( * ): Indicates a mandatory field.

  • Suggestion icon (Image Added): Indicates a list that is dynamically populated based on the configuration.

  • Expression icon (Image Added ): Indicates the value is an expression (if enabled) or a static value (if disabled). Learn more about Using Expressions in SnapLogic.

  • Add icon ( Image Added ): Indicates that you can add fields in the fieldset.

  • Remove icon ( Image Added): Indicates that you can remove fields from the fieldset.

  • Upload icon (Image Added ): Indicates that you can upload files.


Write empty file

File action*

Default Value:OVERWRITE 
Example
IGNORE 

Specify the action to perform if the file already exists. The available options are:

  • OVERWRITE - The Snap attempts to write the file without checking for the file's existence for a better performance, and the "fileAction" field will be "overwritten" in the output view data.
  • IGNORE The Snap will not overwrite the file and will do nothing but write the status and file name to its output view.
  • ERROR-  The error displays in the Pipeline Run Log. If an error view is defined, the error will be written there as well.
Field NameField Type Description

Label*


Default Value:S3 File Writer
Example
S3 File Writer

String

Specify the name for the Snap. You can modify this to be more specific, especially if you have more than one of the same Snap in your pipeline.


File name*


Default Value: s3:///

Examples

String/Expression/Suggestion

Specify the URL for the S3 file, from where the binary data is to be read. This Snap also supports S3 Virtual Private Cloud (VPC) endpoint. For example, s3://my-bucket@bucket.vpce-028b7814794578709-vu0vvauy.s3.us-west-2.vpce.amazonaws.com

Info

The file name must start with s3:///. You can use the suggest feature to view the list of buckets, subdirectories and files. Bucket names are suggested if the property is empty or s3:///. Once you select a bucket, it can list subdirectories and files immediately below the bucket. Names of subdirectories end with a forward slash ("/"). The suggest feature is not supported if the properties in the S3 Dynamic account are parameters.

Prerequisite: The provided account must have Read access to the specified S3 bucket in order to read the file successfully.

Using Expressions:

This property can be an expression with the Expression enabler Image Added enabled.

For example, if the File property is "s3:///mybucket/out_" + Date.now() + ".csv" then the evaluated filename is s3:///mybucket/out_2013-11-13T00:22:31.880Z.csv.

Syntax:

Paste code macro
s3:///<S3_bucket_name>@s3.<region_name>.amazonaws.com/<path>

For region names and their details, see AWS Regions and Endpoints.

Note
titleRegion Name

Region name is optional only if the region is us-east-1. In all other cases the region name must be specified based on the syntax above. For example, mybucket@eu-west-1. 

For more information about regions, see AWS Regions and Endpoints.

Multiexcerpt include macro
nameAcceptable File Paths
pageFile Writer

Multiexcerpt include macro
nameLint Warning
pageFile Writer

Suggest fully-qualified file names


Default ValueDeselected

Checkbox

Select this checkbox to include the region and authority of the S3 bucket in the associated paths that appear in the Suggestion list.

Info

We recommend you to use fully-qualified suggestions from the Suggestions list if you are using an instance in the gov cloud.


File action*


Default Value:OVERWRITE 
Example
IGNORE 


Dropdown list

Specify the action to perform if the file already exists. The available options are:

  • OVERWRITE - The Snap attempts to write the file without checking for the file's existence for a better performance, and the "fileAction" field will be "overwritten" in the output view data.
  • IGNORE The Snap will not overwrite the file and will do nothing but write the status and file name to its output view.
  • ERROR-  The error displays in the Pipeline Run Log. If an error view is defined, the error will be written there as well.
Note

Even though it is listed, Append is not supported for the S3 file protocol.


Write empty file


Default ValueDeselected

Checkbox

Select this checkbox to

include the region and authority of the S3 bucket in the associated paths that appear in the Suggestion list.
Info

We recommend you to use fully-qualified suggestions from the Suggestions list if you are using an instance in the gov cloud.

Dropdown list
Note

Even though it is listed, Append is not supported for the S3 file protocol.

write an empty file when the incoming binary document has empty data. If there is no incoming document at the input view of the Snap, no file is written regardless of the value of the property.


Write header file


Default ValueDeselected


Checkbox

Select this checkbox to write a header file whose name is generated by appending ".header" to the value of the File name property. The header file provides metadata about the file such as content disposition or content type. The same header information is also included in the output view data, as shown in the "Expected output" section above, under the key "original". 

The binary data stream in the input view may contain header information about the binary data in the form of a document with key-value pair data.

Info

If the header has no keys other than Content-Type or Content-Encoding, the .header file is not written.


Validate after write


DefaultValueDeselected

Checkbox

Select this checkbox to

write an empty file when the incoming binary document has empty data. If there is no incoming document at the input view of the Snap, no file is written regardless of the value of the property.

Write header file

Default ValueDeselected

Checkbox

Select this checkbox to write a header file whose name is generated by appending ".header" to the value of the File name property. The header file provides metadata about the file such as content disposition or content type. The same header information is also included in the output view data, as shown in the "Expected output" section above, under the key "original". 

The binary data stream in the input view may contain header information about the binary data in the form of a document with key-value pair data.

Info

If the header has no keys other than Content-Type or Content-Encoding, the .header file is not written.

Validate after write

DefaultValueDeselected

Checkbox

Select this checkbox to allow the Snap to check if the file exists after writing the file.

Number of retries

Minimum value: 0

Default Value: 0
Example:

allow the Snap to check if the file exists after writing the file.


Number of retries


Minimum value: 0

Default Value: 0
Example: 3

Integer/Expression

Specify the maximum number of retry attempts the Snap must make when it fails to write. If the value is larger than 0, the Snap first stores the input data in a temporary local file before writing to the target file.

Info

Ensure that the local drive has sufficient free disk space as large as the expected target file size.

If the value is larger than 0, the Snap first downloads the target file into a temporary local file. If any error occurs during the download, the Snap waits for the time specified in the Retry interval and attempts to download the file again from the beginning. When the download is successful, the Snap streams the data from the temporary file to the downstream Pipeline. All temporary local files are deleted when they are no longer needed.


Retry interval (seconds)


Minimum value: 1

Default Value: 1
Example3

Integer/Expression

Specify the

maximum

minimum number of

retry attempts

seconds for which the Snap must

make when it fails to write. If the value is larger than 0, the Snap first stores the input data in a temporary local file before writing to the target file.
Info

Ensure that the local drive has sufficient free disk space as large as the expected target file size.

If the value is larger than 0, the Snap first downloads the target file into a temporary local file. If any error occurs during the download, the Snap waits for the time specified in the Retry interval and attempts to download the file again from the beginning. When the download is successful, the Snap streams the data from the temporary file to the downstream Pipeline. All temporary local files are deleted when they are no longer needed.

Retry interval (seconds)

Minimum value: 1

Default Value: 1
Example3

Integer/Expression

Specify the minimum number of seconds for which the Snap must wait before attempting recovery from a network failure.

Buffer size(MB)

Default Value: 10 MB

Integer

Specify the data (in MB) to load into the S3 bucket, at a time.

Note
  • The minimum data size you can upload is 6 MB.

  • The maximum data size you can upload is limited to 10000 times the buffer size.

    wait before attempting recovery from a network failure.


    Buffer size(MB)


    Default Value: 10 MB


    Integer

    Specify the data (in MB) to load into the S3 bucket, at a time.

    Note
    • The minimum data size you can upload is 6 MB.

    • The maximum data size you can upload is limited to 10000 times the buffer size.

    • To upload S3 files that are more than 100 GB in size, we recommend you to set the Buffer size to 100 MB or more. Also, set the Maximum upload threads to 10; otherwise, many parallel HTTP connections may be opened, leading to the following error: "AmazonClientException: Unable to execute HTTP request: Timeout waiting for connection from pool".

    Refer to Upload Part for more information on uploading to S3. 

    Maximum upload threads


    Default Value: 10
    Example
    :3

    Integer

    Specify the maximum number of threads to be used for the concurrent multipart upload. The minimum value allowed is 1.

    Note

    To upload S3 files that are more than 100 GB in size, we recommend you to set

    the

    the Maximum upload threads to 10. Also, set the Buffer size to 100 MB or more

    . Also, set the Maximum upload threads to 10; otherwise

    ; otherwise, many

    parallel HTTP

    parallel HTTP connections

    may be

    may be opened, leading to the following error: "AmazonClientException: Unable to execute HTTP request: Timeout waiting for connection from pool".

    Refer to Upload Part for more information on uploading to S3. 


    Maximum upload threadsAWS Canned ACL


    Default Value: 10None
    Example
    : 3PublicRead

    Integer

    Specify the maximum number of threads to be used for the concurrent multipart upload. The minimum value allowed is 1.

    Note

    To upload S3 files that are more than 100 GB in size, we recommend you to set the Maximum upload threads to 10. Also, set the Buffer size to 100 MB or more; otherwise, many parallel HTTP connections may be opened, leading to the following error: "AmazonClientException: Unable to execute HTTP request: Timeout waiting for connection from pool".

    AWS Canned ACL

    Default Value: None
    Example
    : PublicRead

    Dropdown list

    The predefined ACL grants (from AWS) to use when writing a file to S3. Choose a Canned ACL from the available options:

    • None
    • Private
    • PublicRead
    • PublicReadWrite
    • AuthenticatedRead
    • LogDeliveryWrite
    • BucketOwnerRead
    • BucketOwnerFullControl
    • AwsExecRead

    Watch the video for more information about AWS Canned ACL.

    Learn more about AWS Canned ACLs.

    Access Control List

    Use this field set to define the Access Control List (ACL) to the specified S3 file. This filed set contains the following fields:

    • Grantee
    • Read
    • View permissions
    • Full control
    Note

    The account that you use for the Snap must have the required permissions to set the ACLs for the S3 object it is writing to.

    GranteeString/Expression/Suggestion

    Select a grantee from the suggested list or enter a valid email address or a canonical ID associated to an AWS account. Canonical AWS ID can be obtained in the Security Credentials page of the AWS console.

    Info

    A grantee can be an AWS account or one of the predefined Amazon S3 groups. The following note is an excerpt from AWS document:

    "An email grantee is a grantee identified by their email address and authenticated by an Amazon system. email grants are internally converted to the canonical user representation when creating the ACL. If the grantee changes their email address, it will not affect existing Amazon S3 permissions. Adding a grantee by email address only works if exactly one Amazon account corresponds to the specified email address. If multiple Amazon accounts are associated with the email address, an AmbiguousGrantByEmail error message is returned. This happens rarely, but usually occurs if a user created an Amazon account in the past, forgotten the password, and created another Amazon account using the same email address. If this occurs, the user should contact Amazon customer service to have the accounts merged. Alternatively, grant user access specifying the canonical user representation."

    Examples:      

    • "Everyone"
    • "Authenticated user"
    •  an email address used to create AWS account
    • a canonical AWS ID, e.g. "1700891f3927e316dc4c9e18c789b32131880f48d3e03ac110aaf695b212573e"
    Warningtitle

    "Everyone” option allows anyone in the world to access the file (authenticated or anonymous). Even when the bucket is protected with permission and if the file operation in the Snap under ACL->Grantee is set to 'Everyone', any user (authenticated or anonymous) can access the file. So, we highly recommend not to use this option as its unsafe. 

    Note
    titleEmail Address Grantee Access

    You can use Email addresses to specify a grantee only in the following regions:

    • US East (N. Virginia)
    • US West (N. California)
    • US West (Oregon)
    • Asia Pacific (Singapore)
    • Asia Pacific (Sydney)
    • Asia Pacific (Tokyo)
    • EU (Ireland)
    • South America (São Paulo)

    For more information, see Specifying Grantee.

    ReadCheckboxGrants permission to read the file.View permissionsCheckboxGrants permission to read the ACL.Full controlCheckboxGrants full control to the file.User-defined object metadataDropdown list

    The predefined ACL grants (from AWS) to use when writing a file to S3. Choose a Canned ACL from the available options:

    • None
    • Private
    • PublicRead
    • PublicReadWrite
    • AuthenticatedRead
    • LogDeliveryWrite
    • BucketOwnerRead
    • BucketOwnerFullControl
    • AwsExecRead

    Watch the video for more information about AWS Canned ACL.

    Learn more about AWS Canned ACLs.

    Access Control List


    Use this field set to define the Access Control List (ACL) to the specified S3 file. This filed set contains the following fields:

    • Grantee
    • Read
    • View permissions
    • Full control


    Note

    The account that you use for the Snap must have the required permissions to set the ACLs for the S3 object it is writing to.


    GranteeString/Expression/Suggestion

    Select a grantee from the suggested list or enter a valid email address or a canonical ID associated to an AWS account. Canonical AWS ID can be obtained in the Security Credentials page of the AWS console.

    Info

    A grantee can be an AWS account or one of the predefined Amazon S3 groups. The following note is an excerpt from AWS document:

    "An email grantee is a grantee identified by their email address and authenticated by an Amazon system. email grants are internally converted to the canonical user representation when creating the ACL. If the grantee changes their email address, it will not affect existing Amazon S3 permissions. Adding a grantee by email address only works if exactly one Amazon account corresponds to the specified email address. If multiple Amazon accounts are associated with the email address, an AmbiguousGrantByEmail error message is returned. This happens rarely, but usually occurs if a user created an Amazon account in the past, forgotten the password, and created another Amazon account using the same email address. If this occurs, the user should contact Amazon customer service to have the accounts merged. Alternatively, grant user access specifying the canonical user representation."

    Examples:      

    • "Everyone"
    • "Authenticated user"
    •  an email address used to create AWS account
    • a canonical AWS ID, e.g. "1700891f3927e316dc4c9e18c789b32131880f48d3e03ac110aaf695b212573e"
    Warning
    title

    "Everyone” option allows anyone in the world to access the file (authenticated or anonymous). Even when the bucket is protected with permission and if the file operation in the Snap under ACL->Grantee is set to 'Everyone', any user (authenticated or anonymous) can access the file. So, we highly recommend not to use this option as its unsafe. 


    Note
    titleEmail Address Grantee Access

    You can use Email addresses to specify a grantee only in the following regions:

    • US East (N. Virginia)
    • US West (N. California)
    • US West (Oregon)
    • Asia Pacific (Singapore)
    • Asia Pacific (Sydney)
    • Asia Pacific (Tokyo)
    • EU (Ireland)
    • South America (São Paulo)

    For more information, see Specifying Grantee.


    ReadCheckboxGrants permission to read the file.View permissionsCheckboxGrants permission to read the ACL.Full controlCheckboxGrants full control to the file.User-defined object metadataUse this field set to define key-value pairs for user-defined object metadata of an S3 object. For more information about user-defined object metadata, see Using Metadata

    This field set contains the following fields:

    • Key
    • Value
    KeyString/Expression

    Specify the key name of the object metadata.

    Info

    The key names of the object metadata are case-insensitive.  AWS S3 converts them to lower-case and prefixes them with “x-amz-meta-” when displayed in the AWS S3 web console.

    When the S3 File Reader Snap reads an S3 file, this metadata is shown in the header of the output binary data, and the key names are displayed in lower-case without the prefix “x-amz-meta-”.


    ValueString/ExpressionSpecify the value for the key entered above.Object tags

    Use this field set to define key-value pairs for

    user-defined

    object

    metadata

    tags of an S3 object.

    For more information about user-defined object metadata, see Using Metadata

    This Object tags enable you to categorize existing and new objects using key-value combinations. For details about the object tags, see Object Tagging. This field set contains the following fields:

    • Key
    • Value
    KeyString/Expression

    Specify the key name of the object tag.

    Info

    The key names of object tags are case-sensitive. When the S3 File Reader Snap reads an S3 file, these object tags are displayed in the header of the output binary data. If a key name of an object tag is the same as another in the header, it is prefixed with “tag_”.

    See example Providing User-defined Object Metadata and Object Tags using the S3 File Writer Snap below for more information. 

    ValueString/ExpressionSpecify the value for the key entered above.

    Snap Execution

    Dropdown list Multiexcerpt include macronameExecution_Detail_WritepageSOAP Execute

    Specify the key name of the object

    metadata

    tag.

    Info

    The key names of

    the

    object

    metadata

    tags are case-

    insensitive.  AWS S3 converts them to lower-case and prefixes them with “x-amz-meta-” when displayed in the AWS S3 web console.

    When the S3 File Reader Snap reads an S3 file, this metadata is shown in the header of the output binary data, and the key names are displayed in lower-case without the prefix “x-amz-meta-”.

    ValueString/ExpressionSpecify the value for the key entered above.Object tags

    Use this field set to define key-value pairs for object tags of an S3 object. Object tags enable you to categorize existing and new objects using key-value combinations. For details about the object tags, see Object Tagging. This field set contains the following fields:

    • Key
    • Value
    KeyString/Expression

    sensitive. When the S3 File Reader Snap reads an S3 file, these object tags are displayed in the header of the output binary data. If a key name of an object tag is the same as another in the header, it is prefixed with “tag_”.

    See example Providing User-defined Object Metadata and Object Tags using the S3 File Writer Snap below for more information. 


    ValueString/ExpressionSpecify the value for the key entered above.

    Snap Execution

    Dropdown list

    Multiexcerpt include macro
    nameExecution_Detail_Write
    pageSOAP Execute

    Optional Configuration

    IAM Roles for Amazon EC2

    To access S3 files from Groundplex nodes hosted in the EC2 environment without specifying Access-key ID and Secret key in AWS S3 account configured for the Snap, enable the ‘IAM_CREDENTIAL_FOR_S3’ feature. When you enable this feature, the IAM credential stored in EC2 metadata is used to access S3 buckets.

    To enable the IAM_CREDENTIAL_FOR_S3 feature:

    1. Open Manager.
    2. Open the Snaplexes tab of the project that contains the EC2-based Groundplex.
    3. Click the Groundplex to open its Properties.
    4. Open the Node Properties tab.
    5. Add a new row in the Global properties section.
    6. Specify jcc.jvm_options as the Key and -DIAM_CREDENTIAL_FOR_S3=TRUE as the Value
      Image Added

      Restart the JCC (node) to apply the changes. For more information about IAM Roles, refer to IAM Roles for Amazon EC2.

    Examples


    Providing User-defined Object Metadata and Object Tags using the S3 File Writer Snap

    This example is a basic use case for the S3 File Writer Snap. It also demonstrates how you can configure the Snap with custom object metadata and object tags to classify the data. 

    In the sample Pipeline, the S3 File Writer Snap is configured as follows with the User-defined object metadata and Object tags

    The following is a preview of the output data from the S3 File Writer Snap: 

    When the S3 File Reader Snap reads this data, it picks up the user-defined object metadata and object tags we defined, as show below: 

    Typical Configuration

    Key configuration of the Snap lies in how the values are passed. Values can be passed to the Snap:

    Without Expressions

    For example, the File name is passed directly to the Snap in the image below.

    With Expressions

    The File name is passed using Pipeline Parameters:

    Download the Pipeline.

    Downloads

    Multiexcerpt include macro
    namedownload_instructions
    pageOpenAPI

    Attachments
    uploadfalse
    oldfalse
    patterns*.slp, *.zip


    See Also

    Insert excerpt
    Binary Snap Pack
    Binary Snap Pack
    nopaneltrue