Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

In this article

Table of Contents
minLevel1
maxLevel4

Key Components

There are three key components involved with the Redshift Bulk Snaps.

  • EC2 Instance

  • Redshift Cluster

  • S3

These components can reside in the same AWS account or different accounts. After executing the Pipeline, these components perform the following operations:

  • EC2: Archives input data and writes the data into the specified S3 bucket/folder.

    ABC.csv.gz -> s3://swat-3032/datalake/raw

  • Redshift Cluster: Copies the data from S3 to a Redshift temporary table using the COPY command.

    COPY "public"."swat3032_update_temp_table_XYZ" ("id", "name", "price")
    FROM 's3://swat-3032/datalake/raw/Redshift_load_temp/ABC.csv.gz'
    CREDENTIALS '...'

  • S3: Loads the data from temporary table to a target table. For the Upsert Snap this will be an UPDATE followed by an INSERT operation.

For more information about the operation that is done by each component, you can inspect the queries in the Redshift console.

Configuring Redshift Cross Account with IAM Role

The following flow chart illustrates the cross-account roles that you should configure for each key component.

Appsplus panel macro confluence macro
data{"features":["title","icon","rounded"],"title":"Redshift Cross Account IAM Role Account Setup","titleColor":"#FFFFFF","titleColorBG":"#04599e","titleSize":16,"titleBold":false,"titleItalic":false,"titleUnderline":false,"lozenge":"Hot stuff","lozengeColor":"#172B4D","lozengeColorBG":"#fff0b3","height":200,"panelPadding":12,"panelColor":"#141414","panelColorBG":"#e5e6eb","borderColor":"#4C9AFF","borderRadius":3,"borderStyle":"solid","borderWidth":1,"icon":"editor/info","iconPrimary":"#FFFFFF","iconSecondary":"#0052CC","newMacro":false}
  1. If all your components are in the same AWS account, you must use a Redshift IAM Account, which means you do not need a Redshift Cross Account IAM Role account.

  2. If all your components are in different AWS accounts, you must use Redshift Cross Account IAM Role account for your setup to be successful.

Here are the typical combinations of your Redshift cross IAM Role account configuration, when all the components are in different AWS accounts:

  • Configuration 1: If S3, EC2, and Redshift are in different AWS accounts.

  • Configuration 2: If Redshift is in one AWS account, and EC2 and S3 are in another account.

  • Configuration 3: If Redshift and S3 are in one AWS account and EC2 is in another account.

The values in the legend indicate the values that you can use in your account configuration.

Appsplus panel macro confluence macro
data{"features":["title","icon","rounded"],"title":"Configuration 1: S3 in its own AWS Account","titleColor":"#FFFFFF","titleColorBG":"#04599e","titleSize":16,"titleBold":false,"titleItalic":false,"titleUnderline":false,"lozenge":"Hot stuff","lozengeColor":"#172B4D","lozengeColorBG":"#fff0b3","height":200,"panelPadding":12,"panelColor":"#172b4d","panelColorBG":"#ededf0","borderColor":"#4C9AFF","borderRadius":3,"borderStyle":"solid","borderWidth":1,"icon":"editor/info","iconPrimary":"#FFFFFF","iconSecondary":"#0052CC","newMacro":false}

When all the components are in three different AWS accounts, the configuration centers around accessing S3. In which case, you need Read (read from S3) and Write (write to S3) permissions for S3. EC2 should be able to write and Redshift should be able to read.

Appsplus panel macro confluence macro
data{"features":["title","icon","rounded"],"title":"Configuration 2: Redshift in its own AWS Account","titleColor":"#FFFFFF","titleColorBG":"#04599e","titleSize":16,"titleBold":false,"titleItalic":false,"titleUnderline":false,"lozenge":"Hot stuff","lozengeColor":"#172B4D","lozengeColorBG":"#fff0b3","height":200,"panelPadding":12,"panelColor":"#172B4D","panelColorBG":"#e5e6eb","borderColor":"#4C9AFF","borderRadius":3,"borderStyle":"solid","borderWidth":1,"icon":"editor/info","iconPrimary":"#FFFFFF","iconSecondary":"#0052CC","newMacro":false}

When Redshift is in one account and EC2 and S3 are in a different account, you need to define a role with an S3 write policy. Define another role in the same account for EC2 to assume that role and assign this role to the
EC2 instance.

Appsplus panel macro confluence macro
data{"features":["title","icon","rounded"],"title":"Configuration 3: EC2 in its own AWS account","titleColor":"#FFFFFF","titleColorBG":"#04599e","titleSize":16,"titleBold":false,"titleItalic":false,"titleUnderline":false,"lozenge":"Hot stuff","lozengeColor":"#172B4D","lozengeColorBG":"#fff0b3","height":200,"panelPadding":12,"panelColor":"#172B4D","panelColorBG":"#e5e6eb","borderColor":"#4C9AFF","borderRadius":3,"borderStyle":"solid","borderWidth":1,"icon":"editor/info","iconPrimary":"#FFFFFF","iconSecondary":"#0052CC","newMacro":false}

When Redshift and S3 are in the same account and EC2 is in another account, you can define a role (role1) with S3 read permission, and then define another role (role2) in the same account to assume that role (role1). Assign role2 the Redshift Cluster, and role1 to the S3 read field.

Alternatively, leave S3 Read blank and ensure role assigned to the Redshift Cluster has S3 read permission.

Read and Write policies

Paste code macro
titleRead Policy
 "Version": "2012-10-17",
    "Statement": [
        {
            "Effect": "Allow",
            "Action": [
                "s3:ListBucket",
                "s3:GetBucketLocation"
            ],
            "Resource": "arn:aws:s3:::bucket"
        },
        {
            "Effect": "Allow",
            "Action": [
                "s3:GetObject"
            ],
            "Resource": "arn:aws:s3:::bucket/*"
        }
    ]
}
Paste code macro
titleWrite Policy
"Version": "2012-10-17",
    "Statement": [
        {
            "Effect": "Allow",
            "Action": [
                "s3:PutObject",
                "s3:DeleteObject"
            ],
            "Resource": "arn:aws:s3:::bucket/*"
        }
    ]
}