December 11, 2023     11 min read

Let's Try - AWS S3 Access Grants

Let's Try - AWS S3 Access Grants

Introduction

AWS S3 Access Grants provide a modern, scalable approach to manage granular access to Amazon S3 data. This new service is essential for assisting organisations who are trying to adhere to the principle of least privilege, but who struggle to deal with the complexity of managing access to large-scale data environments. Let's delve into why S3 Access Grants are a game-changer and how they surpass traditional methods in managing access to S3 data.

Modern data access management is hard

Let's imagine we are building a new e-commerce platform called ShopFast. We have a team of developers in charge of assigning access to a couple of S3 buckets that contain the company's data.

The organization has several departments, each with different access requirements to the data. The following table outlines the access requirements for each department:

DepartmentAccessible S3 ResourcesAccess Type
Marketings3://shopfast-data/products/Read
s3://shopfast-data/feedback/Read
Saless3://shopfast-data/transactions/Read/Write
s3://shopfast-data/users/Read/Write
Customer Supports3://shopfast-data/users/Read
s3://shopfast-data/feedback/Read/Write
Product Managements3://shopfast-data/products/Read/Write
s3://shopfast-internal/leads/Read
HRs3://shopfast-internal/employee-records/Read/Write
s3://shopfast-internal/benefits/Read/Write

Some departments only need to read data, so we expect a mix of read-only and read-write access to certain locations in buckets for different teams.

I've chosen this example to highlight the complexity most organizations eventually find themselves facing over time. As the number of users and data grows, the number of teams who want to get at that data also grows. This complexity is compounded by the fact that the data is often spread across multiple buckets and prefixes.

Access Pattern Plan
Access Pattern Plan

So now that we understand the problem, let's look at some of the ways you could solve it.

Should I care about S3 Access Grants?

It's a fair question. If you're not dealing with multifaceted access, or perhaps you just have a very good idea of what your access requirements are, then you might not need to care about S3 Access Grants. However, if you are a growing organization, inevitably you will be asked to expand access to your data, and you will need to find a way to do it.

What I'd like to cover in this section are the foundational data access problems and the current solutions offered by AWS. Then we can look at how S3 Access Grants solve those problems.

Scalability

When I use the term scalability, I'm referring to the ability to manage access to large-scale data environments. This includes the number of IAM principals, the number of S3 buckets, and the number of S3 prefixes.

In our example, we have a small complex environment, but imagine if we had 100 departments, each with its access requirements. The number of IAM principals could easily grow to the hundreds, and the number of S3 buckets and prefixes could grow to tens of hundreds.

The direct comparison here is between S3 Access Grants and bucket policies/IAM permission policies. The following outlines the limitations of each method:

  • IAM Permission Policies

    • 5 KB IAM policies size limit
  • S3 Bucket Policies

    • 20 KB S3 bucket policies size limit
  • S3 Access Grants

    • 1 S3 Access Grants instance per region
    • 1,000 S3 Access Grants locations
    • 100,000 grants per S3 Access Grants instance

Below is an example of how an S3 bucket policy might be used to grant department IAM role-specific access to a bucket.

{
    "Version": "2012-10-17",
    "Statement": [
        {
            "Sid": "ShopFast-Marketing-ReadAccess",
            "Effect": "Allow",
            "Principal": {
                "AWS": [
                    "arn:aws:iam::123456789012:role/ShopFast-Marketing"
                ]
            },
            "Action": [
                "s3:GetObject",
                "s3:ListBucket"
            ],
            "Resource": [
                "arn:aws:s3:::shopfast-data/products/*",
                "arn:aws:s3:::shopfast-data/feedback/*"
            ]
        },
        {
            "Sid": "ShopFast-ProductManagement-ReadWriteAccess",
            "Effect": "Allow",
            "Principal": {
                "AWS": [
                    "arn:aws:iam::123456789012:role/ShopFast-ProductManagement"
                ]
            },
            "Action": [
                "s3:GetObject",
                "s3:PutObject",
                "s3:ListBucket"
            ],
            "Resource": [
                "arn:aws:s3:::shopfast-data/products/*"
            ]
        }
    ]
}

Note: Notice how the second half of the ProductManagement is missing (/leads* - read-access) - due to it being for a different bucket (shopfast-internal). This means we would need to create a second policy on the other bucket to grant access to the other prefix.

The policy size is a genuine concern for organizations with large-scale data environments, as when you hit the limit, there isn't any way to work around it - other than moving the bucket policies to IAM policies (however, those policies have their size limits that you need to be aware of).

In come S3 Access Grants, which enable you to create locations that map to S3 buckets and then assign grants on top of those locations. The explanation from the AWS documentation covers the resource model well - however, I've included a diagram below to help visualize it.

S3 Access Grants Overview
S3 Access Grants Overview

You can create up to 100,000 grants - where a grant is a combination of a principal, a location, and permission. All this configuration occurs above the S3 resources themselves, meaning viewing what is configured does not require you to query hundreds of bucket policies.

Granular Control

Off the back of the scalability discussion, the typical recommendation, if you hit the policy size limit, is to move to S3 Access Points, as they are resources that sit in front of a bucket and allow you to grant access to specific prefixes.

However, S3 Access Points have their limitations, which are outlined in the following table:

  • S3 Access Points

    • 10,000 access points per Region
    • Access point can only be associated with 1 bucket
  • S3 Access Grants

    • 1 S3 Access Grants instance per region
    • 1 Access grant location to 1 bucket
    • 100,000 grants per S3 Access Grants instance

S3 Access Points are a middle ground between bucket policies and S3 Access Grants. The way you manage the policy documents attached to the access points is still very much the same as bucket policies.

{
    "Version": "2012-10-17",
    "Statement": [
        {
            "Sid": "ShopFast-Marketing-ReadAccess",
            "Effect": "Allow",
            "Principal": {
                "AWS": [
                    "arn:aws:iam::123456789012:role/ShopFast-Marketing"
                ]
            },
            "Action": [
                "s3:GetObject",
                "s3:ListBucket"
            ],
            "Resource": [
                "arn:aws:s3:us-west-2:123456789012:accesspoint/shopfast-data-access-point-1/products/*",
                "arn:aws:s3:us-west-2:123456789012:accesspoint/shopfast-data-access-point-1/feedback/*"
            ]
        },
        {
            "Sid": "ShopFast-ProductManagement-ReadWriteAccess",
            "Effect": "Allow",
            "Principal": {
                "AWS": [
                    "arn:aws:iam::123456789012:role/ShopFast-ProductManagement"
                ]
            },
            "Action": [
                "s3:GetObject",
                "s3:PutObject",
                "s3:ListBucket"
            ],
            "Resource": [
                "arn:aws:s3:us-west-2:123456789012:accesspoint/shopfast-data-access-point-1/products/*"
            ]
        }
    ]
}

Then the S3 bucket would need delegated access setup to allow the access point

{
    "Version": "2012-10-17",
    "Statement": [
        {
            "Effect": "Allow",
            "Principal": { 
                "AWS": [
                    "arn:aws:iam::123456789012:role/ShopFast-Marketing",
                    "arn:aws:iam::123456789012:role/ShopFast-ProductManagement"
                ]
            },
            "Action": [
                "s3:GetBucketLocation",
                "s3:ListBucket",
                "s3:GetObject",
                "s3:PutObject"
            ],
            "Resource": [ "arn:aws:s3:::shopfast-data", "arn:aws:s3:::shopfast-data/*"],
            "Condition": {
                "StringEquals": { "s3:DataAccessPointAccount": "123456789012" }
            }
        }
    ]
}

The rationale is that when you hit the policy size limit, you can simply create more access points and split the policy document across them.

The drawback to this approach however is that the way you access the buckets must be done through the access points. So your applications or users need to be aware of the names/ARNs of the access points and use them to access the data.

S3 Access Grants on the other hand have a very flexible mechanism for using grants which means that all the application (or person) needs to request access is the S3 path and the access type they want (READ or READWRITE).

aws s3control get-data-access \
    --account-id 123456789012 \
    --target s3://shopfast-data/users* \
    --permission READ \
    --privilege Default

Compare this to the S3 Access Point approach, where the name shopfast-data-access-point-1 would need to be known. If you run out of access point policy space and create a second access point called shopfast-data-access-point-2 then any apps that are in the new policy document would need to be updated to use the new access point, while the old apps would continue to use the old access point.

aws s3api get-object \
    --bucket arn:aws:s3:us-west-2:123456789012:accesspoint/shopfast-data-access-point-1 \
    --key users/user_list \
    user_list_downloaded

You can see how this could quickly become a nightmare to manage if you didn't have a registry that maps permissions to access points.

Federated Access Control

A standout feature of S3 Access Grants is the ability to integrate with IAM and federated identity providers like Azure AD. This marks a significant shift from traditional AWS approaches, placing the focus on organizational identity and access management. By leveraging patterns that likely already exist in most organizations, such as AD Group membership, teams can manage S3 access more efficiently, which mirrors the similar approach other AWS services have started to take; such as AWS Managed Grafana.

This new model contrasts with the previous reliance on SSO identities and Customer Managed Policies. It represents a customer-focused strategy, transferring the administrative responsibilities from AWS-specific roles to mainstream IT teams. Watch this space for more insights on federated IdP integration with S3 Access Grants in future posts.

Setting Up S3 Access Grants with Terraform

Hopefully, at this point, you're starting to come around to the idea of S3 Access Grants. But maybe you are curious how you would go about setting them up in your organization. I've gone ahead and implemented the example from the beginning of this post for you to try out.

For this guide I wanted to focus on how your organization could enable the use of S3 Access Grants using Terraform, however, the same principles apply to other infrastructure-as-code tools (except CloudFormation, as S3 Access Grants are not yet supported as of right now - 11/12/2023).

I've created the following repository to help you get started and learn how S3 Access Grants work practically: https://github.com/t04glovern/terraform-aws-s3-access-grants

Clone the repository and follow the instructions in the README to get started.

git clone https://github.com/t04glovern/terraform-aws-s3-access-grants
cd terraform-aws-s3-access-grants

# Deploy the infrastructure
terraform init
terraform apply

Once the infrastructure is deployed, you can test out the access grants by assuming the IAM roles created for each department.

For example, let's test that the Customer Support department cannot WRITE to the users prefix in the shopfast-data bucket but can READ it.

export AWS_DEFAULT_REGION=ap-southeast-2
export AWS_ACCOUNT_ID=$(aws sts get-caller-identity --query Account --output text)
export AWS_ROLE_TO_ASSUME=arn:aws:iam::$AWS_ACCOUNT_ID:role/ShopFast-CustomerSupport
export SHOPFAST_DATA_BUCKET=$(terraform output -raw shopfast_data_bucket)

# Sets the AWS_ACCESS_KEY_ID, AWS_SECRET_ACCESS_KEY, and AWS_SESSION_TOKEN environment variables
CREDENTIALS_JSON=$(aws sts assume-role --role-arn $AWS_ROLE_TO_ASSUME --role-session-name ShopFastRole)
export AWS_ACCESS_KEY_ID=$(echo $CREDENTIALS_JSON | jq -r '.Credentials.AccessKeyId')
export AWS_SECRET_ACCESS_KEY=$(echo $CREDENTIALS_JSON | jq -r '.Credentials.SecretAccessKey')
export AWS_SESSION_TOKEN=$(echo $CREDENTIALS_JSON | jq -r '.Credentials.SessionToken')

aws s3control get-data-access \
    --account-id $AWS_ACCOUNT_ID \
    --target s3://$SHOPFAST_DATA_BUCKET/users* \
    --permission READWRITE \
    --privilege Default

This should return an error indicating that the user does not have the correct permissions to access the data.

# An error occurred (AccessDenied) when calling the GetDataAccess operation: You do not have READWRITE permissions to the requested S3 Prefix: s3://terraform-20231210044558274900000002/users*

If you try the same command but with the READ permission, you should get a successful response.

CREDENTIALS_JSON=$(aws s3control get-data-access \
    --account-id $AWS_ACCOUNT_ID \
    --target s3://$SHOPFAST_DATA_BUCKET/users* \
    --permission READ \
    --privilege Default)

echo $CREDENTIALS_JSON
# {
#     "Credentials": {
#         "AccessKeyId": "ASIAZZZZZZZZZZZZZZZZ",
#         "SecretAccessKey": "RA+YYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYY",
#         "SessionToken": "IQoJb3JpZ2luXXXXXXXXXXXXXXXXXXXXXXXXXX",
#         "Expiration": "2023-12-06T16:16:19+00:00"
#     },
#     "MatchedGrantTarget": "s3://terraform-20231210044558274900000002/users*"
# }

These credentials can then be used to access the data in the bucket.

export AWS_ACCESS_KEY_ID=$(echo $CREDENTIALS_JSON | jq -r '.Credentials.AccessKeyId')
export AWS_SECRET_ACCESS_KEY=$(echo $CREDENTIALS_JSON | jq -r '.Credentials.SecretAccessKey')
export AWS_SESSION_TOKEN=$(echo $CREDENTIALS_JSON | jq -r '.Credentials.SessionToken')

$ aws sts get-caller-identity
# {
#     "UserId": "ASIAZZZZZZZZZZZZZZZZ:access-grants-ade86c3c-4781-4c65-8beb-4639bb72f5e6",
#     "Account": "012345678912",
#     "Arn": "arn:aws:sts::012345678912:assumed-role/terraform-20231210044558274900000002/access-grants-ade86c3c-4781-4c65-8beb-4639bb72f5e6"
# }

Download the data from the bucket to confirm!

aws s3api get-object --bucket $SHOPFAST_DATA_BUCKET --key users/user_list user_list_downloaded
# {
#     "AcceptRanges": "bytes",
#     "LastModified": "2023-12-10T05:00:37+00:00",
#     "ContentLength": 32,
#     "ETag": "\"4aa99f977fb1e5ba4d846e408f6a90ba\"",
#     "ContentType": "application/octet-stream",
#     "ServerSideEncryption": "AES256",
#     "Metadata": {}
# }

Feel free to try out the other departments and their access patterns to get a feel for how S3 Access Grants work.

When you are done, you can destroy the infrastructure.

terraform destroy

Conclusion

I think S3 Access Grants are a game-changer for organizations that are struggling to manage access to large-scale data environments. Even if you don't plan on using the Federated identities piece of the puzzle, the ability to define S3 access policies above the S3 resources themselves opens a lot of doors for centralized governance.

I hope this post has helped you understand the benefits of S3 Access Grants and how they compare to some of the methods you might currently be using. I'm keen to hear your thoughts and whether you've given them a shot at your organization.

If you have any issues or feedback please feel free to reach out to me on Twitter @nathangloverAUS or leave a comment below!

devopstar

DevOpStar by Nathan Glover | 2024