Skip to content

Overview

This guide provides a comprehensive overview of Amazon Simple Storage Service (S3), a scalable object storage service. You will learn about the different S3 storage classes and how to create and manage S3 buckets using Terraform. This service is ideal for storing everything from application assets and backups to large datasets for analysis.

Before you begin, you will need:

  • An active University of Oregon AWS account.
  • Terraform installed on your local machine.
  • The AWS CLI installed and configured with your UO account credentials.
  • Familiarity with basic AWS concepts like IAM roles and policies.

Amazon S3 provides a variety of storage classes designed for different use cases. Each class offers different levels of durability, availability, and performance, at different price points.

This is the default storage class. It’s a general-purpose class for frequently accessed data, such as website assets, mobile applications, and active datasets. It offers high durability, availability, and performance. Use this for data that you need to access quickly and regularly.

This class is designed to optimize storage costs by automatically moving data to the most cost-effective access tier, without performance impact or operational overhead. It works by monitoring access patterns and moving objects that have not been accessed over a period of time to a lower-cost tier. This is a good choice if you have data with unknown or changing access patterns.

S3 Standard-Infrequent Access (S3 Standard-IA)

Section titled “S3 Standard-Infrequent Access (S3 Standard-IA)”

This class is for data that is accessed less frequently, but requires rapid access when needed. It has a lower per-GB storage price and a per-GB retrieval fee, making it a good choice for long-term storage, backups, and disaster recovery files. Consider this for data you need to keep for months or years, but don’t need to access on a daily basis.

This class provides the lowest-cost storage for data that is rarely accessed but requires millisecond retrieval. It’s a good fit for data archives that you need to access immediately, such as medical images or news media assets. Use this when you need to keep data for a long time, but might need to get it back in a hurry.

This class is for archival data that does not require immediate access. It offers flexible retrieval options, from a few minutes to several hours. It is a low-cost option for data that can be retrieved asynchronously. This is suitable for archives where you can wait a bit to get your data back.

This is the lowest-cost storage class and is designed for long-term retention of data that is accessed very rarely. Retrieval time is within 12 hours. This is ideal for regulatory compliance archives or digital media preservation. You should use this for files you never expect to need again, but must keep for legal or compliance reasons.

These examples demonstrate how to create S3 buckets for different use cases.

The following Terraform code will create a private S3 bucket with a public access block, which is a security best practice.

# main.tf - Example for creating a private S3 bucket
resource "aws_s3_bucket" "document_storage" {
bucket = "uo-myapp-example-bucket" # Please use a long, unique name
tags = {
Name = "UO Myapp Example Bucket"
Environment = "Production"
}
}
resource "aws_s3_bucket_public_access_block" "main" {
bucket = aws_s3_bucket.document_storage.id
block_public_acls = true
block_public_policy = true
ignore_public_acls = true
restrict_public_buckets = true
}
  • resource "aws_s3_bucket" "document_storage": This block declares a new S3 bucket resource. We’ve given it the logical name document_storage for reference within our Terraform code.
  • bucket = "uo-myapp-example-bucket": This sets the globally unique name for the S3 bucket. You will need to change this to a unique name for your own bucket.
  • tags: These are key-value pairs that you can use to organize and manage your AWS resources.
  • resource "aws_s3_bucket_public_access_block" "main": This is a critical security resource that ensures your bucket remains private. It blocks all public access to the bucket and its objects, which is the recommended setting for most use cases.

Example 2: S3 Bucket with Glacier Instant Retrieval

Section titled “Example 2: S3 Bucket with Glacier Instant Retrieval”

This example creates an S3 bucket and configures a lifecycle rule to transition objects to the S3 Glacier Instant Retrieval storage class after 30 days. This is a cost-effective way to store long-term data that still requires immediate access.

# main.tf - Example for creating an S3 bucket with a lifecycle rule
resource "aws_s3_bucket" "archival_storage" {
bucket = "uo-myapp-archival-bucket" # Please use a long, unique name
tags = {
Name = "UO Myapp Archival Bucket"
Environment = "Production"
}
}
resource "aws_s3_bucket_lifecycle_configuration" "archival_rule" {
bucket = aws_s3_bucket.archival_storage.id
rule {
id = "archive-after-30-days"
status = "Enabled"
filter {
prefix = "documents/"
}
transition {
days = 30
storage_class = "GLACIER_IR"
}
}
}
  • resource "aws_s3_bucket" "archival_storage": This creates the S3 bucket for archival storage.
  • resource "aws_s3_bucket_lifecycle_configuration" "archival_rule": This resource defines the lifecycle rules for the bucket.
  • rule: This block defines a specific lifecycle rule. In this case, it’s named archive-after-30-days.
  • filter: This specifies that the rule should only apply to objects with the documents/ prefix.
  • transition: This block defines the transition action. It specifies that objects should be moved to the GLACIER_IR storage class after 30 days.

Example 3: S3 Bucket with Glacier Deep Archive

Section titled “Example 3: S3 Bucket with Glacier Deep Archive”

This example is for long-term data retention of data that is accessed very rarely. This example transitions objects to the S3 Glacier Deep Archive storage class immediately upon upload.

# main.tf - Example for creating an S3 bucket with a lifecycle rule for deep archive
resource "aws_s3_bucket" "deep_archive_storage" {
bucket = "uo-myapp-deep-archive-bucket" # Please use a long, unique name
tags = {
Name = "UO Myapp Deep Archive Bucket"
Environment = "Production"
}
}
resource "aws_s3_bucket_lifecycle_configuration" "deep_archive_rule" {
bucket = aws_s3_bucket.deep_archive_storage.id
rule {
id = "deep-archive-immediately"
status = "Enabled"
transition {
days = 0
storage_class = "DEEP_ARCHIVE"
}
}
}
  • resource "aws_s3_bucket" "deep_archive_storage": This creates the S3 bucket for deep archival storage.
  • resource "aws_s3_bucket_lifecycle_configuration" "deep_archive_rule": This resource defines the lifecycle rules for the bucket.
  • rule: This block defines a specific lifecycle rule. In this case, it’s named deep-archive-immediately.
  • filter {}: This empty filter means the rule will apply to all objects in the bucket.
  • transition: This block defines the transition action. It specifies that objects should be moved to the DEEP_ARCHIVE storage class immediately (0 days).

Now that you have a private S3 bucket, you can start using it to store your application data. Here are some next steps you might consider: