Overview

This guide provides a comprehensive overview of Amazon Simple Storage Service (S3), a scalable object storage service. You will learn about the different S3 storage classes and how to create and manage S3 buckets using Terraform. This service is ideal for storing everything from application assets and backups to large datasets for analysis.

Prerequisites

Before you begin, you will need:

An active University of Oregon AWS account.
Terraform installed on your local machine.
The AWS CLI installed and configured with your UO account credentials.
Familiarity with basic AWS concepts like IAM roles and policies.

Core Concepts

Amazon S3 provides a variety of storage classes designed for different use cases. Each class offers different levels of durability, availability, and performance, at different price points.

S3 Standard

This is the default storage class. It’s a general-purpose class for frequently accessed data, such as website assets, mobile applications, and active datasets. It offers high durability, availability, and performance. Use this for data that you need to access quickly and regularly.

S3 Intelligent-Tiering

This class is designed to optimize storage costs by automatically moving data to the most cost-effective access tier, without performance impact or operational overhead. It works by monitoring access patterns and moving objects that have not been accessed over a period of time to a lower-cost tier. This is a good choice if you have data with unknown or changing access patterns.

S3 Standard-Infrequent Access (S3 Standard-IA)

This class is for data that is accessed less frequently, but requires rapid access when needed. It has a lower per-GB storage price and a per-GB retrieval fee, making it a good choice for long-term storage, backups, and disaster recovery files. Consider this for data you need to keep for months or years, but don’t need to access on a daily basis.

S3 Glacier Instant Retrieval

This class provides the lowest-cost storage for data that is rarely accessed but requires millisecond retrieval. It’s a good fit for data archives that you need to access immediately, such as medical images or news media assets. Use this when you need to keep data for a long time, but might need to get it back in a hurry.

S3 Glacier Flexible Retrieval

This class is for archival data that does not require immediate access. It offers flexible retrieval options, from a few minutes to several hours. It is a low-cost option for data that can be retrieved asynchronously. This is suitable for archives where you can wait a bit to get your data back.

S3 Glacier Deep Archive

This is the lowest-cost storage class and is designed for long-term retention of data that is accessed very rarely. Retrieval time is within 12 hours. This is ideal for regulatory compliance archives or digital media preservation. You should use this for files you never expect to need again, but must keep for legal or compliance reasons.

Terraform Examples

These examples demonstrate how to create S3 buckets for different use cases.

Example 1: Basic Private Bucket

The following Terraform code will create a private S3 bucket with a public access block, which is a security best practice.

# main.tf - Example for creating a private S3 bucket

resource "aws_s3_bucket" "document_storage" {
  bucket = "uo-myapp-example-bucket" # Please use a long, unique name

  tags = {
    Name        = "UO Myapp Example Bucket"
    Environment = "Production"
  }
}

resource "aws_s3_bucket_public_access_block" "main" {
  bucket = aws_s3_bucket.document_storage.id

  block_public_acls       = true
  block_public_policy     = true
  ignore_public_acls      = true
  restrict_public_buckets = true
}

Code Explanation

resource "aws_s3_bucket" "document_storage": This block declares a new S3 bucket resource. We’ve given it the logical name document_storage for reference within our Terraform code.
bucket = "uo-myapp-example-bucket": This sets the globally unique name for the S3 bucket. You will need to change this to a unique name for your own bucket.
tags: These are key-value pairs that you can use to organize and manage your AWS resources.
resource "aws_s3_bucket_public_access_block" "main": This is a critical security resource that ensures your bucket remains private. It blocks all public access to the bucket and its objects, which is the recommended setting for most use cases.

Example 2: S3 Bucket with Glacier Instant Retrieval

This example creates an S3 bucket and configures a lifecycle rule to transition objects to the S3 Glacier Instant Retrieval storage class after 30 days. This is a cost-effective way to store long-term data that still requires immediate access.

# main.tf - Example for creating an S3 bucket with a lifecycle rule

resource "aws_s3_bucket" "archival_storage" {
  bucket = "uo-myapp-archival-bucket" # Please use a long, unique name

  tags = {
    Name        = "UO Myapp Archival Bucket"
    Environment = "Production"
  }
}

resource "aws_s3_bucket_lifecycle_configuration" "archival_rule" {
  bucket = aws_s3_bucket.archival_storage.id

  rule {
    id     = "archive-after-30-days"
    status = "Enabled"

    filter {
      prefix = "documents/"
    }

    transition {
      days          = 30
      storage_class = "GLACIER_IR"
    }
  }
}

Code Explanation

resource "aws_s3_bucket" "archival_storage": This creates the S3 bucket for archival storage.
resource "aws_s3_bucket_lifecycle_configuration" "archival_rule": This resource defines the lifecycle rules for the bucket.
rule: This block defines a specific lifecycle rule. In this case, it’s named archive-after-30-days.
filter: This specifies that the rule should only apply to objects with the documents/ prefix.
transition: This block defines the transition action. It specifies that objects should be moved to the GLACIER_IR storage class after 30 days.

Example 3: S3 Bucket with Glacier Deep Archive

This example is for long-term data retention of data that is accessed very rarely. This example transitions objects to the S3 Glacier Deep Archive storage class immediately upon upload.

# main.tf - Example for creating an S3 bucket with a lifecycle rule for deep archive

resource "aws_s3_bucket" "deep_archive_storage" {
  bucket = "uo-myapp-deep-archive-bucket" # Please use a long, unique name

  tags = {
    Name        = "UO Myapp Deep Archive Bucket"
    Environment = "Production"
  }
}

resource "aws_s3_bucket_lifecycle_configuration" "deep_archive_rule" {
  bucket = aws_s3_bucket.deep_archive_storage.id

  rule {
    id     = "deep-archive-immediately"
    status = "Enabled"

    transition {
      days          = 0
      storage_class = "DEEP_ARCHIVE"
    }
  }
}

Code Explanation

resource "aws_s3_bucket" "deep_archive_storage": This creates the S3 bucket for deep archival storage.
resource "aws_s3_bucket_lifecycle_configuration" "deep_archive_rule": This resource defines the lifecycle rules for the bucket.
rule: This block defines a specific lifecycle rule. In this case, it’s named deep-archive-immediately.
filter {}: This empty filter means the rule will apply to all objects in the bucket.
transition: This block defines the transition action. It specifies that objects should be moved to the DEEP_ARCHIVE storage class immediately (0 days).

Next Steps

Now that you have a private S3 bucket, you can start using it to store your application data. Here are some next steps you might consider:

Read our guide on Uploading Objects to S3 to learn how to get your data into the bucket.
Learn how to set up S3 Bucket Policies for fine-grained access control.
Explore how to use S3 with other AWS services, such as EC2 or Lambda.