Skip to content

S3 Part I

S3 Buckets in AWSπŸ”—

Amazon S3 (Simple Storage Service) is an object storage service that provides industry-leading scalability, data availability, security, and performance. Unlike a computer’s hard drive (block storage) or a shared network drive (file storage), S3 stores data as objects within buckets.

It is essentially "storage for the internet," designed to make web-scale computing easier for developers.


1. Core ConceptsπŸ”—

To understand S3, you only need to know three primary components:

  • Buckets: Fundamental containers for data. Bucket names must be globally unique across all of AWS.
  • Objects: The actual files you upload (up to 5 TB each). An object consists of the Data (the file itself) and Metadata (name-value pairs that describe the object).
  • Keys: The unique identifier for an object within a bucket. For example, in the path my-bucket/photos/vacation.jpg, the key is photos/vacation.jpg.

2. Key Features (The "Why")πŸ”—

AWS S3 is famous for its "11 Nines" of durability, but it offers several other critical technical advantages:

  • Durability: Designed for 99.999999999% durability. AWS achieves this by automatically replicating your data across at least three physical Availability Zones (AZs) within a region.
  • Consistency: S3 provides strong read-after-write consistency for all applications. If you upload a new object or overwrite an existing one, any subsequent read request will immediately return the latest version.
  • Security: Data is private by default. You control access using IAM Policies (user-level), Bucket Policies (bucket-level), and S3 Block Public Access (an account-level "kill switch" to prevent accidental data exposure).
  • Scalability: You can store virtually unlimited amounts of data and serve millions of requests per second without ever "provisioning" hardware.

3. Storage Classes & Cost OptimizationπŸ”—

Not all data is accessed the same way. S3 offers several "tiers" to help you save money based on how often you need your files:

Storage Class Best Use Case Retrieval Time
S3 Standard Frequently accessed data (active apps, gaming). Milliseconds
S3 Intelligent-Tiering Data with unknown or changing access patterns. Milliseconds (auto-saves $)
S3 Standard-IA Long-lived data, accessed less than once a month. Milliseconds
S3 Express One Zone High-performance AI/ML training & real-time analytics. Single-digit ms
S3 Glacier Archival data (backups). Minutes to Hours
S3 Glacier Deep Archive Long-term digital preservation (compliance). 12 - 48 Hours

Pro Tip: Use S3 Lifecycle Policies to automate this. For example, you can set a rule to move objects to Glacier after 90 days and delete them after 7 years.


4. Modern S3 (The 2026 Landscape)πŸ”—

As of 2026, S3 has evolved from simple "storage" into a foundational engine for AI and Data Lakes:

  • S3 Tables: Fully managed Apache Iceberg tables that allow you to treat S3 data like a high-performance database for analytics.
  • S3 Vectors: Native support for storing and querying vector embeddings, enabling high-speed semantic search for Generative AI applications directly within S3.
  • Event-Driven Workflows: S3 can trigger AWS Lambda functions automatically whenever a file is uploaded (e.g., to resize an image or scan a file for viruses).

How does AWS manage S3?πŸ”—

alt text

AWS S3 Durability and Scalability?πŸ”—

AWS ensures the durability and scalability of S3 through a combination of physical redundancy, automated self-healing software, and a massive distributed architecture that treats "scale as an advantage."


1. How S3 Ensures Durability (The "11 Nines")πŸ”—

Durability is the probability that your data remains intact and uncorrupted over time. S3 is designed for \(99.999999999\%\) durability.

Multi-AZ RedundancyπŸ”—

By default (in the Standard class), S3 stores your data redundantly across at least three Availability Zones (AZs) within an AWS Region. Each AZ is a physically distinct group of data centers with independent power and cooling.

  • The Impact: Even if an entire data center is destroyed by a disaster, your data remains available and intact in the other two AZs.

Integrity Checking & Self-HealingπŸ”—

AWS doesn't just "store" your file; it actively monitors it.

  • Checksum Verification: S3 uses checksums to detect data corruption during transit and while at rest.
  • Background Auditors: Continuous "auditor" microservices scan the entire S3 fleet. If they detect a bit has "flipped" or a disk is failing, S3 automatically triggers a repair system to reconstruct the data from the other redundant copies before you even notice.

Erasure CodingπŸ”—

For archival tiers like Glacier, S3 uses Erasure Coding (similar to RAID but at a massive cloud scale). Files are split into data and parity fragments and spread across many different devices. This allows the system to reconstruct the original file even if multiple hardware components fail simultaneously.

To understand how S3 maintains its "11 Nines" of durability, it helps to look at Checksums as the "detective" and Erasure Coding as the "healer."


1. Checksum: The Integrity DetectiveπŸ”—

A checksum is a mathematical fingerprint of a file. When you upload a file to S3, the system calculates a value (like an MD5 or SHA-256 hash) based on the file's binary content.

The Example:πŸ”—

  1. Upload: You upload a photo. S3 calculates its checksum: a1b2c3d4.
  2. Storage: S3 stores the photo and the checksum.
  3. The "Bit Rot" Event: Five years later, a cosmic ray hits a hard drive and flips a single 0 to a 1 in your photo's data.
  4. Detection: S3’s background "scrubber" service reads the file and recalculates the checksum. It gets z9y8x7w6.
  5. Action: Since a1b2c3d4 \(\neq\) z9y8x7w6, S3 flags that specific copy as corrupt and immediately replaces it with a healthy copy from another Availability Zone.

2. Erasure Coding: The Digital HealerπŸ”—

Erasure coding is a method of data protection that breaks data into fragments, expands them with redundant data (parity), and stores them across different locations. It is far more efficient than simple "mirroring" (copying the whole file multiple times).

The Example (\(n+m\)):πŸ”—

Imagine S3 uses a simplified 4+2 Erasure Coding scheme.

  • Data Fragments (\(n=4\)): Your 100MB file is split into four 25MB chunks (\(D1, D2, D3, D4\)).
  • Parity Fragments (\(m=2\)): S3 uses advanced math (Reed-Solomon algorithms) to create two extra 25MB "parity" chunks (\(P1, P2\)).
  • Distribution: These 6 total chunks are spread across 6 different server racks or drives.

The "Failure" Scenario:πŸ”—

Suppose a massive hardware failure destroys the drives containing D2 and P1.

  • The Math: Because we have 4 out of the 6 original pieces (\(D1, D3, D4,\) and \(P2\)), the Reed-Solomon formula can mathematically reconstruct the missing \(D2\) piece perfectly.
  • Result: You experience zero data loss even though 33% of the storage hardware failed.

Comparison at a GlanceπŸ”—

Feature Checksum Erasure Coding
Primary Goal To detect if data has changed. To reconstruct data if it is lost.
Analogy A tamper-evident seal on a jar. A puzzle where you only need 70% of the pieces to see the whole image.
Math Type Hashing (e.g., CRC32C, SHA-256). Polynomial equations (Reed-Solomon).
---

2. How S3 Ensures ScalabilityπŸ”—

Scalability in S3 is "elastic," meaning it grows and shrinks automatically without you needing to provision servers or manage partitions.

Request ParallelismπŸ”—

S3 scale is built on the principle that "Scale is to your advantage." Because S3 is used by millions of customers, the aggregate load is so massive and decorrelated that a sudden spike from one user is easily absorbed by the sheer size of the shared infrastructure.

  • Performance Scaling: You can achieve at least 3,500 PUT/COPY/POST/DELETE and 5,500 GET/HEAD requests per second per prefix in a bucket. There are no limits to the number of prefixes in a bucket.

Massive Throughput with S3 Express One ZoneπŸ”—

Introduced for high-performance needs (like AI/ML), S3 Express One Zone uses a different architecture called "Directory Buckets."

  • It supports up to 2 million transactions per second (TPS).
  • It uses a specialized high-speed request path (partially rewritten in Rust for memory safety and speed) to provide single-digit millisecond latency.

Distributed IndexingπŸ”—

S3 uses a distributed hash table to manage object keys. When you request an object, S3 doesn't "search" for it; it uses the key to immediately identify which of the thousands of storage nodes holds that specific data fragment.


Summary ComparisonπŸ”—

Feature Mechanism for Durability Mechanism for Scalability
Physical 3+ Availability Zones (standard) Hundreds of exabytes of hardware fleet
Software Background Auditors & Self-Healing Distributed Hash Tables & Rust-optimized paths
Logic Erasure Coding & Checksums Prefix-based partitioning & Auto-scaling