Chapter 3: Storage and Retrieval

What Actually Happens When You Hit “Save”

If Chapter 2 was about how we model data, Chapter 3 is about the uncomfortable truth:

And how those bytes are stored determines:

Performance
Scalability
Cost
And whether your system melts under load or not

The Simplest Database You Can Imagine

Let’s start dumb. Like really dumb.

You store data in a file:

set user123 = "Frank"
set user124 = "Rhoie"

Append-only. No indexes. Just vibes.

Problem:

Reads are slow → you scan everything
Updates create duplicates
Lookups are painful

Congrats, you’ve built a log-based storage system.

And surprisingly…

👉 This idea is still used in real systems today.

Enter Indexes: Because Scanning Is Not a Strategy

An index is just a data structure that helps you find things faster.

Think:

Without index → read the whole book
With index → flip to the right page

But here’s the catch:

Every index is a trade-off.

Faster reads
Slower writes
More storage

Pick your poison.

Hash Indexes (Key-Value on Steroids)

The simplest useful index:

Key → position in file
Stored in memory (usually)

How it works:

Append new data to a file
Update hash map: key → file offset

Pros:

Blazing fast lookups
Simple

Cons:

Needs to fit in memory
Doesn’t support range queries

👉 Great for:

Caches
Session stores
Simple KV systems

The Real Problem: Disk Is Slow (and Weird)

Memory is fast. Disk is slow.

But more importantly:

Disk prefers sequential reads/writes, not random ones.

So modern storage engines optimize for:

Appending data (sequential)
Avoiding random access

This leads us to…

SSTables & LSM Trees (Write Fast, Clean Later)

This is where things get interesting.

Core idea:

Writes go to memory first
Periodically flushed to disk as sorted files

These files are called:
👉 SSTables (Sorted String Tables)

How LSM Trees Work

Write goes to in-memory structure (memtable)
When full → flushed to disk (SSTable)
Background process merges files (compaction)

Why this is genius:

✔ Writes are fast (sequential)
✔ Disk is used efficiently
✔ Great for write-heavy systems

But there’s always a catch:

❌ Reads may check multiple files
❌ Compaction is expensive
❌ Higher read amplification

Real-world systems:

Cassandra
HBase
RocksDB

👉 If your system writes a lot (logs, events, analytics), LSM trees are your friend.

B-Trees (The Classic Workhorse)

If LSM Trees are the new cool kid, B-Trees are the veteran.

Core idea:

Keep data sorted and balanced on disk
Minimize disk reads

Instead of scanning files, B-Trees:

Jump directly to the right location

Why B-Trees dominate:

✔ Efficient reads
✔ Efficient range queries
✔ Predictable performance

Downsides:

❌ Writes are slower (random writes)
❌ More complex disk operations

Real-world systems:

PostgreSQL
MySQL (InnoDB)
Most traditional databases

B-Trees vs LSM Trees (The Cage Match)

Let’s not overcomplicate it:

FeatureB-TreeLSM TreeWritesSlowerFasterReadsFasterSlowerRange QueriesExcellentGoodStorage EfficiencyModerateHighUse CaseOLTPWrite-heavy / analytics

The blunt truth:

Need fast reads & consistency → B-Tree
Need high write throughput → LSM

There is no winner. Only context.

Secondary Indexes (Because One Index Is Never Enough)

Primary index = how data is stored

Secondary index = extra lookup paths

Example:

Primary key: user_id
Secondary index: email

Trade-offs:

✔ Faster queries
❌ More storage
❌ Slower writes (must update all indexes)

Again… nothing is free.

Full-Text Search (Databases Aren’t Enough)

Searching text ≠ simple lookup.

You need:

Tokenization
Ranking
Fuzzy matching

That’s why systems like:

Elasticsearch
Solr

exist.

👉 These are specialized indexes, not general databases.

OLTP vs OLAP (Stop Mixing Them)

This is where many systems fail.

OLTP (Online Transaction Processing)

Small reads/writes
Real-time
Example: user login, checkout

OLAP (Analytics)

Large scans
Aggregations
Example: dashboards, reports

The mistake:

Using one database for both.

👉 That’s how you get:

Slow dashboards
Angry users
Late-night debugging sessions

Column Stores (Analytics Done Right)

Instead of storing rows:

user_id | name | age

Store columns:

user_id: [1,2,3]
name: ["Frank","Rhoie","..."]
age: [30,28,...]

Why this works:

✔ Reads only needed columns
✔ Better compression
✔ Faster aggregations

Used by:

BigQuery
Redshift
Snowflake

The Big Picture

Here’s what Chapter 3 really teaches:

Storage engines are about trade-offs between reads, writes, and space.

Everything boils down to:

Sequential vs random access
Memory vs disk
Write speed vs read speed

Practical Takeaways (Engineer Mode ON)

1. You’re always choosing trade-offs

No storage engine is “best”.

2. Understand your workload first

Before picking a database, ask:

Read-heavy or write-heavy?
Point queries or scans?
Real-time or analytics?

3. Indexes are not free

Every index:

Slows writes
Uses memory
Adds complexity

4. Separate OLTP and OLAP early

Seriously. Do it.

5. Don’t ignore the storage engine

Most devs stop at:

“We use Postgres.”

But the real magic is:

B-Trees
WAL logs
Caching
Query planning

That’s where performance lives.

Final Thoughts

Chapter 3 pulls back the curtain and says:

“Your database is not magic. It’s just clever data structures.”

Once you understand that:

You debug faster
You design better systems
You stop blaming the database for everything 😄

Chapter 3: Storage and Retrieval

What Actually Happens When You Hit “Save”

The Simplest Database You Can Imagine

Enter Indexes: Because Scanning Is Not a Strategy

Hash Indexes (Key-Value on Steroids)

How it works:

The Real Problem: Disk Is Slow (and Weird)

SSTables & LSM Trees (Write Fast, Clean Later)

Core idea:

How LSM Trees Work

Why this is genius:

But there’s always a catch:

Real-world systems:

B-Trees (The Classic Workhorse)

Core idea:

Why B-Trees dominate:

Downsides:

Real-world systems:

B-Trees vs LSM Trees (The Cage Match)

The blunt truth:

Secondary Indexes (Because One Index Is Never Enough)

Trade-offs:

Full-Text Search (Databases Aren’t Enough)

OLTP vs OLAP (Stop Mixing Them)

OLTP (Online Transaction Processing)

OLAP (Analytics)

The mistake:

Column Stores (Analytics Done Right)

Why this works:

Used by:

The Big Picture

Practical Takeaways (Engineer Mode ON)

1. You’re always choosing trade-offs

2. Understand your workload first

3. Indexes are not free

4. Separate OLTP and OLAP early

5. Don’t ignore the storage engine

Final Thoughts

Stay in the loop

💬 Leave a Comment