Chapter 2: Data Models and Query Languages

“The limits of my language mean the limits of my world.” — Ludwig Wittgenstein
The Real Backbone of Every System
If Chapter 1 was about how systems behave, Chapter 2 is about something more dangerous:
how you think about your data.
Because once you pick a data model…
you’re kind of stuck with it.
Why Data Models Matter More Than You Think
Most engineers treat databases like a tool:
“Just pick Postgres… or Mongo… or whatever works.”
But Martin Kleppmann makes a subtle point:
👉 Data models shape your thinking, not just your storage.
Every system is layered:
Real-world concepts (users, payments, events)
Application objects (classes, structs, JSON)
Database representation (tables, documents, graphs)
Storage (bytes, disk, memory)
Each layer abstracts the one below it
And here’s the catch:
Every abstraction makes some things easy… and others painful.
The Three Core Data Models
Let’s strip it down. Almost everything you’ll use falls into three camps:
1. Relational Model (SQL)
The OG. Still undefeated in many ways.
Core idea:
Data lives in tables (relations)
Rows = records
Columns = attributes
Relationships handled via joins
Strengths:
Handles complex relationships (many-to-many) extremely well
Mature ecosystem
Query optimizers do the hard work for you
Weaknesses:
Rigid schema (schema-on-write)
Mapping objects → tables can feel awkward (the famous ORM headache)
2. Document Model (NoSQL / JSON)
Think MongoDB, Firestore, etc.
Core idea:
Store data as self-contained documents (usually JSON)
{
"user": "Frank",
"skills": ["React", "AI"],
"experience": {
"years": 8
}
}
Strengths:
Flexible schema (schema-on-read)
Maps naturally to application code
Great for one-to-many / tree-like data
Weaknesses:
Weak support for joins
Many-to-many relationships get messy fast
You might push complexity into your app instead of the DB
👉 Translation:
You avoided SQL… but now you are the query engine. Congrats.
3. Graph Model
The underrated weapon.
Core idea:
Data = nodes (entities) + edges (relationships)
Perfect for:
Social networks
Recommendation systems
Fraud detection
Anything with deep relationships
Strengths:
Relationships are first-class citizens
Traversals are natural and efficient
Extremely flexible schema
Weaknesses:
Not ideal for simple CRUD apps
Requires a mindset shift
Graphs shine when:
“Everything is connected to everything.”
The Real Trade-Off (No BS Version)
Let’s simplify the decision-making:
Use CaseBest ModelStructured data, transactionsRelationalNested data, fast iterationDocumentHighly connected dataGraph
And here’s the truth Kleppmann hints at:
There is no “best” model—only fit-for-purpose
Schema: Strict vs Flexible
This is where developers start fights on Twitter.
Schema-on-Write (Relational)
Define structure before storing
Enforced by the database
✔ Safe
❌ Less flexible
Schema-on-Read (Document)
Store anything
Interpret later
✔ Flexible
❌ Easy to shoot yourself in the foot
Kleppmann puts it nicely:
It’s not “schemaless”—it’s just schema somewhere else (usually your code)
Query Languages: Declarative Wins
Another underrated concept.
Imperative (how to do it)
for (...) {
if (...) {
...
}
}
Declarative (what you want)
SELECT * FROM animals WHERE family = 'Sharks';
Declarative queries:
Let the database optimize execution
Scale better long-term
Reduce application complexity
That’s why SQL is still everywhere.
Beyond SQL: Modern Query Approaches
Chapter 2 also explores different query styles:
SQL → relational
MongoDB aggregation → document pipelines
Cypher → graph queries
SPARQL → RDF graph queries
Datalog → rule-based logic (very powerful, slightly mind-bending)
Datalog in particular is interesting:
You define rules, not just queries—and they can be reused and composed
The Object-Relational Mismatch (a.k.a. Why ORMs Exist)
Your app uses objects.
Your DB uses tables.
That mismatch creates:
Boilerplate
Complexity
Performance trade-offs
This is why:
ORMs exist
And also why people complain about ORMs 😅
The Convergence Trend
Here’s the plot twist:
Databases are starting to look… the same.
Relational DBs now support JSON
Document DBs are adding joins
Hybrid systems are emerging
Kleppmann calls this out clearly:
👉 The future is likely a mix of models, not one winner
Practical Takeaways (No Fluff)
If you remember nothing else, remember this:
1. Your data model is a long-term decision
Changing it later = pain.
2. Model your relationships first
Few relationships → Document
Many relationships → Relational or Graph
3. Don’t blindly follow trends
“Use Mongo” or “Use Postgres” is not architecture.
4. Complexity moves somewhere
Not in the DB? → It’s in your app
Not in your app? → It’s in the DB
Pick your poison wisely.
Final Thoughts
Chapter 2 is basically Kleppmann saying:
“Your database choice is not a tooling decision.
It’s a thinking framework.”
And once you see it that way, you stop asking:
“What database should I use?”
And start asking:
“What shape does my data actually have?”
If you want, I can also:
Turn this into a Notion-ready blog template
Add code examples (Postgres vs Mongo vs Neo4j)
Or tailor it to your blog voice (The Practical Engineer) with stronger personality and visuals
Stay in the loop
Get notified when new posts are published. No spam, unsubscribe anytime.
No spam · Unsubscribe anytime