Chapter 2: Data Models and Query Languages

“The limits of my language mean the limits of my world.” — Ludwig Wittgenstein
The Real Backbone of Every System
If Chapter 1 was about how systems behave, Chapter 2 is about something more dangerous:
how you think about your data.
Because once you pick a data model…
you’re kind of stuck with it.
Why Data Models Matter More Than You Think
Most engineers treat databases like a tool:
“Just pick Postgres… or Mongo… or whatever works.”
But Martin Kleppmann makes a subtle point:
👉 Data models shape your thinking, not just your storage.
Every system is layered:
Real-world concepts (users, payments, events)
Application objects (classes, structs, JSON)
Database representation (tables, documents, graphs)
Storage (bytes, disk, memory)
Each layer abstracts the one below it
And here’s the catch:
Every abstraction makes some things easy… and others painful.
The Three Core Data Models
Let’s strip it down. Almost everything you’ll use falls into three camps:
1. Relational Model (SQL)
The OG. Still undefeated in many ways.
Core idea:
Data lives in tables (relations)
Rows = records
Columns = attributes
Relationships handled via joins
Strengths:
Handles complex relationships (many-to-many) extremely well
Mature ecosystem
Query optimizers do the hard work for you
Weaknesses:
Rigid schema (schema-on-write)
Mapping objects → tables can feel awkward (the famous ORM headache)
2. Document Model (NoSQL / JSON)
Think MongoDB, Firestore, etc.
Core idea:
Store data as self-contained documents (usually JSON)
{
"user": "Frank",
"skills": ["React", "AI"],
"experience": {
"years": 8
}
}
Strengths:
Flexible schema (schema-on-read)
Maps naturally to application code
Great for one-to-many / tree-like data
Weaknesses:
Weak support for joins
Many-to-many relationships get messy fast
You might push complexity into your app instead of the DB
👉 Translation:
You avoided SQL… but now you are the query engine. Congrats.
3. Graph Model
The underrated weapon.
Core idea:
Data = nodes (entities) + edges (relationships)
Perfect for:
Social networks
Recommendation systems
Fraud detection
Anything with deep relationships
Strengths:
Relationships are first-class citizens
Traversals are natural and efficient
Extremely flexible schema
Weaknesses:
Not ideal for simple CRUD apps
Requires a mindset shift
Graphs shine when:
“Everything is connected to everything.”
The Real Trade-Off (No BS Version)
Let’s simplify the decision-making:
Use CaseBest ModelStructured data, transactionsRelationalNested data, fast iterationDocumentHighly connected dataGraph
And here’s the truth Kleppmann hints at:
There is no “best” model—only fit-for-purpose
Schema: Strict vs Flexible
This is where developers start fights on Twitter.
Schema-on-Write (Relational)
Define structure before storing
Enforced by the database
✔ Safe
❌ Less flexible
Schema-on-Read (Document)
Store anything
Interpret later
✔ Flexible
❌ Easy to shoot yourself in the foot
Kleppmann puts it nicely:
It’s not “schemaless”—it’s just schema somewhere else (usually your code)
Query Languages: Declarative Wins
Another underrated concept.
Imperative (how to do it)
for (...) {
if (...) {
...
}
}
Declarative (what you want)
SELECT * FROM animals WHERE family = 'Sharks';
Declarative queries:
Let the database optimize execution
Scale better long-term
Reduce application complexity
That’s why SQL is still everywhere.
Beyond SQL: Modern Query Approaches
Chapter 2 also explores different query styles:
SQL → relational
MongoDB aggregation → document pipelines
Cypher → graph queries
SPARQL → RDF graph queries
Datalog → rule-based logic (very powerful, slightly mind-bending)
Datalog in particular is interesting:
You define rules, not just queries—and they can be reused and composed
The Object-Relational Mismatch (a.k.a. Why ORMs Exist)
Your app uses objects.
Your DB uses tables.
That mismatch creates:
Boilerplate
Complexity
Performance trade-offs
This is why:
ORMs exist
And also why people complain about ORMs 😅
The Convergence Trend
Here’s the plot twist:
Databases are starting to look… the same.
Relational DBs now support JSON
Document DBs are adding joins
Hybrid systems are emerging
Kleppmann calls this out clearly:
👉 The future is likely a mix of models, not one winner
Practical Takeaways (No Fluff)
If you remember nothing else, remember this:
1. Your data model is a long-term decision
Changing it later = pain.
2. Model your relationships first
Few relationships → Document
Many relationships → Relational or Graph
3. Don’t blindly follow trends
“Use Mongo” or “Use Postgres” is not architecture.
4. Complexity moves somewhere
Not in the DB? → It’s in your app
Not in your app? → It’s in the DB
Pick your poison wisely.
Final Thoughts
Chapter 2 is basically Kleppmann saying:
“Your database choice is not a tooling decision.
It’s a thinking framework.”
And once you see it that way, you stop asking:
“What database should I use?”
And start asking:
“What shape does my data actually have?”
If you want, I can also:
Turn this into a Notion-ready blog template
Add code examples (Postgres vs Mongo vs Neo4j)
Or tailor it to your blog voice (The Practical Engineer) with stronger personality and visuals