Chapter 2: Data Models and Query Languages

FMFrank Mendez·
Chapter 2: Data Models and Query Languages

“The limits of my language mean the limits of my world.” — Ludwig Wittgenstein

The Real Backbone of Every System

If Chapter 1 was about how systems behave, Chapter 2 is about something more dangerous:
how you think about your data.

Because once you pick a data model…
you’re kind of stuck with it.


Why Data Models Matter More Than You Think

Most engineers treat databases like a tool:

“Just pick Postgres… or Mongo… or whatever works.”

But Martin Kleppmann makes a subtle point:

👉 Data models shape your thinking, not just your storage.

Every system is layered:

  • Real-world concepts (users, payments, events)

  • Application objects (classes, structs, JSON)

  • Database representation (tables, documents, graphs)

  • Storage (bytes, disk, memory)

Each layer abstracts the one below it

And here’s the catch:

Every abstraction makes some things easy… and others painful.


The Three Core Data Models

Let’s strip it down. Almost everything you’ll use falls into three camps:


1. Relational Model (SQL)

The OG. Still undefeated in many ways.

Core idea:

  • Data lives in tables (relations)

  • Rows = records

  • Columns = attributes

  • Relationships handled via joins

Strengths:

  • Handles complex relationships (many-to-many) extremely well

  • Mature ecosystem

  • Query optimizers do the hard work for you

Weaknesses:

  • Rigid schema (schema-on-write)

  • Mapping objects → tables can feel awkward (the famous ORM headache)


2. Document Model (NoSQL / JSON)

Think MongoDB, Firestore, etc.

Core idea:

  • Store data as self-contained documents (usually JSON)

{
  "user": "Frank",
  "skills": ["React", "AI"],
  "experience": {
    "years": 8
  }
}

Strengths:

  • Flexible schema (schema-on-read)

  • Maps naturally to application code

  • Great for one-to-many / tree-like data

Weaknesses:

  • Weak support for joins

  • Many-to-many relationships get messy fast

  • You might push complexity into your app instead of the DB

👉 Translation:
You avoided SQL… but now you are the query engine. Congrats.


3. Graph Model

The underrated weapon.

Core idea:

  • Data = nodes (entities) + edges (relationships)

Perfect for:

  • Social networks

  • Recommendation systems

  • Fraud detection

  • Anything with deep relationships

Strengths:

  • Relationships are first-class citizens

  • Traversals are natural and efficient

  • Extremely flexible schema

Weaknesses:

  • Not ideal for simple CRUD apps

  • Requires a mindset shift

Graphs shine when:

“Everything is connected to everything.”


The Real Trade-Off (No BS Version)

Let’s simplify the decision-making:

Use CaseBest ModelStructured data, transactionsRelationalNested data, fast iterationDocumentHighly connected dataGraph

And here’s the truth Kleppmann hints at:

There is no “best” model—only fit-for-purpose


Schema: Strict vs Flexible

This is where developers start fights on Twitter.

Schema-on-Write (Relational)

  • Define structure before storing

  • Enforced by the database

✔ Safe
❌ Less flexible


Schema-on-Read (Document)

  • Store anything

  • Interpret later

✔ Flexible
❌ Easy to shoot yourself in the foot

Kleppmann puts it nicely:

It’s not “schemaless”—it’s just schema somewhere else (usually your code)


Query Languages: Declarative Wins

Another underrated concept.

Imperative (how to do it)

for (...) {
  if (...) {
    ...
  }
}

Declarative (what you want)

SELECT * FROM animals WHERE family = 'Sharks';

Declarative queries:

  • Let the database optimize execution

  • Scale better long-term

  • Reduce application complexity

That’s why SQL is still everywhere.


Beyond SQL: Modern Query Approaches

Chapter 2 also explores different query styles:

  • SQL → relational

  • MongoDB aggregation → document pipelines

  • Cypher → graph queries

  • SPARQL → RDF graph queries

  • Datalog → rule-based logic (very powerful, slightly mind-bending)

Datalog in particular is interesting:

You define rules, not just queries—and they can be reused and composed


The Object-Relational Mismatch (a.k.a. Why ORMs Exist)

Your app uses objects.
Your DB uses tables.

That mismatch creates:

  • Boilerplate

  • Complexity

  • Performance trade-offs

This is why:

  • ORMs exist

  • And also why people complain about ORMs 😅


The Convergence Trend

Here’s the plot twist:

Databases are starting to look… the same.

  • Relational DBs now support JSON

  • Document DBs are adding joins

  • Hybrid systems are emerging

Kleppmann calls this out clearly:

👉 The future is likely a mix of models, not one winner


Practical Takeaways (No Fluff)

If you remember nothing else, remember this:

1. Your data model is a long-term decision

Changing it later = pain.


2. Model your relationships first

  • Few relationships → Document

  • Many relationships → Relational or Graph


3. Don’t blindly follow trends

“Use Mongo” or “Use Postgres” is not architecture.


4. Complexity moves somewhere

  • Not in the DB? → It’s in your app

  • Not in your app? → It’s in the DB

Pick your poison wisely.


Final Thoughts

Chapter 2 is basically Kleppmann saying:

“Your database choice is not a tooling decision.
It’s a thinking framework.”

And once you see it that way, you stop asking:

“What database should I use?”

And start asking:

“What shape does my data actually have?”


If you want, I can also:

  • Turn this into a Notion-ready blog template

  • Add code examples (Postgres vs Mongo vs Neo4j)

  • Or tailor it to your blog voice (The Practical Engineer) with stronger personality and visuals

Stay in the loop

Get notified when new posts are published. No spam, unsubscribe anytime.

No spam · Unsubscribe anytime

💬 Leave a Comment

Want to join the conversation?