Backend

Chapter 2: Data Models and Query Languages

FMFrank Mendez·
Chapter 2: Data Models and Query Languages

“The limits of my language mean the limits of my world.” — Ludwig Wittgenstein

The Real Backbone of Every System

If Chapter 1 was about how systems behave, Chapter 2 is about something more dangerous:
how you think about your data.

Because once you pick a data model…
you’re kind of stuck with it.


Why Data Models Matter More Than You Think

Most engineers treat databases like a tool:

“Just pick Postgres… or Mongo… or whatever works.”

But Martin Kleppmann makes a subtle point:

👉 Data models shape your thinking, not just your storage.

Every system is layered:

  • Real-world concepts (users, payments, events)

  • Application objects (classes, structs, JSON)

  • Database representation (tables, documents, graphs)

  • Storage (bytes, disk, memory)

Each layer abstracts the one below it

And here’s the catch:

Every abstraction makes some things easy… and others painful.


The Three Core Data Models

Let’s strip it down. Almost everything you’ll use falls into three camps:


1. Relational Model (SQL)

The OG. Still undefeated in many ways.

Core idea:

  • Data lives in tables (relations)

  • Rows = records

  • Columns = attributes

  • Relationships handled via joins

Strengths:

  • Handles complex relationships (many-to-many) extremely well

  • Mature ecosystem

  • Query optimizers do the hard work for you

Weaknesses:

  • Rigid schema (schema-on-write)

  • Mapping objects → tables can feel awkward (the famous ORM headache)


2. Document Model (NoSQL / JSON)

Think MongoDB, Firestore, etc.

Core idea:

  • Store data as self-contained documents (usually JSON)

{
  "user": "Frank",
  "skills": ["React", "AI"],
  "experience": {
    "years": 8
  }
}

Strengths:

  • Flexible schema (schema-on-read)

  • Maps naturally to application code

  • Great for one-to-many / tree-like data

Weaknesses:

  • Weak support for joins

  • Many-to-many relationships get messy fast

  • You might push complexity into your app instead of the DB

👉 Translation:
You avoided SQL… but now you are the query engine. Congrats.


3. Graph Model

The underrated weapon.

Core idea:

  • Data = nodes (entities) + edges (relationships)

Perfect for:

  • Social networks

  • Recommendation systems

  • Fraud detection

  • Anything with deep relationships

Strengths:

  • Relationships are first-class citizens

  • Traversals are natural and efficient

  • Extremely flexible schema

Weaknesses:

  • Not ideal for simple CRUD apps

  • Requires a mindset shift

Graphs shine when:

“Everything is connected to everything.”


The Real Trade-Off (No BS Version)

Let’s simplify the decision-making:

Use CaseBest ModelStructured data, transactionsRelationalNested data, fast iterationDocumentHighly connected dataGraph

And here’s the truth Kleppmann hints at:

There is no “best” model—only fit-for-purpose


Schema: Strict vs Flexible

This is where developers start fights on Twitter.

Schema-on-Write (Relational)

  • Define structure before storing

  • Enforced by the database

✔ Safe
❌ Less flexible


Schema-on-Read (Document)

  • Store anything

  • Interpret later

✔ Flexible
❌ Easy to shoot yourself in the foot

Kleppmann puts it nicely:

It’s not “schemaless”—it’s just schema somewhere else (usually your code)


Query Languages: Declarative Wins

Another underrated concept.

Imperative (how to do it)

for (...) {
  if (...) {
    ...
  }
}

Declarative (what you want)

SELECT * FROM animals WHERE family = 'Sharks';

Declarative queries:

  • Let the database optimize execution

  • Scale better long-term

  • Reduce application complexity

That’s why SQL is still everywhere.


Beyond SQL: Modern Query Approaches

Chapter 2 also explores different query styles:

  • SQL → relational

  • MongoDB aggregation → document pipelines

  • Cypher → graph queries

  • SPARQL → RDF graph queries

  • Datalog → rule-based logic (very powerful, slightly mind-bending)

Datalog in particular is interesting:

You define rules, not just queries—and they can be reused and composed


The Object-Relational Mismatch (a.k.a. Why ORMs Exist)

Your app uses objects.
Your DB uses tables.

That mismatch creates:

  • Boilerplate

  • Complexity

  • Performance trade-offs

This is why:

  • ORMs exist

  • And also why people complain about ORMs 😅


The Convergence Trend

Here’s the plot twist:

Databases are starting to look… the same.

  • Relational DBs now support JSON

  • Document DBs are adding joins

  • Hybrid systems are emerging

Kleppmann calls this out clearly:

👉 The future is likely a mix of models, not one winner


Practical Takeaways (No Fluff)

If you remember nothing else, remember this:

1. Your data model is a long-term decision

Changing it later = pain.


2. Model your relationships first

  • Few relationships → Document

  • Many relationships → Relational or Graph


3. Don’t blindly follow trends

“Use Mongo” or “Use Postgres” is not architecture.


4. Complexity moves somewhere

  • Not in the DB? → It’s in your app

  • Not in your app? → It’s in the DB

Pick your poison wisely.


Final Thoughts

Chapter 2 is basically Kleppmann saying:

“Your database choice is not a tooling decision.
It’s a thinking framework.”

And once you see it that way, you stop asking:

“What database should I use?”

And start asking:

“What shape does my data actually have?”


If you want, I can also:

  • Turn this into a Notion-ready blog template

  • Add code examples (Postgres vs Mongo vs Neo4j)

  • Or tailor it to your blog voice (The Practical Engineer) with stronger personality and visuals