Transactions: Keeping Your Data Sane in a Chaotic World

Distributed systems don’t fail loudly. They fail silently—with bad data.
If Chapter 6 was about scaling your system,
Chapter 7 is about not corrupting your data while doing it.
Because here’s the uncomfortable truth:
Distributed systems don’t fail loudly.
They fail silently—with bad data.
And that’s worse.
🧠 What Is a Transaction?
A transaction is a way to group operations so they behave as one logical unit.
Either:
✅ everything succeeds
❌ nothing happens
No half-finished mess.
💡 Why Transactions Exist
Without transactions, you get:
missing data
duplicated records
inconsistent state
Example:
deduct money ✔
add money ❌
Congrats—you just invented a bug your users will definitely notice.
⚖️ ACID: The Classic Guarantees
Transactions are usually described with ACID:
🔒 Atomicity
All or nothing.
If something fails halfway:
👉 rollback everything
🧭 Consistency
Data stays valid according to rules.
(Not “consistent across replicas”—different meaning!)
🧱 Isolation
Transactions don’t interfere with each other.
Each one behaves like it’s alone in the system.
💾 Durability
Once committed:
👉 it stays, even after crashes
⚠️ Reality Check: ACID Is Not Absolute
Different systems interpret ACID differently.
Some databases:
relax isolation
weaken durability
Why?
👉 Performance and scalability.
🔥 Isolation Levels (Where Things Get Interesting)
Isolation is not binary.
It comes in levels—with trade-offs.
1. Read Committed
You only see committed data.
✅ avoids dirty reads
❌ still allows weird behaviors
2. Repeatable Read
Same query = same result within a transaction
✅ more stable reads
❌ still not perfect
3. Serializable (The Gold Standard)
Transactions behave as if executed one-by-one.
✅ safest
❌ slowest / hardest to scale
🧨 Common Concurrency Problems
Let’s talk about the bugs that ruin your day.
Dirty Reads
Reading uncommitted data
Non-Repeatable Reads
Same query → different result
Phantom Reads
New rows appear mid-transaction
Lost Updates (The Silent Killer)
Two users update the same data
👉 one overwrites the other
🧠 Concurrency Control Strategies
How do systems prevent chaos?
🔐 Locking (Pessimistic)
Lock data before modifying.
✅ Pros:
safe
predictable
❌ Cons:
slow
can deadlock
⚡ MVCC (Multi-Version Concurrency Control)
Keep multiple versions of data.
Readers don’t block writers.
✅ Pros:
high performance
great for reads
❌ Cons:
more complex
Used by:
PostgreSQL
MySQL (InnoDB)
🧪 Optimistic Concurrency
Assume no conflict.
Check before commit.
✅ Pros:
fast when conflicts are rare
❌ Cons:
retries needed
🌍 Distributed Transactions (The Hard Mode)
Now add multiple services or databases.
Things get… complicated.
Two-Phase Commit (2PC)
Prepare phase
Commit phase
Problem:
slow
can block
fragile in failures
⚠️ Why Many Systems Avoid Distributed Transactions
Modern architectures (microservices, event-driven systems) often:
👉 avoid strict transactions entirely
Instead, they use:
eventual consistency
retries
idempotency
compensation logic
Because:
strict correctness across services = expensive and fragile
🔄 The Shift in Thinking
Old mindset:
“We need strong consistency everywhere.”
New mindset:
“We need correctness where it matters—and flexibility where it doesn’t.”
💡 Real-World Example
Think e-commerce:
Do you really need:
inventory
payment
email
analytics
…all in one strict transaction?
No.
You:
ensure payment correctness
tolerate delays elsewhere
🧠 Final Takeaways
Transactions protect data integrity
ACID is a spectrum, not a guarantee
Isolation levels define behavior under concurrency
Distributed transactions are painful—avoid when possible
Modern systems favor pragmatic consistency
🔥 The Big Idea
Transactions are not about perfection.
They’re about controlling chaos just enough to keep data trustworthy.
Master this, and you stop fearing concurrency bugs—
and start designing systems that handle them gracefully.