🚀 From WebSockets to AI Agents: Building chatmesh (and Breaking It on Purpose)

Most developers build features. I wanted to build systems. So instead of another CRUD app, I started building a real-time chat backend in Go—something simple on the surface, but brutal underneath.

👉 Repo: chatmesh on GitHub

(Yes, I intentionally picked something that would break once it scales.)

🧠 Why chat?

Because chat forces you to deal with problems most apps avoid:

Persistent connections (WebSockets)
Concurrency
State management
Real-time delivery
Horizontal scaling

It’s not just “send message → receive message.”

It’s:

“What happens when 1,000 users connect across 5 servers?”

⚙️ Phase 1 — Everything Works (and That’s the Problem)

I built a single-node WebSocket server in Go.

Each connection:

has a read loop
has a write loop
communicates via channels

A central hub:

tracks clients
manages rooms
broadcasts messages

And guess what?

It worked perfectly.

That’s the trap.

💥 Phase 2 — Add One More Server (Everything Breaks)

I introduced:

multiple instances
a load balancer

Now:

User A → Server 1
User B → Server 2

Messages stopped syncing.

Nothing crashed.
But the system was fundamentally broken.

🔥 The Core Problem

WebSockets are stateful.

That means:

connections live inside a specific server
memory is not shared
load balancers are blind to connection state

So scaling horizontally causes:

fragmented reality

🧠 Phase 3 — Thinking Like a Systems Engineer

The fix wasn’t “better code.”

It was:

change how the system communicates

I introduced a Pub/Sub layer (Redis).

Now:

Client → Server → Pub/Sub → All Servers → Clients

Each instance:

publishes messages
subscribes to events
broadcasts locally

Now the system:

scales horizontally
stays consistent
behaves like one system

⚡ The Shift

At this point, chatmesh stopped being “a chat app.”

It became:

a real-time event system
a distributed messaging layer
a foundation for something bigger

🤖 Why This Matters for AI Agents

Modern AI systems aren’t just models.

They’re systems that:

communicate in real-time
maintain shared state
coordinate actions
stream responses

That’s exactly what this architecture enables.

Your “chat backend” becomes:

an AI agent communication layer

🧩 What’s Next for chatmesh

The roadmap:

✅ WebSocket real-time layer
⏳ Redis Pub/Sub (scaling)
⏳ Streaming LLM responses
⏳ Tool execution layer
⏳ Agent memory (vector DB / Redis)
⏳ Multi-agent coordination

🧠 Lessons Learned

1. Working locally means nothing

Single-node success is misleading.

2. Scaling is not an add-on

It changes your architecture completely.

3. State is your biggest enemy

Especially when it’s in memory.

4. Distributed systems = communication problems

Not coding problems.

5. Build things that break

That’s where real learning happens.

🚀 Final Thought

Most portfolios show:

dashboards
APIs
CRUD apps

Few show:

real-time systems
distributed architecture
scaling failures

That difference?
That’s where engineering starts.

🔗 Check the Project

👉 chatmesh repository

If you’re building something similar—or intentionally breaking your own system to learn—then you’re on the right path.