Engineering

Scaling WebSocket Connections to 1 Million Users: Lessons from Building Wibe Chat

Vishnu Raj·2026-04-01·12 min read

When we started building Wibe Chat, our prototype handled 100 connections beautifully. At 1,000, things got interesting. At 10,000, we rewrote our connection layer. At 100,000, we rewrote it again.

Scaling WebSocket connections is fundamentally different from scaling HTTP APIs. With HTTP, each request is independent. With WebSockets, each connection is stateful and persistent.

Here's what we learned building infrastructure that handles over a million concurrent connections.

The Challenge

A WebSocket connection is a persistent TCP socket. Unlike HTTP requests that complete in milliseconds, a WebSocket connection lives as long as the user's session.

This means each connection consumes server memory (20-50KB), uses a file descriptor (Linux defaults to 1,024 per process), requires regular heartbeat pings, and needs state management for routing messages to specific server instances.

At 10,000 connections, a single server handles it. At 100,000, you need multiple servers with coordination. At 1,000,000, you need a distributed system with edge nodes.

Our Architecture

Layer 1: Edge Nodes. Lightweight connection servers in 30+ locations. Each handles up to 50,000 connections. Their job: accept connections, handle heartbeats, route messages. Stateless for business logic — fast, cheap, replaceable.

Layer 2: Message Router. Validates messages, persists to database, looks up channel subscribers, determines which edge nodes hold those connections, fans out messages. Uses consistent hashing to distribute channels across router instances.

Layer 3: Storage. Distributed database optimized for time-series appends. In-memory cache for hot channels (95%+ cache hit rate) plus persistent storage for history.

Handling Reconnections

We spent more time on reconnection logic than any other feature. When a client reconnects, it presents a "last seen" message ID. The edge node fetches all messages since that ID and delivers them. For long offline periods, messages are batched into compressed chunks — the client renders the first batch immediately while the rest loads.

Numbers

Concurrent connections: 1M+ routinely
Message latency (p50): 47ms globally
Message latency (p99): 120ms globally
Connection success rate: 99.7% on first attempt
Reconnection time: Median 1.2 seconds
Memory per connection: 32KB average
Peak throughput: 500K messages/second
Uptime: 99.99% over 12 months

What We'd Do Differently

Start with edge nodes from day one. We initially ran everything in a single region. The migration to edge was painful.

Invest in observability early. Real-time systems fail subtly. A message delivered in 200ms instead of 50ms triggers no alert, but users notice.

Don't underestimate mobile networks. A huge percentage of users are on 3G/4G. Designing for high-latency, low-bandwidth from the start saves rewrites.

The Takeaway

Scaling WebSocket connections gets exponentially harder with each order of magnitude. 1K to 10K is a config change. 10K to 100K is an architecture change. 100K to 1M is a distributed systems problem.

If real-time messaging is a feature (not the product), use an SDK and save months of engineering.

Frequently Asked Questions

What programming language do you use for edge nodes?

Go. The goroutine model maps naturally to handling many concurrent connections, and memory usage is predictable.

How do you handle message ordering?

Messages within a channel are totally ordered using server-assigned timestamps and sequence numbers. The router ensures ordering per-channel since all messages for a channel go through the same router instance.

What database do you use for message storage?

A combination of ScyllaDB for time-series message storage and Redis for hot channel caching. Cache hit rate for recent messages is over 95%.

Start building with Wibe Chat

Get started for free — no credit card required.

Get Started Now

Vishnu Raj

Software Engineer

Expert in real-time communication infrastructure and developer experiences.