WebSocket Architecture for Real-Time Messaging
How to design a WebSocket architecture for real-time messaging systems, covering connection management, heartbeat patterns, reconnection strategies, and scaling considerations used in Whisper.
Real-time communication requires a persistent, bidirectional channel between client and server. HTTP's request-response model does not support this — the server cannot push data to the client without the client asking first. WebSockets solve this by establishing a long-lived TCP connection that either side can use to send messages at any time.
Whisper, Klivvr's notification and messaging service, uses WebSockets as its primary real-time delivery channel. This article covers the architecture decisions behind Whisper's WebSocket layer, from connection lifecycle management to scaling patterns.
Connection Lifecycle
A WebSocket connection goes through four phases: handshake, authentication, active communication, and termination. Each phase needs explicit handling.
import { WebSocketServer, WebSocket } from "ws";
import { IncomingMessage } from "http";
interface ClientConnection {
id: string;
userId: string;
socket: WebSocket;
connectedAt: Date;
lastPingAt: Date;
subscriptions: Set<string>;
metadata: Record<string, unknown>;
}
class WhisperWebSocketServer {
private wss: WebSocketServer;
private connections = new Map<string, ClientConnection>();
private userConnections = new Map<string, Set<string>>();
constructor(port: number) {
this.wss = new WebSocketServer({ port });
this.wss.on("connection", (socket, request) =>
this.handleConnection(socket, request)
);
}
private async handleConnection(
socket: WebSocket,
request: IncomingMessage
): Promise<void> {
// Phase 1: Authentication
const token = this.extractToken(request);
if (!token) {
socket.close(4001, "Authentication required");
return;
}
const user = await this.authenticateToken(token);
if (!user) {
socket.close(4003, "Invalid token");
return;
}
// Phase 2: Register connection
const connectionId = crypto.randomUUID();
const connection: ClientConnection = {
id: connectionId,
userId: user.id,
socket,
connectedAt: new Date(),
lastPingAt: new Date(),
subscriptions: new Set(),
metadata: { deviceType: request.headers["x-device-type"] },
};
this.connections.set(connectionId, connection);
this.addUserConnection(user.id, connectionId);
// Phase 3: Setup message handling
socket.on("message", (data) =>
this.handleMessage(connection, data.toString())
);
socket.on("close", () => this.handleDisconnect(connection));
socket.on("error", (error) => this.handleError(connection, error));
// Send connection acknowledgment
this.send(connection, {
type: "connected",
connectionId,
serverTime: new Date().toISOString(),
});
// Deliver any queued messages
await this.deliverPendingMessages(user.id, connection);
}
private send(connection: ClientConnection, data: unknown): void {
if (connection.socket.readyState === WebSocket.OPEN) {
connection.socket.send(JSON.stringify(data));
}
}
private handleDisconnect(connection: ClientConnection): void {
this.connections.delete(connection.id);
this.removeUserConnection(connection.userId, connection.id);
}
private addUserConnection(userId: string, connId: string): void {
if (!this.userConnections.has(userId)) {
this.userConnections.set(userId, new Set());
}
this.userConnections.get(userId)!.add(connId);
}
private removeUserConnection(userId: string, connId: string): void {
this.userConnections.get(userId)?.delete(connId);
if (this.userConnections.get(userId)?.size === 0) {
this.userConnections.delete(userId);
}
}
private extractToken(request: IncomingMessage): string | null {
const url = new URL(request.url ?? "", "ws://localhost");
return url.searchParams.get("token");
}
private async authenticateToken(
token: string
): Promise<{ id: string } | null> {
// Verify JWT or session token
return tokenService.verify(token);
}
private async deliverPendingMessages(
userId: string,
connection: ClientConnection
): Promise<void> {
const pending = await messageQueue.getPending(userId);
for (const message of pending) {
this.send(connection, message);
await messageQueue.acknowledge(message.id);
}
}
}Heartbeat and Connection Health
WebSocket connections can silently die — the client loses network connectivity, a proxy times out, or a load balancer closes an idle connection. Without active health checking, the server holds stale connections that consume memory and produce delivery failures.
class HeartbeatManager {
private interval: NodeJS.Timeout | null = null;
private pingIntervalMs: number;
private timeoutMs: number;
constructor(pingIntervalMs: number = 30000, timeoutMs: number = 10000) {
this.pingIntervalMs = pingIntervalMs;
this.timeoutMs = timeoutMs;
}
start(
connections: Map<string, ClientConnection>,
onTimeout: (connection: ClientConnection) => void
): void {
this.interval = setInterval(() => {
const now = Date.now();
for (const [id, connection] of connections) {
const timeSinceLastPing = now - connection.lastPingAt.getTime();
if (timeSinceLastPing > this.pingIntervalMs + this.timeoutMs) {
// Connection has not responded to ping — terminate
onTimeout(connection);
continue;
}
if (timeSinceLastPing > this.pingIntervalMs) {
// Time to send a ping
if (connection.socket.readyState === WebSocket.OPEN) {
connection.socket.ping();
}
}
}
}, this.pingIntervalMs / 2);
}
stop(): void {
if (this.interval) {
clearInterval(this.interval);
}
}
}The client mirrors this with its own heartbeat handler, responding to server pings and detecting when the connection has been lost.
Message Protocol
Whisper uses a typed message protocol that distinguishes between different message categories and supports acknowledgment.
interface WhisperMessage {
id: string;
type: MessageType;
channel?: string;
payload: unknown;
timestamp: string;
requiresAck: boolean;
}
type MessageType =
| "notification" // Push notification content
| "transaction_alert" // Real-time transaction event
| "chat_message" // In-app chat message
| "system_event" // System status updates
| "presence" // User online/offline status
| "ack" // Acknowledgment
| "subscribe" // Subscribe to a channel
| "unsubscribe"; // Unsubscribe from a channel
class MessageRouter {
private handlers = new Map<
MessageType,
(conn: ClientConnection, msg: WhisperMessage) => Promise<void>
>();
register(
type: MessageType,
handler: (conn: ClientConnection, msg: WhisperMessage) => Promise<void>
): void {
this.handlers.set(type, handler);
}
async route(
connection: ClientConnection,
raw: string
): Promise<void> {
const message: WhisperMessage = JSON.parse(raw);
const handler = this.handlers.get(message.type);
if (!handler) {
console.warn(`No handler for message type: ${message.type}`);
return;
}
await handler(connection, message);
if (message.requiresAck) {
connection.socket.send(
JSON.stringify({
id: crypto.randomUUID(),
type: "ack",
payload: { messageId: message.id },
timestamp: new Date().toISOString(),
requiresAck: false,
})
);
}
}
}Scaling with Multiple Server Instances
A single WebSocket server has a limit on the number of concurrent connections it can handle — typically 50,000-100,000 depending on hardware and message volume. Scaling beyond that requires multiple server instances with a coordination layer.
import { Redis } from "ioredis";
class DistributedWhisperServer {
private localConnections = new Map<string, ClientConnection>();
private redis: Redis;
private pubsub: Redis;
private serverId: string;
constructor(redis: Redis) {
this.redis = redis;
this.pubsub = redis.duplicate();
this.serverId = crypto.randomUUID();
}
async start(): Promise<void> {
// Subscribe to cross-server messages
await this.pubsub.subscribe("whisper:broadcast");
this.pubsub.on("message", (channel, data) => {
const message = JSON.parse(data);
if (message.sourceServer === this.serverId) return;
this.deliverLocally(message.userId, message.payload);
});
}
async sendToUser(userId: string, payload: unknown): Promise<boolean> {
// Try local delivery first
if (this.deliverLocally(userId, payload)) {
return true;
}
// User might be on another server — publish to Redis
await this.redis.publish(
"whisper:broadcast",
JSON.stringify({
sourceServer: this.serverId,
userId,
payload,
})
);
return true;
}
private deliverLocally(userId: string, payload: unknown): boolean {
// Find all local connections for this user
let delivered = false;
for (const [, conn] of this.localConnections) {
if (conn.userId === userId && conn.socket.readyState === WebSocket.OPEN) {
conn.socket.send(JSON.stringify(payload));
delivered = true;
}
}
return delivered;
}
async registerConnection(userId: string, connId: string): Promise<void> {
// Track which server handles which user
await this.redis.sadd(`whisper:user:${userId}:servers`, this.serverId);
await this.redis.expire(`whisper:user:${userId}:servers`, 3600);
}
}Conclusion
WebSocket architecture for real-time messaging requires attention to every phase of the connection lifecycle — authentication, health monitoring, message routing, and graceful termination. Whisper's architecture handles these concerns through typed message protocols, server-side heartbeat monitoring, acknowledgment-based delivery guarantees, and Redis-coordinated multi-server scaling. The result is a real-time channel that delivers notifications, transaction alerts, and chat messages to Klivvr's users with sub-second latency and reliable delivery.
Related Articles
Omnichannel Messaging Strategy for Fintech
How to build a unified omnichannel messaging strategy across push, SMS, email, and in-app channels — covering channel selection, message consistency, and unified customer experience.
Real-Time Notifications in Banking
How real-time notifications transform the banking experience, covering transaction alerts, security notifications, compliance requirements, and the business impact of instant communication.
Scaling Notification Systems
Strategies for scaling notification systems to millions of users, covering horizontal scaling, fan-out patterns, rate limiting, and infrastructure design — with patterns from Whisper.