Scaling Notification Systems
Strategies for scaling notification systems to millions of users, covering horizontal scaling, fan-out patterns, rate limiting, and infrastructure design — with patterns from Whisper.
A notification system that works for 10,000 users will not work for 10 million. As user base and notification volume grow, every component faces scaling pressure: the WebSocket servers that maintain persistent connections, the message queues that buffer notifications, the push service integrations that deliver to devices, and the database that tracks delivery status.
This article covers the scaling strategies Whisper uses to deliver notifications reliably at scale.
Horizontal Scaling of WebSocket Servers
WebSocket servers are stateful — each server holds a set of active connections. Scaling requires distributing connections across multiple server instances and coordinating message delivery between them.
class ScalableWhisperCluster {
private serverId: string;
private redis: Redis;
private localConnections = new Map<string, WebSocket>();
constructor(redis: Redis) {
this.redis = redis;
this.serverId = `whisper-${process.env.HOSTNAME}-${process.pid}`;
}
async registerConnection(userId: string, socket: WebSocket): Promise<void> {
this.localConnections.set(userId, socket);
// Register in Redis: which server handles which user
await this.redis.hset(
"whisper:connections",
userId,
this.serverId
);
await this.redis.expire("whisper:connections", 3600);
}
async sendToUser(userId: string, message: unknown): Promise<boolean> {
// Try local delivery
const localSocket = this.localConnections.get(userId);
if (localSocket?.readyState === WebSocket.OPEN) {
localSocket.send(JSON.stringify(message));
return true;
}
// Look up which server has this user
const targetServer = await this.redis.hget("whisper:connections", userId);
if (!targetServer || targetServer === this.serverId) {
return false; // User not connected
}
// Route through Redis pub/sub to the correct server
await this.redis.publish(
`whisper:server:${targetServer}`,
JSON.stringify({ userId, message })
);
return true;
}
async startListening(): Promise<void> {
const subscriber = this.redis.duplicate();
await subscriber.subscribe(`whisper:server:${this.serverId}`);
subscriber.on("message", (channel, data) => {
const { userId, message } = JSON.parse(data);
const socket = this.localConnections.get(userId);
if (socket?.readyState === WebSocket.OPEN) {
socket.send(JSON.stringify(message));
}
});
}
}Fan-Out Pattern for Broadcast Notifications
Some notifications target many users simultaneously — a system maintenance announcement, a new feature rollout, or a market-wide alert. The fan-out pattern distributes these bulk operations efficiently.
class NotificationFanOut {
private batchSize: number;
private concurrency: number;
private pushService: UnifiedPushService;
constructor(config: { batchSize: number; concurrency: number }) {
this.batchSize = config.batchSize;
this.concurrency = config.concurrency;
}
async broadcastToSegment(
segmentQuery: string,
template: NotificationTemplate,
variables: Record<string, unknown>
): Promise<BroadcastResult> {
const stats = { total: 0, sent: 0, failed: 0, skipped: 0 };
// Stream users from database in batches
let offset = 0;
let batch: User[];
do {
batch = await this.getUserBatch(segmentQuery, offset, this.batchSize);
offset += this.batchSize;
// Process batch with controlled concurrency
const results = await this.processBatch(batch, template, variables);
stats.total += batch.length;
stats.sent += results.filter((r) => r.success).length;
stats.failed += results.filter((r) => !r.success).length;
stats.skipped += results.filter((r) => r.skipped).length;
} while (batch.length === this.batchSize);
return stats;
}
private async processBatch(
users: User[],
template: NotificationTemplate,
variables: Record<string, unknown>
): Promise<Array<{ success: boolean; skipped: boolean }>> {
const semaphore = new Semaphore(this.concurrency);
const results: Array<{ success: boolean; skipped: boolean }> = [];
await Promise.all(
users.map(async (user) => {
await semaphore.acquire();
try {
// Check rate limit before sending
const allowed = await rateLimiter.check(user.id);
if (!allowed) {
results.push({ success: false, skipped: true });
return;
}
const rendered = templateRenderer.render(
template,
"push",
user.locale,
{ ...variables, userName: user.name }
);
await this.pushService.sendToUser(user.id, {
title: rendered.title ?? "",
body: rendered.body,
priority: "normal",
});
results.push({ success: true, skipped: false });
} catch {
results.push({ success: false, skipped: false });
} finally {
semaphore.release();
}
})
);
return results;
}
private async getUserBatch(
query: string,
offset: number,
limit: number
): Promise<User[]> {
const result = await db.query(
`SELECT id, name, locale FROM users WHERE ${query} ORDER BY id LIMIT $1 OFFSET $2`,
[limit, offset]
);
return result.rows;
}
}
class Semaphore {
private permits: number;
private waiting: Array<() => void> = [];
constructor(permits: number) {
this.permits = permits;
}
async acquire(): Promise<void> {
if (this.permits > 0) {
this.permits--;
return;
}
return new Promise((resolve) => this.waiting.push(resolve));
}
release(): void {
if (this.waiting.length > 0) {
this.waiting.shift()!();
} else {
this.permits++;
}
}
}Connection Pooling and Resource Management
Each WebSocket connection consumes memory for the socket buffer, the connection state, and any subscriptions. At scale, resource management becomes critical.
class ConnectionResourceManager {
private maxConnectionsPerServer: number;
private currentConnections = 0;
constructor(maxConnections: number = 50000) {
this.maxConnectionsPerServer = maxConnections;
}
canAcceptConnection(): boolean {
return this.currentConnections < this.maxConnectionsPerServer;
}
trackConnection(): void {
this.currentConnections++;
this.reportMetrics();
}
releaseConnection(): void {
this.currentConnections--;
this.reportMetrics();
}
getUtilization(): number {
return this.currentConnections / this.maxConnectionsPerServer;
}
private reportMetrics(): void {
metrics.gauge("whisper.connections.active", this.currentConnections);
metrics.gauge("whisper.connections.utilization", this.getUtilization());
if (this.getUtilization() > 0.8) {
metrics.increment("whisper.connections.high_utilization_warning");
}
}
}Monitoring and Alerting
At scale, visibility into system health is essential. Whisper tracks metrics at every layer.
class NotificationMetrics {
async recordDelivery(
channel: string,
status: "sent" | "delivered" | "failed",
latencyMs: number
): Promise<void> {
metrics.increment(`whisper.delivery.${channel}.${status}`);
metrics.histogram(`whisper.delivery.${channel}.latency_ms`, latencyMs);
}
async recordQueueDepth(priority: string, depth: number): Promise<void> {
metrics.gauge(`whisper.queue.${priority}.depth`, depth);
}
async getHealthStatus(): Promise<SystemHealth> {
const queueDepths = await this.getAllQueueDepths();
const deliveryRate = await this.getDeliveryRate("5m");
const errorRate = await this.getErrorRate("5m");
return {
status: errorRate > 0.05 ? "degraded" : "healthy",
queueDepths,
deliveryRatePerSecond: deliveryRate,
errorRate,
activeConnections: connectionManager.currentConnections,
serverUtilization: connectionManager.getUtilization(),
};
}
}Conclusion
Scaling a notification system requires addressing capacity at every layer: WebSocket servers scaled horizontally with Redis coordination, fan-out patterns for bulk notifications, resource management for connection limits, and comprehensive monitoring for operational visibility. Whisper's scaling architecture ensures that notification delivery remains reliable as Klivvr's user base grows — whether delivering a single transaction alert or broadcasting to millions of users simultaneously.
Related Articles
Omnichannel Messaging Strategy for Fintech
How to build a unified omnichannel messaging strategy across push, SMS, email, and in-app channels — covering channel selection, message consistency, and unified customer experience.
Real-Time Notifications in Banking
How real-time notifications transform the banking experience, covering transaction alerts, security notifications, compliance requirements, and the business impact of instant communication.
Reducing Notification Fatigue
Strategies for reducing notification fatigue in mobile apps, covering frequency capping, smart scheduling, user preference management, and content relevance optimization.