Designing Event Schemas for Long-Term Evolution
Best practices for designing event schemas that can evolve gracefully over time, covering versioning strategies, compatibility rules, and TypeScript patterns for schema management in event-driven systems.
Events are the contracts of an event-driven system. When a producer emits an event, every consumer trusts that the event conforms to a known schema. When that schema needs to change --- and it will --- the consequences ripple across every service that touches that event. Getting schema design right from the beginning is one of the highest-leverage investments you can make. Getting it wrong leads to painful migrations, broken consumers, and the kind of production incidents that ruin weekends.
This article lays out the principles, patterns, and TypeScript implementations for designing event schemas that evolve gracefully over months and years in systems like Starburst.
Why Event Schemas Matter More Than You Think
In a traditional request-response system, the contract between client and server is defined by an API endpoint. If the endpoint changes, you update the client. The old version of the contract ceases to exist.
Events are different. Because events are persisted in an event store or log, old versions of events live alongside new versions indefinitely. An event sourced system might replay events from months or years ago, and those events must still be interpretable. This creates a unique challenge: your schema must be both stable enough to support historical data and flexible enough to accommodate future requirements.
// The envelope pattern: Separate event metadata from payload
interface EventEnvelope<T extends string = string, P = unknown> {
readonly eventId: string;
readonly eventType: T;
readonly schemaVersion: number;
readonly timestamp: Date;
readonly source: string;
readonly aggregateId: string;
readonly aggregateType: string;
readonly payload: P;
readonly metadata: {
correlationId: string;
causationId: string;
userId?: string;
traceId?: string;
};
}
// Versioned payload types
interface UserRegisteredV1 {
userId: string;
email: string;
name: string;
}
interface UserRegisteredV2 {
userId: string;
email: string;
firstName: string;
lastName: string;
registrationSource: string;
}
// Type-safe event definitions with version
type UserRegisteredEventV1 = EventEnvelope<"UserRegistered", UserRegisteredV1>;
type UserRegisteredEventV2 = EventEnvelope<"UserRegistered", UserRegisteredV2>;The schemaVersion field in the envelope is your lifeline. It tells consumers exactly which version of the payload to expect, enabling them to handle different versions appropriately.
Compatibility Rules: Forward, Backward, and Full
When evolving a schema, you need to understand three types of compatibility.
Backward compatibility means new consumers can read old events. If you add an optional field to an event, old events that lack that field can still be processed by new consumer code. Backward compatibility is the minimum requirement for any event schema change.
Forward compatibility means old consumers can read new events. If a new event includes an extra field that old consumers do not know about, those consumers should ignore it gracefully. Forward compatibility requires consumers to be tolerant of unknown fields.
Full compatibility means the schema is both forward and backward compatible. This is the gold standard and should be your default goal.
// Schema compatibility validation
interface SchemaRule {
name: string;
validate(oldSchema: EventSchema, newSchema: EventSchema): ValidationResult;
}
interface EventSchema {
eventType: string;
version: number;
fields: FieldDefinition[];
}
interface FieldDefinition {
name: string;
type: string;
required: boolean;
defaultValue?: unknown;
}
interface ValidationResult {
compatible: boolean;
errors: string[];
warnings: string[];
}
// Rule: New required fields are not backward compatible
const noNewRequiredFields: SchemaRule = {
name: "NoNewRequiredFields",
validate(oldSchema, newSchema) {
const errors: string[] = [];
const oldFieldNames = new Set(oldSchema.fields.map((f) => f.name));
for (const field of newSchema.fields) {
if (!oldFieldNames.has(field.name) && field.required) {
errors.push(
`New required field "${field.name}" breaks backward compatibility. ` +
`Make it optional with a default value instead.`
);
}
}
return {
compatible: errors.length === 0,
errors,
warnings: [],
};
},
};
// Rule: Existing fields cannot be removed (forward compatibility)
const noRemovedFields: SchemaRule = {
name: "NoRemovedFields",
validate(oldSchema, newSchema) {
const errors: string[] = [];
const newFieldNames = new Set(newSchema.fields.map((f) => f.name));
for (const field of oldSchema.fields) {
if (!newFieldNames.has(field.name)) {
errors.push(
`Removing field "${field.name}" breaks forward compatibility. ` +
`Deprecate it instead.`
);
}
}
return {
compatible: errors.length === 0,
errors,
warnings: [],
};
},
};
// Rule: Field types cannot change
const noTypeChanges: SchemaRule = {
name: "NoTypeChanges",
validate(oldSchema, newSchema) {
const errors: string[] = [];
const newFields = new Map(
newSchema.fields.map((f) => [f.name, f])
);
for (const oldField of oldSchema.fields) {
const newField = newFields.get(oldField.name);
if (newField && newField.type !== oldField.type) {
errors.push(
`Changing type of field "${oldField.name}" from ` +
`"${oldField.type}" to "${newField.type}" breaks compatibility.`
);
}
}
return {
compatible: errors.length === 0,
errors,
warnings: [],
};
},
};
// Schema compatibility checker
class CompatibilityChecker {
private rules: SchemaRule[] = [
noNewRequiredFields,
noRemovedFields,
noTypeChanges,
];
check(oldSchema: EventSchema, newSchema: EventSchema): ValidationResult {
const allErrors: string[] = [];
const allWarnings: string[] = [];
for (const rule of this.rules) {
const result = rule.validate(oldSchema, newSchema);
allErrors.push(...result.errors);
allWarnings.push(...result.warnings);
}
return {
compatible: allErrors.length === 0,
errors: allErrors,
warnings: allWarnings,
};
}
}Upcasting: Bridging Schema Versions
When you replay historical events, they may be in an older schema version. Upcasting transforms old events into the current schema version so that consumer code only needs to handle the latest format.
// Upcaster registry
type Upcaster<TFrom = any, TTo = any> = (payload: TFrom) => TTo;
interface UpcasterDefinition {
eventType: string;
fromVersion: number;
toVersion: number;
transform: Upcaster;
}
class UpcasterRegistry {
private upcasters = new Map<string, UpcasterDefinition[]>();
register(definition: UpcasterDefinition): void {
const key = definition.eventType;
if (!this.upcasters.has(key)) {
this.upcasters.set(key, []);
}
this.upcasters.get(key)!.push(definition);
// Keep sorted by fromVersion for sequential application
this.upcasters.get(key)!.sort(
(a, b) => a.fromVersion - b.fromVersion
);
}
upcast<T>(
eventType: string,
payload: unknown,
fromVersion: number,
targetVersion: number
): T {
const chain = this.upcasters.get(eventType) ?? [];
let current = payload;
let currentVersion = fromVersion;
for (const upcaster of chain) {
if (
upcaster.fromVersion === currentVersion &&
currentVersion < targetVersion
) {
current = upcaster.transform(current);
currentVersion = upcaster.toVersion;
}
}
if (currentVersion !== targetVersion) {
throw new Error(
`Cannot upcast ${eventType} from version ${fromVersion} ` +
`to version ${targetVersion}. ` +
`Reached version ${currentVersion}.`
);
}
return current as T;
}
}
// Registering upcasters for the UserRegistered event
const registry = new UpcasterRegistry();
// V1 -> V2: Split "name" into "firstName" and "lastName",
// add "registrationSource"
registry.register({
eventType: "UserRegistered",
fromVersion: 1,
toVersion: 2,
transform: (v1: UserRegisteredV1): UserRegisteredV2 => {
const nameParts = v1.name.split(" ");
return {
userId: v1.userId,
email: v1.email,
firstName: nameParts[0] ?? "",
lastName: nameParts.slice(1).join(" ") || "",
registrationSource: "unknown",
};
},
});
// Usage in event processing
function processEvent(envelope: EventEnvelope): void {
const currentVersion = 2;
if (envelope.schemaVersion < currentVersion) {
const upcastedPayload = registry.upcast<UserRegisteredV2>(
envelope.eventType,
envelope.payload,
envelope.schemaVersion,
currentVersion
);
handleUserRegistered(upcastedPayload);
} else {
handleUserRegistered(envelope.payload as UserRegisteredV2);
}
}
function handleUserRegistered(payload: UserRegisteredV2): void {
console.log(
`Processing registration for ${payload.firstName} ${payload.lastName}`
);
}Upcasting keeps consumer code clean --- it only deals with the latest version of each event. The complexity of handling multiple versions is isolated in the upcaster chain, which is registered once and applied automatically.
Schema Registry: Centralizing Schema Management
In a system with many event types and multiple producers and consumers, managing schemas manually becomes untenable. A schema registry provides a centralized repository for event schemas, with version tracking and compatibility checking built in.
// Schema registry
interface SchemaRegistryEntry {
eventType: string;
version: number;
schema: EventSchema;
createdAt: Date;
deprecated: boolean;
deprecationMessage?: string;
}
class SchemaRegistry {
private schemas = new Map<string, SchemaRegistryEntry[]>();
private compatibilityChecker = new CompatibilityChecker();
register(entry: SchemaRegistryEntry): void {
const key = entry.eventType;
if (!this.schemas.has(key)) {
this.schemas.set(key, []);
}
const existing = this.schemas.get(key)!;
const latestVersion = existing.length > 0
? existing[existing.length - 1].version
: 0;
if (entry.version !== latestVersion + 1) {
throw new Error(
`Expected version ${latestVersion + 1}, got ${entry.version}. ` +
`Versions must be sequential.`
);
}
// Check compatibility with the latest version
if (existing.length > 0) {
const latest = existing[existing.length - 1];
const result = this.compatibilityChecker.check(
latest.schema,
entry.schema
);
if (!result.compatible) {
throw new SchemaCompatibilityError(
`Schema for ${key} v${entry.version} is not compatible ` +
`with v${latest.version}`,
result.errors
);
}
if (result.warnings.length > 0) {
console.warn(
`Schema warnings for ${key} v${entry.version}:`,
result.warnings
);
}
}
existing.push(entry);
}
getLatest(eventType: string): SchemaRegistryEntry | null {
const versions = this.schemas.get(eventType);
if (!versions || versions.length === 0) return null;
return versions[versions.length - 1];
}
getVersion(
eventType: string,
version: number
): SchemaRegistryEntry | null {
const versions = this.schemas.get(eventType);
return versions?.find((v) => v.version === version) ?? null;
}
deprecate(
eventType: string,
version: number,
message: string
): void {
const entry = this.getVersion(eventType, version);
if (entry) {
(entry as any).deprecated = true;
(entry as any).deprecationMessage = message;
}
}
getAllEventTypes(): string[] {
return Array.from(this.schemas.keys());
}
}
class SchemaCompatibilityError extends Error {
constructor(
message: string,
public readonly compatibilityErrors: string[]
) {
super(message);
this.name = "SchemaCompatibilityError";
}
}In production, a schema registry is often backed by a dedicated service like Confluent Schema Registry or a custom implementation using a database. The key is that every producer and consumer validates events against the registry before producing or consuming them.
Runtime Validation with TypeScript
Compile-time types are not enough. Events come from external sources --- message brokers, HTTP endpoints, other services --- and their structure cannot be guaranteed at compile time. Runtime validation is essential.
// Runtime validation using a schema definition
interface Validator<T> {
validate(input: unknown): ValidationOutcome<T>;
}
type ValidationOutcome<T> =
| { success: true; value: T }
| { success: false; errors: string[] };
// Builder pattern for creating validators
class EventValidator<T> implements Validator<T> {
private rules: Array<{
path: string;
check: (value: any) => string | null;
}> = [];
requireString(path: string): this {
this.rules.push({
path,
check: (value) =>
typeof value === "string" ? null : `${path} must be a string`,
});
return this;
}
requireNumber(path: string): this {
this.rules.push({
path,
check: (value) =>
typeof value === "number" && !isNaN(value)
? null
: `${path} must be a number`,
});
return this;
}
optionalString(path: string): this {
this.rules.push({
path,
check: (value) =>
value === undefined || value === null || typeof value === "string"
? null
: `${path} must be a string or absent`,
});
return this;
}
requireArray(path: string): this {
this.rules.push({
path,
check: (value) =>
Array.isArray(value) ? null : `${path} must be an array`,
});
return this;
}
validate(input: unknown): ValidationOutcome<T> {
if (typeof input !== "object" || input === null) {
return {
success: false,
errors: ["Input must be a non-null object"],
};
}
const errors: string[] = [];
for (const rule of this.rules) {
const value = this.getNestedValue(input, rule.path);
const error = rule.check(value);
if (error) {
errors.push(error);
}
}
if (errors.length > 0) {
return { success: false, errors };
}
return { success: true, value: input as T };
}
private getNestedValue(obj: any, path: string): any {
return path.split(".").reduce(
(current, key) => current?.[key],
obj
);
}
}
// Usage
const userRegisteredValidator = new EventValidator<UserRegisteredV2>()
.requireString("userId")
.requireString("email")
.requireString("firstName")
.requireString("lastName")
.requireString("registrationSource");
function processIncomingEvent(raw: unknown): void {
const result = userRegisteredValidator.validate(raw);
if (!result.success) {
console.error("Invalid event payload:", result.errors);
return;
}
// result.value is typed as UserRegisteredV2
handleUserRegistered(result.value);
}Practical Tips for Schema Design
Use the "tell, do not ask" principle for event naming. Events should describe what happened, not what might happen. Use past tense: OrderPlaced, PaymentReceived, UserRegistered. This makes the temporal nature of events explicit.
Include enough context for consumers to act independently. An OrderPlaced event should include the customer name and order details, not just an order ID that requires a lookup. This reduces coupling between services and makes the system more resilient to individual service failures.
Avoid deeply nested payloads. Flat or shallow structures are easier to evolve, validate, and query. If you find yourself nesting more than two levels deep, consider whether the event should be split into multiple events.
Reserve fields for future use. If you can foresee a field being needed, add it as optional now. It is far easier to make an optional field required later than to add a new required field to an existing schema.
Document every event schema. Include the business meaning, when the event is produced, who consumes it, and what invariants hold. Your future self will thank you.
Conclusion
Event schema design is one of the most consequential decisions in an event-driven system. Schemas that are well-designed from the start, versioned carefully, and validated rigorously will save countless hours of debugging and migration work down the road. The patterns we have covered --- the envelope pattern, compatibility rules, upcasting, schema registries, and runtime validation --- provide a comprehensive toolkit for managing schema evolution in TypeScript.
When building event-processing systems like Starburst, investing in schema infrastructure is not gold-plating; it is building the foundation that allows the rest of the system to evolve safely and confidently over time.
Related Articles
Monitoring Event-Driven Systems at Scale
A practical guide to building comprehensive monitoring and observability for event-driven systems, covering metrics, distributed tracing, alerting strategies, and operational dashboards for maintaining healthy event processing pipelines.
Migrating to Event-Driven Architecture
A practical guide for planning and executing a migration from traditional request-response systems to event-driven architecture, covering assessment frameworks, migration strategies, risk management, and organizational change.
Real-Time Data Processing: Business Impact and ROI
An exploration of the business value of real-time data processing, covering measurable ROI, competitive advantages, and practical frameworks for justifying investment in event-driven infrastructure.