Data Modeling for CRM Systems
An in-depth guide to designing data models for CRM systems in TypeScript, covering entity relationships, event sourcing, temporal data, and schema evolution strategies.
A CRM system is, at its core, a data system. The quality of every feature — segmentation, analytics, campaign management, customer support — depends on the quality of the underlying data model. A poorly designed model creates friction everywhere: queries become contorted, features require workarounds, and migrations turn into multi-sprint projects. A well-designed model, by contrast, makes new features feel like natural extensions of what already exists.
At Klivvr, designing CVM Nova's data model was one of the most consequential architectural decisions we made. This article shares the patterns we use, the trade-offs we navigated, and the TypeScript interfaces that express our model in code.
The Core Entity Graph
Every CRM revolves around a small set of core entities: customers, interactions, accounts, and relationships. The challenge is not identifying these entities — that part is obvious — but defining the boundaries between them and the cardinalities that connect them.
interface Customer {
id: string;
externalId?: string;
type: "individual" | "business";
status: "active" | "inactive" | "churned" | "prospect";
createdAt: Date;
updatedAt: Date;
}
interface CustomerProfile {
customerId: string;
firstName?: string;
lastName?: string;
email?: string;
phone?: string;
dateOfBirth?: Date;
address?: Address;
preferences: Record<string, unknown>;
customFields: Record<string, unknown>;
}
interface Address {
line1: string;
line2?: string;
city: string;
state?: string;
postalCode: string;
country: string;
}
interface Account {
id: string;
customerId: string;
type: string;
status: "open" | "closed" | "suspended";
balance: number;
currency: string;
openedAt: Date;
closedAt?: Date;
}
interface Interaction {
id: string;
customerId: string;
channel: "email" | "phone" | "chat" | "branch" | "app" | "web";
type: string;
direction: "inbound" | "outbound";
timestamp: Date;
metadata: Record<string, unknown>;
}Several design decisions are worth calling out. First, we separate Customer from CustomerProfile. The Customer entity is the stable identity — it has an ID, a type, and a status. The CustomerProfile contains mutable attributes that change over time. This separation makes it possible to track profile changes without complicating the core entity, and it enables clean handling of data privacy operations like profile deletion while preserving the customer's transactional history.
Second, the Interaction type is intentionally generic. Rather than modeling separate types for emails, phone calls, chat sessions, and app events, we use a single Interaction entity with a channel and type discriminator. The specific payload lives in metadata. This keeps the core schema stable even as new interaction types emerge.
Third, customFields on the profile is a deliberate escape hatch. Every CRM deployment has unique data requirements. Rather than forcing schema changes for every client-specific attribute, we provide a typed-at-the-edge, schema-flexible storage mechanism. The trade-off is that customFields cannot be indexed or queried as efficiently as first-class columns, but it avoids the alternative: a model so generic that nothing is queryable at all.
Event Sourcing for Customer History
In a CRM, history is as important as current state. You need to know not just that a customer's status is "churned" but when it changed, what triggered the change, and what the status was before. Event sourcing captures this naturally by modeling state changes as an append-only sequence of events.
interface CustomerEvent {
id: string;
customerId: string;
eventType: string;
payload: Record<string, unknown>;
timestamp: Date;
source: string;
correlationId?: string;
}
// Specific event types
interface StatusChangedEvent extends CustomerEvent {
eventType: "customer.status.changed";
payload: {
previousStatus: string;
newStatus: string;
reason: string;
};
}
interface ProfileUpdatedEvent extends CustomerEvent {
eventType: "customer.profile.updated";
payload: {
changedFields: string[];
previousValues: Record<string, unknown>;
newValues: Record<string, unknown>;
};
}
interface TransactionEvent extends CustomerEvent {
eventType: "transaction.completed";
payload: {
accountId: string;
amount: number;
currency: string;
merchantCategory?: string;
};
}
class CustomerAggregate {
private events: CustomerEvent[] = [];
private state: Customer & { profile: CustomerProfile };
constructor(
initialState: Customer & { profile: CustomerProfile },
events: CustomerEvent[]
) {
this.state = { ...initialState };
this.events = [];
for (const event of events) {
this.apply(event);
}
}
apply(event: CustomerEvent): void {
this.events.push(event);
switch (event.eventType) {
case "customer.status.changed": {
const payload = event.payload as StatusChangedEvent["payload"];
this.state.status = payload.newStatus as Customer["status"];
break;
}
case "customer.profile.updated": {
const payload = event.payload as ProfileUpdatedEvent["payload"];
for (const field of payload.changedFields) {
(this.state.profile as Record<string, unknown>)[field] =
payload.newValues[field];
}
break;
}
}
this.state.updatedAt = event.timestamp;
}
getState(): Customer & { profile: CustomerProfile } {
return { ...this.state };
}
getStateAt(timestamp: Date): Customer & { profile: CustomerProfile } {
const aggregate = new CustomerAggregate(
this.state,
this.events.filter((e) => e.timestamp <= timestamp)
);
return aggregate.getState();
}
}The getStateAt method is particularly powerful. It enables time-travel queries: what was this customer's status on January 15th? What was their address when they opened their second account? These queries are invaluable for compliance, dispute resolution, and retroactive analytics.
CVM Nova does not use pure event sourcing for all reads — that would be too slow for dashboard queries against millions of customers. Instead, we maintain materialized read models (projections) that are rebuilt from events. The event store is the source of truth; the read models are optimized views for specific query patterns.
Temporal Data and Slowly Changing Dimensions
CRM data is inherently temporal. A customer's segment today is not their segment last month. Their credit score changes. Their product holdings evolve. Modeling this temporal dimension correctly is essential for accurate analytics.
We use the slowly changing dimension (SCD) Type 2 pattern for attributes that need historical tracking. Each row has an effective date range, and queries filter by the desired point in time.
interface TemporalAttribute<T> {
customerId: string;
attributeName: string;
value: T;
effectiveFrom: Date;
effectiveTo: Date | null; // null means current
source: string;
}
class TemporalAttributeStore {
private attributes: TemporalAttribute<unknown>[] = [];
setCurrent<T>(
customerId: string,
attributeName: string,
value: T,
source: string
): void {
// Close the current record
const current = this.attributes.find(
(a) =>
a.customerId === customerId &&
a.attributeName === attributeName &&
a.effectiveTo === null
);
const now = new Date();
if (current) {
current.effectiveTo = now;
}
// Insert the new record
this.attributes.push({
customerId,
attributeName,
value,
effectiveFrom: now,
effectiveTo: null,
source,
});
}
getAsOf<T>(
customerId: string,
attributeName: string,
asOfDate: Date
): T | undefined {
const record = this.attributes.find(
(a) =>
a.customerId === customerId &&
a.attributeName === attributeName &&
a.effectiveFrom <= asOfDate &&
(a.effectiveTo === null || a.effectiveTo > asOfDate)
);
return record?.value as T | undefined;
}
getHistory<T>(
customerId: string,
attributeName: string
): TemporalAttribute<T>[] {
return this.attributes.filter(
(a) => a.customerId === customerId && a.attributeName === attributeName
) as TemporalAttribute<T>[];
}
}This pattern works well for attributes that change infrequently — segment membership, credit tier, relationship manager assignment. For high-frequency data like transaction counts or engagement scores, we use time-series aggregation tables instead, with pre-computed values at daily, weekly, and monthly granularity.
Schema Evolution and Migration
A CRM data model is never finished. New business requirements, regulatory changes, and integration partners all drive schema changes. The model must evolve without breaking existing functionality or losing historical data.
CVM Nova uses a versioned migration system with forward-only migrations. Each migration is a TypeScript file that describes the schema change and the data transformation required.
interface Migration {
version: number;
name: string;
up: () => Promise<void>;
validate: () => Promise<boolean>;
}
const migration0042: Migration = {
version: 42,
name: "add-customer-risk-score",
up: async () => {
// Add new column with default value
await db.query(`
ALTER TABLE customer_profiles
ADD COLUMN risk_score DECIMAL(5,2) DEFAULT NULL
`);
// Backfill from existing data
await db.query(`
UPDATE customer_profiles cp
SET risk_score = (
SELECT COALESCE(AVG(t.risk_indicator), 0)
FROM transactions t
WHERE t.customer_id = cp.customer_id
AND t.created_at > NOW() - INTERVAL '90 days'
)
`);
},
validate: async () => {
const result = await db.query(`
SELECT COUNT(*) as total,
COUNT(risk_score) as populated
FROM customer_profiles
WHERE status = 'active'
`);
const { total, populated } = result.rows[0];
// At least 90% of active profiles should have a risk score
return populated / total > 0.9;
},
};The validate function is critical. It runs after the migration and checks that the schema change produced sensible results. A migration that adds a column but leaves it empty for most active customers is probably a bug in the backfill query. Catching this immediately — rather than when a downstream report looks wrong — saves hours of debugging.
We also follow the expand-and-contract pattern for breaking changes. Instead of renaming a column directly, we add the new column (expand), migrate data and update application code, then remove the old column (contract) in a subsequent release. This ensures zero-downtime deployments and gives us a rollback path at every step.
Handling Multi-Tenancy
CVM Nova serves multiple clients, each with their own customer base and configuration. Multi-tenancy in the data model is implemented through a tenant identifier on every table, enforced by row-level security policies at the database level.
interface TenantAware {
tenantId: string;
}
interface TenantCustomer extends Customer, TenantAware {}
class TenantScopedRepository<T extends TenantAware> {
constructor(
private readonly tenantId: string,
private readonly tableName: string
) {}
async findById(id: string): Promise<T | null> {
const result = await db.query(
`SELECT * FROM ${this.tableName} WHERE id = $1 AND tenant_id = $2`,
[id, this.tenantId]
);
return result.rows[0] ?? null;
}
async findAll(filters: Partial<T>): Promise<T[]> {
const conditions = ["tenant_id = $1"];
const values: unknown[] = [this.tenantId];
let paramIndex = 2;
for (const [key, value] of Object.entries(filters)) {
conditions.push(`${key} = $${paramIndex}`);
values.push(value);
paramIndex++;
}
const result = await db.query(
`SELECT * FROM ${this.tableName} WHERE ${conditions.join(" AND ")}`,
values
);
return result.rows;
}
}The tenant ID is injected into every repository at construction time and automatically applied to every query. Application code never needs to remember to filter by tenant — the repository handles it. This prevents the catastrophic bug of returning one tenant's customer data to another tenant, which in a financial services context would be a regulatory disaster.
Conclusion
Data modeling for CRM systems is a discipline that rewards patience and punishes shortcuts. The patterns described here — entity separation, event sourcing, temporal attributes, versioned migrations, and tenant isolation — form the backbone of CVM Nova's data layer. None of them are novel in isolation, but their combination creates a model that is queryable, auditable, evolvable, and secure.
The most important principle is to model the domain, not the UI. Screens change frequently; the underlying business concepts — customers, interactions, accounts, relationships — change slowly. When your data model reflects the domain faithfully, new features are straightforward to build because the data they need already has a natural home. When the model reflects the UI, every new screen requires a new table, and the database becomes a mirror of design debt.
Related Articles
Real-Time Customer Profiles with Event Streaming
A technical guide to building real-time customer profile systems using event streaming in TypeScript, covering event-driven architecture, stream processing, profile materialization, and consistency guarantees.
Customer Engagement Metrics That Matter
A practical guide to defining, measuring, and acting on customer engagement metrics in CRM platforms, with a focus on metrics that drive retention and revenue in fintech.
Data-Driven CRM: Strategy and Implementation
A strategic guide to building and operating a data-driven CRM practice, covering organizational alignment, data governance, analytics maturity models, and practical implementation roadmaps.