Mastering Event Sourcing: Build Audit-Ready Systems

What Is Event Sourcing and Why Should You Care

Imagine a bank that only stores your current balance. A customer calls and asks, "Why is my balance 300 USD?" The bank has no answer—it overwrote every previous state. Event sourcing fixes this by persisting each transaction as an immutable event. Instead of storing "balance = 300," you store "Deposited 50 USD," "Withdrew 20 USD," and so on. The current state becomes a left-fold of those events.

This approach gives you an audit trail for free, effortless time travel, and the ability to rebuild any past state. Payment platforms, logistics engines, and even collaborative editors like Notion lean on event sourcing for these exact super-powers.

Core Concepts in Plain English

Events Are Facts

An event is a simple, immutable fact that happened in the past: OrderPlaced, PaymentCaptured, InventoryDecreased. Events are named in past tense and contain just enough data to describe the fact.

Event Store vs Database

An event store is a specialized database optimized for append-only writes. You can use PostgreSQL, MongoDB, or dedicated engines like EventStoreDB. Writes are transactional and sequential per stream, guaranteeing order.

Projections Read the Write Model

Your read models—HTML tables, API responses, search indexes—are not written directly. Instead, they are built by listening to events and updating themselves. This separation is often paired with CQRS (Command Query Responsibility Segregation), but you can adopt event sourcing without CQRS.

Minimal Code Example in Node.js

Below is a stripped-down example using PostgreSQL as the event store. No external frameworks—just raw SQL and Node.

import postgres from 'postgres';
const sql = postgres('postgres://user:pass@localhost/events');

async function appendEvent(streamId, type, data, expectedVersion) {
  return sql`
    INSERT INTO events(stream_id, type, data, version)
    VALUES (${streamId}, ${type}, ${data}, ${expectedVersion + 1})
  `;
}

async function getEvents(streamId) {
  return sql`SELECT type, data, version FROM events WHERE stream_id = ${streamId} ORDER BY version`;
}

function rebuildState(events) {
  return events.reduce((state, event) => {
    switch (event.type) {
      case 'ItemAdded': return { ...state, items: [...state.items, event.data.item] };
      case 'ItemRemoved': return { ...state, items: state.items.filter(i => i !== event.data.item) };
      default: return state;
    }
  }, { items: [] });
}

Call appendEvent when something happens, and call getEvents plus rebuildState when you need the current cart. That is literally the entire engine.

Designing Events That Age Well

Keep Them Small

Store only the data you would need to replay the fact tomorrow or in five years. A ShippingAddressUpdated event needs the new address, not the full order snapshot.

Use Explicit Schema

Write down the event name and payload shape in a shared schema registry. A simple JSON file in the repo works at small scale; larger teams adopt Apache Avro or JSON Schema to enforce compatibility.

Version Upward

When payload needs change, add new fields with defaults instead of mutating old ones. Consumers coded against v1 will happily ignore v2 fields, giving you backward compatibility.

Handling Duplicate Events

Networks retry, users double-click, and messages replay. Use idempotency keys or the combination of streamId plus expectedVersion to reject duplicates. The SQL snippet above increments version atomically, so a duplicate write with the same version fails on a unique constraint.

Snapshotting for Performance

Replaying thousands of events for every read is wasteful. After every N events, persist a snapshot containing the rolled-up state. When you need the current cart, fetch the latest snapshot plus only the events that arrived after it. Snapshots are pure optimization; you can delete and rebuild them at any time.

async function saveSnapshot(streamId, version, state) {
  await sql`INSERT INTO snapshots(stream_id, version, snapshot) VALUES (${streamId}, ${version}, ${state})`; 
}

async function loadSnapshot(streamId) {
  return sql`SELECT version, snapshot FROM snapshots WHERE stream_id = ${streamId} ORDER BY version DESC LIMIT 1`;
}

Event Sourcing and CQRS: The Good Marriage

Command side handles writes: validate, produce events, append to store. Query side is denormalized and fast: read models built from events. Need a leaderboard? Subscribe to PlayerScored events and update a Redis sorted set. Need a finance report? Subscribe to InvoiceIssued and dump rows into BigQuery. Each view is eventually consistent, usually within milliseconds on a local network.

Pitfalls That Hurt in Production

Treating Events as Commands

Commands can be rejected; events cannot. Validate during the command phase, then store only the resulting facts. Storing PlaceOrder instead of OrderPlaced couples your audit log to business rules that may evolve.

Large Payloads

Do not embed PDFs or images inside events. Store a reference to blob storage and keep the event small. Network hops and storage bills grow quickly.

Privacy Leaks

An immutable stream means you cannot delete personal data. Either encrypt sensitive fields per user or store a mapping table outside the stream and leave only a pseudonym in the event.

Migrating Existing Systems

You do not have to start green-field. Capture changes with database triggers or application hooks, stream them into an event store, and gradually move read models. Run both models in parallel until confidence is high, then switch writes.

Testing Strategy

Golden rule: given a sequence of events, assert that the next event is produced and the state is correct. Unit tests stay pure: feed events in, assert events out. Integration tests spin up Postgres inside Docker and verify snapshots.

test('applying loyalty points emits PointsAdded', () => {
  const events = [new OrderPlaced({ id: 1, total: 200 })];
  const newEvent = applyLoyaltyPolicy(events);
  expect(newEvent.type).toBe('PointsAdded');
  expect(newEvent.data.points).toBe(20);
});

Deployment Checklist

Turn on WAL archiving for Postgres to avoid event loss.
Monitor append latency; anything above 50 ms on SSD signals I/O pressure.
Backup both events and snapshots; restore tests every quarter.
Use a single writer per stream to prevent concurrency conflicts; scale by partitioning streams.

When Not to Use Event Sourcing

High-frequency sensor data where only the last value matters, or simple CRUD apps with zero audit requirements. If your domain experts never ask "how did we get here?," event sourcing is probably over-engineering.

Key Takeaways

Event sourcing stores facts, not state, giving you an audit trail and flexibility at the price of extra complexity. Start small: one aggregate, one read model, one projection. Master append-only writes, idempotency, and snapshots before scaling out. Do that, and you will own a system that can tell the story of every single click—something classic state-based models will never rival.

Disclaimer: This article is an educational overview and does not constitute financial or legal advice. It was generated by an AI language model based on publicly available information. Always test patterns in a staging environment before production use.

Event Sourcing for Developers: A Complete Beginner-to-Pro Guide