Event Stores and Tags: A Misunderstood Optimization
A Critique of Tagging in Event Stores

Tags in event stores sound helpful, even obvious. They promise easier correlation, cleaner queries, and clearer intent. Instead of digging into payloads, you just filter by labels. The event type tells you what happened; the tag tells you to whom.
That is the argument made in this article on DCB (Dynamic Consistency Boundary), which presents tagging as a pragmatic enhancement to event-sourced systems.
I used to see it the same way. My stance was: you can use tags, but you don't have to.
Ralf Westphal challenged that. Repeatedly. And to be honest, I did not fully appreciate the depth of his point, until now.
Tags add nothing to the concept of an event store. Tags make it harder to understand and more difficult to implement.
He was right.
What tags introduce is a premature structure. They assume that correlation must be declared at write time. That events must carry external identifiers describing their "subject". Although it sounds efficient, it quietly undermines the very basis of event sourcing.
Event Sourcing Is About Deferring Meaning
The strength of event sourcing is that events do not require immediate interpretation.
They simply record what happened, allowing meaning to emerge later depending on context and intent.
Events do not declare what they mean; they become meaningful through interpretation. Event logs are not spreadsheets. They are streams of recorded facts, waiting to be understood from different perspectives, at different times, by different consumers.
This article is not an attack on tags as metadata. It's a critique of tagging as a core feature of event stores because that crosses a line:
It introduces a black-and-white mindset into a system that thrives in shades of grey.
The Tag Temptation
Tags are seductive.
They promise clarity and convenience. Add a "customer: alice-smith" or "order: 12345" to your event, and correlation becomes simple. It feels like good housekeeping like documenting what the event is about.
But it's a shortcut with a cost.
In the DCB article, tags are said to correlate events with “instances in the domain.” That wording carries an assumption: that such instances already exist, independently of the events. That they are real, stable things the system should recognize and tag.
But that's not how event-sourced systems work.
"Instances" (customers, deliveries, projects) are not predefined. They are retrospective constructs, shaped by event interpretation. They are derived from patterns and context but not from tags.
By tagging events, we are no longer just annotating. We are declaring. We are baking one interpretation into the storage layer, saying: this event happened to Alice, even though “Alice” may not exist yet, or that meaning might change later.
Worse, this shifts focus away from what happened toward metadata about what we think it meant.
That's a shift from event-first to structure-first thinking. It's reversing causality.
And once tags become the primary mechanism for querying or grouping events, they stop being metadata, and start being structural. At that point:
We are no longer sourcing events. We are indexing snapshots. That's database thinking.
Black, White, Grey
When it comes to event modeling, two extremes are easy to recognize:
- Black box: Events are opaque blobs. They're appendable and replayable but you can't inspect or query them.
- White box: Events follow rigid schemas. They're tightly versioned, fully structured, and easily queryable but inflexible.
Both are wrong.
The sweet spot is grey.
Events that are introspectable, typically JSON, but without fixed contracts. No enforced schemas, no premature structure. Just enough form to support interpretation, but never to prescribe it.
As Ralf puts it:
"Grey payloads are highly flexible, conceptually parsimonious, fast enough until proven (!) otherwise."
Tags pull you out of grey.
They impose a fixed structure for correlation. They encode assumptions about identity, ownership, and domain semantics into every write. They hardcode a worldview that might not hold tomorrow.
Tags Are Not Free
At first glance, tags seem harmless. Just a few fields to help with querying. But here's what they really do.
Tags must be attached at write time. That means the writer needs to know the correct subject of an event in advance. Which identity? Which domain concept? According to which rule?
You're committing to an interpretation before you've even seen what happens next. That's not just brittle; it's also backwards in terms of how we know things.
Tags are informal contracts. A tag like order_id = 123 or customer = alice-smith assumes those concepts are universal and stable. Over time, they become required conventions. Queries start to rely on them. If they’re missing or change format, things break.
It's schema drift just without migrations.
To make tags useful, you need to index them. That means query paths, maintenance overhead, consistency rules. All of this, just to query facts that are already in the payload.
Why add a second layer of structure, when the payload already contains the truth?
Why trust a label, when you can inspect the fact?
Performance matters deeply. But it must be approached with precision, not assumption.
Premature optimizations like tagging for correlation often solve unproven problems while introducing hidden complexity.
If your system needs indexing, prove it with real workloads and real bottlenecks.
Until then, favor clarity over convenience.
In most systems, a JSON-based event store with introspectable payloads and projection-driven queries is fast enough until it's measurably not.
That's when you optimize. Not before.
Want Tags? Model Them
Let's be clear: correlation is important. But the right place to model correlation is not in a metadata side-channel. It's in the stream itself.
If tagging matters in your domain, make it explicit. Treat it as behavior. Record it as an event.
Instead of tagging "customer: alice-smith", record:
CustomerWasTagged { customerId: "alice-smith", tag: "vip" }.
Now that information is part of the log: versioned, auditable, replayable. It reflects a real action, not an implicit assumption. You can track when a correlation was made, by whom, and in response to what.
This approach keeps the event store honest. It avoids structural assumptions. It respects causality.
And it aligns with a foundational truth of event-sourced systems:
The producer of events does not care about — nor take care of — event consumers.
Its sole responsibility is to faithfully record what happened. In that, the producer must be ego-less. It must not encode interpretation into the event.
That’s difficult because decisions must still be made: when to record, what to record, at what granularity.
But the goal is always the same: capture what happened, without collapsing it into what we think it means.
When producers start embedding their own interpretations (whether through tags, stream IDs, or inferred identities) they close off future possibilities. They decide too early what matters, and in doing so, they let potential information fall through the cracks.
That's why event sourcing works best when producers stay impartial and let meaning emerge later, when it's actually needed.
Let Event Stores Be Event Stores
An event store is not a document store. Not a blob store. Not a read-optimized table.
It is a log of what happened in the order it happened with no declared meaning beyond the facts it records.
The moment you start injecting tags, identities, and labels, you are no longer modeling behavior. You are modeling structure.
You are turning facts into opinions. You are asking the store to behave like a relational index, and that's not its job.
Let the event store be the source of truth and not the source of structure. Let correlation happen downstream. Let meaning emerge. Let consumers decide.
That’s the whole point: The job of the event store is simple, capture what happened. Faithfully. Durably. Transparently.
Everything else is interpretation.
Cheers!