When I hear other developers and architects make the case for event sourcing I often hear the argument of audit trails. Audit trailing is an argument that will mostly convince business people. In this blog, I want to make the case for event sourcing not just for business people but also for developers.
I want to make the case that event sourcing will also allow you to apply domain driven design better and keep your application more maintainable.
As developers were used to using big anemic domain models with the active record paradigm, updating our core entities in place keeping only the latest state. In complex systems with entities with long and complex life cycles, financial contracts for example, this leads to unmaintainable systems. Why? By keeping only the active state of our entity we have become attached to our domain model as the only state of our core entities. The state of an entity was gradually built by a series of user interactions or state changes. We’ve lost the history and the intent of those changes. This means we only have a clear overview of the life cycle of our entity by inspecting all of the code of the application which is often impractical. Only by reading all the code it’s clear what the actual life cycle of our long-lived entities is. The lifecycle of our core entities are almost invisible now. It’s lost in hundreds of tables where each logical interaction from a user resulted in updating in some of the data of the entity. Figuring out which operations took place on an entity is almost impossible.
In my experience, I often see a simple user interaction resulting in updating thousands of properties across lots of entities (often stored in database tables). These properties in our domain model are only an updated as a side effect of the user operation without actually storing just the user interaction itself. With event sourcing, we only store the user interaction and derive the domain model by replaying the events within our entities. With event sourcing, the life cycle of our core entities is explicitly stored in the database as a series of state changes with business intent. When we want to restructure our domain model we have more information. Now that our domain model is only derived from events we can rewrite our domain model for free as it’s not stored anywhere. If we want to derive another set of properties from the user interactions to enforce business invariants we can do that very easily. When event sourcing our aggregates we better decouple our current understanding (domain model with lots of derived properties) of our business from the state, now being stored events with business intent.
In Eric Evan’s book Domain Driven Design he talks a lot about redesigning domain models. Unfortunately in large complex systems with long-lived aggregates remodeling domain models is hard especially when using the active record paradigm. The state of an aggregate can only be understood as a whole (sometimes distributed over hundreds of tables with lots of derived properties), no longer in a series of clear state changes (events) that capture intent. A better understanding of the state of our existing aggregates allows us to refactor our domain model more easily and apply Domain Driven Design better. Crucial in businesses where you need to support your state for years and still be agile.
By having a better record of the life cycle of our core entities and because we do not store the domain model in the database itself, we are way more flexible in applying domain driven design. We can continually rework our domain model and achieve conceptual breakthroughs and better domain models but still support our current set of aggregates (financial contracts) in our database that have lived there for years.
An example. In a previous system, we assigned a contract with a single status field. In order the first five were: INITIAL, PENDING, CONFIRMED, READY_TO_BE_ACTIVATED, ACTIVATED. For our initial use cases, these status were fine. PENDING, for example, was used when the mandatory withdrawal period of 14 days for consumers was active. When the contract was in the CONFIRMED status the system would continue to required actions like sending the initial invoice for example. But later on, also contracts for businesses were added. These contracts didn’t have any need for the PENDING status as they didn’t have a revocation period. But the status was still used when the contract wouldn’t immediately start because of a delayed start date of the contract. It was difficult to refactor the domain model to better reflect this. For one thing, the intent of the depending status was absent. It was hard to deduce the actual meaning of the PENDING status for all contracts because this status was only derived from actual state changes with business intent. Furthermore, other processes looked at the status field instead of reacting to events with business intent and the migration of hundreds of thousands of existing contracts in our system was considered risky. When doing schema changes in a database you don’t keep the old data, you actively overwrite it. Even restoring backups a day after the migration would remove new updates done after the migration, that would be a big problem.
These days with continuous integration and deployment your usually not allowed to have downtime during deployments. Best practice for that within active record systems is to only make additions to your schema. This makes it also hard to refactor your domain model, as you tend to leave your current model in place and work around it.
Later on, we remodeled part of that system to an event sourcing system. By then we were only dependent on the events not on our model within our entity. Based on the events with actual business intent (like the ContractSigned event) we could refactor our model with ease. Creating different status fields for business and consumer contracts. Later on, we even had a breakthrough that a single status for a contract was not enough. There are actually multiple processes within a contract. Processes for billing, coordinating the activation of the contract with other external services and companies. These breakthroughs in our understanding were very easy to implement by just throwing away the domain model in our entities and build it up again from the events.
The test library that Axon provides for this is perfect. It validates your aggregate (a collection of entities that enforce business invariants) as a black box. If you refactor and your current tests that check that given a set of events and a command the aggregate still provides the same output, your refactoring of the internals of the aggregate is successful.
The first thing I learned when I tried to design an event based systems was to make your events capture the intent of the business/user. With events actually capturing the business intent you have a better history of your aggregates. Furthermore with event sourcing changing your domain model doesn’t require you to do an irreversible database migration. Your event store remains immutable with just append-only data access. Deployments with zero uptime are way easier as you can run multiple versions of the application on the same data store.
By using event sourcing you create an extra layer between your state and your domain model, that layer provides lots of useful information and extra flexibility in domain modeling. By being better in domain modeling you keep your application more maintainable in the long run.