My Current Architecture

Callum Linington
11 min readJan 18, 2022

So I’ve been playing about with Event Sourcing, trying to consume as much information from the likes of Greg Young, Vaughn Vernon and David Schmitz.

I’ve been loving using the EventStoreDb and have been quickly spinning up instances using Docker Compose. It’s now fundamentally changing the way I think about larger architectures, but also my fundamental architecture.

I want everything to be real-time, almost eliminating all polling calls and making extensive use of gRPC, WebSockets and the Pub/Sub style EventStore. I also kinda want to get away from that CRUD style “object in” — “object out”, in favour of bits of information and domain models.

I have spoken about this in the Unhandled Exception Podcast with Dan Clarke and I’m going to try and solidify some of those ideas here.

EventStore

Like any good book, we’ll start at the beginning and hopefully finish up at the end, but there are no guarantees.

So we should start with the database where there is the most fundamental change. Most developers will be familiar with the likes of a SQL DB (MS, Postgres, MySQL) or some document store (Mongo, Cosmos, Raven etc…). SQL is a structured entity, and a document is not entirely unstructured but is a little more flexible as it’s meant to represent a programming object.

To start to understand an event store, we’ll compare to Mongo. In Mongo, each document has some special properties starting with _ which allows mongo to understand some extra stuff about each document. EventStoreDb has a message envelope called EventData which has some extra properties, one called data which takes the "document"/"entity". Another property called eventType which groups a set of events. Finally you can store all these events with their corresponding event types into a stream.

Fig 1. Event Stream

In domain driven design terms, this is your Domain Model consistency boundary.

Another property of the EventStore is that you can subscribe to particular streams (or everything), this means as soon as an event is appended to a stream (published, or saved in other words) then anything that has subscribed will get that event. This is what sparks the beginning of a more reactive architecture.

Backends

So now that we have an EventStore that is being reactive, we need a couple of services to be interacting with it, writing to it, and reading from it.

Writing is pretty straightforward right, you just append to the stream. So here you want your validated models to be pushed in. Validated in the sense of structural, value and against business rules.

But this poses the question, if like the image above, you want to store updates which are part of the aggregate you’ll have to reconstruct the full aggregate before issuing changes to ensure the consistency boundary. That is to say that updates depend on several properties. If for instance it’s just a property update, then the previous property doesn’t really have any impact.

When you are reconstituting data, this is where things get interesting and gives you several options for quickly obtaining the aggregate — storing in memory, in redis or using another db like mongo.

Ingestion

Assuming we’re using HTTP, we could use minimal APIs

This is a very abbreviated version, but shows the gist, call the endpoint, spin up connection to EventStore and write some serialised json into the event and then append to the stream.

Streams are append only, this is because they’re immutable storage. So you can only publish new events. This means if you need to make corrections, then you simply write a new event that overwrites the data.

Re-constitution/Re-composition

This is the most interesting bit, and definitely informs the way we can build the UI.

First of all we need to make some events that will represent the stream

I created the EmailUpdate object because it just makes JsonSerializer.Deserialize() easier to deal with.

Now to get from EventData to a ProfileStreamEvents discriminated union (so we can match against it and decide how we handle the incoming event) we need to write something that automatically de-serialises the data and picks the appropriate DU to populate, fortunately I've already done this in an open source library

basic usage will be:

Helpers also comes from that same library.

So once we’ve reconstructed the discriminated union event, we need to then re-constitute or re-compose the domain entity.

In order for this to be effective, when we start from the beginning of a stream we need an initial object — and we’re NOT using null…

In F# we can simply make a static default member which sets defaults

Let’s look at the different ways we can store this object and then recall for later stream subscription events.

In Memory

The overall architecture will look something like

Fig 2. Backend with Cache

So for this method of re-composition you’ll probably do something like

This will probably be roughly the same for Redis — but you could create your own RedisCache implementation over IMemoryCache if you so desire.

Mongo

So you can imagine if you’re Facebook with millions of users, even if you partitioned over several instances, it would eat up so much memory that it wouldn’t be efficient.

So you can keep state inside Mongo if you wish, will look something like:

Fig 3. Backend with Mongo

This way you can let the database server deal with partitions etc…

Thoughts

This looks a bit overkill, why write into the store to then immediately receive a notification that will be written directly into mongo. Lets just cut out the middle man of the EventStore and just use Mongo.

The great benefit of using these stores, is that they don’t really need to be resilient to data loss, if anything happens to them, just destroy them and rebuild state from the beginning of the streams you use. This is super nice when testing things and you want a completely fresh new start.

But, this is just a mega small sample. In reality you can start to split read and write. To say that we don’t need a background service and web server doing all of the process, we can have one ingestion service and several re-composition style services. Other services can read from the streams any bits of data they’re interested in to create a new type of Domain Model with new consistency boundaries. Read doesn’t even have to be done in that domain, other domains can subscribe to streams, or even we could feed, say, a GraphQL service.

Backend For Frontend (BFF)

So the next part of our architecture is GraphQL.

The reason being is that GraphQL is amazing at combining data from a bunch of different sources into an easy to use schema — and if this is federated, then each domain can control it’s own schema and a final GraphQL service can compile all the schemas together and present a unified front.

The other bonus of using GraphQL is that the UI developers can just request the data they need without worrying about where it comes from and what order they need to get it.

To give a quick overview of what GraphQL is, imagine you have this JSON:

{
"id": 123,
"name": "jeff",
"age": 20
}

If you were asking for this from GraphQL then it would look like:

person (id: "123") {
name,
age
}

So essentially you’re defining the structure of the JSON that will be returned, whilst also ignoring how. On the backend, each of those fields has a field resolver.

The default will be something like

let output receivedData = 
{
name = receivedData["name"]
}

But you can actually make them as complicated as you want, even asynchronous. So for example, if the output JSON looks like:

{
"id": 123,
"name": "jeff",
"orders": [
{
"orderId": 123,
"items": [
{
"name": "Keyboard",
"cost": 100.00,
"qty": 2
}
]
}
]
}

The resolver would look like this:

I’ll leave the OrderGraphType up to your imagination, but assume it looks similar to the top two field lines for most properties.

This will ultimately give the GraphQL schema of:

person (id: string) {
id,
name,
orders {
name
cost
qty # there could be many more properties but the beauty of GraphQl is you can choose what you need
}
}

Everything above is for illustrative purposes, it may not fully compile. But is a basic implementation of the dotnet-graphql library which I've used to great success when utilising the Azure DevOps APIs.

So the architecture would look something like:

Fig 4. BFF Complete Architecture

UI

So we’re now onto the UI aspect of the architecture. With the backends we’ve covered the EventStore for storing all the Domain model data (partial immutable state at different points in time), we’ve covered the different ways you can interact with it via each Domain having it’s own backing store for the recomposed state. Finally, above all the different services I’ve introduced the GraphQL BFF.

The awesome property of having an event driven architecture based on an Event Store is that subscriptions make data updates “real-time”, therefore re-composition (or re-constitution depending on whether you’re just recreating the Domain model, or forming a new one, Re-composition being the later) is real-time and we can propagate that through the services and the BFF via gRPC streams or WebSockets (I believe EventStoreDb uses gRPC streams for it’s subscriptions) to the UI.

And most importantly we can carry the architecture through.

Generally for my UIs I’m most comfortable using either React or Svelte (rare occasions Aurelia), so for the following examples I will be using React.

My goto packages are Redux, React-Router, Redux-Observables and RxJs

A quick rundown:

  1. Redux is a functional immutable state store, it relies on actions being dispatched through reducers (functions that take action data and merge it with the state to produce new state)
  2. redux-observable is a library that injects RxJs into the middleware and allow you to handle actions as event stream (very consistent with the current architecture)
  3. RxJs is the Reactive Extensions implementation for Observables. Allows for handling events as streams (as opposed to callback functions, or Promises)
  4. react-router is a routing library

So, it goes something a bit like this:

Then any view can be reactive to this via:

Obviously if you want to go the other way, push data down from the UI, simply create an event in the UI and push down to the hub, or GraphQL subscription, jobs a good’un.

What have we just read through?

I’ve run through a sorta circular event driven architecture. This architecture is driven via the EventStore through the subscriptions allowing real-time data update notifications. These are then propagated through services (via gRPC). The services then feed a federated GraphQL service (via WebSockets) which allows the UI to hit a single endpoint and fully dictate the exact data it needs, without worrying about all the different places to get it from. Any changes that the UI makes gets sent back down the socket, propagated down through the services and appended to a stream in the EventStore, thus looping back to the beginning (pinging the subscribed services)

Here’s an illustration:

Fig 5. Full End-to-End Cycle

Why?

My current bug bear with architectures, especially ones with REST at their core, is that they’re so static. Also, they’re based around rigid entity based behaviours.

What I’ve tried to do here is outline a Domain Driven, Real-Time, Event Driven Architecture based around an Event Sourcing Store. This architecture allows one to focus on behaviours over entities, real-time over request/response and a flexible data re-composition.

Why not use queues? Well, you can, but the state probably shouldn’t be passed around in those events. By all means using fat events (events with as much data as needed for the next process to act upon it), but I feel that these can get a bit messy. A pattern that I like is the choreography pattern, this is a bit like a fast food restaurant or coffee shop. Your order is taken then as each part of your order is completed either the ticket is updated — or it is moved to a new place. In this scenario a stream can be the ticket, every time an update is made, it can be pushed into the stream. A services can subscribe and wait for all data that is need, once that is done it can fire off a new process. Or, one can use the EventStore projections to do that instead. In my eyes this becomes a lot cleaner because all the data is stored in one place — you can follow the trail.

To me, this ends up following the real world more closely. State is constantly changing around you — you only care about it when it changes and what the new state is, so this architecture reflects that. I feel that this fits in more with life. This architecture is just also another tool in the toolbox, use it when it makes sense.

We haven’t covered a few more nice things from EventStore (message bus and projections) also we’ve glossed over GraphQL Subscriptions. These are also super useful tools in making this architecture super real time and easy to use — there will be follow up posts.

Pros & Cons

Pros

  • Realtime, everything becomes more reactive and in a world that is more connected and more instantaneous, it makes this a breeze
  • No need for the Outbox pattern, as events are written directly to EventStore in a complete transaction
  • GraphQL allows for individual domain ownership of each schema (their way of reconstituting the data)
  • A lot of flexibility around consuming the data, because the data is immutable we don’t have to worry about accidentally modifying it
  • Central data source makes it easy to keep track of all the data — however, each domain can have their own snapshots
  • Easily pluggable with other event driven architectures
  • GraphQL allows for 0 friction for UI developers when gathering the data they need to display as it does all the hard-work in compiling it all
  • Destroyable caches, because they’re built from the ground up from the streams, you can just keep rebuilding state each time

Cons

  • Relies heavily on Domain Driven Design, and a good understanding of all the concepts, which could be a big learning curve
  • Feels like there are more stores for data, however the domain stores are just calculated from the EventStore
  • Data broken down into its change log is a lot more verbose, more admin interfaces will be need to interrogate stream states
  • More boiler plate and process around storing data than standard SQL or Mongo — but the trade off is more information

--

--