MDO: It’s about the data
It’s always been about the data, and it will always be about the data.
Microservices architecture evolution is on everyone’s lips these days. It’s harder than it should be, because we keep trying to do it backwards. We focus on the application, but it’s actually all about the data
Everything old is new again. I recently attended an architecture training session with the amazing Mark Richards, and amongst a lot of just great material, I got to thinking about how anecdotally he told us that the toughest part about evolving to a microservices based architecture was getting the data separation needed to implement the single responsibility principle (SRP).
I was greatly reminded of this Fred Brooks comment on the primacy of data from 1975:
Show me your flowcharts and conceal your tables, and I shall continue to be mystified. Show me your tables, and I won’t usually need your flowcharts; they’ll be obvious.
Pp. 102–3. Fred Brooks “The Mythical Man-Month
My take-away from those two points is for good microservices architecture, especially in an evolution of monolith to microservices, we need to start with understanding the data. It’s the hardest thing to separate, and there are often really good reasons given the prohibitive cost of data management tools like SQL databases, and the usual kinds of constraints like foreign keys, data custodianship & sovereignty, and of course, your DBA.
We were all taught about database normalization; that kind of thinking is based in two now arguably flawed “truisms”: Firstly, data storage is expensive, and secondly updates must be synchronous; at no point could the data model have any lost children. Every coder with a few years in the SQL trenches knows that de-normalization is a tried and true performance enhancement technique so apparently storage usage is secondary to performance plus we all know how much cheaper storage is now. The eventual consistency revolution has largely proven that expensive synchronous updates are not necessarily required to get eventual correctness. We now have pretty ample proof that the updates don’t have to be expensive.
We know that data separation is hard, and it’s true that in some cases you just can’t split up the data on SPR lines. However, when there is a level of flexibility, there is a huge opportunity waiting here: start your analysis from the smallest bit of data and work backwards towards the flows to build our microservices. Let me coin a trendy label to further distinguish this pattern: Micro-Data-Oriented. MDO.
Why do we have big relational structures anyway? Do the presentation layers need the big join? Yes-ish, but that’s what caches are for.
Does the business logic need a foreign key? The DBA’s all scream “YES!”, but, let’s consider that closely: What the foreign key does is validate relational integrity, but at the low level, that layers your business logic, by putting a business rule like “All purchase orders can only be created with valid product IDs” into the DBMS.
The rest of your business logic & validity checking is in your microservice itself. That starts to play havoc with SRP and debugging. What happens if you discontinue a product? Then you get “it depends” in several places in your entire system. That’s a recipe for issues in future.
Data grooming / validation could be continuous, rather than via relational integrity. That’s an idea worth exploring. If each data field - most especially foreign keys – had a source of truth defined, then we could think about “de-normalized” data where one of the exposed interfaces might implement “delete all entities related to some_foreign_key”. This would still keep SRP intact – the microservice would then delete its own entities.
That “cascaded delete” could be via event sourcing, because there’s no good reason – other than relational style thinking – to make “ON DELETE CASCADE” cross responsibility boundaries.
The take-away I want to leave you with is that to rapidly evolve to a microservices architecture, you need to turn your thinking towards how even data integrity can be turned into single responsibility pattern. Micro-Data-Oriented makes it possible to simplify your data model, making it a lot simpler to clean up lines of data ownership.