In this post, we’ll share some of our experiences after 3 years of running a microservice architecture in production.
Several years ago, Microservices were the hottest new tech fad in town. Every consultant and speaker, if they didn’t speak about DevOps or Kubernetes, would pitch this “new” way of doing things.
After the “getting people hooked” part of the tech-fad cycle was completed, early adopters had started to switch over. The “it isn’t a silver bullet” phase began, with many talks on the problems of correctly splitting up services and teams.
Currently and for the last few years, we have been in the “was this even a good idea in the first place”-phase. And honestly, all tech-fads must go through that one at some point.
This topic seems to be somewhat controversial and emotional for some people.
So as a disclaimer, this post does not aim to convince anyone to use, or not use a microservice architecture. It’s simply an experience report. 🙂
At the end of the day, everything is a trade-off. I believe if a team knows what they’re doing on the technical side and if they work well together, those things will have a higher impact on software quality and success than whether they decide to make many small things or one big thing in most cases.
In the following post, we’ll talk about our motivation to go down this road in the first place and the things we learned along the way.
As a small tech startup during the first few years, it’s likely that things will change a LOT. Be it requirements, team size, amount of users or simply the entire focus of the product. Everything is possible at this point.
You don’t want to wake up one day with an idea for pivoting the product and be limited by the architecture you only built to fit the initial ideas.
You also want to be able to scale up without too much effort, if the need arises.
The ability to build small, isolated services with low complexity can be a great benefit in terms of changing things rapidly. It also helps with getting rid of things you don’t need anymore. Simply delete the whole service once nothing depends on it anymore.
Knowledge sharing and scaling up both as a team and from a technical perspective are also issues where microservices can provide a great bit of flexibility at the cost of some infrastructural overhead.
As stated above, this doesn’t mean that you can’t reach these advantages with a different architecture.
However, these were and still are some of the motivations for us to keep using microservices and why we went down that path in the first place.
One interesting aspect we noticed is the lifecycle of microservices in our product. We observed three different life cycles, once we built a new service and put it in production:
- It stays the same, gets updates and a bit of regular maintenance and just does its job happily ever after
- We replace the service – no hard feelings, but it just didn’t work out with us
- We update it quite often and it becomes part of the “core-functionality” of the whole system
The first and second cases are not very interesting. You can handle those with updates and maintenance becoming a bit of a hassle once the number of services increases. You can automate this – it isn’t a big deal.
The ability to easily get rid of a whole service and all the accompanying complexity is a very nice aspect. Once it’s gone, it’s gone for good and doesn’t remain as a zombie with hidden complexity within a bigger system.
The third case is where we noticed some issues over the long term. If you build new functionality, you have to decide where to put it. Either you build a whole new service for it, or you append it to an existing service. That decision is usually made based on trade-offs at the time of planning the implementation.
Under time-pressure, however, it can become attractive to simply append, rather than to create a new service. As there is always a bit of overhead involved in adding a new thing to an existing infrastructure. You can reduce this cost, but it’s still there.
This can lead to services growing not only in terms of lines of code, but in terms of responsibilities they have until they become mini-monoliths within the system.
The problem with this is that the whole infrastructure is optimized for small services with few responsibilities and low cognitive overhead.
And not only that, this “mudballification” of services is contagious and will spread until eventually you lose all the benefits of your micro-module system and gain all the negatives of a monolithic system, without any of the benefits. We don’t want to end up there.
Another side-effect of this is that the dependencies between the services might get unnecessarily complex. It’s fine if many services depend on one service for a cross-cutting concern (e.g. authentication).
But these dependencies should be clear and it should be possible to easily separate modules if they become too big.
To remediate this situation, where a service gets out of hand in terms of complexity and responsibilities during a tight feature cycle, there needs to be a chunk of time to cleanly separate things again.
The longer you don’t do this, the worse it gets until it becomes difficult and costly to untangle the ball of microservices you made.
It also becomes harder to split the mess up conceptually. The cognitive work to split things up afterwards is higher than designing it that way upfront.
This is comparable to not cleanly separating modules in a big system and not refactoring towards that goal for a long time.
Making it costly to refactor anything at all at some point. So we learned that just using microservices doesn’t save you from this kind of mistake. You need to regularly refactor and clean up either way.
API Versioning is an important topic in and of itself in terms of your public API. But with a microservice architecture, it also becomes relevant for inter-service communication.
To take full advantage of the low-risk, isolated deployments using microservices, it’s important to not introduce deployment-dependencies between services.
You never want to be in the situation where you have to wait with deploying Service A until Service B is deployed.
In the worst case, you’ll get a cycle and you can’t deploy without a downtime anymore.
To avoid this, services should version their internal APIs properly, so any breaking API change leads to a new API version. As you can imagine, this can potentially lead to many API versions.
However, once you deploy a new version and all other services use it, you can delete the old one.
This is something you have to think about and plan for and a bit of tedious work for every breaking change.
However, the nice upside is, that this forces you to think about your interfaces more and to treat other services as API consumers. It forces you to respect the boundaries between different elements of the system.
With multiple teams working on different services this can become even more of an issue.
But for us the need to think deeply about our interfaces and to have control of the internal and external APIs is a plus.
Multiple Technologies within the Stack
The potential to use different technologies for different services is nice, but can also bite you. It’s great that you can use a technology that’s very well suited to a specific problem. However, there is a trade-off in terms of understanding every technology well enough to build something robust for production use.
We made the experience that a couple of different languages and approaches within these languages are fine. However, there is an upper limit of different technologies you want to run within a cluster.
This depends on the number of developers and teams working on the whole product and on the cognitive load on every single member.
A broad tech-stack is also a potential problem when hiring new developers. As it will be hard to find anyone who has experience with all the things you’re using. However, it promotes hiring people who aren’t afraid to learn new things and are confident in their ability to adapt. This is in-tune with our hiring as a small startup. People have lots of autonomy and will have to jump into things they haven’t done before at times.
Experimenting with new tech is great for motivation, promotes innovation and you can do it in a safe way for internal tools.
Once something deals with production user data, you need to have confidence in your ability to analyze and fix unexpected issues.
You can bend any pattern or architecture to your will and solve your problems based on your own trade-offs. As long as you make an informed decision and don’t just follow blindly what’s en-vogue at tech conferences, things should work out.
Here at Timeular we’re happy overall with our microservice architecture. However, we’re also aware, that we would have saved ourselves some infrastructure headaches and money, if we would have built things in a less distributed way.
Another worry we have is the breadth of our tech stack. It’s mind-boggling to count all the different technologies in use both in our services and on the infrastructure layer. This situation needs to be held in check against the urge to try new things and experiment.
In our case, the experimentation and flexibility our solution grants us has paid off nicely. Especially with a small team like ours, it’s nice that once things are set up, deployments and maintenance are smooth and low-risk.
Another thing we like about our setup is the mindset of being able to replace thing without too much effort. This helps to reduce overall complexity with little risk and keeps things fresh and understandable.
That’s it. I hope this was useful, or at least interesting to read. I’m curious to see how such an experience report might look in another three years, so see you then. 🙂