How Fine Should I Slice My Services?
A large reason why micro services appeal to programmers is that they resonate with the UNIX philosophy - make each program do one thing well and ensure a separation of concerns. While micro-services are quite the rage at most companies, there are a few dissenting opinions that are fans of the opposite end of the spectrum - monolithic services. This often results in both sides indulging in fairly engaging debates. Occasionally 140 characters at a time.
There are patterns to getting monolithic services to work well in production systems. Etsy is one of the places that seems to have it working reasonably well (continuous integration, multiple deployments a day, per developer VMs, monitoring, dashboards & alarms). Personally though, I'm wary about a few aspects of the monolithic setup. I'm not sure how well it can work when sizes of development teams cross a few hundred engineers. Netflix has come out publicly about a lot of their struggles with monolithic architectures as their size grew in the early years. I’d heard similar stories about the early days while I was at Amazon (though that might not be entirely applicable as it was quite a few years back). Scaling various components might also prove to be a challenge in such architectures.
There is an aspect of micro service design that I've had my own share of lively debates over. It centers on deciding what bits of functionality needs to be part of a service. Very strict separation of concerns can lead services to be very minimal (or the opposite - bloated services). I've listed some aspects that I’ve learnt to pay attention to while thinking about how to structure my services.
1) Scaling characteristics:
When you have application components that have very different scaling characteristics, it tends to make sense to break them out. That allows you to scale these different components independently. If necessary, you could also use different machine types for each of these components to help them perform better. For example, services that are compute intensive have different characteristics from those that are network / IO intensive.
2) Services with different availability tiers:
While I was at Amazon it was common to characterize services as Tier-1 (essential to the functioning of the site) and Tier-2 (could take a bit of a downtime without affecting users). Given the scale at which they operate, there were tons of war stories related to this. Typically it was due to services handling both Tier-1 and Tier-2 traffic together and some issues with the Tier-2 service causing the Tier-1 service to go down and leading to unhappy users. Breaking services up based on the availability SLAs was strongly encouraged.
To make things more interesting, this also extends to understanding your service's dependencies well. It makes no sense to depend on a service with lower SLAs that those you need to provide your clients. If your service has strict availability requirements you need to only take on dependencies that can provide those guarantees.
3) Latency concerns:
Pretty often when companies start transitioning from large monolithic architectures to micro-services they tend to go all out in making their services as fine grained as possible (no point in half measures :-)). Sometimes this ends up being the natural progression over a period of a couple of years as existing components are migrated to services and every new feature is wrapped as a shiny little service. Being over zealous about micro services can sometimes result in each service having to make a flood of calls to dependencies to service the call. Each call over the network can add a few milliseconds of latency to your application (see: Numbers everyone should know). If you're making a decent number of calls this adds up. I’ve seen some services making dozens of calls to process every request and the latency ended up being terrible. There are options when it comes to tackling high service latencies in these scenarios. Co-locating services on the same machines, batching calls, adding a cache layer, pre-computing results can be options depending on your setup. In some cases it helps to merge services (or not break them up in the first place) if latency is a concern and other aspects don't strongly dictate splitting the services.
4) Service maintenance:
Each service that you need to maintain involves a certain amount of overhead. You need to worry about deploying to it regularly with minimal downtime to users. Setting up monitoring and alerts needs to be done as well. You have an additional service added to your on-call responsibilities (in other words, another service to wake you up at 3am). Depending on the number of services you are spinning up and the tools at your disposal, this can amount to a good amount of work. I’ve been at teams where a person in the team needed to spend a day or two every week to ensure that all our services were deployed to world wide in a sane fashion. This was even though we had a fairly mature deployment / monitoring / alerts ecosystem at the company.
In service design, as with most interesting things in life, there is no magic bullet. What requirements matter while solving a problem in one context might not be important in another. I find it useful to consider some of the aspects I’ve listed above while designing new services. It is equally important though, to keep in mind what you're trying to optimize for (speed of development / performance / scalability / robustness) as that helps you make the right choices during this process.