When I first encountered the Fallacies of Distributed Computing I felt relieved. At last, all of the thoughts that were swirling around my head concerning the problems I’d had when building numerous distributed systems were all there in a neat list. Articles that built on the basic list, such as this one by A Rotem-Gal-Oz, provide such clarity that one would think that it would be impossible to make the same mistakes again.
So why is it, time and time again, people design and build distributed systems as though the whole thing will be self-contained on a single server? Often the reason you’ll hear is “this system is too simple to have to worry about those things”.
So lets look at these typical simple systems:
- Client Server: This may be desktop machines talking to a remote server, or web applications where users connect over the Internet. Multiple nodes have to talk to one another.
- N-Tier: Redundancy is likely a concern so a logical tier may be made up of multiple physical servers. Even more nodes need to talk to one another.
- Shared Database: In all but the most trivial situations data will be shared between users via a common database. This is a separate node in a distributed application. More nodes.
- Storage: Storage itself can be a further networked node. When high performance databases use SANs there are network connections between the database servers and SAN devices. Even more nodes.
This list is quite an historical look at things. Today all modern applications are distributed
- Classic dynamic web applications where users connect to servers over HTTP
- Single page applications where client side libraries are regularly communicating to the server to update page fragments
- Mobile applications where the network between client and server is unreliable
- Microservices based applications where almost all activity is made up of many services coordinating over a network.
Building modern applications is complex so as humans we have a natural tendency to create simple mental models. The models we make need to be reasonable to the individual and are heavily based on previous experience. It is no coincidence that early remote procedure calling frameworks attempted to simulate making in-process calls. It was a model that was well understood.
An interesting side effect of waterfall projects is that often the developer who was building the system wasn’t the same person who actually had to get the thing working once it was in a staging or a production environment where there was a realistic number of servers. The developer happily rolled off the project, updating their mental model to something that actually didn’t work in practice.
And life is not always rosy in Agile projects…
The same mistakes can be made under the banner of YAGNI, You Aren’t Going to Need It. I worry about YAGNI because it contains a kernel of truth, but it is often expressed in a manner that makes it very easy to get you into trouble. This is particularly the case when it’s paired with other ones like “the simplest thing that could possibly work”. You hear this a lot when delivery pressure increases and when this happens, considering the fallacies are one of the first things to go. The team only think of the happy path, when latency is zero and the network is reliable and bandwidth is infinite…. Oops.
The saving grace of Agile projects is that you will see your mistakes early. The people who implemented the problems have to fix them, and they will throw away their old mental models and replace them with ones that are fit for purpose in the modern era of software applications.
Now, if only I could speed up time…