"After all, the engineers only needed to refuse to fix anything, and modern industry would grind to a halt." -Michael Lewis
This blog is a place to learn about and better understand what it takes to scale backend systems. By scale, we mean that the system keeps performing more or less "normally" [that is, the business requirements, performance, availability, etc. are all within a reasonable SLA/SLO] all while adding more and more load. In most cases, more load will be more users, though in some cases more load could simply be the same users using the system more, or pushing it to its limits by using it in ways it wasn't originally designed to be used.
Scaling a system is a very hard thing to do, and there is much less room for error that just getting your solution to work. Things that perform well at lower scale can quickly break down at larger scale, and designs that satisfy an MVP for a few hundred users can make the service completely unusable for tens of thousands of users.
Further, many of the tools that help enable scalability in today's technology landscape are deceptively complicated: you have to be very careful with how you use them, or you will back yourself into a corner and make maintainability a nightmare. As practitioners know, the most expensive part of scaling most systems is often the engineering effort required. Maximizing scalability and maintainability is critical if you want to enable the business to succeed.
Nick works as a backend software engineer in the Pacific Northwest, and specializes in scaling complex, distributed systems. He can be contacted via email.