Scalable is the term used in software engineering to describe software systems that can accommodate growth. In this first part of the series, we will explore what precisely is meant by the ability to scale — known, not surprisingly, as scalability. We’ll also describe a few examples that put hard numbers on the capabilities and characteristics of contemporary applications and give a brief history of the origins of the massive systems we routinely build today. Finally, we will describe two general principles for achieving scalability that will recur in various forms throughout the rest of this series of articles and examine the indelible link between scalability and cost.
What is Scalability?
Intuitively, scalability is a pretty straightforward concept. If we ask Wikipedia for a definition, it tells us “scalability is the property of a system to handle a growing amount of work by adding resources to the system”. We all know how we scale a highway system — we add more traffic lanes so it can handle a greater number of vehicles. Some of my favorite people know how to scale beer production — they add more capacity in terms of the number and size of brewing vessels, the number of staff to perform and manage the brewing process, and the number of kegs they can fill with tasty fresh brews. Think of any physical system — a transit system, an airport, elevators in a building — and how we increase capacity is pretty obvious.
Unlike physical systems, software is somewhat amorphous. It is not something you can point at, see, touch, feel, and get a sense of how it behaves internally from external observation. It’s a digital artifact. At its core, the stream of 1’s and 0’s that make up executable code and data are hard for anyone to tell apart. So, what does scalability mean in terms of a software system?
Put very simply, and without getting into definition wars, scalability defines a software system’s capability to handle growth in some dimension of its operations. Examples of operational dimensions are:
the number of simultaneous user or external (e.g. sensor) requests a system can process
the amount of data a system can effectively process and manage
the value that can be derived from the data a system stores
For example, imagine a major supermarket chain is rapidly opening new stores and increasing the number of self-checkout kiosks in every store. This requires the core supermarket software systems to:
Handle increased volume from item sale scanning without decreased response time. Instantaneous responses to item scans are necessary to keep customers happy.
Process and store the greater data volumes generated from increased sales. This data is needed for inventory management, accounting, planning and likely many other functions.
Derive ‘real-time’ (e.g. hourly) sales data summaries from each store, region and country and compare to historical trends. This trend data can help highlight unusual events in regions (e.g. unexpected weather conditions, large crowds at events, etc.) and help the stores affected quickly respond.
Evolve the stock ordering prediction subsystem to be able to correctly anticipate sales (and hence the need for stock reordering) as the number of stores and customers grow
These dimensions are effectively the scalability requirements of a system. If over a year, the supermarket chain opens 100 new stores and grows sales by 400 times (some of the new stores are big), then the software system needs to scale to provide the necessary processing capacity to enable the supermarket to operate efficiently. If the systems don’t scale, we could lose sales as customers are unhappy. We might hold stock that will not be sold quickly, increasing costs. We might miss opportunities to increase sales by responding to local circumstances with special offerings. All these reduce customer satisfaction and profits. None are good for business.
Successfully scaling is therefore crucial for our imaginary supermarket’s business growth, and is in fact t