From working as an IT consultant and a sysadmin for a variety of media production companies, I learned quickly that a problem is rarely as simple as when it is first presented, and yet few are so complex that they can’t be broken down into components to produce a net solution. That’s the start of system design: The process of looking at the needs of an organization, and designing a schema of data processing to produce the desired results.
System design is a lot like Object-Oriented Programming (OOP), and in fact, is often implemented as Object-Oriented Design (OOD) on the premise that real-world systems can be emulated in literal code. If you are, for instance, designing a pipeline for a restaurant, you can correlate the parts in your OOD to Python classes representing everything from the customer, waiter, and chef, to the ingredients of the dish, the billing process, and so on. With this kind of abstraction, the waiter isn’t one specific waiter, nor is the customer a specific customer, nor is a recipe a specific recipe.
Once objects have been defined, they serve as templates for any real-world object that fits. Ideally, you should be able to run a virtual instance of your system and reap the benefits of exposing bugs, plus testing redundancy, availability, and reliability, well before instantiating it in the physical world.
When tasked with designing a system for any purpose, you must first determine what components are involved. A high-level view is best, in the beginning, to protect yourself from getting mired in minutiae. Any given component of a design is bound to have its own smaller components, and while it’s important to be willing to break objects apart and design how each one works, initially you only need broad definitions. As you look at the problem being solved, create objects representing the things you know you’ll need for your solution to work. For instance, if you were asked to design a load-balanced website infrastructure, you know that a client computer is one object class, a load balancer is another, and a web server is another. These may not be all of the objects required, but it’s a good start, and at least allows for a basic diagram.
Here is an over-simplified view of what has been defined so far:
Inputs and outputs
With the most important components defined, you can start looking at the inputs and outputs involved. You must determine what data each object class provides and what data each object class accepts. For instance, a load-balancing proxy server accepts TCP from a network connection, but "network connection" is a broad term nowadays, so it introduces the opportunity to create granular definitions in your diagram.
In a large organization, you may have network connections originating from inside the organization as well as from outside, so you may have reason to define those as unique instances of the client class:
More complex systems have a greater diversity of inputs and outputs, but these inputs and outputs help you define how components relate to one another. This is true for all manner of systems you might find yourself designing:
- A web application that requires front-end code as well as a database back end is bound to more components than a static web page.
- A physical art exhibit that encourages viewer interaction has more input than a painting behind glass.
- Recipes on a gourmet menu have more ingredient dependencies than a short order cook’s recipes at a diner.
After classifying objects and gaining a full understanding of how they relate to one another, you can start a visual diagram in earnest. At this stage, the details do matter and should be accounted for, but abstraction is still key. You are not mapping topography or creating a flowchart, but classifying the object types and how they fit together.
I use open source Dia Diagram for quick and official visualization while relying upon Inkscape for extrapolated versions for upper management. For instance, the system design of your load-balanced web server might be codified like this:
Extrapolated, that translates to:
The Client class that served as a template for any number of networked computers translates, in a visual implementation of your design, to exactly that: any number of networked computers. Classes in Object-Oriented Design construct instances of the objects they define. They are not representative of the resources required to fulfill them, only an indicator of what kinds of resources are required or expected.
Not every system design is transformed into code for testing, but knowing that it’s possible is a powerful thing. When you believe you have completed your design, you should be able to, either as a mental exercise or as executable source code, virtualize your system, and witness that it functions.
This kind of exercise is key to learning how individual components of system design fit together into "the big picture," and how taking a thorough look at all a complex system’s moving parts can help you identify oversights or weaknesses before you deploy.
An excellent resource to learn system design in detail is the ebook, System Design Primer, by Donne Martin. It’s published under a Creative Commons Share-alike license and provides plenty of scenarios (and solutions) to practice on and learn from.