A Philosophy of Software Design by John Ousterhout - All about complexity
Writing computer software is one of the purest creative activities in the history of the human race. Programmers aren’t bound by practical limitations such as the laws of physics; we can create exciting virtual worlds with behaviors that could never exist in the real world. Programming doesn’t require great physical skill or coordination, like ballet or basketball. All programming requires is a creative mind and the ability to organize your thoughts. If you can visualize a system, you can probably implement it in a computer program.
This book is must-read for programmers who want to improve design skill and build greater, long-lasting software. Its principles can be applied for both architectural level and code organization.
Contents are copied from the book and the author’s talk at Google.
The greatest limitation in writing software is our ability to understand the systems we are creating.
It’s all about complexity
If we want to make it easier to write software, so that we can build more powerful systems more cheaply, we must find ways to make software simpler.
This book is about one thing: complexity. Dealing with complexity is the most important challenge in software design. It is what makes systems hard to build and maintain, and it often makes them slow as well.
Furthermore, the investments you make in good design will pay off quickly. The modules you defined carefully at the beginning of a project will save you time later as you reuse them over and over. The clear documentation that you wrote six months ago will save you time when you return to the code and add a new feature. The time you spent honing your design skills will also pay for itself: as your skills and experience grow, you will find that you can produce good designs more and more quickly. Good design doesn’t really take much longer than quick-and-dirty design, once you know how.
The reward for being a good designer is that you get to spend a larger fraction of your time in the design phase, which is fun. Poor designers spent most of their time chasing bugs in complicated and brittle code. If you improve your design skills, not only will you produce higher quality software more quickly, but the software development process will be more enjoyable.
The nature of complexity
It is easier to tell whether a design is simple than it is to create a simple design, but once you can recognize that a system is too complicated, you can use that ability to guide your design philosophy towards simplicity.
Complexity is anything related to the structure of a software system that makes it hard to understand and modify the system.
Complexity is more apparent to readers and writers. If you write a piece of code and it seems simple to you, but other people think it is complex, then it is complex. Your job as a developer is not just to create code that you can work with easily, but to create code that others can also work with easily. Software should be designed for ease of reading, not ease of writing.
Symptoms of complexity:
- Change amplification: a seemingly simple change requires code modifications in many different places.
- Cognitive load: how much a developer needs to know in order to complete a task. A higher cognitive load means that developers have to spend more time learning the required information, and there is a greater risk of bugs because they have missed something important. Sometimes an approach that requires more lines of code is actually simpler, because it reduces cognitive load.
- Unknown unknowns: it is not obvious which pieces of code must be modified to complete a task, or what information a developer must have to carry out the task successfully.
Complexity is caused by two things: dependencies and obscurity. Complexity isn’t caused by a single catastrophic error; it accumulates in lots of small chunks. Once complexity has accumulated, it is hard to eliminate, since fixing a single dependency or obscurity will not, by itself, make a big difference.
Strategic vs. tactical programming
If you want a good design, you must take a more strategic approach where you invest time to produce clean designs and fix problems. The strategic approach produces better designs and is actually cheaper than the tactical approach over the long run.
Once you start down the tactical path, it’s difficult to change. Tactical programming makes it nearly impossible to produce a good system design. If you are in a company leaning in this direction, you should realize that once a code base turns to spaghetti, it is nearly impossible to fix. You will probably pay high development costs for the life of the product.
The first step towards becoming a good software designer is to realize that working code isn’t enough. It’s not acceptable to introduce unnecessary complexities in order to finish your current task faster. The most important thing is the long-term structure of the system. Most of the code in any system is written by extending the existing code base, so your most important job as a developer is to facilitate those future extensions. Thus, you should not think of “working code” as your primary goal, though of course your code must work. Your primary goal must be to produce a great design, which also happens to work. This is strategic programming.
Once you start delaying design improvements, it’s easy for the delays to become permanent and for your culture to slip into the tactical approach. The longer you wait to address design problems, the bigger they become; the solutions become more intimidating, which makes it easy to put them off even more.
One of the most important factors for success of a company is the quality of its engineers. The best way to lower development costs is to hire great engineers: they don’t cost much more than mediocre engineers but have tremendously higher productivity. However, the best engineers care deeply about good design. If your code base is a wreck, word will get out, and this will make it harder for you to recruit. As a result, you are likely to end up with mediocre engineers. This will increase your future costs and probably cause the system structure to degrade even more.
Modules should be deep
The best modules are those whose interfaces are much simpler than their implementations. Such modules have two advantages. First, a simple interface minimizes the complexity that a module imposes on the rest of the system. Second, if a module is modified in a way that does not change its interface, then no other module will be affected by the modification.
Deep classes are more efficient than shallow ones, because they get more work done for each method call. Shallow classes result in more layer crossings, and each layer crossing adds overhead.
This maximizes the amount of complexity that is concealed.
Pull complexity downwards
It is more important for a module to have simple interface than a simple implementation.
When developing a module, look for opportunities to take a little bit of extra suffering upon yourself in order to reduce the suffering of your users.
General-purpose modules are deeper
The module’s functionality should reflect your current needs, but its interface should not. Instead, the interface should be general enough to support multiple uses.
One of the most important elements of software design is determining who needs to know what, and when. When the details are important, it is better to make them explicit and as obvious as possible.
Avoid configuration parameters
You should avoid configuration parameters as much as possible. Before exporting a configuration parameter, ask yourself: “will users (or higher-level modules) be able to determine a better value than we can determine here?” When you do create configuration parameters, see if you can compute reasonable defaults automatically, so users will only need to provide values under exceptional conditions. Ideally, each module should solve a problem completely; configuration parameters result in an incomplete solution, which adds to system complexity.
Better together or better apart?
If the components are truly independent, then separation is good: it allows the developer to focus on a single component at a time, without being distracted by the other components. On the other hand, if there are dependencies between the components, then separation is bad: developers will end up flipping back and forth between the components. Even worse, they may not be aware of the dependencies, which can lead to bugs.
When designing methods, the most important goal is to provide clean and simple abstractions. Each method should do one thing and do it completely. The method should have a clean and simple interface, so that users don’t need to have much information in their heads in order to use it correctly. The method should be deep: its interface should be much simpler that its implementations. If a method has all of these properties, then it probably doesn’t matter whether it is long or not.
Define errors out of existence
The exceptions throw by a class are part of its interface; classes with lots of exceptions have complex interfaces, and they are shallower than classes with fewer exceptions. An exception is a particularly complex element of an interface.
Throwing exceptions is easy; handling them is hard. Thus, the complexity of exceptions comes from the exception handling code. The best way to reduce the complexity damage caused by exception handling is to reduce the number of places where exceptions have to be handled.
The best way to reduce bugs is to make software simpler.
Design special cases out of existence
Special cases should be eliminated wherever possible. The best way to do this is by designing the normal case in a way that automatically handles the special cases without any extra code.
Special cases of any form make code harder to understand and increase the likelihood of bugs.
Write comments
Developers should be able to understand the abstraction provided by a module without reading any code other than its externally visible declarations.
The process of writing comments, if done correctly, will actually improve a system’s design. Conversely, a good software design loses much of its value if it is poorly documented.
The reason for writing comments is that statements in a programming language can’t capture all of the important information that was in the mind of the developer when the code was written. The guiding principle for comments is that comments should describe things that aren’t obvious from the code, it is to capture information that was in the mind of the designer but couldn’t be represented in the code. Without documentation, future developers will have to re-derive or guess at the developer’s original knowledge; this will take additional time, and there is a risk of bugs if the new developer misunderstands the original designer’s intentions.
Comments augment the code by providing information at a different level of detail. Some comments provide information at a lower, more detailed, level than the code; these comments add precision by clarifying the exact meaning of the code. Other comments provide information at a higher, more abstract, level than the code; these comments offer intuition, such as the reasoning behind the code, or a simpler and more abstract way of thinking about the code.
Modify existing code
Ideally, when you have finished with each change, the system will have the structure it would have had if you had designed it from the start with that change in mind. To achieve this goal, you must resist the temptation to make a quick fix. Instead, think about whether the current system design is still the best one, in light of the desired change. If not, refactor the system so that you end up with the best possible design. With this approach, the system design improves with every modification. If you’re not making the design better, you are probably making it worse.
Consistency
If a system is consistent, it means that similar things are done in similar ways, and dissimilar things are done in different ways. Consistency creates cognitive leverage: once you have learned how something is done in one place, you can use that knowledge to immediately understand other places that use the same approach. If a system is not implemented in a consistent fashion, developers must learn about each situation separately.
The more nit-picky that code reviewers are, the more quickly everyone on the team will learn the conventions, and the cleaner the code will be.
Consistency is another example of the investment mindset. It will take a bit of extra work to ensure consistency: work to decide on conventions, work to create automated checkers, work to look for similar situations to mimic in new code, and work in code reviews to educate the team. The return on this investment is that your code will be more obvious. Developers will be able to understand the code’s behavior more quicky and accurately, and this will allow them to work faster, with fewer bugs.
Design for performance
It’s tempting to rush off and start making performance tweaks, based on your intuitions about what is slow. Don’t do this! Programmers’ intuitions about performance are unreliable. This is true even for experienced developers. If you start making changes based on intuition, you’ll waste time on things that don’t actually improve performance, and you’ll probably make the system more complicated in the process.
Design it twice
Designing software is hard, so it’s unlikely that your first thoughts about how to structure a module or system will produce the best design. You’ll end up with a much better result if you consider multiple options for each major design decision: design it twice.
Try to pick approaches that are radically different from each other; you’ll learn more that way. Even if you are certain that there is only one reasonable approach, consider a second design anyway, no matter how bad you think it will be. It will be instructive to think about the weaknesses of that design and contrast them with the features of other designs. After you have roughed out the designs for the alternatives, make a list of the pros and cons of each one.
It is also worth considering other factors:
- Does one alternative have a simpler interface than another?
- Is one interface more general-purpose than another?
- Does one interface enable a more efficient implementation than another?
- Is one interface ease of use for higher level software?
Once you have compared alternative designs, you will be in a better position to identify the best design. The best choice may be one of the alternatives, or you may discover that you can combine features of multiple alternatives into a new design that is better than any of the original choices.
Smart people
The design-it-twice principle is sometimes hard for really smart people to embrace. When they are growing up, smart people discover that their first quick idea about any problem is sufficient for a good grade; there is no need to consider a second or third possibility. This makes it easy to develop bad work habits. However, as these people get older, they get promoted into environments with harder and harder problems. Eventually, everyone reaches a point where your first ideas are no longer good enough; if you want to get really great results, you have to consider a second possibility, or perhaps a third, no matter how smart you are. The design of large software systems falls in this category: no-one is good enough to get it right with their first try.
Unfortunately, I often see smart people who insist on implementing the first idea that comes to mind, and this causes them to underperform their true true potential (it also makes them frustrating to work with). Perhaps they subconsciously believe that “smart people get it right the first time,” so if they try multiple designs it would mean they are not smart after all. This is not the case. It isn’t that you aren’t smart; it’s that the problems are really hard! Furthermore, that’s a good thing: it’s much more fun to work on a difficult problem where you have to think carefully, rather than an easy problem where you don’t have to think at all.
The design-it-twice approach not only improves your designs, but it also improves your design skills. The process of devising and comparing multiple approaches will teach you about the factors that make designs better or worse. Over time, this will make it easier for you to rule out bad designs and hone in on really great ones.
Process
Because software is so malleable, software design is a continuous process that spans the entire lifecycle of a software system; this makes software design different from the design of physical systems such as buildings, ships, or bridges.
The waterfall model rarely works well for software. Software systems are intrinsically more complex than physical systems; it isn’t possible to visualize the design for a large software system well enough to understand all of its implications before building anything. As a result, the initial design will have many problems. The problems do not become apparent until implementations is well underway.
Incremental development means that software design is never done. Design happens continuously over the life of a system: developers should always be thinking about design issues. Incremental development also means continuous redesign. The initial design for a system or component is almost never the best one; experience inevitably shows better ways to do things. As a software developer, you should always be on the lookout for opportunities to improve the design of the system you are working on, and you should plan on spending some fraction of your time on design improvements.
It isn’t possible to visualize a complex system well enough at the outset of a project to determine the best design. The best way to end up with a good design is to develop a system in increments, where each increment adds a few new abstractions and refactors existing abstractions based on experience.
Test-driven development (TDD)
The problem with test-driven development is that it focuses attention on getting specific features working, rather than finding the best design. This is tactical programming pure and simple, with all of its disadvantages. Test-driven development is to incremental: at any point in time, it’s tempting to just hack in the next feature to make the next test pass. There’s no obvious time to design, so it’s easy to end up with a mess.
One place where it makes sense to write the tests first is when fixing bugs. Before fixing a bug, write a unit test that fails because of the bug. Then fix the bug and make sure that the unit test now passes. This is the best way to make sure you really have fixed the bug. If you fix the bug before writing the test, it’s possible that the new unit test doesn’t actually trigger the bug, in which case it won’t tell you whether you really fixed the problem.
Summary
Here are the most important software design principles discussed in this book:
- Complexity is incremental: you have to sweat the small stuff.
- Working code isn’t enough.
- Make continual small investments to improve system design.
- Modules should be deep.
- Interfaces should be designed to make the most common usage as simple as possible.
- It’s more important for a module to have a simple interface than a simple implementation.
- General-purpose modules are deeper.
- Separate general-purpose and special-purpose code.
- Different layers should have different abstractions.
- Pull complexity downward.
- Define errors (and special cases) out of existence.
- Design it twice.
- Comments should describe things that are not obvious from the code.
- Software should be designed for ease of reading, not ease of writing.
- The increments of software development should be abstractions, not features.
Here are a few of the most important red flags discussed in this book. The presence of any of these symptoms in a system suggests that there is a problem with the system’s design:
- Shallow module: the interface for a class or method isn’t much simpler than its implementation.
- Information leakage: a design decision is reflected in multiple modules.
- Temporal decomposition: the code structure is based on the order in which operations are executed, not on information hiding.
- Overexposure: an API forces callers to be aware of rarely used features in order to use commonly used features.
- Pass-through method: a method does almost nothing except pass its arguments to another method with a similar signature.
- Repetition: a nontrivial piece of code is repeated over and over.
- Special-general mixture: special-purpose code is not cleanly separated from general purpose code.
- Conjoined methods: two methods have so many dependencies that it is hard to understand the implementation of one without understanding the implementation of the other.
- Comment repeats code: all of the information in a comment is immediately obvious from the code next to the comment.
- Implementation documentation contaminates interface: an interface comment describes implementation details not needed by users of the thing being documented.
- Vague name: the name of a variable or method is so imprecise that it doesn’t convey much useful information.
- Hard to pick name: it is difficult to come up with a precise and intuitive name for an entity.
- Hard to describe: in order to be complete, the documentation for a variable or method must be long.
- Nonobvious code: the behavior or meaning of a piece of code cannot be understood easily.
Other good notes
Every rule has its exceptions, and every principle has its limits. If you take any design idea to its extreme, you will probably end up in a bad place. Beautiful designs reflect a balance between competing ideas and approaches.
There is quite a bit of scientific evidence that outstanding performance in many fields is related more to high-quality practice than innate ability.
The author also had this talk at Google, which is good and worth watching.