Book Summary: A Philosophy of Software Design
Written by John Ousterhout, a Professor of Computer Science at Stanford University, this book’s central idea is all about managing and reducing complexity in software projects. It draws upon his experience as an academic, but also from his various commercial endeavours, and his work on the Tcl scripting language. It’s a short and light read (about 170 pages) that discusses several principles which can be applied during the development of software systems to achieve cleanly designed abstractions. It also highlights several red flags that can indicate major design problems and what to do about them. Most importantly, the author acknowledges that all design principles can be taken too far, and gives some advice on what not to do.
The book starts by defining complexity as anything that can make a software project hard to understand and modify. Complexity, as far as the author is concerned, is not about how sophisticated a system is or how many features it has (though it is true that big systems tend to be complex), but how easy it is to maintain. If a system is easy to understand and modify, then it is simple no matter how large it is. Otherwise, it is complicated.
Complexity typically manifests itself in a software project in three major ways ordered in terms of increasing severity:
- When a simple change requires modifying code in many places.
- When developers need to carry a lot of information in their heads to complete a task. This increases the chances that they might miss something, leading to bugs.
- When it is not obvious what information or changes are needed in order to carry out a task (unknown unknowns).
How do software systems become complicated?
Complexity is incremental. It’s usually not one thing that makes a system complicated but an accumulation of bad decisions over a period of time. The author gives two main reasons why software projects become complex:
Dependencies — A dependency between two or more parts of a system is only desirable if the dependency is clear and obvious. When it is not clear what code depends on another, even simple changes to the system will take a long time and there would be a high risk of bugs showing up after deploying to production.
Obscurity — This occurs when some important information about the system is not obvious. For example, when it’s not clear what order in which to run a set of methods in order to carry out an operation. If code is not obvious, a reader must expend a lot of time and energy to attempt to understand it, and the likelihood of misunderstanding it is high.
Cultivating the right mindset
The author distinguishes between programming with a tactical or strategic mindset. The former is the default in most companies, and it focuses on getting features working as fast as possible. This mindset is usually in favour of taking a shortcut or two in order to get something working quickly. The main problem with this approach is that planning for the future, or finding the best design often takes a back seat to getting working code out the door as quickly as possible which may increase productivity in the short term, but hurt the maintainability of the system in ways that may be difficult or impossible to correct in the future.
A strategic programmer on the other hand recognises that “working code” is not a good enough standard if long-term maintainability is a paramount concern. Instead of trying to ship something as quickly as possible, this mindset encourages taking some time to find the best design that also happens to work. This could mean trying out different designs before picking a solution, or being proactive about writing documentation. Essentially, this type of programming requires an investment mindset, but it is often at odds with the realities of commercial software development.
How to design modules
According to the book, a module is a relatively independent unit of code with an interface and an implementation. It can take many forms such as a function, class, package, or service. The interface of a module describes what the module does, but not how it is implemented. In the case of methods, this includes things like its signature and information about any exceptions thrown by the method. The correct use of these aspects of a module can usually be enforced by the programming language or linting tools.
There is also an informal aspect to interfaces, and these include any other information that users of the associated module need to know about to use it correctly. For example, if one method must be called before another one, it is part of the method’s interface and should be described using comments. If this information is omitted, it may lead to the unknown unknowns problem described earlier.
In chapters 4-11 of the book, the author provides general guidelines, and examples on how to design modules in a way that will help to reduce a system’s complexity. A selection of these guidelines are discussed below.
1. Provide good defaults
The interface of a module should make its common use cases as simple as possible. When almost every user of a module requires a certain behaviour, it should be provided by default, with an easy way to opt out. Don’t make users of the module have to do extra work to get the desired behaviour for the most common use cases.
2. Favour deep modules over shallow ones
Deep modules are those that provide simple interfaces to complex functionality, while shallow ones are those that have a complicated interface without hiding much complexity. Shallow modules add complexity to a system because of the cost of learning and using their interfaces, but without providing a compensating benefit by hiding complicated implementation details.
The author is specifically against the idea that classes or methods should always be small which is the prevalent view in the industry. He argues that this philosophy tends to produce a large number of shallow modules which contribute some complexity to the system by virtue of having their own interface, and also an increase in verbosity due to the boilerplate required for each module.
The best modules have simple interfaces which abstract away complex implementations. Larger classes that encapsulate closely related functionality usually fall into this category, although the author notes that you can take the notion of larger classes too far. Chapter 9 of the book goes into great detail on how to decide whether to combine or separate modules in order to reduce the complexity of a system.
3. Hide unimportant information
One technique for producing deeper modules is information hiding. Each module should omit details about its implementation from a its interface so that it is invisible to other modules. This helps to simplify the interface to the module so that the abstraction is easier for other developers to use.
Information hiding also makes it easier to evolve the system over time. For example, you can switch the algorithm used by a module to a faster or more efficient one without affecting users of the module. This would be much harder to do if such information is leaked to the outside world because if such a change is required, it will necessitate changes to all the classes that rely on the leaked information whether implicitly or explicitly.
When applying this principle, ensure that only unimportant details are omitted from the interface. If details that are important to the abstraction are left out of an interface, it will obscure the correct use of the associated module because users will no longer have all the information they need to use it just by looking at its interface.
4. Eliminate unnecessary errors
In chapter 10, the author identifies exception handling as one of the worst sources of complexity in software systems. This is because exceptions increases the complexity of a module because it becomes part of the module’s interface causing users of the module to become aware of it, and increases the number of places where a potential exception must be handled.
A good way to reduce exception handling complexity is to consider if an
exceptional case can be eliminated entirely. The author gives an example of the
String.substring() method which throws an
when the supplied index is negative or larger than the length of the
object. This forces developers to check each index to ensure it is not out of
bounds before calling the method.
An alternative design is to treat indices that are less than 0 or greater than
the string length as if they were equivalent to 0 and the length of the string
method which effectively defines the error case out of existence, making it
unnecessary to perform additional checks or write exception handling code when
using this method.
The practice of in-code documentation
Comments aim to capture the information that was in the mind of the module designer, but couldn’t be represented in the code. This can be the reasoning behind a key design decision, or a specific quirk that motivates a piece of code. A well-written comment can help other developers who make modifications to the system at a later time to work more quickly and accurately since they’ll have the full context on why a specific decision was made without having to guess or ask the original developer (who may no longer be available, or may have forgotten).
Chapters 12-16 of the book is dedicated to the practice of in-code documentation. In general, the author is in favour of writing comments because he believes that it can improve a system’s design significantly if done right.
Debunking the myths of commenting code
Chapter 12 discusses four excuses often presented by developers as reasons for their unwillingness to write comments along with rebuttals for each one.
- Good code is self-documenting — No matter how good the code is, it cannot capture the informal aspects of a module’s interface such as why a particular design decision was made, or the meaning of its result, hence the need for comments. The presence of comments also makes it easy to get a high-level description of what the module does without having to read the code itself.
- No time to write comments — Taking the time to write comments is part of the investment mindset discussed earlier. To develop a software system that is easy to maintain, you have to spend some extra time doing the work to create that structure, and the benefits of having good documenting upfront will quickly offset any additional costs.
- Outdated comments become misleading — Comments can sometimes get stale, but it shouldn’t be a huge effort to keep it up to date especially if comments are positioned next to the code they describe. Code reviews can also be effective at detecting and correcting outdated comments.
- All comments are worthless — Many software developers question the usefulness of comments due to how many useless comments they’ve seen in the wild. This occurs because commenting is often viewed as drudge work, and many developers do not take the time to learn how to write them properly which results in poor documentation. This problem is solvable by following some guidelines to writing good documentation which is tackled in chapter 13.
Some guidelines for writing good comments
- Comments should describe things that cannot be inferred from the code.
- Try to place yourself in the shoes of the reader, and ask yourself what they need to know about the entity being documented.
- Be consistent in way you write comments. Use the standard conventions for the specific programming language, or adopt one if no standards exist.
- Use different words to describe the entity being documented to clarify its purpose. Focus on what the entity represents, and not on how it is manipulated.
- Interface comments (those that provide a high-level information on how to use a class or method) should not describe the implementation. Instead, document things like side effects, constraints on arguments, dependencies, exceptions or preconditions before a method is invoked.
- When documenting implementation details, focus on the what and why of the code. The how can be deduced by reading the code itself.
- Write comments when writing the code. Putting it off until later increases the chances that it doesn’t get written at all.
- Place comments in the place where developers are most likely to see it (usually right next to the relevant code).
- Avoid duplicating comments in multiple places because it makes it harder to keep them up to date. Ensure that each design decision is documented only once, and referenced in other places if necessary.
Choosing identifier names
The quality of names in a software system can affect its perceived complexity in non-trivial ways. Good names provide clarity by telling the reader what an entity is and what it is not, which reduces the need for extensive comments. Bad names tend to mislead readers or require them to seek more documentation.
When choosing a name, try to capture the essential purpose of the entity being named using as few words as possible. Generic names are not recommended in most cases because it increases the chances that the named entity will be misused. It’s also important to be consistent when using names, so that the same name means the same thing throughout the codebase regardless of the context in which it was used.
Test Driven Development considered harmful
While the author is in favour of writing unit tests, especially because they aid in catching problems while refactoring, he is not a fan of Test Driven Development (TDD). He argues that TDD encourages tactical programming through its emphasis on getting specific features working instead of finding the right design, which is better achieved by considering the units of development as abstractions instead of features.
The only time he recommends using TDD is when fixing bugs that can be reproduced using a test beforehand, so that the test can confirm that the bug was actually fixed after the necessary changes have been made.
A Philosophy of Software Design presents some fresh ideas on the practice of designing software at a relatively high level. The author is not afraid to go against conventional wisdom, and he does a good job of explaining how he came about the ideas in the book through easy to follow examples.
I think the book is a good read for developers of any level, but it should be especially useful for intermediate to senior level developers who have adequate experience in the field. If you have recommendations for other good books on software design, please share it in the comments.
Thanks for reading, and happy coding!