Clean Code by Robert C. Martin - Code quality is your responsibility
If you care about quality of your software, happiness of your colleagues, and professionalism of yourself, you should read this book.
You will very soon pay for bad code, much sooner than you may expected. We spend most of our development time in reading code. The ratio of reading and writing is around 10:1. So please write clean code, that will help you and your colleagues a lot.
Texts in italic, bullet/numbered lists and source code examples are quotes from the book.
If you want to go fast, if you want to get done quickly, if you want your code to be easy to write, make it easy to read.
Working program is not enough. They must have readability, flexibility, maintainability, and reusability. Getting software to work and making software clean are two very different activities. Most of us have limited room in our heads, so we focus on getting our code to work more than organization and cleanliness. The problem is that too many of us think that we are done once the program works.
There are many recommendations and guidelines to help us make our code clean. They will be listed below, then summarized in Smells and Heuristics. However, there are some points that I have my own idea, they will be listed in the very last section: What do I not agree.
#1: Clean Code
Most managers may defend the schedule and requirements with passion; but that’s their job. It’s your job to defend the code with equal passion. It is programmer’s responsibility to tell managers about the importance of code quality and the cost of owning a mess. Write clean code will save you time sooner than you can imagine.
You will not make the deadline by making the mess. Indeed, the mess will slow you down instantly, and will force you to miss the deadline. The only way to make the deadline - the only way to go fast - is to keep the code as clean as possible at all times.
Some people can “smell” bad code, but don’t know how to write clean one, just like cooking.
Many definitions
There are many definitions and point of views about clean code. Based on perspective, background, experience…
Bjarne Stroustrup, inventor of C++ and author of “The C++ Programming Language”:
I like my code to be elegant and efficient. The logic should be straightforward to make it hard for bugs to hide, the dependencies minimal to ease maintenance, error handling complete according to an articulated strategy, and performance close to optimal so as not to tempt people to make the code messy with unprincipled optimizations. Clean code does one thing well.
Grady Booch, author of “Object Oriented Analysis and Design with Applications”:
Clean code is simple and direct. Clean code reads like well-written prose. Clean code never obscures the designer’s intent but rather is full of crisp abstractions and straightforward lines of control.
“Big” Dave Thomas, founder of OTI, godfather of the Eclipse strategy:
“Clean code can be read, and enhanced by a developer other than its original author. It has unit and acceptance tests. It has meaningful names. It provides one way rather than many ways for doing one thing. It has minimal dependencies, which are explicitly defined, and provides a clear and minimal API.
Michael Feathers, author of “Working Effectively with Legacy Code”:
Clean code always looks like it was written by someone who cares.
Ron Jeffries, author of “Extreme Programming Installed” and “Extreme Programming Adventures in C#”:
Reduced duplication, high expressiveness, and early building of simple abstractions. That’s what makes clean code for me.
Leave this world a little better than you found it… Leave the code a little cleaner than you checked it out…
#2: Meaningful Names
Choosing good names takes time but saves more than it takes. So take care with your names and change them when you find better ones.
There are many simple rules for good names:
- Use intention-revealing names
- Avoid disinformation
- Make meaningful distinctions: distinguish names in such a way that the reader knows what the differences offer
- Use pronounceable names
- Use searchable names
- Avoid encodings
- Avoid mental mapping
- Classes and objects should have noun or noun phrase names
- Methods should have verb or verb phrase names
- Don’t be cute
- Pick one word for one abstract concept and stick with it
- Avoid using the same word for two purposes
- Use solution domain names: technical terms, computer science terms, algorithm names, pattern names, math terms…
- Use problem domain names
- Add meaningful context
- Don’t add gratuitous context: shorter names are generally better than longer ones, so long as they are clear
#3: Clean Functions
Should be small, but deep. Keeping functions small and deep is hard. Whatever they are, they should be clean. You can achieve this by refactoring and extracting functions.
Do one thing: Functions should do one thing. They should do it well. They should do it only.
Use descriptive names: You know you are working on clean code when each routine turns out to be pretty much what you expected. Don’t be afraid to make a name long. A long descriptive name is better than a short enigmatic name. A long descriptive name is better than a long descriptive comment.
Arguments: Keep as less function arguments as possible. Because it introduces more detail, it also make testing harder. Consider instance variable or wrap arguments into a class.
Avoid output arguments: If a function is going to transform its input argument, the transformation should appear as the return value. If your function must change the state of something, have it change the state of its own object.
Avoid flag arguments: It immediately complicates the signature of the method, loudly proclaiming that this function does more than one thing.
Command query separation: Functions should either do something or answer something, but not both.
A function should have no side effects.
Anything that forces you to check the function signature is equivalent to a double-take. It’s a cognitive break and should be avoided.
#4: Comments
Comment is excuse for bad code. The proper use of comments is to compensate for our failure tor express ourself in code.
When code evolves, it is hard to maintain correctness and up-to-date of its comments. Inaccurate comments are more dangerous than no comment at all. The only truly good comment is the comment that you found a way not to write.
We know it’s a mess. So we say to ourselves, “Ooh, I’d better comment that!” No! You’d better clean it!
Try to make the self-explain. Instead of doing this:
// Check to see if the employee is eligible for full benefits
if ((employee.flags && HOURLY_FLAG) &&
(employee.age > 65))
Do this:
if (employee.isEligibleForFullBenefits())
Bad comments
- Mumbling. If you decide to write a comment, then spend the time necessary to make sure it is the best comment you can write.
- Redundant: repeat, clutter, obscure code
- Misleading
- Mandated comments
- Journal comments
- Noise: restate the obvious and provide no new information. We learned to ignore them, our eyes simply skip over them. Eventually the comments begin to lie as the code around them changes. Bad examples:
/**
* Returns the day of the month.
*
* @return the day of the month.
*/
public int getDayOfMonth() {
return dayOfMonth;
}
/** The name. */
private String name;
/** The version. */
private String version;
- Don’t use a comment when you can use function names and variable names
- Position markers. For example:
// Actions ///////////////////////////////////////
- Closing brace comments: If you find yourself wanting to mark your closing braces, try to shorten your functions instead.
- Attributions and bylines
- Commented-out code: Just delete the code. We won’t lose it. Promise.
- HTML comments
- Nonlocal information: Make sure a comment describes the code it appears near. Don’t offer systemwide information in the context of a local comment.
- Too much information
- Inobvious connection between a comment and the code it describes. It is a pity when a comment needs its own explanation.
- Function headers
Necessary comments
But in some cases, we need comments:
- Legal, copyright, authorship
- Informative
// format matched kk:mm:ss EEE, MMM dd, yyyy
Pattern timeMatcher = Pattern.compile(
"\\d*:\\d*:\\d* \\w*, \\w* \\d*, \\d*");
- Explanation of intent behind a decision
- Clarification: make obscure thing more readable
- Warning of consequences
//TODO
- Amplify importance of something that may be missed
- Docs for public APIs
#5: Code Formatting
Code formatting is important. It is about communication, and communication is the professional developer’s first order of business.
The coding style and readability set precedents that continue to affect maintainability and extensibility long after the original code has been changed beyond recognition. Your style and discipline survives, even though your code does not.
There are many guidelines for keeping code format “natural” and easy to navigate:
- Small files are usually easier to understand than large files.
- Related concepts should be kept close to each other. We want to avoid forcing our readers to hop around through our source files and classes.
- Keep lines short.
Variables
- Variables should be declared as close to their usage as possible.
- Instance variables should be declared at the top of the class, because in a well-designed class, they are used by many, if not all, of the methods of the class.
Functions
- If one function calls another, they should be vertically close, and the caller should be above the callee, if possible. This gives the program a natural flow. This creates a nice flow down the source code module from high level to low level.
This is the exact opposite of languages like Pascal, C, and C++ that enforce functions to be defined, or at least declared, before they are used.
#6: Objects and Data Structures
Procedural code (code using data structures) makes it easy to add new functions without changing the existing data structures. OO code, on the other hand, makes it easy to add new classes without changing existing functions.
Procedural code makes it hard to add new data structures because all the functions must change. OO code makes it hard to add new functions because all the classes must change.
#7: Error Handling
Error handling is important, but if it obscures logic, it’s wrong.
Some techniques and considerations for error handling:
- Use exceptions rather than return codes
- Write your
try-catch-finally
statement first - Use unchecked exceptions: stack trace, informative error messages
- Define exception classes in terms of a caller’s needs: concern about how exceptions are caught
- Define the normal flow
- Don’t return
null
. Checkingnull
after each function call is bad. If we don’t check, what exactly should you do in response to aNullPointerException
throw from the depths of your application? - Don’t pass
null
. Avoid whenever possible.
#8: Boundaries
Learning the third-party code is hard. Integrating the third-party code is hard too. Doing both at the same time is doubly hard.
Learning/Boundary test: verify that the third-party packages we are using work the way we expect them to. If the third-party package changes in some way incompatible with our tests, we will find out right away.
#9: Unit Tests
It is unit tests that keep your code flexible, maintainable, and reusable.
Having dirty tests is equivalent to, if not worse than, having no tests. The problem is that tests must change as the production code evolves. Test code is as important as production code.
Only put expressive code to the test: build data, operate on that data, and check expected results. Other things are considered noise and should be refactored to helper methods.
FIRST rule
- Fast
- Independent
- Repeatable: in any environment: production, QA, local without network
- Self-validating: have boolean output: pass or fail
- Timely
#10: Classes
- Single responsibility: We regularly encounter classes that do far too many things. But, a class or module should have one, and only one, reason to change.
- Organizing for change: Needs will change, therefore code will change. Change is continual. Change introduces risks, and must be retested.
This chapter focuses on making classes small. But I would prefer keeping them “deep”, as stated in “A Philosophy of Software Design” post and “What do I not agree” section below. Sometimes, deep and small are 2 different directions. I also don’t agree with some code improvement examples in this chapter because of the same reason.
#11: Systems
Software systems are unique compared to physical systems. Their architectures can grow incrementally.
It is a myth that we can get systems “right in the first time.”
About design process: Big Design Up Front (BDUF) is harmful because it inhibits adapting to change, due to the psychological resistance to discarding prior effort and because of the way architecture choices influence subsequent thinking about design.
Whether you are designing systems or individual modules, never forget to use the simplest thing that can possibly work.
About decision making: It is also best to postpone decisions until the last possible moment. This isn’t lazy or irresponsible; it lets us make informed choices with the best possible information. A premature decision is a decision made with suboptimal knowledge.
#12: Emergence
A design is “simple” if it follow these rules, in order of importance:
- Runs all the tests
- Contains no duplication
- Expresses the intent of the programmer
- Minimizes the number of classes and methods
#13: Concurrency
Concurrency is hard:
- Concurrency incurs some overhead, both in performance as well as writing additional code.
- Correct concurrency is hard, even for simple problems.
- Concurrency bugs aren’t usually repeatable, so they are often ignored as one-offs instead of the true defects they are.
- Concurrency often requires a fundamental change in design strategy.
Concurrency myths and misconceptions:
- Concurrency always improves performance
- Design does not change when writing concurrent programs
Useful recommendations:
- Keep your concurrency-related code separate from other code
- Take data encapsulation to heart; severely limit the access of any data that may be shared
- Attempt to partition data into independent subsets than can be operated on by independent threads, possibly in different processors
- Keep your synchronized sections as small as possible
- Do not ignore system failures as one-offs
- Do not try to chase down nonthreading bugs and threading bugs at the same time. Make sure your code works outside of threads.
- Make your thread-based code especially pluggable so that you can run it in various configurations
- Run your threaded code on all target platforms early and often
Deadlock
This book has a very good part about deadlock: causes, how to avoid and how to test: on Appendix A, page 335.
#14: Successive Refinement
If we have learned anything over the last couple of decades, it is that programming is a craft more than it is a science. To write clean code, you must first write dirty code and then clean it.
It is not enough for code to work. Code that works is often badly broken. Programmers who satisfy themselves with merely working code are behaving unprofessionally. They may fear that they don’t have time to improve the structure and design of their code, but I disagree. Nothing has a more profound and long-term degrading effect upon a development project than bad code. Bad schedules can be redone, bad requirements can be redefined. Bad team dynamics can be repaired. But bad code rots and ferments, becoming an inexorable weight that drags the team down. Time and time again I have seen teams grind to a crawl because, in their haste, they created a malignant morass of code that forever thereafter dominated their destiny.
Of course bad code can be cleaned up. But it’s very expensive. As code rots, the modules insinuate themselves into each other, creating lots of hidden and tangled dependencies. Finding and breaking old dependencies is a long and arduous task. On the other hand, keeping code clean is relatively easy. If you made a mess in a module in the morning, it is easy to clean it up in the afternoon. Better yet, if you made a mess five minutes ago, it’s very easy to clean it up right now.
So the solution is to continuously keep your code as clean and simple as it can be. Never let the rot get started.
Smells and Heuristics
Key points stated in previous chapters:
General:
- Multiple languages in one source file
- Obvious behavior is unimplemented
- Incorrect behavior at the boundaries
- Overridden safeties
- Duplication
- Code at wrong level of abstraction
- Base classes depending on their derivatives
- Too much information
- Dead code
- Vertical separation
- Inconsistency
- Clutter
- Artificial coupling
- Feature envy
- Selector arguments
- Obscured intent
- Misplaces responsibility
- Inappropriate static
- Use explanatory variables
- Function names should say what they do
- Understand the algorithm
- Make logical dependencies physical
- Prefer polymorphism to if/else or switch/case
- Follow standard conventions
- Replace magic numbers with named constants
- Be precise
- Structure over convention
- Encapsulate conditionals
- Avoid negative conditionals
- Functions should do one thing
- Hidden temporal couplings
- Don’t be arbitrary
- Encapsulate boundary conditions
- Functions should descend only one level of abstraction
- Keep configurable data at high levels
- Avoid transitive navigation
Names:
- Choose descriptive names
- Choose names at the appropriate level of abstraction
- Use standard nomenclature where possible
- Unambiguous names
- Use long names for long scopes
- Avoid encodings
- Names should describe side-effects
Functions:
- Too many arguments
- Output arguments
- Flag arguments
- Dead function
Tests:
- Insufficient tests
- Use a coverage tool
- Don’t skip trivial tests
- An ignored test is a question about an ambiguity
- Test boundary conditions
- Exhaustively test near bugs
- Patterns of failure are revealing
- Test coverage patterns can be revealing
- Tests should be fast
Comments:
- Inappropriate information
- Obsolete comment
- Redundant comment
- Poorly written comment
- Commented-out code
Environment:
- Build requires more than one step
- Tests requires more than one step
Java:
- Avoid long import lists by using wildcards
- Don’t inherit constants
- Constants versus enums
What do I not agree
Small and deep
The small function/class rule of this book, in some specific cases, contradicts with the deep module principle of “A Philosophy of Software Design” by John Ousterhout. In cases that raise a contradiction, I would prefer the “deep module” principle. Deep does not mean unclean. With good skills, we can make deep modules/classes clean.
Listing 3-7
To make functions “do just one thing”, the author break them into smaller ones. In my opinion, this increase reference cost, reader must read forward and backward go get the big picture. Each function is shallow. They just do “elemental” things, and doesn’t provide any extra calculation except calling another function. They are just function forwarder. A lot of references to get one thing done make the code harder to read.
My proposed change: functions can do more than one thing. As long as they provide helpful calculation, keep themselves clean and easy to read. No need to do function references here and there in order to just keep them small.
Listing 9-4, 9-5
Reader must have knowledge about upper and lower case convention, or need to read many functions in production code, to understand what HBchL
actually mean.
Final words
Reading the book is not enough. You must practice. Do code reviews in many years will make the most of it.
Happy coding. And make your colleagues happy coders, too.