Clean Code by Robert C. Martin - Code quality is your responsibility

April 4, 2020

If you care about quality of your software, happiness of your colleagues, and professionalism of yourself, you should read this book.

You will very soon pay for bad code, much sooner than you may expected. We spend most of our development time in reading code. The ratio of reading and writing is around 10:1. So please write clean code, that will help you and your colleagues a lot.

Clean Code - A Handbook of Agile Software Craftmanship book — Clean Code - A Handbook of Agile Software Craftmanship, by Robert C. Martin. Image source: kobo.com

Texts in italic, bullet/numbered lists and source code examples are quotes from the book.

If you want to go fast, if you want to get done quickly, if you want your code to be easy to write, make it easy to read.

Working program is not enough. They must have readability, flexibility, maintainability, and reusability. Getting software to work and making software clean are two very different activities. Most of us have limited room in our heads, so we focus on getting our code to work more than organization and cleanliness. The problem is that too many of us think that we are done once the program works.

There are many recommendations and guidelines to help us make our code clean. They will be listed below, then summarized in Smells and Heuristics. However, there are some points that I have my own idea, they will be listed in the very last section: What do I not agree.

#1: Clean Code

Most managers may defend the schedule and requirements with passion; but that’s their job. It’s your job to defend the code with equal passion. It is programmer’s responsibility to tell managers about the importance of code quality and the cost of owning a mess. Write clean code will save you time sooner than you can imagine.

You will not make the deadline by making the mess. Indeed, the mess will slow you down instantly, and will force you to miss the deadline. The only way to make the deadline - the only way to go fast - is to keep the code as clean as possible at all times.

Some people can “smell” bad code, but don’t know how to write clean one, just like cooking.

Many definitions

There are many definitions and point of views about clean code. Based on perspective, background, experience…

Bjarne Stroustrup, inventor of C++ and author of “The C++ Programming Language”:

I like my code to be elegant and efficient. The logic should be straightforward to make it hard for bugs to hide, the dependencies minimal to ease maintenance, error handling complete according to an articulated strategy, and performance close to optimal so as not to tempt people to make the code messy with unprincipled optimizations. Clean code does one thing well.

Grady Booch, author of “Object Oriented Analysis and Design with Applications”:

Clean code is simple and direct. Clean code reads like well-written prose. Clean code never obscures the designer’s intent but rather is full of crisp abstractions and straightforward lines of control.

“Big” Dave Thomas, founder of OTI, godfather of the Eclipse strategy:

“Clean code can be read, and enhanced by a developer other than its original author. It has unit and acceptance tests. It has meaningful names. It provides one way rather than many ways for doing one thing. It has minimal dependencies, which are explicitly defined, and provides a clear and minimal API.

Michael Feathers, author of “Working Effectively with Legacy Code”:

Clean code always looks like it was written by someone who cares.

Ron Jeffries, author of “Extreme Programming Installed” and “Extreme Programming Adventures in C#”:

Reduced duplication, high expressiveness, and early building of simple abstractions. That’s what makes clean code for me.

Leave this world a little better than you found it… Leave the code a little cleaner than you checked it out…

#2: Meaningful Names

Choosing good names takes time but saves more than it takes. So take care with your names and change them when you find better ones.

There are many simple rules for good names:

Use intention-revealing names
Avoid disinformation
Make meaningful distinctions: distinguish names in such a way that the reader knows what the differences offer
Use pronounceable names
Use searchable names
Avoid encodings
Avoid mental mapping
Classes and objects should have noun or noun phrase names
Methods should have verb or verb phrase names
Don’t be cute
Pick one word for one abstract concept and stick with it
Avoid using the same word for two purposes
Use solution domain names: technical terms, computer science terms, algorithm names, pattern names, math terms…
Use problem domain names
Add meaningful context
Don’t add gratuitous context: shorter names are generally better than longer ones, so long as they are clear

#3: Clean Functions

Should be small, but deep. Keeping functions small and deep is hard. Whatever they are, they should be clean. You can achieve this by refactoring and extracting functions.

Do one thing: Functions should do one thing. They should do it well. They should do it only.

Use descriptive names: You know you are working on clean code when each routine turns out to be pretty much what you expected. Don’t be afraid to make a name long. A long descriptive name is better than a short enigmatic name. A long descriptive name is better than a long descriptive comment.

Arguments: Keep as less function arguments as possible. Because it introduces more detail, it also make testing harder. Consider instance variable or wrap arguments into a class.

Avoid output arguments: If a function is going to transform its input argument, the transformation should appear as the return value. If your function must change the state of something, have it change the state of its own object.

Avoid flag arguments: It immediately complicates the signature of the method, loudly proclaiming that this function does more than one thing.

Command query separation: Functions should either do something or answer something, but not both.

A function should have no side effects.

Anything that forces you to check the function signature is equivalent to a double-take. It’s a cognitive break and should be avoided.

#4: Comments

Comment is excuse for bad code. The proper use of comments is to compensate for our failure tor express ourself in code.

When code evolves, it is hard to maintain correctness and up-to-date of its comments. Inaccurate comments are more dangerous than no comment at all. The only truly good comment is the comment that you found a way not to write.

We know it’s a mess. So we say to ourselves, “Ooh, I’d better comment that!” No! You’d better clean it!

Try to make the self-explain. Instead of doing this:

// Check to see if the employee is eligible for full benefits
if ((employee.flags && HOURLY_FLAG) &&
    (employee.age > 65))

Do this:

if (employee.isEligibleForFullBenefits())

Bad comments

Mumbling. If you decide to write a comment, then spend the time necessary to make sure it is the best comment you can write.
Redundant: repeat, clutter, obscure code
Misleading
Mandated comments
Journal comments
Noise: restate the obvious and provide no new information. We learned to ignore them, our eyes simply skip over them. Eventually the comments begin to lie as the code around them changes. Bad examples:

/**
 * Returns the day of the month.
 *
 * @return the day of the month.
 */
 public int getDayOfMonth() {
   return dayOfMonth;
 }

/** The name. */
private String name;

/** The version. */
private String version;

Don’t use a comment when you can use function names and variable names
Position markers. For example:

// Actions ///////////////////////////////////////

Closing brace comments: If you find yourself wanting to mark your closing braces, try to shorten your functions instead.
Attributions and bylines
Commented-out code: Just delete the code. We won’t lose it. Promise.
HTML comments
Nonlocal information: Make sure a comment describes the code it appears near. Don’t offer systemwide information in the context of a local comment.
Too much information
Inobvious connection between a comment and the code it describes. It is a pity when a comment needs its own explanation.
Function headers

Necessary comments

But in some cases, we need comments:

Legal, copyright, authorship
Informative

// format matched kk:mm:ss EEE, MMM dd, yyyy
Pattern timeMatcher = Pattern.compile(
  "\\d*:\\d*:\\d* \\w*, \\w* \\d*, \\d*");

Explanation of intent behind a decision
Clarification: make obscure thing more readable
Warning of consequences
//TODO
Amplify importance of something that may be missed
Docs for public APIs

#5: Code Formatting

Code formatting is important. It is about communication, and communication is the professional developer’s first order of business.

The coding style and readability set precedents that continue to affect maintainability and extensibility long after the original code has been changed beyond recognition. Your style and discipline survives, even though your code does not.

There are many guidelines for keeping code format “natural” and easy to navigate:

Small files are usually easier to understand than large files.
Related concepts should be kept close to each other. We want to avoid forcing our readers to hop around through our source files and classes.
Keep lines short.

Variables

Variables should be declared as close to their usage as possible.
Instance variables should be declared at the top of the class, because in a well-designed class, they are used by many, if not all, of the methods of the class.

Functions

If one function calls another, they should be vertically close, and the caller should be above the callee, if possible. This gives the program a natural flow. This creates a nice flow down the source code module from high level to low level.

This is the exact opposite of languages like Pascal, C, and C++ that enforce functions to be defined, or at least declared, before they are used.

#6: Objects and Data Structures

Procedural code (code using data structures) makes it easy to add new functions without changing the existing data structures. OO code, on the other hand, makes it easy to add new classes without changing existing functions.

Procedural code makes it hard to add new data structures because all the functions must change. OO code makes it hard to add new functions because all the classes must change.

#7: Error Handling

Error handling is important, but if it obscures logic, it’s wrong.

Some techniques and considerations for error handling:

Use exceptions rather than return codes
Write your try-catch-finally statement first
Use unchecked exceptions: stack trace, informative error messages
Define exception classes in terms of a caller’s needs: concern about how exceptions are caught
Define the normal flow
Don’t return null. Checking null after each function call is bad. If we don’t check, what exactly should you do in response to a NullPointerException throw from the depths of your application?
Don’t pass null. Avoid whenever possible.

#8: Boundaries

Learning the third-party code is hard. Integrating the third-party code is hard too. Doing both at the same time is doubly hard.

Learning/Boundary test: verify that the third-party packages we are using work the way we expect them to. If the third-party package changes in some way incompatible with our tests, we will find out right away.

#9: Unit Tests

It is unit tests that keep your code flexible, maintainable, and reusable.

Having dirty tests is equivalent to, if not worse than, having no tests. The problem is that tests must change as the production code evolves. Test code is as important as production code.

Only put expressive code to the test: build data, operate on that data, and check expected results. Other things are considered noise and should be refactored to helper methods.

FIRST rule

Fast
Independent
Repeatable: in any environment: production, QA, local without network
Self-validating: have boolean output: pass or fail
Timely

#10: Classes

Single responsibility: We regularly encounter classes that do far too many things. But, a class or module should have one, and only one, reason to change.
Organizing for change: Needs will change, therefore code will change. Change is continual. Change introduces risks, and must be retested.

This chapter focuses on making classes small. But I would prefer keeping them “deep”, as stated in “A Philosophy of Software Design” post and “What do I not agree” section below. Sometimes, deep and small are 2 different directions. I also don’t agree with some code improvement examples in this chapter because of the same reason.

#11: Systems

Software systems are unique compared to physical systems. Their architectures can grow incrementally.

It is a myth that we can get systems “right in the first time.”

About design process: Big Design Up Front (BDUF) is harmful because it inhibits adapting to change, due to the psychological resistance to discarding prior effort and because of the way architecture choices influence subsequent thinking about design.

Whether you are designing systems or individual modules, never forget to use the simplest thing that can possibly work.

About decision making: It is also best to postpone decisions until the last possible moment. This isn’t lazy or irresponsible; it lets us make informed choices with the best possible information. A premature decision is a decision made with suboptimal knowledge.

#12: Emergence

A design is “simple” if it follow these rules, in order of importance:

Runs all the tests
Contains no duplication
Expresses the intent of the programmer
Minimizes the number of classes and methods

#13: Concurrency

Concurrency is hard:

Concurrency incurs some overhead, both in performance as well as writing additional code.
Correct concurrency is hard, even for simple problems.
Concurrency bugs aren’t usually repeatable, so they are often ignored as one-offs instead of the true defects they are.
Concurrency often requires a fundamental change in design strategy.

Concurrency myths and misconceptions:

Concurrency always improves performance
Design does not change when writing concurrent programs

Useful recommendations:

Keep your concurrency-related code separate from other code
Take data encapsulation to heart; severely limit the access of any data that may be shared
Attempt to partition data into independent subsets than can be operated on by independent threads, possibly in different processors
Keep your synchronized sections as small as possible
Do not ignore system failures as one-offs
Do not try to chase down nonthreading bugs and threading bugs at the same time. Make sure your code works outside of threads.
Make your thread-based code especially pluggable so that you can run it in various configurations
Run your threaded code on all target platforms early and often

Deadlock

This book has a very good part about deadlock: causes, how to avoid and how to test: on Appendix A, page 335.

If we have learned anything over the last couple of decades, it is that programming is a craft more than it is a science. To write clean code, you must first write dirty code and then clean it.

It is not enough for code to work. Code that works is often badly broken. Programmers who satisfy themselves with merely working code are behaving unprofessionally. They may fear that they don’t have time to improve the structure and design of their code, but I disagree. Nothing has a more profound and long-term degrading effect upon a development project than bad code. Bad schedules can be redone, bad requirements can be redefined. Bad team dynamics can be repaired. But bad code rots and ferments, becoming an inexorable weight that drags the team down. Time and time again I have seen teams grind to a crawl because, in their haste, they created a malignant morass of code that forever thereafter dominated their destiny.

Of course bad code can be cleaned up. But it’s very expensive. As code rots, the modules insinuate themselves into each other, creating lots of hidden and tangled dependencies. Finding and breaking old dependencies is a long and arduous task. On the other hand, keeping code clean is relatively easy. If you made a mess in a module in the morning, it is easy to clean it up in the afternoon. Better yet, if you made a mess five minutes ago, it’s very easy to clean it up right now.

So the solution is to continuously keep your code as clean and simple as it can be. Never let the rot get started.

Smells and Heuristics

Key points stated in previous chapters:

General:

Multiple languages in one source file
Obvious behavior is unimplemented
Incorrect behavior at the boundaries
Overridden safeties
Duplication
Code at wrong level of abstraction
Base classes depending on their derivatives
Too much information
Dead code
Vertical separation
Inconsistency
Clutter
Artificial coupling
Feature envy
Selector arguments
Obscured intent
Misplaces responsibility
Inappropriate static
Use explanatory variables
Function names should say what they do
Understand the algorithm
Make logical dependencies physical
Prefer polymorphism to if/else or switch/case
Follow standard conventions
Replace magic numbers with named constants
Be precise
Structure over convention
Encapsulate conditionals
Avoid negative conditionals
Functions should do one thing
Hidden temporal couplings
Don’t be arbitrary
Encapsulate boundary conditions
Functions should descend only one level of abstraction
Keep configurable data at high levels
Avoid transitive navigation

Names:

Choose descriptive names
Choose names at the appropriate level of abstraction
Use standard nomenclature where possible
Unambiguous names
Use long names for long scopes
Avoid encodings
Names should describe side-effects

Functions:

Too many arguments
Output arguments
Flag arguments
Dead function

Tests:

Insufficient tests
Use a coverage tool
Don’t skip trivial tests
An ignored test is a question about an ambiguity
Test boundary conditions
Exhaustively test near bugs
Patterns of failure are revealing
Test coverage patterns can be revealing
Tests should be fast

Comments:

Inappropriate information
Obsolete comment
Redundant comment
Poorly written comment
Commented-out code

Environment:

Build requires more than one step
Tests requires more than one step

Java:

Avoid long import lists by using wildcards
Don’t inherit constants
Constants versus enums

What do I not agree

Small and deep

The small function/class rule of this book, in some specific cases, contradicts with the deep module principle of “A Philosophy of Software Design” by John Ousterhout. In cases that raise a contradiction, I would prefer the “deep module” principle. Deep does not mean unclean. With good skills, we can make deep modules/classes clean.

Listing 3-7

To make functions “do just one thing”, the author break them into smaller ones. In my opinion, this increase reference cost, reader must read forward and backward go get the big picture. Each function is shallow. They just do “elemental” things, and doesn’t provide any extra calculation except calling another function. They are just function forwarder. A lot of references to get one thing done make the code harder to read.

My proposed change: functions can do more than one thing. As long as they provide helpful calculation, keep themselves clean and easy to read. No need to do function references here and there in order to just keep them small.

Listing 9-4, 9-5

Reader must have knowledge about upper and lower case convention, or need to read many functions in production code, to understand what HBchL actually mean.

Final words

Reading the book is not enough. You must practice. Do code reviews in many years will make the most of it.

Happy coding. And make your colleagues happy coders, too.

book
se

"Writing clean code is what you must do in order to call yourself a professional. There is no reasonable excuse for doing anything less than your best."