(No) Comment?!

Many software developers feel bad because they make little use of comments in their code. Often, using lengthy comments is considered good style. In the old days, with languages like C or assembler, things got messy pretty fast, so comments were the only way to keep track of processor registers or pointer arithmetic. In modern programming languages, more powerful abstractions are available. The question is, does this change the strategy of commenting your source code?

First of all, I'd like to differentiate between interface documentation and actual code comments. Interface documentation is often created from specially formatted code comments, but they aren't comments in the usual sense. I'll discuss interface documentation separately in this article.

In agile software development processes, you usually try to keep the number of supporting artifacts low. Lengthy analysis documents, fancy diagrams and the like tend to become outdated very soon as the code base is constantly being refactored. Updating artifacts takes time and discipline and you can't be agile if you're carrying around too much weight.

With comments it's not much different. Although they live inside the code base, comments can be seen as artifacts, too. They have to be kept up to date and cause confusion if they aren't. If a piece of code and its comment diverge, you can never be sure if it's a bug or an outdated comment. In fact, the situation is worse than having no comment at all.

Based on books about software design (most notably: Martin Fowler, Refactoring; Eric Evans, Domain Driven Design) and my own experiences, I worked out the following strategy:

I try to avoid writing comments when possible. If a piece of code is getting complex and cannot be understood easily, I try to refactor it to make it simpler. Decomposing a method and using intention revealing variable and method names usually make the code easier to read. Only as a last resort, if a piece of code is inherently complicated, I add comments.

Generally, I favour comments on why something is done over those telling how something is done; it's the code's job to communicate the "how" part. Failed attempts and reasons why they failed are valuable, too, as they help a later maintainer to avoid old mistakes.

When it comes to interface documentation, I differentiate between private, public, and published interfaces. Inside of a class, where there are many small private methods, it's hardly feasible to document everything. In-depth documentation would hold me back from necessary refactorings. Much the same holds for the majority of public methods, but I try to write at least a few sentences about each class, its purpose, and responsibilities.

Things are completely different with published interfaces. A published interface consists of all classes, interfaces, and methods (or other public features) that are to be used by external clients. I document classes, their invariants, method parameters, constraints etc. extensively. These interfaces typically don't change a lot over time so the effort is certainly worth it. When published interfaces do change, it's often a major effort that requires changing client code. In this case, an update to the documentation is unavoidable anyway.

Assertions are another useful way of documenting code. They can check pre-conditions and state the caller's responsibilities at the same time. I run my test suites with assertions enabled to catch bugs but disable them in production environments.

social