Code Farms Inc

1.5 THE PROGRAM AS A RECORD OF AN IDEA


Most people consider a computer program merely as a set of instructions for the machine. In reality, the logic of software systems is usually so complex that programs (the actual code) are the only exact records of what the software should do. If we want to use programs in this way, they'd better be easy to read.

One of the curses of software engineering is that a program and its documentation seldom agree. This problem has a very real reason.

Years ago, when everybody programmed in assembly languages, the program documentation had to describe the complete logic of the code. This was done either in pseudo code or in pictures, but the complete logic (and the emphasis here is on the word complete) had to be recorded somewhere, since the programs were so cryptic that a person trying to understand the code could get lost easily, even in a relatively small program.

When following this method, the program documentation (the original logic design) and the actual code corresponded to each other except when the assembly code was modified without updating the documentation, usually during debugging or when adding new functions to the code. When attempting to understand the code, new members of the team would study the documentation first, and approach the assembly code only when solving a problem locally.

Since people began to use programming languages that combine natural English with mathematical expressions, the code itself has become more readable than any pseudo code. It does not contain any shortcuts, and contains the full logic. It describes precisely what the computer will do, and it cannot become obsolete. Note how, after getting their first bearings, new members of a team trying to understand an existing C or C++ program will likely go straight for the code, using the documentation only for general guidance.

The purpose of documentation is changing from providing an exact description of the implementation details to providing higher level information which helps the programmer understand the purpose of each particular module, including interactions with other modules. Documentation (in this new sense) is a means of communication from one brain to another, and it takes real intelligence, empathy, and imagination to create a document which the reader finds illuminating and interesting.[Quoted from Trygve Reenskaug who coded the Smalltalk example in Section 6.4.]

A similar tendency to move from detailed instruction to more global information can be observed in the commenting of code. Without good comments spread through the code, the logic of a typical FORTRAN program is difficult to understand. However, C++ programs are often coded (or even published) practically without comments. Comments seem to be unnecessary since they can tell only what can be seen directly from the code. [Compare this with the current Smalltalk practice, where comments are very important and used ever more extensively.]

For example, in [STR] Section 6.4, Stroustrup describes a complete program which includes a screen manager, a shape library, and an application program. The program consists of about 200 lines of code, presented in 16 sections. Except for 6 general comments providing a hint at the beginning of some major functions, only a few in-line comments on p. 196 appear: if clipping (p. 195), if top to bottom (p. 196), if left to right (p. 196).

The program is so easy to read, that additional comments may only confuse the issue. One may feel that comments are not needed in such a case since the text of the book provides the explanation. This may be true but, according to my observation, C++ programs with very few or no comments are common throughout the industry. If a comment duplicates information which can be obtained by reading the code, there is an inherent danger that during the maintenance, a change of code without a proper update of the comment will lead to a potentially ambiguous situation.

What I would like to show by this discussion is that, with the more advanced languages, the program is becoming its own documentation, at least at the most basic level. Often documentation is inadequate or completely missing, because of time pressure or other circumstances. It is critical that programs be easy to read.

The methodology that we will develop in this book aims in this direction. However, we must not forget that computer programs reflect the minds of their creators. Clear, organized programs are the result of good planning and thinking, while unreadable code often indicates hurried, disorganized design and a confused programmer.

Naur expands the idea of a program being the ultimate record of the solution even further, [NAU]. The entire programming activity can be viewed as building a theory on how to solve a particular problem. The theory evolves as we analyze the problem and experiment with different solutions. In the beginning, the theory reflects human thinking based on intuition, and possibly omits important details. Gradually, the theory covers more aspects of the problem, until it eventually yields a generally applicable solution. The program itself (not its documentation) is the record of the theory and again, it is critical that the program can be understood simply from the code.[See Section 6.4.]

Note the dynamic nature of this notion. Since the program must evolve, the duality between documentation and code is impractical, and indeed unmaintainable. The logic of the solution and the instruction for the computer (the code itself) must become one unit.

Object-oriented programming poses an interesting challenge to this theory. In a private communication, Lakos (see also [LAK]) suggested the following example: let us assume that a programmer designed class Set, with an iterator which returns the elements of the set in an undetermined order. However, when reading the code, this intention may not be obvious. Unless there is additional documentation such as comments, or unless the name of the class suggests certain behavior, we may see only a linked list or an array. The abstraction has been lost; we have broken encapsulation. The information which we have does not allow us to reimplement the originally intended solution.

I believe that this example demonstrates the true reason for comments in the code. Comments can record the programmer's intention and reasons for certain implementation styles, and in that sense, become an important part of the code.

Code Farms Inc | www.codefarms.com | info@codefarms.com