Wordman's Coding Convention

Coding standards are important for large projects because a) the vast majority of the time spent on a software system goes toward maintaining the system and b) code is vary rarely maintained by the person who originally wrote it. Coding conventions bring a standard “feel” to the code, allowing for much easier maintenance and debugging.

Goals

Every code standard attempts to meet a certain set of goals. For example, the main goal of the Hungarian notation system used by Microsoft is to be able to identify the type of variable system you are using by its name. This seems to be a fairly useless goal, so we will try to do better here. The primary goal of this convention is to make it easier to debug code. More specifically, the goals of this convention are:

The standard is based (very) loosely on Sun’s “Code Conventions for the Java Programming Language”. This standard is nowhere near as strict as Sun’s and you can assume that anything not covered by this standard may be done however you like.

Keep in mind when reading this that when you write code, you are not writing it for yourself. Your code will be viewed by a number of other engineers. The idea of a coding convention is to give these other engineers the best chance of understanding your code in the least possible time. Roughly 80% of the life of a source code file is spent being maintained by someone other than the person who wrote it originally.

Convention Requirements

The following sections detail requirements for all new code written for NetTransact. Existing code may be modified to meet this requirement, but this is not required. Any such modification of existing code should take difficulties merging such code into consideration.

Class Names

All class names should be nouns. Class names should be nouns, in mixed case with the first letter of each internal word capitalized. Names should be simple and descriptive, avoiding use of abbreviations (unless the abbreviation is the common form, such as XML).

class TemplateEngine;
class BillerLoginScreen;

Additionally, interface classes should always start with a capital I.

class IFilterInterface;

Method Names

Methods should be verbs, in mixed case with the first letter lowercase, with the first letter of each internal word capitalized.

sort();
sortByDate();
senderPage();

Packages

Package names should follow the convension of com. followed by the company name.

Variable Names

For any variable in code, you should be able to tell at a glance if it is a local, a global, a static or a data member of a class. Variable names should also provide instant understanding that a variable is an object or some integer type.

Each type of variable is indicated in a specified way:

Data member of a class Should begin with “m_”
Static variable of a class Should begin with “s_”
Method argument Should begin with “a_”
Local variable Should begin with “v_”
Class constant Should begin with “k_”

Some variables should also contain a type code after the underscore to indicate the general type of data of the variable.

<none> An object
b A boolean
n Integer type of some kind (int, short, etc)
p A pointer (not used in Java code)
r A reference (not used in Java code)

Examples:

m_nSetting Integer data member of a class
m_Name String data member of class (strings are objects in C++, Java and others)
m_BillerArray Object data member of a class
m_pParent Pointer within a class
s_PropertyName Static data member of a class
s_bUseTracking Static boolean member of a class
a_nIndex Integer passed into the current method
a_pWindow Pointer passed into the current method
a_rWindow Reference passed into the current method
v_nIndex Local integer variable
v_BillerName Local object variable
v_bContinue Local boolean variable
k_nPayment Class constant integer

Formatting

For the most part, this coding convention places no restrictions on formatting choices. For example, do you put spaces between method arguments? Do your parentheses use white space or not? This convention does not care, because most formatting decisions do not particularly address the goals of this convention. For example, any of these variations are fine, because there is not much real difference between them:

void foo(int argA,int argB);
void foo(int argA, int argB);
void foo( int argA, int argB );
void foo ( int argA, int argB );

If you have no real preference, the style of the second style in the list is the recommended choice.

The convention does place one condition on formatting, however. This condition will be controversial, because many people have very strong feelings about it. This condition is not issued lightly, and is here only because there are valid, justifiable reasons for setting it. The condition deals with the placement of braces.

This convention requires that braces be matched in-line vertically. That is, the starting and ending braces must be in the same column position within a file, like so:

void fooBar()
{
   // body of function
}

As this will no doubt annoy some readers, the rationale behind this convention is described in the following paragraphs.

In the early life of Unix, a popular use of braces evolved. This form places the opening brace of a matched pair on the same line as the declaration, like so:

void fooBar() {
   // body of function
}

While many people use (and religiously defend) this convention, few can tell you the reasoning behind it. The driving force behind this idea is (unique among specific coding conventions) money. Specifically, a textbook or manual that follows this coding convention is usually cheaper to produce than if it used other coding conventions. This is because code using this convention takes up less vertical space on a page, which usually reduces the total number of pages in the book and, thus, the cost.

These days, proponents of the “same line” convention extend this concept to their coding, indicating that this convention allows the display of more code at once on screen. This, they claim, reduces the need to scroll back and forth, improving coding efficiency. Unfortunately, even assuming that this is true, this convention has very little else going for it, apart from personal aesthetical judgments (i.e. “it just looks better”).

But, you may ask, so what? Isn’t this pretty much the same as using spaces before or after arguments? If those aren’t regulated, why should the use of braces? The answer is that in addition to possessing few positive traits, the “same line” convention contains some serious drawbacks. Specifically, it has two “features” that interfere with the third goal of this coding convention: legibility.

The first (and less critical) drawback to the same line convention deals with white space. It is widely accepted that code using more white space is generally easier to read than code with less white space. For example, compare the following code:

Less white space    More white space
nItems = m_List.Count(); --nItems;
i = j = 0; firstItem = m_List.First();
if ( firstItem )
   j = 1;
   nItems = m_List.Count();
--nItems;
i = 0;
j = 0;
firstItem = m_List.First();
if ( firstItem )
   j = 1;

By avoiding a newline before the opening brace, the “same line” convention intentionally eliminates white space at every brace level, reducing legibility. Strangely, many proponents of the “same line” convention (such as Sun, in their coding convention) agree that more white space is preferred, but use the “same line” convention anyway (without explanation, in Sun’s case).

More crucially, the “same line” convention intentionally obfuscates the block metaphor the braces are intended to represent. At a glance the closing brace is easy to spot, but you must intentionally hunt down the opening brace, hiding at the end of the line. This is an enormous obstacle to legibility, requiring the user to expend conscious thought seeking something that should not require it. Aligning the braces vertically acts as a subconscious guide, allowing the brain to focus on debugging.

While there is some truth to the argument that the indentation provides the necessary blocking, the fact is that most compilers require braces, but do not require indentation. This code, for example, is legal, but confusing and potentially the source of a very large error because its indentation is off:

if ( firstItem )
   j = 1; i = 1;
// DON’T DO THIS

Secondly, often indentation gets mangled from one machine to another, because the tab settings can differ between machines (some set a tab equal to four spaces, some set it equal to three). In general, vertically aligned bracing provides a much better visual outline of the code than indentation alone.

In short, the “same line” convention for braces intentionally decrease legibility to fit more code on screen at once. Since “fitting code on screen” is not a goal of this coding convention, but legibility is, this coding convention requires vertically aligned braces.

2.2.6 Source Code Organization

Class definitions should follow the following rules:

Methods should be short and concise. A method should do exactly one thing. If it does more, it should be split into multiple methods. If you find that a method is deeply nested (indented three or more times), it should also probably be split.

Documentation Comments

This convention uses JavaDoc to automatically generate developer documentation. This requires developers to provide JavaDoc comments with every class, interface, constructor, method, and field.

A single JavaDoc comment, set inside the comment delimiters /**...*/, must be provided immediately before each class, method or interface declaration. Only one JavaDoc comment is used per declaration. JavaDoc comments should not be positioned inside a method or constructor definition block, because the JavaDoc processor associates documentation comments with the first declaration after the comment.

The idea of such comments is that there are tools which can parse the code for these comments and generate HTML references to all of the classes in the project. We will then publish these classes to an internal web server for use by developers.

Most Java development environments support automatic generation of JavaDoc “skeletons”, defining the JavaDoc comments with placeholders you can fill in.

Implementation Comments

Implementation comments are the more typical kind of comments used in code, providing insight into the use of a particular variable or line of code, etc. Code should be commented as much as is practical. The goal of commenting code is to explain to engineers who will be tracking bugs down what is happening as quickly and concisely as possible.

Developers have acquired many styles of documentation over the years, but it turns out that many of these (such as large blocks of * characters) break JavaDoc, so this convention places a few restrictions on style of implementation comments. Only comments described in the following sections will be allowed.

Block Comments

For large blocks of commentary that do not belong in JavaDoc comments, such as a file header or to describe a complex algorithm, the following style is used:

/*
* Part of a block comments that goes on and on and tells you how
* wonderful the code is.
*/

The important thing here is not to start the comment with /**, because comments that start with these three characters will be processed as JavaDoc comments. You should avoid using // style comments for block comments.

Single Line Comments

Single line comments precede the line they describe. These are the most common kind of comments. They should be indented at the same level as the following line, and can be of either of these two forms:

/* One style of single line comment */
// Another style of single line comment

Trailing Comments

Trailing comments come at the end of a line, usually describing a specific field or branch condition. They are generally very short. These can use the same styles as single line comments:

return false;    // Explain why here.

Code Exclusion Comments

Sometimes you need to comment out blocks of code. The preferred method for this is to use ifdefs, provided the language you are using supports them. If not the next best method is to start the block with /* and end it with */. Again, be careful not to use /** to start the block. This allows the code to be commented back in with limited effort. Placing // in front of each line in the block is less desirable, but allowed.

Bug Fix comments

When making a fix to a bug you should enter a comment near the fix containing the date, your initials and the bug number, using the following format:

// <initials> m/d/yy <bugnumber>: <comment>

There are two reasons for two this kind of comment. The first involves preventing a sort of infinite recursion on bug fixes. Say Usario is in charge of feature X. A bug occurs in feature X, and Usario finds and kills it, but does not comment the fix he made. Unbeknownst to Usario, the code he fixed is also used by feature Y, and his fix winds up breaking feature Y. The bug that is now in feature Y is reported and assigned to Benutzer. Benutzer codes in and, seeing no comment, undoes the fix made by Usario. Benutzer also does not add a comment. This fixes feature Y, but re-breaks feature X. So a bug gets assigned to Usario… and so on forever. Well, maybe not forever. Usually after the third repetition or so, it gets figured out, but that tends to take weeks. This all sounds a bit goofy, but has definitely happened here repeatedly.

Now, suppose Usario had added the proper bug fix comment, with bug number. When Benutzer gets into the code, he can look up Usario’s bug, and figure out a way to solve both Usario’s and his own problem. Note that, with the bug number there, Usario does not have to remember why he made the fix, since often it would have been made weeks previously. Crucially, both problems can be unit tested by Benutzer before he checks in the fix.

The second reason isn’t needed often, but can be very costly (or impossible) to perform if the comments are not there. Sometimes it is necessary to figure out what changes were made to fix a certain bug. With the bug numbers in the comments, a search can be done to figure this out very quickly. Without the bug numbers around, this task becomes extremely difficult.

Note that bug fix comments need appear only at the beginning of the fix (or someplace nearby), and does not “wrap” the fix with beginning and end comments as previously required. This is because the begin and end made the code very confusing when bugs began to accumulate. Sometimes making the comments will be overkill (e.g. when changing every third line of a method). In such cases, just keep the two reasons for this kind of comment in mind and use your best judgment about keeping the code legible. For example, if you make significant changes to a method, a single comment detailing all of the changes is probably sufficient.

Along the same lines, if you completely rewrite a method or a class, you probably don’t need to be concerned with preserving these numbers. Just make certain that the situations addressed by those bug fixes are addressed and unit tested.

Convention Recommendations

This section recommends a couple of style practices that you might want to use. These are not part of the requirement, so you are free to ignore them; however, you may find some of the rationale for each item persuasive.

One Line, One Operation

Many people code using a strict requirement that a single line of code should do exactly one thing. The idea is that your code is much clearer and (perhaps more importantly) easier to debug. Consider this line of C code, for example:

theString += *++aBuffer; // Don’t do this

Can you explain what this line does in less that ten words? Did you understand what it did in less than three seconds? For most people, the answer to both of these questions is no. This is because the line above actually does three things. It increments a pointer, it dereferences the pointer, and it appends the dereferenced character to a string. Understanding the code also requires explicit knowledge of C’s order of operation rules to figure out how the * and ++ operators resolve. Under a one line, one operation discipline, the above line code becomes these three lines:

++aBuffer;
char tempChar = *aBuffer;
theString += tempChar;

Each line of does exactly one thing. This code is close to being self-commenting, unlike the single line version. Did you follow this code more easily than the single line version?

Another advantage of this approach is that debugging is a bit cleaner. Consider this code for example:

foo( bar(), fum() ); // Don’t pass functions as parameters like this

Suppose you want to step into foo. You will first have to step into and out of bar, then into and out of fum, then into foo. Not the best use of your time. Also, suppose you want to check the return value of fum in the debugger. This can be a bit difficult. The following code is preferable:

int returnBar = bar();
int returnFum = fum();
foo( returnBar, returnFum );

As a side note, some people worry that adding extra local variables like this is somehow inefficient. It isn’t. In fact, the single line version and the three line version will generate identical code.

One last aspect of one line, one operation coding is the declaration of variables. One line, one operation applies here as well, meaning that you should declare only one variable per line.

2.3.2 One Method, One Screen

As a rule of thumb, the code for a method should fit in a single screen. If the method is longer than this, it should be broken into smaller methods. Another clue that you should break up a method is if your nesting is more than three levels deep.

2.3.3 One Method, One Return

A method should only return from one place, the end. Often, you will get the urge to place a return statement at some random place in the middle of the method (usually inside an if condition). Strongly resist this urge. Often, you may find yourself saying “a return here will make this code so much easier”. In almost all cases, if you are saying this, it is because your code is much too complicated. Re-think and re-organize it. Code the needs returns in the middle is generally going to be buggy and much harder for others to understand.

Generally, if you start writing a method with the “one return” principle in mind, it is much easier to write good code.

2.3.4 Local Initialization

Making sure that you assign a value to a local variable when it is declared often eliminates bugs down the road. What usually happens is that someone writes code like this:

int itemCount; // Don’t do this
// some other code is here
itemCount = m_List.GetCount();

Then someone will come in much later and insert code between where itemCount is defined and initialized that makes use of itemCount in an uninitialized state. Generally, it is good to get into the habit of doing this:

int itemCount = 0;
// some other code is here
itemCount = m_List.GetCount();

The only time you would not want to do this is if the initialization of the variable is somehow computationally costly.

Scope as Locally As Possible

Similar to the idea of initializing locals at declaration time as a “defense measure”, another good practice is to define local variables with as local a scope as possible, as close to where they are first used as possible.

Scope and Variable Names

Avoid declaring the same variable names in nested scopes. For example, if a class happened to have an argument to a method called theList, do not declare a local called theList inside that method. Note that if you follow the convention for variable names, most of this problem will be eliminated. For example, following the convention, it is impossible to have, say, a class data member with the same name as a local variable in a method of that class.

Consider How Class Names are Sorted

When defining the name of a class, keep in mind that the class are usually sorted alphabetically within the development environment. Therefore, it is usually desirable to give classes names listing general functions first, as this groups them together in the environment and source control system.

For example, say you have two classes for dealing with widgets, one responsible for editing widgets and the other responsible for displaying them. Naming these classes WidgetEditor and WidgetDisplayer is a better choice than calling them DisplayWidget and EditWidget, as then the classes are grouped by their common subject matter. Using the latter scheme, all of the display classes a grouped together, and this is rarely how you will interact with this classes as a developer. Usually, you go into code based on subject matter.

Cut & Paste: The Enemy

Never, ever cut a block of code from one place and then paste the identical block in another bunch of code. Always (at least) create a new method with the common code, and have both locations call that method instead. Often, some other way of sharing code will be more appropriate (such as making a common parent class), but the main point is that you should never have the same code in an application more than once. Share. This cuts down on bugs and makes cut much easier to maintain.

The crime of cut & paste code is so awful that, if you see it committed in someone else’s code, you should immediately find the culprit and make them fix it.

Bracing Single-Line Conditions

While not a requirement of this convention, there are some benefits to slavishly using braces on every condition, even those that are only a single line, like so:

if ( firstItem )
{
   i = 0;
}

Some consider this overly paranoid, but it can prevent certain kinds of bugs. This same idea carries over to switch/case statements as well.

Java Code Sample

/*
* @(#)Sample.java 1.2 01/09/19
*
* Copyright (c) 2002 DivNull Software
* Centereach, NY 11720
* All Rights Reserved.
*
*/

package com.divnull.sample;
import java.blah.blahdy.BlahBlah;

/**
* Class description goes into this JavaDoc comment.
*
* @version 1.2 19 Sep 2001
* @author Firstname Lastname
*/

public class Sample extends SomeClass
{
   /* A class implementation comment can go here. */

   /** s_nClassVar1 documentation (JavaDoc) comment */
   public static int s_nClassVar1;

   /**
    * s_ClassVar2 JavaDoc comment that happens to be
    * more than one line long
    */
   private static Object s_ClassVar2;

   /** m_InstanceVar1 documentation comment */
   public Object m_InstanceVar1;

   /** m_nInstanceVar2 documentation comment */
   protected int m_nInstanceVar2;

   /** m_InstanceVar3 documentation comment */
   private Object[] m_InstanceVar3;

   /**
    * ... constructor Sample documentation comment...
    */
   public Sample()
   {
      // ...implementation goes here...
   }

   /**
    * ... method doSomething documentation comment...
    *
    * @returns Something calculated by the method.
    */
   public int doSomething()
   {
      int v_nRetVal = 0;
      // ...implementation goes here...
      return v_nRetVal;
   }

   /**
    * ... method doSomethingElse documentation comment...
    *
    * @param   nameStr   A name used by the method. This should
    * be a good description of this parameter.
    * @throws   SomeException   Describes why this exception would be thrown.
    */
   public void doSomethingElse( String a_NameStr ) throws SomeException
   {
      // ...implementation goes here...
   }
}


Back to Wordman's Writing page.