Coding Guidelines for CS-701

Spring 2003

Introduction

This document gives the rules you are to follow for all the programs you write for CS-701. The goal is for you to write code that is correct, robust, efficient, and easy to understand. Most of the detailed specifications given given in this document apply to the "easy to understand" criterion because code that is easy to understand is also most likely to be correct, robust, and efficient too!

Programming Language

All code for this course is to be written in C++ and compiled using the GNU g++ compiler.

All code must compile and link with no warnings and no errors when compiled with g++ using the

-g  -Wwrite-strings  -Wall

command line options.

There is a [ separate web page on writeable strings ] that explains the significance of the -Wwrite-stings option.

Because you are programming in C++ you must supply a function prototype for every function you define except main(). (The C language does not require prototypes, but C++ does.)

Your function prototype must be in a header file if and only if the function is defined in one source module and referenced from another module. If the function is referenced only from within the same module in which it is defined and if the definition precedes all references to the function, the definition itself may be used as the prototype.

You may use most, but not all, of the features of the C++ lanugage in your code. Parts of the language not to use include:

Project Management

An important goal of this course is for you to learn how to use the Revision Control System (RCS) and the make utility to manage programming projects. There are separate web pages on Using RCS and Using make that will be assigned at the appropriate time in the course. Meantime, the follow material is to help you get your projects set up correctly in general.

Set up a separate project directory for each assignment, and put only the files that actually are part of that assignment in the project directory. For assignments that are continuations of previous assignments, reuse the same project directory.

Create a subdirectory named RCS (must be all capital letters) in the project directory. You will practice using rcs to keep track of program versions in this course.

Once you have learned how to use rcs and make (or gmake as the case may be), you will be required to make the project directory "clean" so it contains nothing but the RCS subdirectory and possibly a text file named README (spelled and capitalized just like that) before submitting an assignment.

To submit an assignment, make the project directory "clean," change to the directory above the project directory, and create a tar file of the project directory using a command like the following:

    tar cvf Assignment_1.tar Assignment_1
This example assumes the project directory name is Assignment_1. After you create the tar file, submit it to me as an attachment to an email message. Warning! Tar files are "binary" (not text) files, and will be corrupted if you are not careful when you copy them from one computer to another. If you use ftp to transfer a tar file, for example, you have to be sure to use "binary mode" for the file transfer. To be sure your tar file is in good shape, you should email a copy to yourself using exactly the same technique you will use to send it to me. Then create a temporary directory separate from your project directory, save the tar file there, and be sure you can extract the contents of the received copy successfully using a command like the following:
    tar xvf Assignment_1.tar

Text Editing

You must use a Unix programmer's editor to prepare all source files for this course. Two free editors that will satisfy this requirement are vim ("Vi Improved") and emacs. Both of these editors normally come preinstalled on all Unix systems. There are other editors available for various Unix systems, but these two are industry standards, and you should pick one of them unless you are already very familiar with a different one. Of the two, vim is easier to use and perhaps more universally available, so that is the assumed editor of choice for the course. Examples of editors that are not programmer's editors are pico and various editors packagedi with various Linux desktop environments, such as KDE or Gnome. And, of course, Notepad, which is a Windows program anyway.

What makes an editor a "programmer's editor?" Any programmer's editor will include at least the following features to help you produce easy to read and well-formatted source code:

Setting Up Your Editor

There is a web page here to help you install and set up the current version of vim on your computer. The current version of vim should work well as soon as you install it. But that web page includes a sample iniitialzation file (.vimrc) that you should copy to your home directory to be sure all features, such as tab expansion, are set up properly.

Coding Style

The remainder of this document tells you how to structure your C++ code so that it meets the course requirements for Correctness, Robustness, and Efficiency.

A correct program is one that does what it is supposed to do when all inputs take on values anywhere within their expected range. Be sure to test your programs for correctness before submitting them. Pay special attention to situations on the edge of the expected range of values. For example, if your program has to read lines of text from a file, it should not have a problem dealing with empty lines.

A robust program is one that behaves in a "reasonable manner" when it encounters input values that are not within their expected range or if expected parameters are missing. The reasonable thing to do, depending on the severity of the error and the nature of the program, is to issue an error message and continue processing (recovery) or to issue an error message and terminate (abort). Be sure to test your program's behavior in response to "bad input" before submitting it too. Instead of "garbage in, garbage out" your code should operate on the "garbage in, explanation out" principle. Explanations should be meaningful, but terse.

An efficient program is one that performs only those computations necessary to accomplish the work at hand. Aside from the obvious advantage of executing quickly, efficient code is typically much easier for someone else to read and understand than code which performs extraneous operations, which the reader has to understand in order to know that they can safely be ignored!

Source File Structure

As mentioned in the Introduction, making code easy to read is one important way to achieve the goals of robustness, clarity, and efficiency. To make your code easy to read, all source modules (.cc files and .h files) must contain the following sections in the order listed here:

File Introduction

The file must begin with a block of comments that introduce the file, called the File Introduction. The first line of the file introduction must contain the RCS keyword, $Id$, which will be expanded by RCS to give the file's name, date of modification, and some other information. Be sure to punctuate and capitalize the keyword exactly as shown, or RCS won't recognize it. Using the RCS utility for project management will be covered in class.

The File Introduction then continues with comments that give a Summary of the file's contents.

If there is more than one function definition in the file, follow the file summary with a Functions section, which is a list of the names of all the functions defined in the file, along with a brief phrase identifying each. Each item in this list should normally fit on a single line. The list must be in the same order as the sequence of function definitions in the file. Do not list functions referenced from within the file, only functions defined in the file.

The Revision History for the file is the section of the file introduction in which you list the changes made to the file and the dates the changes were made. The good news is that you can generate the contents of this section completely automatically by putting the RCS keyword $Log$ inside your comments.

Include the Author's Name in the Author section of the File Introduction. When more than one programmer works on a file, the authors' names go in the revision history. For this course, only one person works on a file, so the author's name goes in its own section. If I give you some code to use as basis for part of a project, put your name underneath mine with a comment to the effect that you modified the code. (Don't put in my name unless I provided a significant part of the code. Sample code given in class, for example, is not "significant.")

Each of the File Introduction sections ("Summary," "Function Names," "Revision History," and "Author") should be preceded its own sub-heading name, but you may omit these names if you think it makes your code easier to read.

Here is a template you can use for your File Introductions:

     //  $Id$
     /*
      *  Summary
      *
      *    [A sentence or two giving the role of this module in the
      *    overall structure of the project.]
      *
      *  Function Names
      *
      *    [A list of the names of all functions defined in this
      *    module in the order in which they are defined.  Follow each
      *    name with a phrase summarizing what the function does.]
      *
      *  Revision History
      *
      *    $Log$
      *
      *  Author:  [Your Name]
      *
      */

Notice that the first line uses the // type of comment, but that the other lines use the /* ... */ style. It's important to use the /* ... */ for the part of the comment block that includes the RCS $Log$ keyword because RCS will expand $Log$ into a list of comments designed to fit inside /* ... */ comment blocks, not in // comment lines.

All Makefiles and man page files must also have a file header with comments giving the $Id$, Summary, Revision History, and Author sections.

Include Files, Manifest Constants, and Macro Definitions

All #include and #define statements follow the File Introduction section. Put #define statements that are used in multiple source modules into a header file. Although ANSI C does not require it, putting a header file name in angle brackets (< and >) conventionally means the header file is one that is supplied with the compiler or part of a standard programming package that is installed on the development system. Putting the name of the header file in quotes (") conventionally means that the header file is specific to the current project and is located in the project directory.

Function Definitions

Every function definition (including main() !!!) begins with a Function Introduction, followed by the function definition itself.

Function Introduction

A function introduction is a block of comments that contains the following information in the order listed here:

Name
The name of the function.
Summary
A statement of the purpose of the function. Use one or more complete sentences. Write in the present tense.
Arguments
List the arguments to be passed to the function using the names used in the function definition. Give a phrase telling what each is used for and any assumptions made about the valid range of values the argument may meaningfully take. If any arguments are pointers to values that are modified by the function, say so.
Return Value
Tell what values are returned by the function. If no value is returned, say so.
Global Variables
List any global variables that are referenced or modified by the function.
Algorithm
List the steps the function executes. Use imperative sentences.

Here is a template you could use for Function Introductions. As with the File Introduction, the heading names are recommended, but not required.

  /*  functionName()
   *  ------------------------------------------------------------
   *
   *  Summary
   *
   *  Arguments
   *
   *  Return Value
   *
   *  Global Variables Referenced
   *
   *  Global Variables Modified
   *
   *  Algorithm
   *
   *    1.
   *
   */
Note the comment line with dashes under the function name. The idea is to make it clear where each function definition begins. The next programmer to read your code (the one who has to add a new feature or fix a bug) will be eternally grateful for this textual landmark. Likewise, notice that all comments need to have two spaces between the comment leader (the //, /*, or the * at the beginning of the line) and the textual part of the comment. This whitespace is an important part of making your code easy to read.

Omit the Global Variables sections if your program doesn't use global variables.

The Algorithm section does not need to be at all elaborate. The idea here is to provide the reader with a guide to the code that makes up the function definition. The code itself will give the details of the algorithm; the comments here, which should parallel the lines of comments in the function body, should give the reader an overview of the algorithm being implemented.

Writing Function Definitions

Use meaningful variable names. (However, "anonymous" variable names like i, j, and k are OK for integers used to index arrays.) In general, you do not need to comment your variable declarations. Use a consistent style of capitalization and underscores for variable names. Manifest constants (set up using #define) are normally all capitals. Variable and function names should be mixed upper/lower case, with underscores or internal capitals to separate words inside a name. (num_commands or numCommands, for example.) Data types that you define using struct, enum, or typedef conventionally have names that end in "_t." (struct node_t { ... }; for example)

Use a consistent indentation style that shows the lexical structure of your code. Blank lines and other whitespace generally improve your code's legibility. Choose an indent increment of either 2, or 4 spaces. Anything larger will result in lines of code that get pushed over to the right margin and are hard to read.

No code or comment line may be more than 72 characters long.

Remember the following feature of the language when breaking up long pieces of code so they will meet this requirement:

    printf( "This is a very long line of text that I want to print\n" );
can be rewritten as:
    printf( "This is a very long "
            "line of text that I "
            "want to print\n" );
The only comments you have to write in your function definitions are ones that correspond to the steps you listed in the Algorithm section of the Function Introduction. These comments go on lines by themselves just before the code that implements each step of the algorithm. Of course, you should add other comments if a piece of code is difficult to understand, but try to write clear code so the need for these extra comments is minimized.

There are only two acceptable arguments to the exit() function: EXIT_SUCCESS and EXIT_FAILURE. These two constants are defined in stdlib.h, along with the function prototype for exit(). These two constants normally have the values of 0 and 1 respectively, and it is all right to use these values explicitly rather than the constant names. But using any other values makes your code "unconventional" and thus more difficult for other programmers to understand easily.

Avoid flag variables wherever possible. If you must have one, give it a meaningful name, and use type bool rather than int for it. Use true and false as boolean literals; they are part of the language, but other forms, such as TRUE and FALSE are not defined on all systems.

Notes About This Style



Christopher Vickery
Queens College of CUNY