Foundations of Computer Systems
For coding assignments in this course, your program will be evaluated not simply on correctness (i.e., whether it follows the assignment specification), but also on good design and coding style. This document details specific design and style issues that I will look for when reviewing your programs. As many aspects of design are highly situational, this is by no means an exhaustive list, but you should strive to follow these guidelines as well as exercising your own judgment when designing your code.
This guide is targeted towards C and C++ code, but most of these principles apply to working in any programming language.
Good code should be largely self-documenting - i.e., the code itself (such as variable and function names) should generally make it clear what you are doing. In particular, more documentation is not necessarily better, and it is better to have clear code that doesn't need any documentation versus unclear code with extensive documentation.
Comments should not describe what the code does, but why. Novice coders (or those just starting in a new programming language) often make the mistake of documenting what the code is doing (e.g., "dereferences the pointer"). Such comments are not necessary; as a general rule of thumb, you should assume that the reader is already comfortable with the language. Instead, your comments should provide detail that is not already evident from the code itself.
There are several parts of your code that do generally deserve comments:
Proper use of whitespace is essential to the readability of your code. In particular:
One easy way to make your code hard to read is to write excessively long lines of code. For this course, you should not have any lines of code longer than 100 characters (whitespace included). Note that 80 characters is a fairly typical 'real-world' limit, so this requirement is not as strict as is often used. Lines exceeding the maximum length should be broken up across multiple lines. To quickly check the maximum line length of
file.c, you can run the following command on Linux:
wc -L file.c
Variable names should clearly describe what the variable contains. Local variables with obvious purposes (such as a loop counter) may be named with a single letter (e.g.,
i); other variables should not. Generic variable names (e.g.,
asdf) should never be used. Variable names should be descriptive but concise; e.g., a name like
sumOfAllArrayValues could probably be better named simply
arraySum. Variable names should generally be nouns (e.g.,
arraySum) while function names should generally be verbs (e.g.,
Variable names with multiple words should be formatted consistently. For example,
array_sum are both fine, but using
arraySum in one place and
array_sum in another is not. Pick your convention and stick to it -- standard C convention generally dictates using underscores to separate words, but this convention is not required as long as you are consistent.
"Magic numbers" are numbers in your code that have a meaning beyond their own values. For example, in the line
num_days = num_years * 365, the number 365 has a significance beyond simply being the number 365 -- it's the number of days in a year. Magic numbers like these should be named using
#define at the top of the file, as follows:
#define DAYS_IN_YEAR 365 ... num_days = num_years * DAYS_IN_YEAR;
Magic values should never be used directly in your code. Note, however, that not every number actually has a meaning beyond its own value -- e.g., the values 0 and 1 usually do not need to be named.
One of the most important aspects of coding style is consistency -- not only in the areas covered by this guide, but in other areas as well (e.g., whether curly braces go on the same line as their associated keyword or on the next line). You are allowed to make your own style choices in such cases, but you should always be consistent. Unexpected style changes in a program substantially detract from readability even if the individual style choices in question are reasonable.
The issue of consistency is particularly important when working in a team. Since your team members may have their own personal preferences and conventions (which may differ from yours), it is critical to agree in advance which conventions you will use. A great way to waste time and annoy your partners at the same time is to write code using multiple different conventions, then go back later and change all your partners' code to match your own code's style. Avoid this problem and decide on your style conventions in advance!
"Dead code" is code that is not actually active in your program. Such code might include old debugging statements that you commented out (e.g.,
printf), or an old function you wrote that is not actually called anywhere from within your program. While some dead code is an inevitable product of development, you should remove all dead code in your final program. A submitted program should never contain any dead code.
To the extent possible, you should strive to make your code modular. Writing appropriately modular code includes the following:
main) is almost certainly too long and should be modularized.
When writing a program, we normally assume that all functions will complete successfully (e.g., the user provides a valid input, a file can be read successfully, and so forth). It is equally important, particularly with system-level code, to consider failure cases. For many types of errors, a program may have no reasonable option but to exit (e.g., if a call to
malloc fails and the program needs the memory to proceed), but it is still better to recognize this failure and cleanly exit with an error message than to blindly proceed and crash sometime later (likely with an uninformative message). In general, any time you make a call that might fail -- even if you think it's extremely unlikely that it will fail -- your code should still check for the failure case and respond appropriately.
Strive for clean design over optimizing the performance of every bit of code. In the vast majority of cases, you will spend far more time debugging a more complex design than worrying about (or waiting for) your code to actually run. Of course, you should not neglect program efficiency completely, particularly when considering your high-level program design (e.g., choosing a logarithmic-time data structure over one that is only linear-time). However, be careful not to complicate your code by fine-tuning small details in a way that compromises its readability. Choosing a faster data structure to use is generally a sound decision. Replacing a multiplication operation by a bit-shift to shave off a few processor cycles is generally not. The famed computer scientist Donald Knuth is quoted as saying "premature optimization is the root of all evil". Trust him.