C and C++ Elements to Avoid

Table of Contents

The Text Itself

Indentation means that the contents of every block are promoted from their containing environment by using a shift of some space. This makes the code easier to read and follow.

Code without indentation is harder to read and so should be avoided. The Wikipedia article lists several styles - pick one and follow it.

Some people call their variables "file". However, file can mean either file handles, file names, or the contents of the file. As a result, this should be avoided and one can use the abbreviations "fh" for file handle, or "fn" for filenames instead.

In C++, classes should start with an uppercase letter (see the Wikipedia article about letter case) and starting them with a lowercase letter is not recommended.

# Bad code


class my_class
{
    .
    .
    .
};
class MyClass
{
    .
    .
    .
};

Your code should not include unnamed numerical constants also known as "magic numbers" or "magic constants". For example, there is one in this code to shuffle a deck of cards:

# Bad code


for (int i = 0; i < 52; i++)
{
    const int j = i + rand() % (52-i);
    swap(cards[i], cards[j]);
}

This code is bad because the meaning of 52 is not explained and it is arbitrary. A better code would be:

const int deck_size = 52;

for (int i = 0; i < deck_size; i++)
{
    int j = i + rand() % (deck_size - i);
    swap(cards[i], cards[j]);
}

Apparently, many non-native English speakers write code with comments and even identifiers in their native language. The problem with this is that programmers who do not speak that language will have a hard time understanding what is going on here, especially after the writers of the foreign language code post it in to an Internet forum in order to get help with it.

Consider what Eric Raymond wrote in his "How to Become a Hacker" document (where hacker is a software enthusiast and not a computer intruder):

4. If you don't have functional English, learn it.

As an American and native English-speaker myself, I have previously been reluctant to suggest this, lest it be taken as a sort of cultural imperialism. But several native speakers of other languages have urged me to point out that English is the working language of the hacker culture and the Internet, and that you will need to know it to function in the hacker community.

Back around 1991 I learned that many hackers who have English as a second language use it in technical discussions even when they share a birth tongue; it was reported to me at the time that English has a richer technical vocabulary than any other language and is therefore simply a better tool for the job. For similar reasons, translations of technical books written in English are often unsatisfactory (when they get done at all).

Linus Torvalds, a Finn, comments his code in English (it apparently never occurred to him to do otherwise). His fluency in English has been an important factor in his ability to recruit a worldwide community of developers for Linux. It's an example worth following.

Being a native English-speaker does not guarantee that you have language skills good enough to function as a hacker. If your writing is semi-literate, ungrammatical, and riddled with misspellings, many hackers (including myself) will tend to ignore you. While sloppy writing does not invariably mean sloppy thinking, we've generally found the correlation to be strong — and we have no use for sloppy thinkers. If you can't yet write competently, learn to.

So if you're posting code for public scrutiny, make sure it is written with English identifiers and comments.

See the Wikipedia article about “The Law of Demeter” for more information. Namely, doing many nested method calls like obj->get_employee('sophie')->get_address()->get_street() is not advisable, and should be avoided.

A better option would be to provide methods in the containing objects to access those methods of their contained objects. And an even better way would be to structure the code so that each object handles its own domain.

As noted in Martin Fowler's "Refactoring" book (but held as a fact for a long time beforehand), duplicate code is a code smell, and should be avoided. The solution is to extract duplicate functionality into subroutines, methods and classes.

Another common code smell is long subroutines and methods. The solution to these is to extract several shorter methods out, with meaningful names.

It is a good idea to avoid global variables or static variables inside functions; at least those that are not constant. This is because using such variables interferes with multithreading, re-entrancy and prohibits instantiation. If you need to use several common variables, then define an environment struct or class and pass a pointer to it to each of the functions.

With many editors, it can be common to write new code or modify existing one, so that some lines will contain trailing whitespace, such as spaces (ASCII 32 or 0x20) or tabs characters. These trailing spaces normally do not cause much harm, but they are not needed, harm the code’s consistency, may undermine analysis by patching/diffing and version control tools. Furthermore, they usually can be eliminated easily without harm.

Here is an example of having trailing whitespace demonstrated using the --show-ends flag of the GNU cat command:

> cat --show-ends toss-coins.pl
$
use strict;$
use warnings;$
$
my @sides = (0,0);$
$
my ($seed, $num_coins) = @ARGV;$
$
srand($seed);  $
$
for my $idx (1 .. $num_coins)$
{$
    $sides[int(rand(2))]++;$
    $
    print "Coin No. $idx\n";$
}$
$
print "You flipped $sides[0] heads and $sides[1] tails.\n";$
>

While you should not feel bad about having trailing space, it is a good idea to sometimes search for them using a command such as ack '[ \t]+$' (in version 1.x it should be ack -a '[ \t]+$', see ack), and get rid of them.

Some editors also allow you to highlight trailing whitespace when present. See for example:

Finally, one can check and report trailing whitespace using the following CPAN modules:

You should add #include guards, or the less standard but widely supported #pragma once into header files (“*.h” or “*.hpp” or whatever) to prevent them from being included times and again by other “#include” directives. Otherwise, it may result in compiler warnings or errors.

On various online forums, we are often getting asked questions like: “What is the speediest way to do task X?” or “Which of these pieces of code will run faster?”. The answer is that in this day and age of extremely fast computers, you should optimise for clarity and modularity first, and worry about speed when and if you find it becomes a problem. Professor Don Knuth had this to say about it:

(Knuth reportedly attributed the exact quote it to C.A.R. Hoare).

While you should be conscious of efficiency, and the performance sanity of your code and algorithms when you write programs, excessive and premature micro-optimisations are probably not going to yield a major performance difference.

If you do find that your program runs too slowly, refer to our resources about Optimising and Profiling code.

You should make sure that the HTML markup you generate is valid HTML and that it validates as XHTML 1.0, HTML 4.01, HTML 5.0, or a different modern standard. For more information, see the “Designing for Compatibility” section in a previous talk.

If you want to group a certain sub-expression in a regular expression, without the need to capture it (into the $1, $2, $3, etc. variables and related capture variables), then you should cluster them using (?: … ) instead of capturing them using a plain ( … ), or alternatively not grouping them at all if it's needed. That is because using a cluster is faster and cleaner and better conveys your intentions than using a capture.

When passing a non-literal-constant string as the first parameter to “printf()”/sprintf()” and friends, one runs the risk of format string vulnerabilities (more information in the link). As a result, it is important to always use a literal constant string to format the string. E.g:

# Bad code


fgets(str,sizeof(str), stdin);
str[sizeof(str)-1] = '\0';
printf(str);

should be replaced with:

fgets(str,sizeof(str), stdin);
str[sizeof(str)-1] = '\0';
printf("%s", str);

One can also use the relevant warning flags of GCC and compatible compilers to warn and possibly generate an error for that.

It is a very good idea for C and C++ code to use a good build and configuration system. There’s a page listing some prominent alternatives. For simple setups, a make file may be suitable, but more complex tasks require a configuration and build system such as CMake.

It is important to use a bug tracking system to maintain a list of bugs and issues that need to be fixed in your code, and of features that you'd like to work on. Sometimes, a simple file kept inside the version control system would be enough, but at other times, you should opt for a web-based bug tracker.

For more information, see:

This is a short list of the sources from which this advice was taken which also contains material for further reading:

  1. A large part of this document is derived from a similar document written earlier for the Perl programming language.

  2. The Book "Perl Best Practices" by Damian Conway - contains a lot of good advice and food for thought, but sometimes should be deviated from. Also see the "PBP Module Recommendation Commentary" on the Perl 5 Wiki.

  3. "Ancient Perl" on the Perl 5 Wiki.

  4. chromatic's "Modern Perl" Book and Blog

  5. The book Refactoring by Martin Fowler - not particularly about Perl, but still useful.

  6. The book The Pragmatic Programmer: From Journeyman to Master - also not particularly about Perl, and I found it somewhat disappointing, but it is an informative book.

  7. The list “How to tell if a FLOSS project is doomed to FAIL”.

  8. Advice given by people on Freenode's #perl channel, on the Perl Beginners mailing list, and on other Perl forums.

  9. Advice given by people on Freenode’s ##programming channel and on other forums.

Version Control Repository

This document is maintained in a GitHub Repository which one can clone, fork, send pull-requests, and file issues for. Note that all contributions are assumed to be licensed under the Creative Commons Attribution 4.0-and-above (CC-by) licence. Enjoy!

Coverage

TODO: fill in.