C Coding Guidelines

Last update: Tue Mar 20 21:22:22 2018.


These are my opinions on the way C code ought to be written. New code should follow this style. Old code, when modified in any significant way, should be converted to this style.


Contents


Code Examples

  1. Naming
    Right FooKBGetInfoFree()
    Avoid FOO_KB_GET_INFO_FREE()
    Avoid FOO_kb_get_info_free()
    Avoid FOO_kbgetinfofree()

  2. If statements
    Right
    if (foo)
      bar(foo);
    Wrong
    if (foo) bar(foo);
    Avoid
    foo && bar(foo);

  3. Block structure
    Right
    if (foo) {
      bar(foo);
      ++gik;
    }
    else {
      mumble(foo);
      --gik;
    }
    Wrong
    if (foo) { bar(foo);    ++gik; }
    else     { mumble(foo); --gik; }
    Wrong
    if (foo)
      {
      bar(foo);
      ++gik;
      }
    else
      {
      mumble(foo);
      --gik;
      }
    OK
    if (foo)
    {
      bar(foo);
      ++gik;
    }
    else
    {
      mumble(foo);
      --gik;
    }
    Avoid
    if (foo) {
      bar(foo);
      ++gik;
    } else {
      mumble(foo);
      --gik;
    }

  4. Spaces around operators
    Right
    f = (c * 9 / 5) + 32;
    OK
    f = 32 + c * 9 / 5;
    Avoid
    f=32+c*9/5;

  5. Spaces after punctuation
    Right
    foo = bar(dog, wombat, bear, fox, ferret, deer, squirrel);
    for (i = 0, j = 0; i < num; ++i)
      j += a[i];
    Wrong
    foo = bar(dog,wombat,bear,fox,ferret,deer,squirrel);
    for (i = 0,j = 0;i < num;++i)
      j += a[i];

  6. Spacing for casts
    Right
    foo = (thing *) bar;
    Wrong
    foo = (thing*) bar;
    Wrong
    foo = (thing*)bar;
    Avoid
    foo = (thing *)bar;

  7. Spacing around parentheses
    Right
    a = (b + c);
    mumble(q);
    Wrong
    a = ( b + c );
    mumble( q );
    Wrong
    a = ( b+c );
    mumble (q);

  8. Safe initializations
    Right
    Cdef1(void,    DoIt,
          void *,  ptr)
    {
      mumble  a  = (mumble) ptr;
      boink   b;
      splat   c;
    
      if (!a)
        return;
    
      b = a->weep;
      c = a->drip;
    
      ...
    }
    Wrong
    Cdef1(void,    DoIt,
          void *,  ptr)
    {
      mumble  a  = (mumble) ptr;
      boink   b  = a->weep;
      splat   c  = a->drip;
    
      if (!a)
        return;
    
      ...
    }

  9. Win16-safe pointer < or >
    Right
    thingy  *foo;
    thingy  *bar;
    
    if ((unsigned char huge *) foo < (UInt1 huge *) bar)
      mumble(foo, bar);
    Wrong
    thingy  *foo;
    thingy  *bar;
    
    if (foo < bar)
      mumble(foo, bar);
    Wrong
    thingy  *foo;
    thingy  *bar;
    
    if ((unsigned long) foo < (unsigned long) bar)
      mumble(foo, bar);

  10. Protected header files
    myheader.h
    #ifndef MYHEADER_PROTECT
    #define MYHEADER_PROTECT
    
    typedef short myInt2;
    
    #endif /* !MYHEADER_PROTECT */
    foo.c
    #ifndef MYHEADER_PROTECT
    # include "myheader.h"
    #endif
    
    int
    main(int argc, char **argv) {
      ...
    }


"Rules"

  1. Don't check in code without getting it reviewed.

  2. Do not check in code unless it works and has been tested.

  3. Don't make control structures look like function calls. Function calls look like exit(1). Control structures look like while (1). Notice the space.

  4. Macros as lvalues are very confusing to read/debug. Accessor macros (often the same macro) should be used sparingly. An example is #define BAR_lsv(tag) (((BAR_LSV *)(global->gen))->tag), which is also confusing because global is not an argument of the macro.

  5. Lines should be not longer than 79 characters.

  6. Code should not contain tabs. Tabs on Unix are 8 spaces. Tabs on Macintosh are 4 spaces. Tabs on Windows are 4 spaces. In Emacs, setting indent-tabs-mode to nil is a good idea.

  7. In a declaration like int *i; the asterisk binds to the i, meaning "i is a pointer to int." Avoid writing int* i; because what it implies contradicts how the parser sees it. After all, we don't declare an array with int[10] i;

  8. An open brace should be the last non-comment, non-whitespace character on a line of code. A close brace should be alone on a line of code, unless followed by the while of a do...while.

  9. Comments of the type:
         /*
          * this is some text
          * about something
          */
    should be avoided becuase it is quite time-consuming to keep the asterisks vertically aligned after making significant changes.
  10. Line up identifiers wherever it improves readability. This includes declarations and assignments into structures. As a corollary, limit variable definitions to one per line.

  11. Don't overuse the ternary operator ((foo != bar) ? foo : -1). We're not all MIT Lisp hackers who love to see (isusr ? (exptopic ? MSG_TCMP_UEXPORT1 : MSG_TCMP_UEXPORT0) : (exptopic ? MSG_TCMP_FEXPORT1 : MSG_TCMP_FEXPORT0)) in code.

  12. Each file should have an overview comment block explaining what it's for. Every function should have an introductory block of comments explaining what the fuctions does, with what, and how, including error conditions. Functions should also include running commentary as appropriate (Emacs's ESC-; is great for this). Files should be broken into logical sections with banner comments.

  13. The names of things should imply their scope. This also helps segment the namespace. First, the package should have a name, such as FooIdx. The files will have names like foo_idx.c, fidx_scm.c, fidx_utl.c, etc. The entrypoints exported by the package should begin with capital letters, as in FooIdxNew(). Static or intra-package functions should begin with lowercase, as in fooIdxNames(). Type names should follow the same rules. In addition, the following relation should be observed:
         typedef struct _XXX {
           ...
         } XXXRec, *XXX;
    Macros should also begin with their package name. Symbolic names for values of a certain type should be named <type>_<name>, as in FooQueryWeight_Nice. Other macros should probably follow the tradition of all uppercase.

  14. Even in C, avoid C++ reserved words. More and more C code is now being run through C++ compilers. Avoid class, public, protected, private, virtual, friend, inline, this, operator, new, delete, catch, throw, try, template, bool, false, and true.

  15. Avoid "multiple ownership" and other ambiguous situations in which it's unclear who should free a resource. Just because things happen in a certain order today doesn't mean they will tomorrow. One way to do this is with "attach" and "detach" functions that explicitly transfer ownership. "Get" or "peek" functions return a read-only version without giving up ownership; the longevity of this version is unclear. Other functions might return a copy for the caller to own; this is safe but can be inefficient and sometimes inappropriate. If you must have shared ownership, use reference counting and consider C++.

  16. Use const when you can. This will allow the compiler to find errors for you. This is especially important when writing functions for general use. Declare arguments (especially pointers) as const if possible. This allows clients to use functions without casting away const.

  17. Write functions with standard return-code semantics. The standard library uses zero for success. Deviations from this model will confuse experienced programmers.


A Platonic Ideal

Code that's "good" is...
  1. Portable
    1. The code itself

      Runs on all/many machines

      A types.h sort of file with ifdef'd typedefs will accomplish this quite well.

      1. Safe datatypes

        For total interchangability, each unit should define and export all of the datatypes it uses. Clients would then cast everything back and forth. Disappointingly, many compilers won't enforce that types have the same name. Also, casts tell compilers to stop checking types, thus removing the safety net. There still must be some conventions, like that strings are null-terminated ASCII, regardless of signedness.

        Some interchangeability can be traded for convenience if a group (e.g.: a company) decides to use one set of data types throughout all of its code. This corpus of code will now depend on the .h file defining the data types. Someone will need to decide if this .h file gets shipped to any customers or is used in customer interfaces.

        A company may decide to have one set of portable datatypes defined for internal use and a different set of datatypes for customers.

      2. Use generic C

        Macros can "genericize" code, massaging away differences between compilers and dialects. These macros should be separate from data-type macros.

      3. Separate platform specific code from generic

        The vast majority of code should be platform-independent. Differences should be abstracted and the implementations should be ifdef'd for each platform. People doing ports should be able to identify the platform-specific files and change only them in order to complete the port.

        It may be a good idea to reserve a portion of the filename/symbol namespace for OSD code so that OSD files/symbols are easily identified and cannot conflict with generic files/symbols.

    2. Data "pickling"

      All platforms should be able to read and write the same files and talk on the network to other platforms. Conventions for big-endian vs. little-endian, etc. should be set up.

      With internationalization, text-based control files might become an issue.

  2. Modular

    1. Non-conflicting namespace

      Proper namespace management yields two benefits: partitioning and documentation. Partitioning guarantees that external symbols in one module will not collide with those in another. This is important since it's hard to predict what pieces of code will need to interoperate. Usually partitioning is accomplished via prefix assignment. Documentation is a side-effect of intelligent partitioning. It means that a developer can instantly tell from the name of a function, datatype, or variable what it is and what part of the system it occupies.

      File names should correspond to symbol names defined within. Symbols private to the file should start with lowercase.

    2. Reusable

      1. Versatile

        Does something another program might need.

        Proper documentation and "advertising" is essential to keep others from reimplementing.

      2. Uses hooks (callbacks)

        Specific functionality provided by client.

    3. Separable

      One module doesn't drag in five others

      Use encapsulation, client-supplied callbacks, and lazy initialization.

      1. Asks client to provide needed services.

        Instead of relying on MUMBLE_alloc() for memory, a package can take a pointer to an alloc function at initialization time and then be flexible enough to work with malloc() too.

  3. Internationalizable

    1. Multibyte string support

    2. Easily translatable (separate) strings

  4. Well-written

    This is probably to most important aspect of code, and the most difficult to teach or codify.

    1. Does the right thing

    2. Maintainable style

      Can be understood and debugged.


Tue Mar 20 21:22:22 2018