Ghostscript C coding guidelines

Table of contents

For other information, see the Ghostscript overview.


Introduction

This document describes Ghostscript's C coding conventions. It is primarily prescriptive, documenting what developers should do when writing new code; the companion developer documentation (Develop.htm) is primarily descriptive, documenting the way things are.

We encourage following the general language usage and stylistic rules for any code that will be integrated with Ghostscript, even if the code doesn't use Ghostscript's run-time facilities or have anything to do with PostScript, PDF, or page description languages. Ghostscript itself follows some additional conventions; these are documented separately under "Ghostscript conventions" below.


C language do's and don'ts

There are several different versions of the C language, and even of the ANSI C standard. Ghostscript versions through 7.0 were designed to compile on pre-ANSI compilers as well as on many compilers that implemented the ANSI standard with varying faithfulness. Ghostscript versions since 7.0 do not cater for pre-ANSI compilers: they must conform to the ANSI 1989 standard (ANS X3.159-1989), with certain restrictions and a few conventional additions.

Preprocessor

Conditionals

Restrictions:

Macros

Restrictions:

Other

Restrictions:

Lexical elements

Do not use:

Restrictions:

Scoping (extern, static, ...)

Do not use:

Restrictions:

Scalars

Restrictions:

Arrays

Restrictions:

Typedefs

Restrictions:

Structures

Restrictions:

Unions

Restrictions:

Expressions

Restrictions:

Statements

Restrictions:

Procedures

Restrictions:

Standard library

Restrictions:


Language extensions

Scoping

static
Ghostscript assumes the compiler supports the static keyword for declaring variables and procedures as local to a particular source file.
inline
inline is available even if the compiler does not support it. Be aware, however, that it may have no effect. In particular, do not use inline in header files. Instead, use the extern_inline facility described just below.
extern_inline
Compilers that do support inline vary in how they decide whether to (also) compile a closed-code copy of the procedure. Because of this, putting an inline procedure in a header file may produce multiple closed copies, causing duplicate name errors at link time. extern_inline provides a safe way to put inline procedures in header files, regardless of compiler. Unfortunately, the only way we've found to make this fully portable involves a fair amount of boilerplate. For details, please see stdpre.h.

Scalar types

bool, true, false
bool is intended as a Boolean type, with canonical values true and false. In a more reasonable language, such as Java, bool is an enumerated type requiring an explicit cast to or from int; however, because C's conditionals are defined as producing int values, we can't even define bool as a C enum without provoking compiler warnings.

Even though bool is a synonym for int, treat them as conceptually different types:

byte, ushort, uint, ulong
These types are simply shorthands for unsigned char, short, int, long.

In addition, the use of byte * indicates a Ghostscript-style string, with explicit length given separately, as opposed to a null terminated C-style string, which is char *.

floatp
This is a synonym for double. It should be used for, and only for, procedure parameters that would otherwise be float. (As noted above, procedure parameters should not be declared as float.)
bits8, bits16, bits32
These are unsigned integer types of the given width. Use them wherever the actual width matters: do not, for example, use short assuming that it is 16 bits wide. New code can use the C99 fixed-width types from stdint_.h.

Stylistic conventions

Ghostscript's coding rules cover not only the use of the language, but also many stylistic issues like the choice of names and the use of whitespace. The stylistic rules are meant to produce code that is easy to read. It's important to observe them as much as possible in order to maintain a consistent style, but if you find these rules getting in your way or producing ugly-looking results once in a while, it's OK to break it.

Formatting

Indentation

We've formatted all of our code using the GNU indent program.

indent -bad -nbap -nsob -br -ce -cli4 -npcs -ncs \
   -i4 -di0 -psl -lp -lps somefile.c

does a 98% accurate job of producing our preferred style. Unfortunately, there are bugs in all versions of GNU indent, requiring both pre- and post-processing of the code.

Put indentation points every 4 spaces, with 8 spaces = 1 tab stop.

Don't indent the initial # of preprocessor commands. However, for nested preprocessor commands, do use indentation between the # and the command itself. Use 2 spaces per level of nesting, e.g.:

#ifndef xyz
#  define xyz 0
#endif

For assignments (including chain assignments), put the entire statement on one line if it will fit; if not, break it after a = and indent all the following lines. I.e., format like this:

var1 = value;
var1 = var2 = value;
var1 =
    value;
var1 =
    var2 = value;
var1 = var2 =
    value;

But not like this:

var1 =
var2 = value;

Indent in-line blocks thus:

{
   ... declarations ...
   {{ blank line if any declarations above }}
   ... statements ...
}

Similarly, indent procedures thus:

return_type
proc_name(... arguments ...)
{
   ... declarations ...
   {{ blank line if any declarations above }}
   ... statements ...
}

If a control construct (if, do, while, or for) has a one-line body, use this:

... control construct ...
   ... subordinate simple statement ...
This is considered particularly important so that a breakpoint can be set inside the conditional easily.

If it has a multi-line body, use this:

... control construct ... {
   ... subordinate code ...
}

If the subordinate code has declarations, see blocks above.

For if-else statements, do this:

if ( ... ) {
   ... subordinate code ...
} else if ( ... ) {
   ... subordinate code ...
} else {
   ... subordinate code ...
}

When there are more than two alternatives, as in the example above, use the above ("parallel") syntax rather than the following ("nested") syntax:

if ( ... ) {
   ... subordinate code ...
} else {
   if ( ... ) {
      ... subordinate code ...
   } else {
      ... subordinate code ...
   }
}

Similarly, for do-while statements, do this:

do {
   ... body ...
} while ( ... condition ... );

Spaces

Do put a space:

Don't put a space:

Parentheses

Parentheses are important in only a few places:

Anywhere else, given the choice, use fewer parentheses.

For stylistic consistency with the existing Ghostscript code, put parentheses around conditional expressions even if they aren't syntactically required, unless you really dislike doing this. Note that the parentheses should go around the entire expression, not the condition. For instance, instead of

hpgl_add_point_to_path(pgls, arccoord_x, arccoord_y,
   (pgls->g.pen_down) ? gs_lineto : gs_moveto);

use

hpgl_add_point_to_path(pgls, arccoord_x, arccoord_y,
   (pgls->g.pen_down ? gs_lineto : gs_moveto));

Preprocessor

Conditionals

Using preprocessor conditionals can easily lead to unreadable code, since the eye really wants to read linearly rather than having to parse the conditionals just to figure out what code is relevant. It's OK to use conditionals that have small scope and that don't change the structure or logic of the program (typically, they select between different sets of values depending on some configuration parameter), but where possible, break up source modules rather than use conditionals that affect the structure or logic.

Macros

Ghostscript code uses macros heavily to effectively extend the rather weak abstraction capabilities of the C language, specifically in the area of memory management and garbage collection: in order to read Ghostscript code effectively, you simply have to learn some of these macros as though they were part of the language. The current code also uses macros heavily for other purposes, but we are trying to phase these out as rapidly as possible, because they make the code harder to read and debug, and to use the rules that follow consistently in new code.

Define macros in the smallest scope you can manage (procedure, file, or .h file), and #undef them at the end of that scope: that way, someone reading the code can see the definitions easily when reading the uses. If that isn't appropriate, define them in as large a scope as possible, so that they effectively become part of the language. This places an additional burden on the reader, but it can be amortized over reading larger amounts of code.

Try hard to use procedures instead of macros. Use "inline" if you really think the extra speed is needed, but only within a .c file: don't put inline procedures in .h files, because most compilers don't honor "inline" and you'll wind up with a copy of the procedure in every .c file that includes the .h file.

If you define a macro that looks like a procedure, make sure it will work wherever a procedure will work. In particular, put parentheses around every use of an argument within the macro body, so that the macro will parse correctly if some of the arguments are expressions, and put parentheses around the entire macro body. (This is still subject to the problem that an argument may be evaluated more than once, but there is no way around this in C, since C doesn't provide for local variables within expressions.)

If you define macros for special loop control structures, make their uses look somewhat like ordinary loops, for instance:

BEGIN_RECT(xx, yy) {
  ... body indented one position ...
} END_RECT(xx, yy);

If at all possible, don't use free variables in macros -- that is, variables that aren't apparent as arguments of the macro. If you must use free variables, list them all in a comment at the point where the macro is defined.

If you define new macros or groups of macros, especially if they aren't simply inline procedures or named constant values, put some extra effort into documenting them, to compensate for the fact that macros are intrinsically harder to understand than procedures.

Comments

The most important descriptive comments are ones in header files that describe structures, including invariants; but every procedure or structure declaration, or group of other declarations, should have a comment. Don't spend a lot of time commenting executable code unless something unusual or subtle is going on.

Naming

Use fully spelled-out English words in names, rather than contractions. This is most important for procedure and macro names, global variables and constants, values of #define and enum, struct and other typedef names, and structure member names, and for argument and variable names which have uninformative types like int. It's not very important for arguments or local variables of distinctive types, or for local index or count variables.

Avoid names that run English words together: "hpgl_compute_arc_center" is better than "hpgl_compute_arccenter". However, for terms drawn from some predefined source, like the names of PostScript operators, use a term in its well-known form (for instance, gs_setlinewidth rather than gs_set_line_width).

Procedures, variables, and structures visible outside a single .c file should generally have prefixes that indicate what subsystem they belong to (in the case of Ghostscript, gs_ or gx_). This rule isn't followed very consistently.

Types

Many older structure names don't have _t on the end, but this suffix should be used in all new code. (The _s structure name is needed only to satisfy some debuggers. No code other than the structure declaration should refer to it.)

Declare structure types that contain pointers to other instances of themselves like this:

typedef struct xxx_s xxx_t;
struct xxx_s {
   ... members ...
   xxx_t *ptr_member_name;
   ... members ...
};

If, to maintain data abstraction and avoid including otherwise unnecessary header files, you find that you want the type xxx_t to be available in a header file that doesn't include the definition of the structure xxx_s, use this approach:

#ifndef xxx_DEFINED
#  define xxx_DEFINED
typedef struct xxx_s xxx_t;
#endif
struct xxx_s {
   ... members ...
};

You can then copy the first 4 lines in other header files. (Don't ever include them in an executable code file.)

Don't bother using const for anything other than with pointers as described below. However, in those places where it is necessary to cast a pointer of type const T * to type T *, always include a comment that explains why you are "breaking const".

Pointers

Use const for pointer referents (that is, const T *) wherever possible and appropriate.

If you find yourself wanting to use void *, try to find an alternative using unions or (in the case of super- and subclasses) casts, unless you're writing something like a memory manager that really treats memory as opaque.

Procedures

In general, don't create procedures that are private and only called from one place. However, if a compound statement (especially an arm of a conditional) is too long for the eye easily to match its enclosing braces "{...}" -- that is, longer than 10 or 15 lines -- and it doesn't use or set a lot of state held in outer local variables, it may be more readable if you put it in a procedure.

Miscellany

Local variables

Don't assign new values to procedure parameters. It makes debugging very confusing when the parameter values printed for a procedure are not the ones actually supplied by the caller. Instead use a separate local variable initialized to the value of the parameter.

If a local variable is only assigned a value once, assign it that value at its declaration, if possible. For example,

int x = some expression ;

rather than

int x;
...
x = some expression ;

Use a local pointer variable like this to "narrow" pointer types:

int
someproc(... gx_device *dev ...)
{
   gx_device_printer *const pdev = (gx_device_printer *)dev;

   ...
}

Don't "shadow" a local variable or procedure parameter with an inner local variable of the same name. I.e., don't do this:

int
someproc(... int x ...)
{
   ...
   int x;
   ...
}

Compiler warnings

We want Ghostscript to compile with no warnings. This is a constant battle as compilers change and new code is added. Work hard to eliminate warnings by improving the code structure instead of patching over them. If the compiler can't figure out that a variable is always initialized, a future reader will probably have trouble as well.


File structuring

All files

Keep file names within the "8.3" format for portability:

For files other than documentation files, use only lower case letters in the names; for HTML documentation files, capitalize the first letter.

Every code file should start with comments containing

  1. a copyright notice,
  2. the name of the file in the form of an RCS Id:
    /* $Id: filename.ext $*/

    (using the comment convention appropriate to the language of the file), and

  3. a summary, no more than one line, of what the file contains.

If you create a file by copying the beginning of another file, be sure to update the copyright year and change the file name.

Makefiles

Use the extension .mak for makefiles.

For each

#include "xxx.h"

make sure there is a dependency on $(xxx_h) in the makefile. If xxx ends with a "_", this rule still holds, so that if you code

#include "math_.h"

the makefile must contain a dependency on "$(math__h)" (note the two underscores "__").

List the dependencies bottom-to-top, like the #include statements themselves; within each level, list them alphabetically. Do this also with #include statements themselves whenever possible (but sometimes there are inter-header dependencies that require bending this rule).

For compatibility with the build utilities on OpenVMS, always put a space before the colon that separates the target(s) of a rule from the dependents.

General C code

List #include statements from "bottom" to "top", that is, in the following order:

  1. System includes ("xxx_.h")
  2. gs*.h
  3. gx*.h (yes, gs and gx are in the wrong order.)
  4. s*.h
  5. i*.h (or other interpreter headers that don't start with "i")

Headers (.h files)

In header files, always use the following at the beginning of a header file to prevent double inclusion:

{{ Copyright notice etc. }}

#ifndef <filename>_INCLUDED
#define <filename>_INCLUDED

{{ The contents of the file }}

#endif /* <filename>_INCLUDED */

The header file is the first place that a reader goes for information about procedures, structures, constants, etc., so ensure that every procedure and structure has a comment that says what it does. Divide procedures into meaningful groups set off by some distinguished form of comment.

Source (.c files)

After the initial comments, arrange C files in the following order:

  1. #include statements
  2. Exported data declarations
  3. Explicit externs (if necessary)
  4. Forward declarations of procedures
  5. Private data declarations
  6. Exported procedures
  7. Private procedures

Be flexible about the order of the declarations if necessary to improve readability. Many older files don't follow this order, often without good reason.


Ghostscript conventions

Specific names

The Ghostscript code uses certain names consistently for certain kinds of values. Some of the commonest and least obvious are these two:

code

A value to be returned from a procedure:
< 0      An error code defined in gserrors.h (or ierrors.h)
0   Normal return
> 0   A non-standard but successful return (which must be documented, preferably with the procedure's prototype)

status

A value returned from a stream procedure:
< 0      An exceptional condition as defined in scommon.h
0   Normal return (or, from the "process" procedure, means that more input is needed)
1   More output space is needed (from the "process" procedure)

Structure type descriptors

The Ghostscript memory manager requires run-time type information for every structure. (We don't document this in detail here: see the Structure descriptors section of the developer documentation for details.) Putting the descriptor for a structure next to the structure definition will help keep the two consistent, so immediately after the definition of a structure xxx_s, define its structure descriptor:

struct xxx_s {
   ... members ...
};
#define private_st_xxx()  /* in <filename>.c */\
  gs_private_st_<whatever>(st_xxx, xxx_t,\
    "xxx_t", xxx_enum_ptrs, xxx_reloc_ptrs,\
    ... additional parameters as needed ...)

The file that implements operations on this structure (<filename>.c) should then include, near the beginning, the line:

private_st_xxx();

In much existing code, structure descriptors are declared as public, which allows clients to allocate instances of the structure directly. We now consider this bad design. Instead, structure descriptors should always be static; the implementation file should provide one or more procedures for allocating instances, e.g.,

xxx_t *gs_xxx_alloc(P1(gs_memory_t *mem));

If it is necessary to make a structure descriptor public, it should be declared in its clients as

extern_st(st_xxx);

"Objects"

Ghostscript makes heavy use of object-oriented constructs, including analogues of classes, instances, subclassing, and class-associated procedures. However, these constructs are implemented in C rather than C++, for two reasons:

Classes

The source code representation of a class is simply a typedef for a C struct. See Structures, above, for details.

Procedures

Ghostscript has no special construct for non-virtual procedures associated with a class. In some cases, the typedef for the class is in a header file but the struct declaration is in the implementation code file: this provides an extra level of opaqueness, since clients then can't see the representation and must make all accesses through procedures. You should use this approach in new code, if it doesn't increase the size of the code too much or require procedure calls for very heavily used accesses.

Ghostscript uses three different approaches for storing and accessing virtual procedures, plus a fourth one that is recommended but not currently used. For exposition, we assume the class (type) is named xxx_t, it has a virtual procedure void (*virtu)(P1(xxx_t *)), and we have a variable declared as xxx_t *pxx.

  1. The procedures are stored in a separate, constant structure of type xxx_procs, of which virtu is a member. The structure definition of xxx_t includes a member defined as const xxx_procs *procs (always named procs). The construct for calling the virtual procedure is pxx->procs->virtu(pxx).
  2. The procedures are defined in a structure of type xxx_procs as above. The structure definition of xxx_t includes a member defined as xxx_procs procs (always named procs). The construct for calling the virtual procedure is pxx->procs.virtu(pxx).
  3. The procedures are not defined in a separate structure: each procedure is a separate member of xxx_t. The construct for calling the virtual procedure is pxx->virtu(pxx).
  4. The procedures are defined in a structure of type xxx_procs as above. The structure definition of xxx_t includes a member defined as xxx_procs procs[1] (always named procs). The construct for calling the virtual procedure is again pxx->procs->virtu(pxx).

Note that in approach 1, the procedures are in a shared constant structure; in approaches 2 - 4, they are in a per-instance structure that can be changed dynamically, which is sometimes important.

In the present Ghostscript code, approach 1 is most common, followed by 2 and 3; 4 is not used at all. For new code, you should use 1 or 4: that way, all virtual procedure calls have the same form, regardless of whether the procedures are shared and constant or per-instance and mutable.

Subclassing

Ghostscript's class mechanism allows for subclasses that can add data members, or can add procedure members if approach 1 or 3 (above) is used. Since C doesn't support subclassing, we use a convention to accomplish it. In the example below, gx_device is the root class; it has a subclass gx_device_forward, which in turn has a subclass gx_device_null. First we define a macro for all the members of the root class, and the root class type. (As for structures in general, classes need a structure descriptor, as discussed in Structures above: we include these in the examples below.)

#define gx_device_common\
    type1 member1;\
    
...
    typeN memberN

typedef struct gx_device_s {
    gx_device_common;
} gx_device;

#define private_st_gx_device()  /* in gsdevice.c */\
  gs_private_st_
<whatever>(st_gx_device, gx_device,\
    "gx_device", device_enum_ptrs, device_reloc_ptrs,\
    
... additional parameters as needed ...)

We then define a similar macro and type for the subclass.

#define gx_device_forward_common\
    gx_device_common;\
    gx_device *target

typedef struct gx_device_forward_s {
    gx_device_forward_common;
} gx_device_forward;

#define private_st_device_forward()  /* in gsdevice.c */\
  gs_private_st_suffix_add1(st_device_forward, gx_device_forward,\
    "gx_device_forward", device_forward_enum_ptrs, device_forward_reloc_ptrs,\
    gx_device, target)

Finally, we define a leaf class, which doesn't need a macro because we don't currently subclass it. (We can create the macro later if needed, with no changes anywhere else.) In this particular case, the leaf class has no additional data members, but it could have some.

typedef struct gx_device_null_s {
    gx_device_forward_common;
};

#define private_st_device_null()  /* in gsdevice.c */\
  gs_private_st_suffix_add0_local(st_device_null, gx_device_null,\
    "gx_device_null", device_null_enum_ptrs, device_null_reloc_ptrs,\
    gx_device_forward)

Note that the above example is not the actual definition of the gx_device structure type: the actual type has some additional complications because it has a finalization procedure. See base/gxdevcli.h for the details.

If you add members to a root class (such as gx_device in this example), or change existing members, do this in the gx_device_common macro, not the gx_device structure definition. Similarly, to change the gx_device_forward class, modify the gx_device_forward_common macro, not the structure definition. Only change the structure definition if the class is a leaf class (one with no _common macro and no possibility of subclassing), like gx_device_null.

Error handling

Every caller should check for error returns and, in general, propagate them to its callers. By convention, nearly every procedure returns an int to indicate the outcome of the call:

< 0      Error return
0   Normal return
> 0   Non-error return other than the normal case

To make a procedure generate an error and return it, as opposed to propagating an error generated by a lower procedure, you should use

return_error(error_number);

Sometimes it is more convenient to generate the error in one place and return it in another. In this case, you should use

code = gs_note_error(error_number);
...
return code;

In executables built for debugging, the -E (or -Z#) command line switch causes return_error and gs_note_error to print the error number and the source file and line: this is often helpful for identifying the original cause of an error.

See the file base/gserrors.h for the error return codes used by the graphics library, most of which correspond directly to PostScript error conditions.


Copyright © 2000-2006 Artifex Software, Inc. All rights reserved.

This software is provided AS-IS with no warranty, either express or implied. This software is distributed under license and may not be copied, modified or distributed except as expressly authorized under the terms of that license. Refer to licensing information at http://www.artifex.com/ or contact Artifex Software, Inc., 7 Mt. Lassen Drive - Suite A-134, San Rafael, CA 94903, U.S.A., +1(415)492-9861, for further information.

Ghostscript version 9.07, 12 February 2013