cpphs 1.0 review

by rbytes.net on

cpphs is a liberalised re-implementation of cpp, the C pre-processor, in Haskell. Why re-implement cpp? Rightly or wrongly, the C

License: LGPL (GNU Lesser General Public License)
File size: 42K
Developer: Malcolm Wallace
0 stars award from rbytes.net

cpphs is a liberalised re-implementation of cpp, the C pre-processor, in Haskell.

Why re-implement cpp? Rightly or wrongly, the C pre-processor is widely used in Haskell source code. It enables conditional compilation for different compilers, different versions of the same compiler, and different OS platforms.

It is also occasionally used for its macro language, which can enable certain forms of platform-specific detail-filling, such as the tedious boilerplate generation of instance definitions and FFI declarations. However, there are two problems with cpp, aside from the obvious aesthetic ones:

* For some Haskell systems, notably Hugs on Windows, a true cpp is not available by default. * Even for the other Haskell systems, the common cpp provided by the gcc 3.x series is changing subtly in ways that are incompatible with Haskell's syntax. There have always been problems with, for instance, string gaps, and prime characters in identifiers. These problems are only going to get worse.

So, it seemed right to provide an alternative to cpp, both more compatible with Haskell, and itself written in Haskell so that it can be distributed with compilers.

This version of the C pre-processor is pretty-much feature-complete, and compatible with the -traditional style. It has two main modes:

* conditional compilation only (--nomacro),
* and full macro-expansion (default).

In --nomacro mode, cpphs performs only conditional compilation actions, namely #include's, #if's, and #ifdef's are processed according to text-replacement definitions (both command-line and internal), but no parameterised macro expansion is performed. In full compatibility mode (the default), textual replacements and macro expansions are also processed in the remaining body of non-cpp text.

Working features:

#ifdef simple conditional compilation
#if the full boolean language of defined(), &&, ||, ==, etc.
#elif chained conditionals
#define in-line definitions (text replacements and macros)
#undef in-line revocation of definitions
#include file inclusion
#line line number directives

line continuations within all # directives
/**/ token catenation within a macro definition
## ANSI-style token catenation
# ANSI-style token stringisation
__FILE__ special text replacement for DIY error messages
__LINE__ special text replacement for DIY error messages
__DATE__ special text replacement
__TIME__ special text replacement

Macro expansion is recursive. Redefinition of a macro name does not generate a warning. Macros can be defined on the command-line with -D just like textual replacements. Macro names are permitted to be Haskell identifiers e.g. with the prime ' and backtick ` characters, which is slightly looser than in C, but they still may not include operator symbols.

Numbering of lines in the output is preserved so that any later processor can give meaningful error messages. When a file is #include'd, cpphs inserts #line directives for the same reason. Numbering should be correct even in the presence of line continuations. If you don't want #line directives in the final output, use the --noline option.

Any syntax errors in cpp directives gives a message to stderr and halts the program. Failure to find a #include'd file produces a warning to stderr, but processing continues.

Differences from cpp:
In general, cpphs is based on the -traditional behaviour, not ANSI C, and has the following main differences from the standard cpp.


The # that introduces any cpp directive must be in the first column of a line (whereas ANSI permits whitespace before the #).
Generates the #line n "filename" syntax, not the # n "filename" variant.
C comments are only removed from within cpp directives. They are not stripped from other text. Consider for instance that in Haskell, all of the following are valid operator symbols: /* */ */* However, you can turn on C-comment removal with the --strip option.
Macros are never expanded within Haskell comments, strings, or character constants, unless you give the --text option to disable lexing the input as Haskell.
Macros are always expanded recursively, unlike ANSI, which detects and prevents self-recursion. For instance, #define foo x:foo expands foo once only to x:foo in ANSI, but in cpphs it becomes an infinite list x:x:x:x:..., i.e. cpphs does not terminate.

Macro definition language

Accepts /**/ for token-pasting in a macro definition. However, /* */ (with any text between the open/close comment) inserts whitespace.
The ANSI ## token-pasting operator is available with the --hashes flag. This is to avoid misinterpreting any valid Haskell operator of the same name.
Replaces a macro formal parameter with the actual, even inside a string (double or single quoted). This is -traditional behaviour, not supported in ANSI.
Recognises the # stringisation operator in a macro definition only if you use the --hashes option. (It is an ANSI addition, only needed because quoted stringisation (above) is prohibited by ANSI.)
Preserves whitespace within a textual replacement definition exactly (modulo newlines), but leading and trailing space is eliminated.
Preserves whitespace within a macro definition (and trailing it) exactly (modulo newlines), but leading space is eliminated.
Preserves whitespace within macro call arguments exactly (including newlines), but leading and trailing space is eliminated.
With the --layout option, line continuations in a textual replacement or macro definition are preserved as line-breaks in the macro call. (Useful for layout-sensitive code in Haskell.)

What's New in This Release:
This release now includes a compatibility script for command line arguments to match the original cpp.
There are several minor bugfixes, e.g. quotes around replacements for special macros like __FILE__, etc.
Interaction with preprocessors like hsc2hs is also improved: if they allow non-cpp directives like #def, these are now passed through to the output with a warning to stderr, rather than halting with an error.
Likewise, a #! line in a shell script is now ignored.

cpphs 1.0 search tags