Parsing and Printing tools

Camlp5 provides two original parsing tools:

The first parsing tool, the stream parsers, is the elementary system. It is pure syntactic sugar, i.e. the code is directly converted into basic ocaml statements: functions, pattern matchings, try. A stream parser is a function. But the system does not take care of associativity, nor parsing level, and left recursion result on infinite loops, just like functions whose first action would be a call to itself.

The second parsing tool, the extensible grammars, are more sophisticated. A grammar written with them is more readable, and look like grammars written with tools like "yacc". They take care of associativity, left recursion, and level of parsing. They are dynamically extensible, what allows the syntax extensions what camlp5 provides for ocaml syntax.

In both cases, the input data are streams.

Camlp5 also provides a pretty printing tool, a module allowing to control the lines length.

The next sections give an overview of the two parsing and the pretty tools.

Stream parsers

The stream parsers are a system of elementary recursive descendant parsing. Streams are actually lazy lists. At each step, the head of the list is compared against a stream pattern. There are two kinds of streams parsers:

The differences are about:

In the imperative version, there exists also lexers, a shorter syntax when the stream elements are of the specific type 'char'.

Extensible grammars

Extensible grammars manipulate grammar entries. Grammar entries are abstract values internally containing mutable stream parsers. When a grammar entry is created, its internal parser is empty, i.e. it raises "Parse.Failure" if used. A specific syntactic construction, with the keyword "EXTEND" allow to extend grammar entries with new grammar rules.

In opposition to stream parsers, grammar entries take care of associativity, left factorization, and levels. Moreover, the syntax for grammars allows to define optional calls, lists and lists with separators. However, they are not functions and cannot take parameters.

Since the internal system is stream parsers, extensible grammars use recursive descendant parsing.

The parser of the ocaml language in camlp5 is written with extensible grammars.

Pretty module

The "Pretty" module is an original tool allowing to control the displaying of lines. The user has to specify two functions where:

The system first tries the first function. At any time, it the line overflows, i.e. if its size is greater than some "line length" specified in the module interface, or if it contains newlines, the function is aborted and control is given to the second function.

This is a basic, but powerful, system. It supposes that the programmer takes care of the current indentation, and the beginning and the end of its lines.

The module will be extended in the future to hide the management of indendations and line continuations, and by the supply of functions combinating the two cases above, in which the programmer can specify the possible places where newlines can be inserted.