Camlp5 - Pretty Printing OCaml Programs

Camlp5 provides extensions kits to pretty print programs in revised syntax and normal syntax. Some other extensions kits also allow to rebuild the parsers, or the EXTEND statements in their initial syntax. The pretty print system is itself extensible, by adding new rules. We present here how it works in the Camlp5 sources.

The pretty print system of Camlp5 uses the library modules Pretty, an original system to format output) and Extfun, another original system of extensible functions.

This documentation is destinated to programmers who want to understand how the pretty printing of ocaml programs work in camlp5, want to adapt, modify or debug it, or want to add their own pretty printing extensions.

Introduction

The files doing the pretty prints are located in Camlp5 sources in the directory "etc". Look at them if you are interested on creating new ones. The main ones are:

We present here how this system work inside these files.

Principles

Using module Pretty

All functions in ocaml pretty printing take a parameter named "the printing context" (variable pc). It is a record holding :

A typical pretty printing function calls the function horiz_vertic of the library module Pretty. This function takes two functions as paramter:

Both functions catenate the strings by using the function sprintf of the library module Pretty which controls whether the printed data holds in the line or not. They generally call, recursively, other pretty printing functions with the same behaviour.

Let us see an example (fictive) of printing an ocaml application. Let us suppose we have an application expression "e1 e2" to pretty print where e1 and e2 are sub-expressions. If both expressions and their application holds on one only line, we want to see:

e1 e2

On the other hand, if they do not hold on one only line, we want to see e2 in another line with, say, an indendation of 2 spaces:

e1
  e2

Here is a possible implementation. The function has been named expr_app and can call the function expr to print the sub-expressions e1 and e2:

value expr_app pc e1 e2 =
  horiz_vertic
    (fun () ->
       let s1 = expr {(pc) with aft = ""} e1 in
       let s2 = expr {(pc) with bef = ""} e2 in
       sprintf "%s %s" s1 s2)
    (fun () ->
       let s1 = expr {(pc) with aft = ""} e1 in
       let s2 =
         expr
           {(pc) with
              ind = pc.ind + 2;
              bef = tab (pc.ind + 2)}
           e2
       in
       sprintf "%s\n%s" s1 s2)
;

The first function is the horizontal printing. It ends with a sprintf separating the printing of e1 and e2 by a space. The possible "before part" (pc.bef) and "after part" (pc.aft) are transmitted in the calls of the sub-functions.

The second function is the vertical printing. It ends with a sprintf separating the printing of e1 and e2 by a newline. The second line starts with an indendation, using the "before part" (pc.bef) of the second call to expr.

The pretty printing library function Pretty.horiz_vertic calls the first (horizontal) function, and if it fails (either because s1 or s2 are too long or hold newlines, or because the final string produced by sprintf is too long), calls the second (vertical) function.

Notice that the parameter pc contains a field pc.bef (what has to be printed before in the same line), which in both cases is transmitted to the printing of e1 (since the syntax {(pc) with aft = ""} is a record with pc.bef kept). Same for the field pc.aft transmitted to the printing of e2.

Using module Extfun and its syntax

This system is combined to the the extensible functions to allow the extensibility of the pretty printing. Pretty printers of camlp5 can then be used as "kits" to be added or not, according to the things to be pretty printed in some or other ways. In particular, the pretty printing kit "pr_r.cmo" alone does not rebuild parsers in their original syntax. When adding "pr_rp.cmo", the parsers are rebuilt: the code of "pr_rp.ml" is just an extension of some parts of the pretty printing extensible functions of "pr_r.ml".

The code above actually looks like:

value expr_app =
  extfun Extfun.empty with
  [ <:expr< $e1$ $e2$ >> ->
      fun curr next pc ->
        horiz_vertic
          (fun () ->
             let s1 = curr {(pc) with aft = ""} e1 in
             let s2 = next {(pc) with bef = ""} e2 in
             sprintf "%s %s" s1 s2)
          (fun () ->
             let s1 = curr {(pc) with aft = ""} e1 in
             let s2 =
               next
                 {(pc) with
                    ind = pc.ind + 2;
                    bef = tab (pc.ind + 2)}
                 e2
             in
             sprintf "%s\n%s" s1 s2)
  | e ->
      fun curr next pc -> next pc e ]
;

The extensible functions have a syntax tree (here <:expr< $e1$ $e2$ >>) as parameter. To be extensible, the syntax tree must be the first parameter (it is not possible to apply extensions inside a closure). The other parameters, in particular the printing context pc are given in the semantic action.

The parameter curr and next are provided by the pretty printing system for ocaml programs. They correspond to the pretty printing of, respectively, the current level and the next level. Since the application in ocaml is left associative, the first sub-expression is printed at the same (current) level and the second one is printed at the next level. We also see a call to next in the last (2nd) case of the function to treat the other cases in the next level.

Dangling else, bar, semicolon

In normal syntax, there are cases where it is necessary to enclose expressions between parentheses (or between begin and end, which is equivalent in that syntax). Three tokens may cause problems: the "else", the vertical bar "|" and the semicolon ";". Here are examples where the presence of these tokens constraints the previous expression to be parenthesized. In these three examples, removing the begin..end enclosers would change the meaning of the expression because the dangling token would be included in that expression:

Dangling else:

if a then begin if b then c end else d

Dangling bar:

function
  A ->
    begin match a with
      B -> c
    | D -> e
    end
| F -> g

Dangling semicolon:

if a then b
else begin
  let c = d in
  e
end;
f

The information is transmitted by the value pc.dang. In the first example above, while displaying the "then" part of the outer "if", the sub-expression is called with the value pc.dang set to "else" to inform the last sub-sub-expression that it is going to be followed by that token. When a "if" expression has to be displayed without "else" part, and that its "pc.dang" is "else", it has to be enclosed with spaces.

This problem does not exist in revised syntax. While pretty printing in revised syntax, the parameter pc.dang is not necessary and remains the empty string.

By level

For each level of pretty printing, there is such a function. The example showed the pretty printing of expression at the level "apply". There are other functions for levels "top", "add", "mul", "simple", and so on. The global pretty printing variable for expressions is a record, named "pr_expr" (in the module Pcaml.Printers), where the levels are defined by a list, something like this:

pr_expr.pr_levels :=
  [{pr_label = "top"; pr_rules = expr_top};
   {pr_label = "add"; pr_rules = expr_add};
   {pr_label = "mul"; pr_rules = expr_mul};
   {pr_label = "apply"; pr_rules = expr_app};
   {pr_label = "simple"; pr_rules = expr_simple}]
;

where we find, in particular, our function expr_app defined above.

The call to a specific level is done by the function pr_expr.pr_fun with the level name. It returns the function taking the "printing context" (pc) and the expression as parameters, and returning the pretty printed string. For example, the call to the top level of expressions has been defined as:

value expr pc e = pr_expr.pr_fun "top" pc e;

Same thing for the other pretty printed functions for patterns, structures, signatures, and so on.

To extend some level, in another file, the function find_pr_level can be used to get the level to be extended, e.g.

let expr_app = find_pr_level "app" pr_expr.pr_levels in
expr_app.pr_rules :=
  extfun expr_app.pr_rules with
  [ <:expr< .... >> -> ...
  | <:expr< .... >> -> ... 
  | ... ];

Copyright 2007 Daniel de Rauglaudre (INRIA)

Valid XHTML 1.1