3 Compilation

3 Compilation

In this section, we present a compilation scheme close to the one described in [20, 1], and implemented in compilers such as the hbc compiler or the Objective Caml compiler. This classical scheme will be refined later into an optimized scheme, using same notations and concepts.

3.1 Output of the match compiler

The compilation of pattern-matching is described by the scheme C that maps a clause matrix to a lambda-code expression. We now describe the specific lambda-code constructs that the scheme C outputs while compiling patterns.

Let-bindings: let (x l_x) l, nested let-bindings are abbreviated as:

let (x₁ l₁) (x₂ l₂) ⋯ (x_n l_n) l
Static exceptions, exit and traps, catch l₁ with l₂. If, when evaluating the body l₁, exit is encountered, then the result of evaluating catch l₁ with l₂ is ther result evaluating the handler l₂, otherwise it is the result of evaluating l₁. By contrast with dynamic exceptions, static exceptions are directly compiled as jumps to the associated handlers (plus some environment adjustment, such as stack pops), whereas traps do not generate any code.
Switch constructs:

switch l with case c₁: l₁ ⋯ case c_k: l_k default: d

The result of a switch construct is the evaluation of the l_i corresponding to the constructor c_i appearing as the head of the value v of l. If the head constructor of v doesn't appear in the case list, the result is the evaluation of the default d expression.

The default clause default: d can be omitted. In such a case the switch behavior is unspecified on non-recognized values. Scheme C can thus omit the default clause when it is known that case lists will cover all possibilities at runtime. We use the keyword switch* to highlight switch constructs with no default clause.

Those switch constructs are quite sophisticated, they compile later into more basic constructs: tests, branches and jump tables. We in fact modified the Objective Caml compiler to improve the compilation of switch constructs, using techniques first introduced in the context of compiling the case statement of Pascal [3]. The key points are using range tests, which can typically be performed by one single (unsigned) test and branch plus possibly one addition, cutting sparse case lists into denser ones, and deciding which of jump tables or test sequence is more appropriate to each situation. A survey of these techniques can be found in [19].
Accessors: field n x, where x is a variable and n is an integer offset. By convention, the first argument of non-constant constructors stands at offset zero.
Sequences: l₁; l₂ and units: ()

3.2 Initial state

Input to the pattern matching compiler C consists of two arguments: a vector of variables x of size n and a clause matrix P → L of width n and height m.

x = (x₁ x₂… x_n), P → L =

⎛
⎜
⎜
⎜
⎝

	p₁¹	p₂¹	⋯	p_n¹	→	l¹
	p₁²	p₂²	⋯	p_n²	→	l²
⋮
	p₁^m	p₂^m	⋯	p_n^m	→	l^m

⎞
⎟
⎟
⎟
⎠

The initial matrix is generated from source input. Given a pattern-matching expression (in Caml syntax):

match x with | p¹ -> e¹ | p² -> e² … | p^m -> e^m

The initial call to C is:

catch C((x), (

`p`¹	`→`	`l`¹
`p`²	`→`	`l`²
	`→`
`p`^m	`→`	`l`^m

)) with (failwith "Partial match")

Where the lⁱ's are the translations to lambda-code of the eⁱ's, and (failwith "Partial match") is a runtime failure that occurs when the whole pattern matching fails.

3.3 Classical scheme

By contrast with previous presentations, we assume that matrix P → L has at least one row (i.e. m > 0). This condition simplifies our presentation, without restricting its generality. Hence, scheme C is defined by cases on non-empty clause matrices:

If n is zero (i.e. when there are no more columns), then the first row of P matches the empty vector ():

C((),

⎛
⎜
⎜
⎜
⎝

→ l¹

→ l²

⋮

→ l^m
⎞
⎟
⎟
⎟
⎠
) = l¹
If n is not zero, then a simple compilation is possible, using the following four rules.
1. If all patterns in the first column of p are variables, y¹, y², ..., y^m, then:
  C(x, P → L) = C((x₂ x₃… x_n), P' → L')
  where
  
  P' → L' =
  
  ⎛
  ⎜
  ⎜
  ⎜
  ⎝
  
  p₂¹ ⋯ p_n¹ → let (y¹ x₁) l¹
  
  p₂² ⋯ p_n² → let (y² x₁) l²
  
  ⋮
  
  p₂^m ⋯ p_n^m → let (y^m x₁) l^m
  ⎞
  ⎟
  ⎟
  ⎟
  ⎠
  )
  We call this rule, the variable rule. This case also handles wild-card patterns: they are treated like variables except that the let-binding is omitted.
2. If all patterns in the first column of P are constructor patterns c(q₁, … , q_a), then let C be the set of matched constructors, that is, the set of the head constructors of the p₁ⁱ's.
  
  Then, for each constructor c in C, we define the specialized clause matrix S(c,P → L) by mapping the following transformation on the rows of P.
  
  p₁ⁱ S(c,P → L)
  
  c(q₁ⁱ, … , q_aⁱ) q₁ⁱ ⋯ q_aⁱ p₂ⁱ ⋯ p_nⁱ → l_i
  
  c'(q₁ⁱ, … , q_a'ⁱ) (c' ≠ c)      No row
  (Matrices S(c,P → L) and P → L define the same matching predicate when x₁ is bound to some value c(v₁, … , v_a).) Furthermore, for a given constructor c of arity a, let y₁, … ,y_a be fresh variables. Then, for any constructor c in C, we define the lambda-expression r(c):
  
    (let (y₁ (field 0 x₁))   ...   (y_a (field (a−1) x₁))   C((y₁, … ,y_a,x₂, … ,x_n),S(c,P → L)))
  
  Finally, assuming C = {c₁,…, c_k}, the compilation result is:
  
    switch x₁ with   case c₁: r(c₁) ⋯ case c_k: r(c_k)   default: exit
  (Note that the default clause can be omitted when C makes up a full signature.) We call this rule, the constructor rule.
3. If P has only one row and that this row starts with an or-pattern:
  
  P =
  
  ⎛
  ⎝
  
  (q₁ ∣... ∣q_o) p₂ ⋯ p_n → l
  ⎞
  ⎠
  ,
  
  Then, compilation result is:
  
  C((x₁),
  
  ⎛
  ⎜
  ⎜
  ⎝
  
  q₁ → ()
  
  ⋮
  
  q_o → ()
  ⎞
  ⎟
  ⎟
  ⎠
  ); C( (x₂… x_n), ( p₂…p_n → l ))
  This rule is the orpat rule. Observe that it does not duplicate any pattern nor action. However, variables in or-patterns are not supported, since, in clause q_i → (), the scope of q_i variables is the action “()”.
4. Finally, if none of the previous rules applies, the clause matrix P → L is cut in two clause matrices P₁ → L₁ and P₂ → L₂, such that P₁ → L₁ is the largest prefix of P → L for which one of the variable, constructor or orpat rule applies.
  
  Then, compilation result is:
  
  catch C(x, P₁ → L₁) with C(x, P₂ → L₂)
  This rule is the mixture rule.

This paper doesn't deal with optimizing let-bindings, which are carelessly introduced by scheme C. This job is left to a later compilation phase.

p₁ⁱ	S(c,P → L)
c(q₁ⁱ, … , q_aⁱ)	q₁ⁱ	⋯	q_aⁱ	p₂ⁱ	⋯	p_nⁱ	→	l_i
c'(q₁ⁱ, … , q_a'ⁱ) (c' ≠ c)	No row