-
If n is zero. then we have:
C*((),
|
|
, ex, def, ctx) = l1, ∅
|
Observe that the jump summary is empty since no exit is outputed.
- With respect to section 3.3,
the variable rule only changes as regards the
extra arguments ex, def and ctx.
We only describe these changes.
The performed recursive call
returns code l and jump summary ρ :
l,ρ =
C*(…, …, ex, ⇓(def), ⇓(ctx))
Exhaustiveness information ex does not change,
while def and ctx are pushed.
The variable rule returns l unchanged and ρ popped.
- In the constructor rule, let C = {c1, … ,ck} be the
matched constructors, let also Σ be the signature of their
type.
For a given constructor c ∈ C,
the performed recursive call is:
C*(…, …, ex, S(c,def), S(c, ctx))
Exhaustiveness information ex is passed unchanged,
while the other two extra arguments
are specialized (specialization of trap handlers being the natural
extension of matrix specialization).
Each recursive call returns a lambda-code l(c) and a jump summary
ρc.
Lambda-code l(c) gets wrapped into let-bindings like in
section 3.3, yielding the final lambda-code r(c).
We then define a case list L and a jump summary
ρrec as
follows:
L = case c1: r(c1) ⋯
case ck: r(ck) |
ρrec = { …, i ↦
|
|
COL(ρc(i)), … } |
|
The case list is as before, while the jump summary is the union of
the the jump summaries produced by recursive calls, once
collected.
Optimizations are then performed.
For clarity, optimizations are described as a two phase process:
first,
extend (or not extend) the case list L
with constructors taken from Σ\ C,
and add (or not add) a default case;
then, compute the final jump summary.
A first easy case is when Σ\ C is empty or
when ex is total. Then, the case list L is not augmented.
Otherwise, we distinguish two cases :
-
If Σ\ C is finite,
then for all constructors c in this set we consider
the context
Qc•Q'c = EX(c(_, … , _),ctx)
Then, trap handlers (P1, e1) ; … ; (Pt, et)
are scanned left-to-right,
stopping at the smallest i, such that the intersection
Q'c ∩ Pi is not empty.
That is, we find the trap handler where to jump to when
the head constructor of x1 is c, in order
to extend the case list as follows :
L = L case
c:
(exit
ei)
It is possible that ei does not exist
(when Q'c is empty).
This means that x1 head constructor will never be c at runtime.
- If Σ\ C is infinite (as in the case of integers)
or considered too large (as it might be in the case of characters),
then, a default case is added to the case list :
L = L default: (exit
e1)
That is, all non-recognized constructors lead to a jump the nearest
enclosing reachable trap-handler.
However it is still possible to extend the case list for particular
constructors, applying the previous procedure (a) to the constructors that
appear in the first column of reachable trap handler matrices and not
in C.
The final jump summary is computed by considering the final case list L.
For a given trap handler number ei let
{c'1, … ,c'k'} be the set of constructors such that
case c'
j
:
(exit e
i
)
appears in L.
Then the jump summary ρei is defined as:
ρei = { ei ↦
EX(c'1(_, … , _) ∣⋯ ∣
c'k'(_, … , _)), ctx) }
Moreover, if there is a default clause, the jump summary ρd is
defined as:
ρd = { e1 ↦ ctx}
Finally the constructor rule returns a switch on case list L and
the jump summary built by performing the union of
ρrec, of all ρei's and, when
appropriate, of ρd.
The constructor rule performs many context unions, so that
contexts may become huge.
Fortunately, contexts can be made smaller using a simple observation.
Namely, let p and q be two rows in a context,
such that p is less precise than q (i.e., all
instances of q are instances of p).
Then, row q can be removed from the context, without modifying
its meaning as a set of value vectors.
Hence, while performing context union, one can leave aside some
pattern rows.
If the produced context is still too large,
then contexts are safely approximated by first replacing some patterns
in them by wild-cards (typically all the pattern in a given column) and
then removing rows using the previous remark.
Rough experiments lead us to set the
maximal admissible context size to 32 rows,
yielding satisfactory compilation time in pathological examples
and exact contexts in practical examples.
- Or-pattern compilation operates on matrices whose first column
contains at least one or-pattern.
Additionally, when p1i is a or-pattern, then for all j, i < j ≤ m
one of the following, mutually exclusive, conditions must hold:
-
p1i and p1j are not compatible.
- p1i and p1j are compatible,
and (p2i… pni) is less precise than
(p2j… pnj)
Conditions (a) and (b) guarantee that, whenever p1i
matches the first value vector v1 of a value v, but row i
does not
match v, then no further row in P
matches v either. This is necessary since further rows of P
won't be reachable in case of failure in the or-pattern trap handler.
Now, consider one row number i, such that
p1i is the or-pattern q1 ∣⋯ ∣qo.
Further assume that this or-pattern binds the variables
y1, … ,yv.
First, we allocate a fresh trap number e and
divide P → L into the following or-body
P' → L' and or-trap P'' → L'' clauses:
P' → L' =
|
⎛
⎜
⎜
⎜
⎜
⎜
⎜
⎜
⎝ |
⋮ |
p1i−1 |
… |
pni−1 |
→ |
li−1 |
q1 |
… |
_ |
→ |
(exit e y1...yv) |
⋮ |
qo |
… |
_ |
→ |
(exit e y1...yv) |
p1i+1 |
… |
pni+1 |
→ |
lj+1 |
⋮ |
|
⎞
⎟
⎟
⎟
⎟
⎟
⎟
⎟
⎠ |
|
In the or-body matrix, observe that the or-pattern is expanded, while the
other patterns in row number i are replaced by wild-cards
and the action is replaced by exits.
Recursive calls are performed as follows:
l',ρ' |
= |
C*(x, P' → L', ex, def, ctx) |
l'', ρ'' |
= |
… |
… C*(x2 ↔ n, P'' → L'',
ex, ⇓(EX(p,def)), ⇓(EX(p,ctx))) |
Outputed code finally is
catch l' with (e y
1
... y
v
) l''
and the returned jump summary is ρ = ρ' ∪ ⇑(ρ'').
- The mixture rule is responsible for feeding the other rules with
appropriate clause matrices.
We first consider the case of a random division.
Hence let us cut P → L into Q → M and
R → N at some row.
Then a fresh trap number e is allocated and
a first recursive call is performed:
lq, ρq = C*(x, Q → M, partial, (R, e) ; def, ctx)
The exhaustiveness information is partial, since
nothing about the exhaustiveness of Q derives from the exhaustiveness
of P. Reachable trap handlers are extended.
Then, a second recursive call is performed:
lr, ρr = C*(x, R → N, ex, def, ρq(e))
It is no surprise that the context argument to the new call is
extracted from the jump summary of the previous call.
Argument ex does not change.
Indeed, if matching by P cannot fail, then matching by R neither can.
Then, the scheme can output the code
l = catch
lq with (
e)
lr
and return the jump summary
(ρq \ { e }) ∪ ρr,
where ρq \ { e } stands for
ρq with the binding for e removed.
Of course, our optimizing compiler does not perform a random division
into two matrices. It instead divides P → L right away into
several sub-matrices.
This can be described formally as several, clever, applications of
the random mixture rule,
so that one of the three previous rules apply
to each matrix in the division.
The aim of the optimizing mixture rule is thus
to perform a division of P into as few sub-matrices as possible.
We present a simple, greedy, approach that
scans P downwards.
We only describe the case when p11 is a constructor pattern.
Thus, having performed the classical mixture rule,
we are in a situation
where the i topmost rows of P have a
constructor pattern in first position (i.e. are constructor rows for short)
and where p1i+1 is not
a constructor pattern.
At that point, a matrix C has been built, which encompasses
all the rows of P from 1 to i.
Let us further write P' for what remains of P, and let
O and R be two new, initially empty matrices.
We then scan the rows of P' from top to bottom, appending them at the
end of C, O or R.
That is, given row number j in P':
-
If p'1j is a variable, then append
row j at the end of R.
- If p'1j is a constructor pattern, then...
-
If row j is not compatible with all the rows of
both R and O, then append row j at the end of
C (i.e., move row j above all the rows that have been
extracted from P' at previous stages).
- If row j is not compatible with all the rows of
R and that one of conditions (a) or (b) for applying the
or-pattern rule are met by O with row j
appended at the end, then do such an append.
- Otherwise, append row j at the end of R.
- If p'1j is a or-pattern, then consider cases (ii) and (iii).
When the scan of P' is over, three matrices, C, O and R have
been built.
In the case where O is empty,
matrix C is valid input to the constructor rule;
otherwise,
appending the rows of O at the end of C yields valid input for
applying
(maybe more than once) the or-pattern rule,
which will in turn yield valid input to
the constructor rule (provided that (_ |
...)
or
patterns have been replaced by semantically equivalent wild-cards in a
previous phase).
Thus, the matrix built by appending O at the end of C is
recorded into the overall division and the division process is
restarted with input R, unless R is empty.
Finally, the full process divides the input matrix P into several
matrices, each of which is valid input to the other rules of the
compilation scheme.