7 Experimental results

7 Experimental results

We compare the performance of the code generated by the Objective-Caml compilers version 3.00 and 3.01, where the former implements the scheme of section 3.3 and the latter implements our new optimizing scheme (there are other differences of minor relevance to our purpose). For most programs there is little difference; this is natural since pattern-matching usually accounts for a small fraction of most programs running time. A full analysis of the efficiency of our optimizations would in fact require counting relevant instructions (test, branches and indirect branches through jump tables), both statically and dynamically. By lack of time, we only present some programs that demonstrate significant improvement.

Our first benchmark is the traditionnal fib, that we write using a or-pattern.

let rec fib n = match n with | (0|1) -> 1 | _ -> fib (n-1) + fib (n-2)

Here, we simply measure the execution time of computing fib 38. Our second benchmark, pcf, is a byte-code compiler and interpreter for PCF. We compute the geometric mean of the execution time for a set of five different PCF programs. The time-consuming part of this program is the byte-code machine which we coded in the style of the byte-code machine included in [14], the winning entry of the 2000 ICFP programming contest. (we also give figures for this program under the name raytrace).

Experiments were performed on a lightly loaded 366Mhz Pentium Pro Linux PC. The tables show wall-clock times (in seconds) and ratios:

	`fib`		`raytrace`		`pcf`
`V 3.00`	5.36	100	1.69	100	8.12	100
`V 3.01`	3.74	71	1.62	96	5.08	63

Obviously, as demonstrated by the fib example, compilation of or-patterns has much improved. Testing similar examples confirms that fact. Improvements also comes from the better compilation of switches. The pcf example is more interesting, it shows that our optimizations yield a 37% speed-up, in the case of a typical ML application (a quickly written, compact, prototype implementation of some programming language). The raytrace example exhibits less important improvements on the whole test suite of the contest; however, improvements are noticeable for some inputs.

It should also be noticed that the new compiler somehow equates the runtime performance of various coding styles, a feature that is important for a high-level construct such as pattern-matching. Variations in coding style include the relative ordering of non-overlapping patterns and on the order of arguments in pairs.

We also performed measurements on a 500Mhz Dec Alpha server. They suggest that the effects of our optimization do not depend on the targeted architecture.

	`fib`		`pcf`
`V 3.00`	3.4	100	4.13	100
`V 3.01`	2.5	74	2.86	69

The raytrace example is is omitted because it relies on IEEE floating point arithmetic, which is not implemented in the Objective Caml compiler for this architecture.

More detailed information on these benchmarks is available at http://caml.inria.fr/pattern/speed.html.