Most languages allow type names and variable names to be redefined. This may cause difficulties when typechecking uses type names: assuming a type name t and a variable x of type t, redefining t to be a different type invalidates the typing hypothesis x:t. Hence we need to distinguish between the two types named t. To this end, we introduce a notion of identifiers distinct from names: each identifier has a name, but it also records the binding location of this name; hence, two identifiers with the same name but different binding locations are distinct identifiers [Cardelli-Quest-impl]. The abstract type of identifiers has the following signature:
module type IDENT =
sig
type t
val new: string -> t
val name: t -> string
val equal: t -> t -> bool
type 'a tbl
val emptytbl: 'a tbl
val add: t -> 'a -> 'a tbl -> 'a tbl
val find: t -> 'a tbl -> 'a
end
new returns a fresh identifier with the name given as argument;
name returns the name of the given identifier; equal checks the
equality (same binding location) of two identifiers.
The parameterized type 'a tbl implements applicative dictionaries associating
identifiers to data of type 'a; add returns the given dictionary
enriched with an (identifier, data) pair; find retrieves the data
associated with an identifier, raising the Not_found exception if
the identifier is unbound. Here is a sample implementation of IDENT, representing identifiers as pairs of a name and an integer stamp incremented at each new operation, and 'a tbl as association lists (balanced binary trees would be better).
module Ident : IDENT =
struct
type t = {name: string; stamp: int}
let currstamp = ref 0
let new s = currstamp := !currstamp + 1; {name = s; stamp = !currstamp}
let name id = id.name
let equal id1 id2 = (id1.stamp = id2.stamp)
type 'a tbl = (t * 'a) list
let emptytbl = []
let add id data tbl = (id, data) :: tbl
let rec find id1 = function
[] -> raise Not_found
| (id2, data) :: rem -> if equal id1 id2 then data else find id1 rem
end
We refer to named types, values (variables), and modules either by identifier (if we are in the scope of their binding) or via the dot notation, e.g. M.x to refer to the x component of module M. The data type path represent both kinds of references:
type path =
Pident of Ident.t (* identifier *)
| Pdot of path * string (* access to a module component *)
Since modules can be nested, paths may be arbitrarily long,
e.g. M.N.P.x, which reads ((M.N).P).x.
Notice that access to a module component is by name (the second argument of
Pdot is a string) and not by identifier. The
reason is that the access is not in the scope of the identifier
binding. To avoid ambiguity, we require that all components of a
module (at the same nesting level) have distinct names.
Path equality extends naturally identifier equality:
let rec path_equal p1 p2 =
match (p1, p2) with
(Pident id1, Pident id2) -> Ident.equal id1 id2
| (Pdot(r1, field1), Pdot(r2, field2)) -> path_equal r1 r2 & field1 = field2
| (_, _) -> false
For typechecking modules, we will need to substitute paths for identifiers. Substitutions are defined by the following signature:
module type SUBST =
sig
type t
val identity: t
val add: Ident.t -> path -> t -> t
val path: path -> t -> path
end
Subst.add i p sigma extends the substitution sigma by [i <- p].
Subst.path p sigma applies sigma to the path p.
Here is a sample implementation of SUBST, where substitutions are
represented as dictionaries from identifiers to paths (type path
Ident.tbl).
module Subst : SUBST =
struct
type t = path Ident.tbl
let identity = Ident.emptytbl
let add = Ident.add
let rec path p sub =
match p with
Pident id -> (try Ident.find id sub with Not_found -> p)
| Pdot(root, field) -> Pdot(path root sub, field)
end
The abstract syntax for the base language is provided as an implementation of the following signature:
module type CORE_SYNTAX =
sig
type term
type val_type
type def_type
type kind
val subst_valtype: val_type -> Subst.t -> val_type
val subst_deftype: def_type -> Subst.t -> def_type
val subst_kind: kind -> Subst.t -> kind
end
The type term is the abstract syntax tree for definitions of value names:
a value expression in a functional language, a variable declaration or
procedure definition in a procedural language, or a set of clauses
defining a predicate in a logic language. The type val_type
represents type expressions for these terms;
def_type represents the type expressions that can be bound to a type
name. In many languages, val_type and def_type are identical, but
ML, for instance, has type schemes for val_type, but type
constructors (type expressions possibly parameterized by other types)
for def_type. Finally, the type kind describes the
various kinds that a def_type may have. Many languages have only one
kind of definable types; in ML, the kind of a def_type is the arity
of a type constructor.
Given the syntax for a core language (a module of signature CORE_SYNTAX), we build the abstract syntax structure for the module language specified below. The core language syntax is re-exported as a substructure Core of the module language syntax, in order to record the core language on top of which the module language is built; the remainder of the signature refers to the core language a.s.t. types as components of the Core substructure.
module type MOD_SYNTAX =
sig
module Core: CORE_SYNTAX (* the core syntax we started with *)
type type_decl =
{ kind: Core.kind;
manifest: Core.def_type option } (* either abstract or manifest *)
type mod_type =
Signature of signature (* sig ... end *)
| Functor_type of Ident.t * mod_type * mod_type (* functor(X: mty) mty *)
and signature = specification list
and specification =
Value_sig of Ident.t * Core.val_type (* val x: ty *)
| Type_sig of Ident.t * type_decl (* type t [= ty] *)
| Module_sig of Ident.t * mod_type (* module X: mty *)
type mod_term =
Longident of path (* X or X.Y.Z *)
| Structure of structure (* struct ... end *)
| Functor of Ident.t * mod_type * mod_term (* functor (X: mty) mod *)
| Apply of mod_term * mod_term (* mod1(mod2) *)
| Constraint of mod_term * mod_type (* (mod : mty) *)
and structure = definition list
and definition =
Value_str of Ident.t * Core.term (* let x = expr *)
| Type_str of Ident.t * Core.kind * Core.def_type (* type t = ty *)
| Module_str of Ident.t * mod_term (* module X = mod *)
val subst_typedecl: type_decl -> Subst.t -> type_decl
val subst_modtype: mod_type -> Subst.t -> mod_type
end
Module terms (type mod_term) denote either structures or
functors. Structures are sequences of definitions: of a value
identifier equal to a core term, of a type identifier equal to a
definable core type, or of a (sub-)module identifier equal to a module
term. Functors are parameterized module terms, i.e.. functions from
module terms to module terms; a module type is explicitly given for
the parameter. Other module terms are module identifiers and access
paths (Longident), referring to module terms bound elsewhere;
applications of a functor to a module (Apply); and restriction of a
module term by a module type (Constraint).Module types are either signatures or functor types. Functor types are dependent function types: they consist of a module type for the argument, a module type for the result, and a name for the argument, which may appear in the result type. A signature describes the interface of a structure, as a sequence of type specifications for identifiers bound in the structure. Value specifications are of the form ``this value identifier has that value type''; module specifications, ``this module identifier has that module type''. Type specifications consist of a kind and an optional definable type revealing the implementation of the type; the type identifier is said to be manifest if its implementation is shown in the specification, and abstract otherwise. Manifest types play an important role for recording type equalities, propagating them through functors, and express so-called sharing constraints between functor arguments [Leroy-manifest-types]. Not all components of a structure need to be specified in a matching signature: identifiers not mentioned in the signature are hidden and remain local to the structure.
The functor that takes an implementation of CORE_SYNTAX and returns the corresponding implementation of MOD_SYNTAX is trivial:
module Mod_syntax(Core_syntax: CORE_SYNTAX) =
struct
module Core = Core_syntax
type type_decl = ... (* as in the signature MOD_SYNTAX *)
type mod_type = ...
type mod_term = ...
let subst_typedecl decl sub =
{ kind = Core.subst_kind decl.kind sub;
manifest = match decl.manifest with
None -> None
| Some dty -> Some(Core.subst_deftype dty sub) }
let rec subst_modtype mty sub =
match mty with
Signature sg -> Signature(List.map (subst_sig_item sub) sg)
| Functor_type(id, mty1, mty2) ->
Functor_type(id, subst_modtype mty1 sub, subst_modtype mty2 sub)
and subst_sig_item sub = function
Value_sig(id, vty) -> Value_sig(id, Core.subst_valtype vty sub)
| Type_sig(id, decl) -> Type_sig(id, subst_typedecl decl sub)
| Module_sig(id, mty) -> Module_sig(id, subst_modtype mty sub)
end
The substitution functions are simple morphisms over declarations and
module types, calling the substitution functions from Core_syntax to
deal with core-language types and kinds. They assume that identifiers
are bound at most once, so that name captures cannot occur.
Type-checking for the base language necessitates type information for module identifiers, in order to type module accesses such as M.x. Before specifying the base-language typechecker, we therefore need to develop an environment structure that records type information for value, type and module identifiers, and answers queries such as ``what is the type of the value M.x?''.
module type ENV =
sig
module Mod: MOD_SYNTAX
type t
val empty: t
val add_value: Ident.t -> Mod.Core.val_type -> t -> t
val add_type: Ident.t -> Mod.type_decl -> t -> t
val add_module: Ident.t -> Mod.mod_type -> t -> t
val add_spec: Mod.specification -> t -> t
val add_signature: Mod.signature -> t -> t
val find_value: path -> t -> Mod.Core.val_type
val find_type: path -> t -> Mod.type_decl
val find_module: path -> t -> Mod.mod_type
end
Environments are handled in a purely applicative way, without side-effects: each add operation leaves the original environment unchanged and returns a fresh environment enriched with the given binding. add_value records the value type of a value identifier; add_type, the declaration of a type identifier; add_module, the module type of a module identifier. add_spec records one of the three kinds of bindings described by the given specification; add_signature records in turn all specifications of the given signature.
Below is a simple implementation of environments, parameterized by an A.S.T. structure for modules.
module Env(Mod_syntax: MOD_SYNTAX) =
struct
module Mod = Mod_syntax
type binding =
Value of Mod.Core.val_type
| Type of Mod.type_decl
| Module of Mod.mod_type
type t = binding Ident.tbl
let empty = Ident.emptytbl
For simplicity, all three kinds of bindings are stored in the same
table. This does not preclude the language from having several name
spaces (e.g. a type and a value can have the same name): names from
different spaces are bound at different locations, and therefore
receive distinct identifiers. The add functions are straightforward:
let add_value id vty env = Ident.add id (Value vty) env
let add_type id decl env = Ident.add id (Type decl) env
let add_module id mty env = Ident.add id (Module mty) env
let add_spec item env =
match item with
Mod.Value_sig(id, vty) -> add_value id vty env
| Mod.Type_sig(id, decl) -> add_type id decl env
| Mod.Module_sig(id, mty) -> add_module id mty env
let add_signature = List.fold_right add_spec
The find functions returns the typing information associated with a
path in an environment. If the input path is just an identifier, then
a simple lookup in the environment suffices. If the path is a dot
access, e.g. M.x, the signature of M is looked up in the
environment, then scanned to find its x field and the associated
type informations. Moreover, some substitutions are required to preserve
the dependencies between signature components. Assume for instance
that the module M has the following signature:
M : sig type t val x: t end
Then, the type of the value M.x is not t as indicated in the
signature (that t becomes unbound once lifted out of the signature),
but M.t. More generally, in the type of a component of a signature,
all identifiers bound earlier in the signature must be prefixed by the
path leading to the signature. This substitution can either be
performed each time a path is looked up, or, more efficiently, be
computed in advance when a module identifier with a signature type is
introduced in the environment.
Below is a naive implementation where the substitution is computed and
applied at path lookup time.
let rec find path env =
match path with
Pident id ->
Ident.find id env
| Pdot(root, field) ->
match find_module root env with
Mod.Signature sg -> find_field root field Subst.identity sg
| _ -> error "structure expected in dot access"
and find_field p field subst = function
[] -> error "no such field in structure"
| Mod.Value_sig(id, vty) :: rem ->
if Ident.name id = field
then Value(Mod.Core.subst_valtype vty subst)
else find_field p field subst rem
| Mod.Type_sig(id, decl) :: rem ->
if Ident.name id = field
then Type(Mod.subst_typedecl decl subst)
else find_field p field (Subst.add id (Pdot(p, Ident.name id)) subst) rem
| Mod.Module_sig(id, mty) :: rem ->
if Ident.name id = field
then Module(Mod.subst_modtype mty subst)
else find_field p field (Subst.add id (Pdot(p, Ident.name id)) subst) rem
and find_value path env =
match find path env with Value vty -> vty | _ -> error "value field expected"
and find_type path env =
match find path env with Type decl -> decl | _ -> error "type field expected"
and find_module path env =
match find path env with Module mty -> mty | _ -> error "module field expected"
end
As the reader may have noticed, error handling is extremely simplified
in this paper: we assume given an error function that prints a
message and aborts. Similarly, Not_found exceptions raised by
Ident.find are not handled. A better implementation would use
exceptions to gather more context before printing the error.
The type-checker for the base language must implement the following signature:
module type CORE_TYPING =
sig
module Core: CORE_SYNTAX
module Env: ENV with module Mod.Core = Core
(* Typing functions *)
val type_term: Env.t -> Core.term -> Core.val_type
val kind_deftype: Env.t -> Core.def_type -> Core.kind
val check_valtype: Env.t -> Core.val_type -> unit
val check_kind: Env.t -> Core.kind -> unit
(* Type matching functions *)
val valtype_match: Env.t -> Core.val_type -> Core.val_type -> bool
val deftype_equiv: Env.t -> Core.def_type -> Core.def_type -> bool
val kind_match: Env.t -> Core.kind -> Core.kind -> bool
val deftype_of_path: path -> Core.kind -> Core.def_type
(* Elimination of dependencies *)
val nondep_valtype: Env.t -> Ident.t -> Core.val_type -> Core.val_type
val nondep_deftype: Env.t -> Ident.t -> Core.def_type -> Core.def_type
val nondep_kind: Env.t -> Ident.t -> Core.kind -> Core.kind
end
The Core and Env components record the a.s.t. types and the
environment structure over which the type-checker is built. Of course,
the environment structure must be compatible with the a.s.t. structure: in
SML parlance, some of their type components must share. In our system,
this is expressed by the notation ENV with module Mod.Core = Core,
which is actually syntactic sugar for the following signature that
enriches ENV with type equalities over its Mod.Core component:
sig
module Mod: sig
module Core: sig
type term = Core.term
type val_type = Core.val_type
type def_type = Core.def_type
type kind = Core.kind
(* remainder of CORE_SYNTAX unchanged *)
end
(* remainder of MOD_SYNTAX unchanged *)
end
(* remainder of ENV unchanged *)
end
The main typing function is type_term, which takes a term and an environment, and returns the principal type of the term in that environment (principal w.r.t. the valtype_match ordering on value types). Depending on the base language, this function implements type inference (propagate types from the declarations of variables and function parameters) or ML-style type reconstruction (guess the types of function parameters as well). For simplicity, all typing functions are assumed to print a message and abort on error.
Three auxiliary functions kind_deftype, check_valtype and check_kind check the well-formedness of type and kind expressions in an environment, in particular that all type paths are bound and all kind constraints are met. In addition, kind_deftype infers and returns the kind of the given definable type.
The three predicates valtype_match, deftype_equiv and kind_match are used when checking an implementation against a specification, e.g. a structure against a signature. In a language with subtyping, valtype_match e t_1 t_2 checks that the type t_1 is a subtype of t_2 in the environment e; in a language with ML-style polymorphism, that t_1 is a type schema more general than t_2; in a language with coercions, that t_1 can be coerced into t_2. Similarly, deftype_equiv e t_1 t_2 checks that the definable types t_1 and t_2 are equivalent (identical modulo the type equalities induced by manifest type specifications contained in e).
The functions nondep_valtype, nondep_deftype and nondep_kind are used to eliminate all occurrences of a given identifier in the given type or kind (see section (ref)).
Finally, deftype_of_path transforms a type path and its kind into the corresponding definable type. For instance, in the case of ML, given the path t and the arity 2, it returns the parameterized type ('a, 'b) |-> ('a, 'b) t.
The type-checker for the module language has the following interface:
module type MOD_TYPING =
sig
module Mod: MOD_SYNTAX
module Env: ENV with module Mod = Mod
val type_module: Env.t -> Mod.mod_term -> Mod.mod_type
val type_definition: Env.t -> Mod.definition -> Mod.specification
end
The main entry point is type_module, which infers and returns the
type of a module term. The intended usage for a separate compiler
is to parse a whole implementation file as a module term, then pass it to
type_module. If an interface file is also given, type_module
should be applied to the constrained term (m:M), where m is the
implementation (a module term) and M the interface (a module type).
The alternate entry point type_definition is intended for
interactive use: the toplevel loop reads a definition, infers its
specification, and prints the outcome.The implementation of the type-checker is parameterized by an A.S.T. structure, an environment structure, and a type-checker for the core language, all three operating on compatible types:
module Mod_typing
(TheMod: MOD_SYNTAX)
(TheEnv: ENV with module Mod = TheMod)
(CT: CORE_TYPING with module Core = TheMod.Core and module Env = TheEnv) =
struct
module Mod = TheMod
module Env = TheEnv
open Mod (* Allows to omit the `Mod.' prefix -- saves on typing *)
let rec modtype_match env mty1 mty2 = ... (* see section 2.9 *)
let rec strengthen_modtype path mty = ... (* see section 2.10 *)
let nondep_modtype env id mty = ... (* see section 2.11 *)
We postpone the definition of the three auxiliary functions above to the
following sections. The check_modtype function below checks the
well-formedness of a user-supplied module type --- in particular, that
no identifier is used before being bound.
let rec check_modtype env = function
Signature sg -> check_signature env sg
| Functor_type(param, arg, res) ->
check_modtype env arg; check_modtype (Env.add_module param arg env) res
and check_signature env = function
[] -> ()
| Value_sig(id, vty) :: rem ->
CT.check_valtype env vty; check_signature env rem
| Type_sig(id, decl) :: rem ->
CT.check_kind env decl.kind;
begin match decl.manifest with
None -> ()
| Some typ -> CT.kind_deftype env typ; ()
end;
check_signature (Env.add_type id decl env) rem
| Module_sig(id, mty) :: rem ->
check_modtype env mty; check_signature (Env.add_module id mty env) rem
After checking a type specification or module specification in a
signature, we add it to the environment before checking the remainder
of the signature, since subsequent signature elements may refer to
the type or module just checked. No such dependency occurs for value
specifications. Similarly, the result type of a functor may depend on
its parameter (the type of the Mod_typing functor itself is an example).
let rec type_module env = function
Longident path ->
strengthen_modtype path (Env.find_module path env)
| Structure str ->
Signature(type_structure env str)
| Functor(param, mty, body) ->
check_modtype env mty;
Functor_type(param, mty, type_module (Env.add_module param mty env) body)
| Apply(funct, arg) ->
(match type_module env funct with
Functor_type(param, mty_param, mty_res) ->
let mty_arg = type_module env arg in
modtype_match env mty_arg mty_param;
(match arg with
Longident path ->
subst_modtype mty_res (Subst.add param path Subst.identity)
| _ ->
try
nondep_modtype (Env.add_module param mty_arg env) param mty_res
with Not_found ->
error "cannot eliminate dependency in functor application")
| _ -> error "application of a non-functor")
| Constraint(modl, mty) ->
check_modtype env mty;
modtype_match env (type_module env modl) mty;
mty
and type_structure env = function
[] -> []
| stritem :: rem ->
let sigitem = type_definition env stritem in
let sigrem = type_structure (Env.add_spec sigitem env) rem in
sigitem :: sigrem
and type_definition env = function
Value_str(id, term) -> Value_sig(id, CT.type_term env term)
| Module_str(id, modl) -> Module_sig(id, type_module env modl)
| Type_str(id, kind, typ) ->
if CT.kind_match env (CT.kind_deftype env typ) kind
then Type_sig(id, {kind = kind; manifest = Some typ})
else error "kind mismatch in type definition"
end
A reference to a module identifier or module component of a structure
(Longident) is typed by a lookup in the environment, followed by a
``strengthening'' operation (strengthen_modtype) that turns abstract
type specifications into specifications of types manifestly equal to
themselves. Strengthening ensures that the identities of abstract
types are preserved; this is detailed in
section (ref).In the case of a structure, each definition is typed, then entered in the environment before typing the remainder of the structure, which can depend on the definition. Type definitions are assigned manifest signatures, which reveal their implementations; the type can be abstracted later, if desired, using a module constraint.
The typing of functor definitions is straightforward. For functor applications, we type the functor and its argument, then check that the type of the argument matches the type of the functor parameter. That is, the argument must provide at least all the components required by the functor, with types at least as general. Matching between module types is detailed in section (ref). Determining the result type of the application raises a subtle difficulty: since functor types are dependent, the result type of the functor can refer to the parameter name; according to the standard elimination rule for dependent function types, the parameter name must therefore be replaced by the actual argument to obtain the type of the application. If the actual argument is a path, this causes no difficulties, because we can always substitute a path for a module identifier anywhere in the module language. But if the argument is not a path, then the substitution is not always possible. Consider:
module F = functor(X: sig type t end) struct type t = X.t end
module A = F(struct type t = int end)
The result type of F is sig type t = X.t end, and attempting to
replace X by struct type t = int end in this type creates an
ill-formed module access (struct type t = int end).t. (Recall that
accesses to structure components are restricted to module paths;
lifting this restriction would compromise the type abstraction
properties of the module system [Leroy-appl-functors].)
However, not all functor applications to non-paths run into this
problem: everything works fine if the functor type is non-dependent;
more subtly, there are also cases where the dependency can be
eliminated using manifest type information from the type of the
argument. The function nondep_modtype, explained below in
section (ref), attempts to eliminate the dependency in the
functor result type; only if it fails is an error generated.
A module type M matches a module type N if any module m satisfying the specification M also satisfies N. This allows several degrees of flexibility. If M and N are signatures, then M may specify more components than N; components common to both signatures may be specified more tightly in M than in N (e.g. N specifies a type t abstract and M manifest). If M and N are functor types, then M's result type can be more precise than N's, or M's argument type can be less precise (accepting more arguments) than N's. All in all, module type matching resembles subtyping in a functional language with records, with some extra complications due to the dependencies in functor types and signatures.
let rec modtype_match env mty1 mty2 =
match (mty1, mty2) with
(Signature sig1, Signature sig2) ->
let (paired_components, subst) = pair_signature_components sig1 sig2 in
let ext_env = Env.add_signature sig1 env in
List.iter (specification_match ext_env subst) paired_components
| (Functor_type(param1, arg1, res1), Functor_type(param2, arg2, res2)) ->
let subst = Subst.add param1 (Pident param2) Subst.identity in
let res1' = Mod.subst_modtype res1 subst in
modtype_match env arg2 arg1;
modtype_match (Env.add_module param2 arg2 env) res1' res2
| (_, _) ->
error "module type mismatch"
As outlined above, matching between functor types is contravariant in
the argument types. Since the result types may depend on the parameters,
we need to identify the two parameter identifiers. For matching the
result types, we assign the parameter the more precise of the two
argument types, allowing more type equalities to be derived about
components of the parameter.Matching between signatures proceed in several steps. First, the signature components are paired: to each component of sig2, we associate the component of sig1 with same name and class. This pass also builds a substitution that equates the identifiers of the paired components, so that these identifiers are considered equal when matching specifications of components that depend on these identifiers.
and pair_signature_components sig1 sig2 =
match sig2 with
[] -> ([], Subst.identity)
| item2 :: rem2 ->
let rec find_matching_component = function
[] -> error "unmatched signature component"
| item1 :: rem1 ->
match (item1, item2) with
(Value_sig(id1, _), Value_sig(id2, _))
when Ident.name id1 = Ident.name id2 -> (id1, id2, item1)
| (Type_sig(id1, _), Type_sig(id2, _))
when Ident.name id1 = Ident.name id2 -> (id1, id2, item1)
| (Module_sig(id1, _), Module_sig(id2, _))
when Ident.name id1 = Ident.name id2 -> (id1, id2, item1)
| _ -> find_matching_component rem1 in
let (id1, id2, item1) = find_matching_component sig1 in
let (pairs, subst) = pair_signature_components sig1 rem2 in
((item1, item2) :: pairs, Subst.add id2 (Pident id1) subst)
After pairing, all components of the richer signature sig1 are added to the
typing environment; this allows matching of specifications to take
advantage of all type equalities specified in sig1. Finally, the
specifications of paired components are matched pairwise.
and specification_match env subst = function
(Value_sig(_, vty1), Value_sig(_, vty2)) ->
if not (CT.valtype_match env vty1 (Core.subst_valtype vty2 subst))
then error "value components do not match"
| (Type_sig(id, decl1), Type_sig(_, decl2)) ->
if not (typedecl_match env id decl1 (Mod.subst_typedecl decl2 subst))
then error "type components do not match"
| (Module_sig(_, mty1), Module_sig(_, mty2)) ->
modtype_match env mty1 (Mod.subst_modtype mty2 subst)
and typedecl_match env id decl1 decl2 =
CT.kind_match env decl1.kind decl2.kind &
begin match (decl1.manifest, decl2.manifest) with
(_, None) -> true
| (Some typ1, Some typ2) -> CT.deftype_equiv env typ1 typ2
| (None, Some typ2) ->
CT.deftype_equiv env (CT.deftype_of_path (Pident id) decl1.kind) typ2
end
Matching pairs of specifications is straightforward: value
specifications match if their value types satisfy the
valtype_match predicate provided by the core language type-checker.
Module specifications match if their module types do. For type
specifications, the kinds should obviously agree. No additional
condition is required if the second type is specified abstract.
If it is specified manifestly equal to some definable type d, then
the first type must either be specified manifestly equal to a type
equivalent to d, or specified abstract but provably equivalent to
d in the current context.The following ML example illustrates all cases of type specification matching:
M = sig type 'a t type u = int type v = u type w type z = w end
N = sig type 'a t type v = int type z type w = z end
The two t specifications match because both are abstract with the
same kind (arity 1). The v=u specification in M matches the
v=int specification in N because u is equivalent to int in the
environment enriched by M's components. The abstract type z in N
is matched because z is manifest with the right kind (arity 0) in M.
Finally, the w=z specification in N is matched by the w
component of M, despite it being abstract, because w and z are
equivalent in the enriched environment.
Consider a module path p with a signature containing an abstract type t:
p : sig type t ... end
What makes p.t abstract is that, since the signature contains no
type equality over t, p.t is incompatible with any other type
except itself. However, the identity of p.t must be preserved, in
particular across rebindings. Assume for instance that p is bound to
a module identifier m:
module m = p
If we assign m the same signature as p, sig type t ... end, then
m.t and p.t are different types. The identity of the abstract
type p.t was lost. The correct signature for m that
preserves p.t's identity is:
m : sig type t = p.t ... end
Fortunately, this signature is a perfectly legal signature for p
itself: an abstract type t component of a path p is always manifestly equal
to itself, p.t. The following function strengthen_modtype
replaces all abstract type specifications in a module type by the
corresponding manifest types rooted at the given path:
let rec strengthen_modtype path mty =
match mty with
Signature sg -> Signature(List.map (strengthen_spec path) sg)
| Functor_type(_, _, _) -> mty
and strengthen_spec path item =
match item with
Value_sig(id, vty) -> item
| Type_sig(id, decl) ->
let m = match decl.manifest with
None -> Some(CT.deftype_of_path (Pdot(path, Ident.name id)) decl.kind)
| Some ty -> Some ty in
Type_sig(id, {kind = decl.kind; manifest = m})
| Module_sig(id, mty) ->
Module_sig(id, strengthen_modtype (Pdot(path, Ident.name id)) mty)
In type_module, this strengthening operation is performed
systematically on a module path each time it is referenced. It can be
shown that this ensures inference of minimal module types
[Leroy-manifest-types] and implements the same notion of
type generativity as in SML [Leroy-generativity].
As mentioned in section (ref), a difficulty arises when applying a functor with a truly dependent type to a module expression that is not a path. Continuing the example given above,
module F = functor(X: sig type t end) struct type t = X.t end
module A = F(struct type t = int end)
we cannot just replace X by the actual argument in the signature of
the functor result (sig type t = X.t end). However, we can take
advantage of the fact that the t component of the actual argument is
known (from the signature of the argument) to be int: instead of
replacing X by the argument, we will replace X.t by int,
obtaining the correct signature sig type t = int end for A. The
formal justification for this trick [Harper-Lillibridge-94]
is to notice that the functor type for F,
functor(X: sig type t end) sig type t = X.t end
is a subtype of
functor(X: sig type t = int end) sig type t = int end
therefore the former can be weakened to the latter
just before typing the application. The latter type being
non-dependent, no problem arises from the argument not being a path.
Algorithmically, what we are doing is: given the result type
sig type t = X.t end of the functor, find an equivalent type that
does not depend on the parameter X, under the assumption that X
has type sig type t = int end, the actual type of the argument.
The functions nondep_valtype, nondep_deftype and nondep_kind
provided by the core language type-checker perform the same operation on
value types, definable types, and kinds.
let rec nondep_modtype env param = function
Signature sg -> Signature(nondep_signature env param sg)
| Functor_type(id, arg, res) ->
Functor_type(id, nondep_modtype env param arg,
nondep_modtype (Env.add_module id arg env) param res)
and nondep_signature env param = function
[] -> []
| item :: rem ->
let rem' = nondep_signature (Env.add_spec item env) param rem in
match item with
Value_sig(id, vty) ->
Value_sig(id, CT.nondep_valtype env param vty) :: rem'
| Type_sig(id, decl) ->
let decl' =
{kind = CT.nondep_kind env param decl.kind;
manifest = match decl.manifest with
None -> None
| Some ty -> Some(CT.nondep_deftype env param ty)} in
Type_sig(id, decl') :: rem'
| Module_sig(id, mty) ->
Module_sig(id, nondep_modtype env param mty) :: rem'
We assume that the CT.nondep functions raise the Not_found
exception if they cannot eliminate the dependency on the parameter
(for instance, we are trying to eliminate X in X.t under the
assumption X : sig type t end). This exception goes through
nondep_modtype transparently and is caught in type_module, causing
an error to be reported.One can go further in trying to eliminate dependencies: in covariant position, a manifest type specification type t = tau where the parameter cannot be eliminated from tau can be turned into an abstract type type t; a value specification val x: tau can be removed from a signature. This corresponds to taking a minimal non-dependent supertype of the functor result type, instead of a non-dependent equivalent type as in the nondep_modtype function above. By taking non-dependent supertypes, more programs are well-typed, but the inferred types are often confusing, because information is lost without warning when moving to a supertype. The approach followed in this paper (signal an error if some information would be lost but eliminate silently the dependency otherwise) seems preferable in practice.