Go to the first, previous, next, last section, table of contents.


The Caml-IDL mapping

This section describes how IDL types, function declarations, and interfaces are mapped to Caml types, functions and classes.

Base types

`IDL type ty `Caml type [[ty]]''
`"byte", "short" `"int"''
`"int", "long" with "[camlint]" attribute `"int"''
`"int", "long" with "[nativeint]" attribute `"nativeint"''
`"int", "long" with "[int32]" attribute `"int32"''
`"int", "long" with "[int64]" attribute `"int64"''
`"char", "unsigned char", "signed char" `"char"''
`"float", "double" `"float"''
`"boolean" `"bool"''

Depending on the attributes, the "int" and "long" integer types are converted to one of the Caml integer types "int", "nativeint", "int32", or "int64". Values of Caml type "int32" are exactly 32-bit wide and values of type "int64" are exactly 64-bit wide on all platforms. Values of type "nativeint" have the natural word size of the platform, and are large enough to accommodate any C "int" or "long int" without loss of precision. Values of Caml type "int" have the natural word size of the platform minus one bit of tag, hence the conversion from IDL types "int" and "long" loses the most significant bit on 32-bit platforms. On 64-bit platforms, the conversion from "int" is exact, but the conversion from "long" loses the most significant bit.

If no explicit integer attribute is given for an "int" or "long" type, the "int_default" or "long_default" attribute of the enclosing interface, if any, determines the kind of the integer. If no "int_default" or "long_default" attribute is in scope, the kind "camlint" is assumed, which maps IDL "int" and "long" types to the Caml "int" type.

Pointers

The mapping of IDL pointer types depends on their kinds. Writing [[ty]] for the Caml type corresponding to the IDL type $ty$, we have:


       [ref] @var{ty} *  =====>  [[@var{ty}]]
    [unique] @var{ty} *  =====>  [[@var{ty}]] option
       [ptr] @var{ty} *  =====>  [[@var{ty}]] Com.opaque

In other terms, IDL pointers of kind "ref" are ignored during the mapping: "[ref] "ty" *" is mapped to the same Caml type as ty. A pointer p to a C value c" = *"p is translated to the Caml value corresponding to c.

IDL pointers of kind "unique" are mapped to an "option" type. The option value is "None" for a null pointer, and "Some("v")" for a non-null pointer to a C value c that translates to the ML value v.

IDL pointers of kind "ptr" are mapped to a "Com.opaque" type. This is an abstract type that encapsulates the C pointer without attempting to convert it to an ML data structure.

IDL pointers of kind "ignore" denote struct fields and function parameters that need not be exposed in the Caml code. Those pointers are simply set to null when converting from Caml to C, and ignored when converting from C to Caml. They cannot occur elsewhere.

If no explicit pointer kind is given, the "pointer_default" attribute of the enclosing interface, if any, determines the kind of the pointer. If no "pointer_default" attribute is in scope, the kind "unique" is assumed.

Arrays

IDL arrays of characters that carry the "[string]" attribute are mapped to the Caml "string" type:

`IDL type ty `Caml type [[ty]]''
`"[string] char []" `"string"''
`"[string] unsigned char []" `"string"''
`"[string] signed char []" `"string"''
`"[string] byte []" `"string"''

Caml string values are translated to standard null-terminated C strings. Be careful about embedded null characters in the Caml string, which will be recognized as end of string by C functions.

IDL arrays carrying the "[bigarray]" attribute are translated to Caml "big arrays", as described in the next section.

All other IDL arrays are translated to ML arrays:


        @var{ty} []  =====>  [[@var{ty}]] array

For instance, "double []" becomes "float array". Consequently, multi-dimensional arrays are translated to Caml arrays of arrays. For instance, "int [][]" becomes "int array array".

If the "unique" attribute is given, the IDL array is translated to an ML option type:


        [string,unique] char []  =====>  string option
        [unique] @var{ty} []     =====>  [[@var{ty}]] array option

As in the case of pointers of kind "unique", the option value is "None" for a null C pointer, and "Some("v")" for a non-null C pointer to a C array that translates to the ML string or array v.

Conversion between a C array and an ML array proceed element by element. For the conversion from C to ML, the number of elements of the ML array is determined as follows (in the order presented):

For instance, C values of IDL type "[length_is(n)] double[]" are mapped to Caml "float array" of "n" elements. C values of IDL type "double[10]" are mapped to Caml "float array" of 10 elements.

The "length_is" and "size_is" attributes take as argument one or several limited expressions. Each expression applies to one dimension of the array. For instance, "[size_is(*dimx, *dimy)] double d[][]" specifies a matrix of "double" whose first dimension has size "*dimx" and the second has size "*dimy".

Big arrays

IDL arrays of integers or floats that carry the "[bigarray]" attribute are mapped to one of the Caml "Bigarray" types: "Array1.t" for one-dimensional arrays, "Array2.t" for 2-dimensional arrays, "Array3.t" for 3-dimensional arrays, and "Genarray.t" for arrays of 4 dimensions or more.

If the "[fortran]" attribute is given, the big array is accessed from Caml using the Fortran conventions (array indices start at 1; column-major memory layout). By default, the big array is accessed from Caml using the C conventions (array indices start at 0; row-major memory layout).

If the "[managed]" attribute is given on a big array type that is result type or out parameter type of a function, Caml assumes that the corresponding C array was allocated using "malloc()", and is not referenced anywhere else; then, the Caml garbage collector will free the C array when the corresponding Caml big array becomes unreachable. By default, Caml assumes that result or out C arrays are statically or permanently allocated, and keeps a pointer to them during conversion to Caml big arrays, and does not free them when the Caml bigarrays become unreachable.

Structs

IDL structs are mapped to Caml record types. The names and types of the IDL struct fields determine the names and types of the Caml record type:


struct s { ... ; ty_i id_i ; ... } 

becomes  

type s = { ... ; id_i : [[ty_i]] ; ... }

Example: "struct s { int n; double d[4]; }" becomes "type s = {n: int; d: float array}".

Exceptions to this rule are as follows:

Unions

IDL discriminated unions are translated to Caml sum types. Each case of the union corresponds to a constructor of the sum type. The constructor is constant if the union case has no associated field, otherwise has one argument corresponding to the union case field. If the union has a "default" case, an extra constructor "Default_"unionname is added to the Caml sum type, carrying an "int" argument (the value of the discriminating field), and possibly another argument corresponding to the default field. Examples:


union u1 { case A: int x; case B: case C: double d; case D: ; } 
becomes
type u1 = A of int | B of float | C of float | D

union u2 { case A: int x; case B: double d; default: ; }
becomes
type u2 = A of int | B of float | Default_u of int

union u3 { case A: int x; default: double d; }
becomes
type u3 = A of int | Default_v of int * double

All IDL unions must be discriminated, either via the special syntax "union "name" switch(int "discr")"..., or via the attribute "switch_is("discr")", where discr is a C l-value built from other parameters of the current function, or other fields of the current "struct". Both the discriminant and the case labels must be of an integer type. Unless a "default" case is given, the value of the discriminant must be one of the cases of the union.

Enums

IDL enums are translated to Caml enumerated types (sum types with only constant constructors). The names of the constructors are determined by the names of the enum labels. The values attached to the enum labels are ignored. Example: "enum e { A, B = 2, C = 4 }" becomes "type enum_e = A | B | C".

The @"set"@ attribute can be applied to a named enum to denote a bitfield obtained by logical "or" of zero, one or several labels of the enum. The corresponding ML value is a list of zero, one or several constructors of the Caml enumerated type. Consider for instance:


enum e { A = 1, B = 2, C = 4 };
typedef [set] enum e eset;

The Caml type "eset" is equal to "enum_e list". The C integer 6 (= "B | C") is translated to the ML list "[B; C]". The ML list "[A; C]" is translated to the C integer "A | C", that is "5".

Type definitions

An IDL "typedef" statement is normally translated to a Caml type abbreviation. For instance, "typedef [string] char * str" becomes "type str = string".

If the @"abstract"@ attribute is given, a Caml abstract type is generated instead of a type abbreviation, thus hinding from Caml the representation of the type in question. For instance, "typedef [abstract] void * handle" becomes "type handle". In this case, the IDL type in the "typedef" is ignored.

If the @"mltype" "(" '"' caml-type-expr '"' ")"@ attribute is given, the Caml type is made equal to @caml-type-expr@. This is often used in conjunction with the @"ml2c"@ and @"c2ml"@ attributes to implement custom translation of data structures between C and ML. For instance, "typedef [mltype("int list")] struct mylist_struct * mylist" becomes "type mylist = int list".

If the @"c2ml("funct-name")" and @"ml2c("funct-name")" attributes are given, the user-provided C functions given as attributes will be called to perform Caml to C and C to Caml conversions for values of the typedef-ed type, instead of using the "camlidl"-generated conversion functions. This allows user-controlled translation of data structures. The prototypes of the conversion functions must be


        value c2ml(@var{ty} * input, camlidl_ctx ctx);
        void ml2c(value input, @var{ty} * output, camlidl_ctx ctx);

where ty is the name of the type defined by "typedef". In other terms, the "c2ml" function is passed a reference to a ty and returns the corresponding Caml value, while the "ml2c" function is passed a Caml value as first argument and stores the corresponding C value in the ty reference passed as second argument. (The extra "camlidl_ctx" argument is for internal use by the generated stub code; just ignore it.)

If the @"errorcheck("fn")"@ attribute is provided for the "typedef" ty, the error checking function @fn@ is called each time a function result of type ty is converted from C to Caml. The function can then check the ty value for values indicating an error condition, and raise the appropriate exception. If in addition the @"errorcode"@ attribute is provided, the conversion from C to Caml is suppressed: values of type ty are only passed to @fn@ for error checking, then discarded.

Functions

IDL function declarations are translated to Caml functions. The parameters and results of the Caml function are determined from those of the IDL function according to the following rules:

Examples:


int f([in] double x, [in] double y)             f : float -> float -> int

Two "double" input, one "int" output


void g([in] int x)                              g : int -> unit

One "int" input, no output


int h()                                         h : unit -> int

No input, one "int" result


void i([in] int x, [out] double * y)            i : int -> double

One "int" input, one "double" output (as an "out" parameter)


int j([in] int x, [out] double * y)             j : int -> int * double

One "int" input, one "int" output (in the result), one "double" output (as an "out" parameter)


void k([in,out,ref] int * x)                    k : int -> int

The "in,out" parameter is both one "int" input and one "int" output.


HRESULT l([in] int x, [out] int * res1, [out] int * res2)
                                                l : int -> int * int

"HRESULT" is a predefined type with the "errorcode" attribute, hence it is ignored. It remains one "int" input and two "int" outputs ("out" parameters)


void m([in] int len, [in,size_is(len)] double d[])
                                                m : float array -> int

"len" is a dependent parameter, hence is ignored. The only input is the "double" array


void n([in] int inputlen, [out] int * outputlen, 
       [in,out,size_is(inputlen),length_is(*outputlen)] double d[])
                                                n : float array -> float array

The two parameters "inputlen" and "outputlen" are dependent, hence ignored. The "double" array is both an input and an output.


void p([in] int dimx, [in] int dimy,
       [in,out,bigarray,size_is(dimx,dimy)] double d[][])
p : (float, Bigarray.float64_elt, Bigarray.c_layout) Bigarray.Array2.t -> unit

The two parameters "dimx" and "dimy" are dependent (determined from the dimensions of the big array argument), hence ignored. The two-dimensional array "d", although marked "[in,out]", is a big array, hence passed as an input that will be modified in place by the C function "p". The Caml function has no outputs.

Error checking:

For every output that is of a named type with the @"errorcheck("fn")"@ attribute, the error checking function @fn@ is called after the C function returns. That function is assumed to raise a Caml exception if it finds an output denoting an error.

Custom calling and deallocation sequences:

The IDL declaration for a function can optionally specify a custom calling sequence and/or a custom deallocation sequence, via @quote@ clauses following the function declaration:


function-decl:
  attributes type-spec {'*'} ident '(' params ')' 
  { 'quote''(' ident ',' string ')' } ;

The general shape of a "camlidl"-generated stub function is as follows:


value caml_wrapper(value camlparam1, ..., value camlparamK)
{
  /* Convert the function parameters from Caml to C */
  param1 = ...;
  ...
  paramN = ...;
  /* Call the C function 'ident' */
  _res = ident(param1, ..., paramN);
  /* Convert the function result and out parameters to Caml values */
  camlres = ...;
  /* Return result to Caml */
  return camlres;
}

A @'quote(call,' string ')'@ clause causes the C statements in @string@ to be inserted in the generated stub code instead of the default calling sequence "_res = ident(param1, ..., paramN)". Thus, the statements in @string@ find the converted parameters in local variables that have the same names as the parameters in the IDL declaration, and should leave the result of the function, if any, in the local variable named "_res".

A @'quote(dealloc,' string ')'@ clause causes the C statements in @string@ to be inserted in the generated stub code just before the stub function returns, hence after the conversion of the C function results to Caml values. Again, the statements in @string@ have access to the function result in the local variable named "_res", and to out parameters in local variables having the same names as the parameters. Since the function results and out parameters have already been converted to Caml values, the code in @string@ can safely deallocate the data structures they point to.

Custom calling sequences are typically used to rearrange or combine function parameters, and to perform extra error checks on the arguments and results. For instance, the Unix "write" system call can be specified in IDL as follows:


        int write([in] int fd,
                  [in,string,length_is(len)] char * data,
                  [in] int len,
                  [in] int ofs,
                  [in] int towrite)
          quote(call,
            " /* Validate the arguments */
              if (ofs < 0 || ofs + towrite >= len) failwith(\"write\");
              /* Perform the write */
              _res = write(fd, data + ofs, towrite);
              /* Validate the result */
              if (_res == -1) failwith(\"write\"); ");

% Custom deallocation sequences are useful to free data structures dynamically allocated and returned by the C function. For instance, a C function "f" that returns a "malloc"-ed string can be specified in IDL as follows:


        [string] char * f([in] int x)
          quote(dealloc, "free(_res); ");

If the string is returned as an "out" parameter instead, we would write:


        void f ([in] int x, [out, string*] char ** str)
          quote(dealloc, "free(*str); ");

Interfaces

IDL interfaces that do not have the @"object"@ attribute are essentially ignored. That is, the declarations contained in the interface are processed as if they occurred at the top-level of the IDL file. The @"pointer_default"@, @"int_default"@ and @"long_default"@ attributes to the interface can be used to specify the default pointer kind and integer mappings for the declarations contained in the interface. Other attributes, as well as the name of the super-interface if any, are ignored.

IDL interfaces having the @"object"@ attribute specify COM-style object interfaces. The function declarations contained in the interface specify the methods of the COM interface. Other kinds of declarations (type declarations, @"import"@ statements, etc) are treated as if they occurred at the top-level of the IDL file. An optional super-interface can be given, in which case the COM interface implements the methods of the super-interface in addition to those specified in the IDL interface. Example:


[object, uuid(...)] interface IA { typedef int t; int f(int x); }
[object] interface IB : IA { import "foo.idl"; void g([string] char * s); }

This defines a type "t" and imports the file "foo.idl" as usual. In addition, two interfaces are declared: "IA", containing one method "f" from "int" to "int", and "IB", containing two methods, "f" from "int" to "int" and "g" from "string" to "unit".

The definition of an object interface i generates the following Caml definitions:

Example: in the "IA" and "IB" example above, the following Caml definitions are generated for "IA":


type iA
val iid_iA : iA Com.iid
class iA_class : iA Com.interface -> object method f : int -> int end
val use_iA : iA Com.interface -> iA_class
val make_iA : #iA_class -> iA Com.interface

For "IB", we get:


type iB
val iA_of_iB : iB Com.interface -> iA Com.interface
class iB_class :
  iB Com.interface -> object inherit iA_class method g : string -> unit end
val use_iB : iB Com.interface -> iB_class
val make_iB : #iB_class -> iB Com.interface

Error handling in interfaces: Conventionally, methods of COM interfaces always return a result of type "HRESULT" that says whether the method succeeded or failed, and in the latter case returns an error code to its caller.

When calling an interface method from Caml, if the method returns an "HRESULT" denoting failure, the exception "Com.Error" is raised with a message describing the error. Successful "HRESULT" return values are ignored. To make them available to Caml, "camlidl" defines the types "HRESULT_bool" and "HRESULT_int". If those types are used as return types instead of "HRESULT", failure results are mapped to "Com.Error" exceptions as before, but successful results are mapped to the Caml types "bool" and "int" respectively. (For "HRESULT_bool", the "S_OK" result is mapped to "true" and other successful results are mapped to "false". For "HRESULT_int", the low 16 bits of the result code are returned as a Caml "int".)

When calling a Caml method from a COM client, any exception that escapes the Caml method is mapped back to a failure "HRESULT". A textual description of the uncaught exception is saved using "SetLastError", and can be consulted by the COM client using "GetLastError" (this is the standard convention for passing extended error information in COM).

If the IDL return type of the method is not one of the "HRESULT" types, any exception escaping the Caml method aborts the whole program after printing a description of the exception. Hence, programmers of Caml components should either use "HRESULT" as result type, or make very sure that all exceptions are properly caught by the method.


Go to the first, previous, next, last section, table of contents.