The Zen Computational Linguistics Toolkit

Welcome to the Zen site. It concerns a computational linguistics toolkit developed by Gérard Huet at INRIA at the Paris-Rocquencourt center.

Zen is implemented in Pidgin ML, which is a core subset of the Objective Caml programming language under the so-called revised syntax.

The Zen toolkit has been abstracted from a computational linguistics library for Sanskrit under development. Its application to the analysis of Sanskrit euphony (sandhi) is available as an article in PS and PDF format.

This toolkit has been applied by Sylvain Pogodalla and Nicolas Barth to the morphological analysis of French verbs (300 000 inflected forms for 6500 verbs); visit the LiToTe linguistic resources site.

A documentation is available in literate programming style as a pdf document. Background articles for using the toolkit are an article on its use for Sanskrit tagging, an article describing the mixed automata Aum technology, and an article on modular transducers.

A compressed tar file is available. Under Unix/Linux/MacOSX, untarring this file will produce a directory ZEN_3.1, in which the README file will provide installation information. This version will work with distribution version Ocaml V3.11 or more recent. Enjoy!

This library, with copyright INRIA 2002-2012, is distributed as open source software under the LGPL license.



Gerard.Huet@inria.fr
Last update : July 30th, 2012