This page is totally outdated: contact me if you need access to the parsers I ported to French, the data set, grammars, etc... (last updated october 2012)


  • Febril : A Brill tagger Front-end with a modified version of the core (beware, it modifies the default pos for an unknown word), it includes a toy oral corpus, a lexicon, some inferences rules included in a perl engine to correct the Tagger Output. Everything is based on the Multext Format.
    The Gui is in tcl/tk, the core in C and the rules in perl, so it should work anywhere (it has been used by many students on Cygwin, Mac os X and of course Linux)
    The actual archive contains the binaries compiled for MacOs X 10.3.9, if you have problems to compile feel free to ask me.
]

Download


  • Debil : An interface to the xerox XLE parser for LFG. It's for now very basic and it's oriented to the production of test suite. It's rapidly evolving towards an universal interface for parser (mainly DyALog, SXLFG and XLFG). The actual version deals with xle.
    It's also written in tcl/tk and should be usable anywhere if you have tcl/tk installed and the following libraries :

http://heanet.dl.sourceforge.net/sourceforge/incrtcl/itcl3.3.tar.gz

http://heanet.dl.sourceforge.net/sourceforge/incrtcl/itk3.3.tar.gz (configure, make, make install to install, it takes 2mn)

If you are on Mandrake add "urpmi iwidgets", on fedora core 4 you'll have to grab the packages for iwidgets.

put debil.tcl somewhere in your path and invoke with debil.tcl | xle

[]

Download


  • DependencyTag parser : It's the prolog sources of my phd thesis, it of course cointains tons of bug, it needs DyALog and SWI-Prolog in order to work.
  • It has some interesting features by the way :
From a ltag grammars and input automaton, it generates a derivation forest as an unification grammar in two ways :
one which is overgenerating (extraction in O(n^3))
the other which is sound (extraction in O(n^6) but which has some drawbacks : in order to avoid the problem of non termination on some rules [see Shieber's restriction on [Shieber et al, 92], it has to be ran under a tabular logic programming environment and the stack of adjunction has to be evaluated bottom-up.

So first using swi-prolog, one has to generate the forest, which has then to be interpreted with DyALog. I included some kind of a very light loop detection emulator using a deept-n recognizer.

*The unifications are precompiled into the trees and the calculus of the features unification are done during the walk of the forest (usually, people generates the derivation trees and then extract the derived tree and apply the unification on the nodes then), so far I included a basic treatment of a few features into this scheme
*if a lexical entry has a control canvas information (control verb), the programs generates a dependency graph.

I'm working on including the treatmnent of coordination as in [seddah et sagot,2006] .

Don't be disapointed, it's far from being usable in a production environment, it's first a proof of concept, then a parser. The grammar format follow the DyALog tree description.

This version is dated from March 2004, I have to grab the last one from my old lab server.
Download


(More to come, I plan to include :

  • the lexicon made by E.Jacquey and I in 2002
  • some usefull tool to make trees using a motif interface (I know, we're not in the 90's anymore but still, it's reallu practical)
  • If I'm able to get back the rights, the stuffs I made while I was in GFI-IE (the website used to be www.gfi-ie.com but it's now a link to the main IT compagny), mainly an nlp interface between an ontology and a geographic database