Premières Journées / 1st Workshop on

Tabulation en Analyse Syntaxique et Déduction

Tabulation in Parsing and Deduction

TAPD'98

Paris, France
2-3 Avril 1998 / April 2-3, 1998

Organisées par / Organized by INRIA
en collaboration avec / in collaboration with CEDRIC du / of CNAM

TAPD'98 Papers available online

Linear Indexed Automata and Tabulation of TAG Parsing
M.-J. Nederhof, Univ. of Groningen, NL

We present a new kind of recognizer for tree-adjoining languages, the linear indexed automata. Such recognizers allow straightforward realization by means of logical programming languages. We show that the computations by such automata, which are in general nondeterministic, can be tabulated by means of an extension of a technique originally devised for context-free languages. The proposed application of this work is the design of efficient parsing algorithms for tree-adjoining grammars.

(BibTeX reference.)

Elementary Tree Representation
V. Diaz, V. Carrillo, and M. Toro, Univ. of Sevilla, Spain

Most of the parsers defined for TAG (Tree Adjoining Grammar) formalism are based in classical CFG (Context-Free Grammar) parsers. An elementary tree is considered a set of CFG-rules and adjunction operation is simulated by substitution. Additional control considerations must be added in order to establish the real power of TAGs. In this paper, we present one elementary tree representation for TAG (plain representation). This representation is compared with respect to other (multilayer representation) that is commonly used in the literature in the definition of TAG parsers.

Keywords: TAG, TIG, CFG, Parser.

(BibTeX reference.)

Grammar Compaction and Computation Sharing in Automata-based Parsing
J. Carroll, N. Nicolov, M. Smets, O. Shaumyan, and D. Weir, Univ. of Sussex, UK

Wide-coverage grammars in Lexicalised Tree-Adjoining Grammar (LTAG) and related formalisms are structurally complex, containing many hundreds of elementary trees. In the context of the development of a full-scale LTAG-like grammar and parsing system, we have investigated the claim that because many of these trees have a great deal of structure in common, a parser that manipulates trees individually performs a considerable amount of redundant computation. This claim has been used to motivate a parsing technique that encodes trees as finite state automata and captures overlapping computation through automata minimisation. Our preliminary results show that this technique leads to considerable computation sharing.

(BibTeX reference.)

Parsing and Generation with Tabulation and Compilation
K. Hasida, Electronical Laboratory, Japan
T. Miyata, NIST, Japan

(BibTeX reference.)

A Method for Preserving Ambiguities in Chart Generation
H. Shemtov, Xerox Park, USA

(BibTeX reference.)

GALENA: Tabular DCG Parsing for Natural Languages
M. Vilares Ferro, M.A. Alonso Pardo, J. Grana Gil, and D. Cabreno Souto, Univ. of Coruna, Spain

We present a definite clause based parsing environment for natural languages, whose operational model is the dynamic interpretation of logical push-down automata. We attempt to briefly explain our design decisions in terms of a set of properties that practical natural language processing systems should incorporate. The aim is to show both the advantages and the drawbacks of our approach.

(BibTeX reference.)

Partial Parsing, Deduction and Tabling
V. Rocio, and J. G. Lopes, Univ. Nova de Lisboa, Portugal

(BibTeX reference.)

Scheduling in SLG Revisited
J. Freire, Bell Labs, USA
T. Swift, and D. S. Warren, SUNY at Stony Brook, USA

Scheduling has been recently recognized as a crucial aspect of tabled evaluations, since the efficiency of an evaluation is highly sensitive to the order in which operations are performed. The empiric results of \cite{FSW@jflp97}, \cite{fsw@iclp97} and {FW@ilpsws97} highlighted some of the characteristics of specific scheduling strategies. However, the lack of a formal structure made it difficult to define these strategies more precisely, as well as compare them. In this paper we address this problem by proposing a formal framework to define scheduling strategies, SLG$_{sched}$.

(BibTeX reference.)

A General Tabulation Procedure for Extended Constraint Logic Programs
C. Viegas Damasio, Univ. Aberta, Lisbon, Portugal
L. Moniz Pereira, Univ. Nova de Lisboa, Portugal

We present a paraconsistent extension of well-founded semantics for extended logic programs with constraints, \wfsxp. We then describe the first sound and complete proof procedure for both normal logic programs under \wfs\ and extended constraint logic programs under the \wfsxp\ semantics. The procedure integrates tabulation and constraint handling features and, in particular, constructive negation.

(BibTeX reference.)

Tabling Abduction
J. J. Alferes, and L. Moniz Pereira, Univ. Nova de Lisboa, Portugal

(BibTeX reference.)

Tabulation-based Induction Proofs with Application to Automated Verification
A. Roychoudhury, C.R. Ramakrishnan, I.V. Ramakrishnan, and S.A. Smolka, SUNY at Stony Brook, USA

(BibTeX reference.)

A Chart-like Parser for Context Sensitive Grammars
A.G. Manousopoulou, G. Papakonstantinou, and P.Tsanakas, Technical Univ. of Athens, Greece

(BibTeX reference.)

Bounded Fixed-Point Definability and Tabular Recognition of Languages
H. Leiss, Univ. of Munich, Germany

By relating positive inductive definitions to space-bounded computations of alternating Turing machines, Rounds, Comp. Linguistics 14, 1988, has given uniform grammatical characterizations of the \EXPTIME\ and \PTIME\ languages. But his proof gives fairly poor bounds for language recognition with context-free resp. head grammars.

We improve Rounds' analysis in two respects: first, we introduce a modified class of language definitions that allow restricted forms of negative inductions, and second, we show how to build table-driven recognizers from such definitions. For a wide and natural class of language definitions we thereby obtain fairly efficient recognizers; we can recognize the boolean closure of context-free resp. head languages in the well-known $O(n^3)$ resp. $O(n^6)$ steps on a RAM. Our `bounded' fixed-point formulas apparently can not define an arbitrary \PTIME\ language.

Our method is based on the existence of fixed-points for a class of operators that need neither be monotone nor increasing, but assume a norm or at least a well-founded quasi-ordering on the underlying set.

(BibTeX reference.)

Linear Categorial Deduction via First-order Compilation
M. Hepple, Univ. of Sheffield, UK

This paper addresses the deduction method for implicational linear logic described in (Hepple \cite{hepple:1996}), which involves compilation into a system of indexed first-order formulae, over which deduction can be made using just a single inference rule in a manner which avoids redundant computation. The approach has applications in relation to the parsing of categorial grammars and to the `glue language' approach to LFG semantics. However, the approach is shown to allow some wasted effort in the construction of intermediate results which are guarenteed not to be able to contribute to an overall analysis. This paper shows how the creation of such unwanted results can be blocked via the computation of exclusion relations between formulae. The method for computing these exclusion relations is framed in terms of an alternative formulation of the indexed first order system.

(BibTeX reference.)

On the Use of Tabling for Abstract Interpretation: An Experiment with Abstract Equation Systems
G. Janssens, and K. Sagonas, Katholieke Univ. Leuven, Belgium

Abstract interpretation is a widely used method for the static analysis of programs. This paper reports on our recent experiments with an alternative approach to implementing program analyses based on abstract interpretation. Instead of using a special-purpose system for abstract interpretation (such as PLAI, GAIA, or


AMAI

), we follow the approach advocated in~\cite{CodishDemoenSagonas} and use a logic programming system which supports tabling (namely XSB). To obtain the results of the analysis, we directly compile the program to be analysed into a tabled logic program such that its execution by XSB coincides with the abstract interpretation of the original program. We show how this approach is adapted to abstractions which satisfy the requirements of the framework of Bruynooghe~\cite{Bruynooghe@JLP-91} and that the approach is practical and competitive with state-of-the-art generic abstract interpretation systems implemented in Prolog. Our experiments are based on the non-trivial domain of abstract equation systems which is a parameterized domain which captures structure information together with modes, linearity and sharing.

(BibTeX reference.)

A Tabular Method of Finding the Optimal Word String together with its Dependency Structure
K. Ozeki, Univ. of Electro-Communications, Tokyo, Japan

This paper describes a tabular method of finding the word string, together with its dependency structure, that has the maximum preference score in a given series of word sets. The technique, which originated in the problem of constraining the process of Japanese speech recognition by linguistic knowledge, is potentially applicable to parsing and disambiguation of various natural languages. The preference score of a word string is defined by taking syntax, semantics, and reliability of each word into consideration. A set of recurrence equations is derived based on the principle of Dynamic Programming, which leads to an algorithm to solve the combinatorial optimization problem. In the algorithm, searching for the optimal word string proceeds concurrently with parsing of the string, looking up and filling in two triangular tables.

(BibTeX reference.)

A Generalized CYK Algorithm for Parsing Stochastic CFG
J.-C. Chappelier, and M. Rajman, EPFL, Lausanne, Switzerland

We present a bottom-up parsing algorithm for stochastic context-free grammars that is able (1) to deal with multiple interpretations of sentences containing compound words; (2) to extract $N$-most probable parses in $\mathcal{O}(n^3)$ and compute at the same time all possible parses of any portion of the input sequence with their probabilities; (3) to deal with "out of vocabulary" words. Explicitly extracting all the parse trees associated to a given input sentence depends on the complexity of the grammar, but even in the case where this number is exponential in $n$, the chart used by the algorithm for the representation is of $\mathcal{O}(n^2)$ space complexity.

(BibTeX reference.)