Grammars which are Re-usable to Automatically Analyze Language
The aim of the EUREKA research project GRAAL (the acronym stands for
``Grammars which are Re-usable to Automatically Analyze Languages) is
to provide a linguistic toolbox consisting in modules for Natural
Language Processing, that will serve to build various NLP applications
of different types.
The budget, drawn over 4 years, exceeds 20 millions ECUs. This
project gathers the skills of more than 50 persons (1300 man/month)
coming from 11 companies and research organisms from 6 European
This project will permit a considerable reduction in the development
costs of an application implementing techniques of language automatic
processing. Today these applications can only be developed at the
expense of considerable efforts, at the level both of grammars and
dictionaries. Therefore it is important to try to reduce the costs
through the capitalisation and maximum re-use of the various
- GSI-Erli (France), project co-ordinator and leader for
- AEROSPATIALE (France),
- EDF (France),
- FIAT (Italy),
- Institute for Language and Speech Product (ILSP) (Greece),
- Instituto de Linguistica Teorica e Computacional (ILTEC)
- IRIT-CNRS (France),
- IRST (Italy),
- ISSCO (Switzerland),
- LINGSOFT (Finland),
- NOKIA (Finland).
1. The GRAAL ToolBox, with its own grammar writing formalism, should
allow industrial partners to benefit from recent results in NLP
research. Methods of analysis and representation which have already
proved useful in research systems can now be applied in real size
industrial system where their utility and dependability can be further
The theoretical basis of the GRAAL formalism are those of Typed
Feature Structures and of unification-based systems,
extended with a limited set of constraints}. The grammars can
thus be `declarative', `reversible' and `modular'. Grammar modularity
is the key concept, allowing the grammar designer to build a `core'
grammar, which can then be modified by extensions which are specific
to an application or a set of applications.
For dictionaries, GRAAL re-uses the results of the GENELEX Eureka
project, regarding models as well as lexicographic resources.
2. A considerable part of this project is dedicated to development,
implementation and maintenance tools. As there are software
development benches, our objective in this project is to implement a
linguistic application development bench allowing to:
3. Beyond the development of basic tools, GRAAL is aimed at
effectively implementing pilot applications for the partners who are
also users (AEROSPATIALE, EDF, Nokia, FIAT). Applications such as
text automatic indexing using various types of reference files
(thesauri, term collections...), knowledge extraction (aid to the
constitution of terminological bases, thesauri or knowledge bases...),
computer-aided translation and machine translation, namely translation
of simplified languages, will thus be implemented.
The systems which are foreseen will for now be concerned primarily
with French, English and Italian.
- finalize generic grammars (development, interactive finalizing,
tests, validation, management of releases...),
- implement applications using these grammars (characterisation of
applications, adaptation of generic dictionaries and grammars,
- maintain these applications (quality monitoring, evolution
management, non-regression tests...).