Attempts to create general-purpose natural language processing systems are
frustrated by the problem of world knowledge; however sophisticated grammars
become, interpretation of texts that are unrestricted in form or
subject-matter requires the ability to access and reason over unpredictably
large amounts of non-linguistic information. Nevertheless, many individual
applications permit a much simpler approach, in which the texts to be
handled are of limited complexity (characterizable in terms of a
`sublanguage') and the knowledge needed to interpret them is finitely
The FNSRS is currently funding a project at ISSCO entitled ``Sublanguage
and Semantic Modelling for MT in a Finite Domain'', which investigates
We adopt a pragmatic approach to these objectives, employing a mixture of
rule-based and statistical methods as appropriate, without excluding the
possibility of guidance by a human expert.
The impetus for this research proposal derives from a programme of work
which ISSCO has been following for a number of years. The Swiss Federal
Institute for the Study of Snow and Avalanches (IFENA) requires a system to
translate avalanche warning bulletins from German into French, and
eventually Italian, and ISSCO has been developing a prototype translation
system for avalanche bulletins using the ELU linguistic
- Domain modelling: given a corpus of texts, create a semantic model
(ontology, knowledge base, etc.) representing the objects and events
referred to in the corpus, and relations between them.
- Sublanguage definition: given a corpus of texts, discover the least
complex language sufficient to express their content.
- Grammar construction: given a sublanguage definition and a domain model,
devise a grammar which relates expressions in the sublanguage to
representations interpretable within the domain model.