I work as a researcher and computer manager for ISSCO
Information retrieval

DicoPro is a project funded within the Multilingual Information Society Programme (MLIS), an EU initiative launched by the European Commission's DG XIII. The goals of the DicoPro project encompass three domains:

  • data: the conversion of several dictionaries to electronic form, stored in standard SGML format (XML).
  • tools: the development of a uniform, cross-platform tool to enable translators and other language professionals connected to an intranet to consult dictionaries and related lexical data from multiple sources
  • licensing: the exploration and assessment of various distribution and licensing schemes for electronic dictionaries whereby usability and convenience of the end-user is balanced with the need to protect the property of publishers
Collaborating in DicoPro are a research institute (ISSCO), two dictionary publishers (Hachette, HarperCollins), three translation/documentation companies (Multilingual Technology, L&H Mendez, Xynos), and a language industry consultant (LIM). DicoPro was launched in April, 1998.

A demonstration version of the DicoPro client is available for public download. It requires Java 1.2. Once the DicoPro demo is installed, you can test it with our public Dictionary Information Server. DicoPro Applet Prototype


Sylex is a prototype to index and consult in parallel, with an Internet Browser, the multilingual texts (Deutsch,French and Italian) of the systematic collection of the Swiss Federal laws.
The indexes uses the powerfull set of Multilingual Core Language Technology to refine the request and thus the relevance of the answers. This includes language analysis components such as segmenter, morphological analyser or tagger. Recognition of lexemes, contruction of ontologies (semantic information) which improve the indexing and thus increase the successful retreival of documents and information by means of intelligent natural analysis. The existing resources which are available for French, English, german and (partly) for Italien.

Demo: (short texts)
Sylex with forms Sylex with Java1.2 interface

Free text Medical Document Retrieval by using terminological resources and statistical linguistics This project is founded by the Swiss National Science Foundation (Subside no 3200-049832.96). It has started in April 1998 and will finish in 2001.

The work is done in collaboration with the Division d'Informatique générale of the Hôpital cantonal universitaire.
demo coming soon

Text Processing/Corpus analysing/Linguistic
(particularly in PartOfSpeech tagging)
Many Multext applications will require the ability to perform various kinds of analysis on word tokens. For example, in some cases it will be necessary to abstract away from inflectional variation, so that e.g. wal k, walks, walking, and walked are all treated as the same word type at the level of textual annotations. Conversely, it will sometimes be desirable to make use of richer information than that available in the raw text, so that e.g. walking can be identified as the present participle of `walk'. In addition, it is easy to envisage a need for flexibility in the triangular relation between word-token, textual annotations and lexical information; a single fixed linguistic analysis cannot fulfil the requirements of diverse text processing tasks. Mmorph provides the means by which lexicons can be constructed and modified, and texts annotated with lexical information.
POS tagging

A set of tools for multilingual part-of-speech tagging based on a Hidden Markov Model.
Basic technology that has proven useful for monolingual processing tasks is adapted and extended to accomodate a range of natural languages. Emphasis is placed on facilites for experimenting with different tagsets and aiding the user to evaluate and modify the results. Aside from the tagger, the tools include modules to prepare the text for training and tagging, define new tagsets or modify existing sets, declare linguistic preferences, train or retrain with hand annotated data, and facilities to compare results with hand corrected data. These tools provide a flexible environment to experiment with a range of tagsets in different languages.

The Tagger (version 3.00)

Thesis of the Postgrade Course in Gestion moderne des documents électroniques
Amélioration des recherches Full-Text sur base de connaissances morphologiques (in French)

Try the BCP Align
BCPVIEW allows various types of searches in aligned text pairs. First the texts are processed through an alignment program that produces files containing 'index' information which are then used by the interactive lookup program to output a region of the first text (source text) and the corresponding region of the second text (target text) one or both satisfying to a research criterion.
The alignment program is based on statistical methods (length of paragraphs and sentences) (Church and Gale, 1991) applied to parallel texts. The texts are aligned and indexed in a sentence by sentence basis.
This tool enables a lexicographer to quickly find all sentences in the source text and in the target text where a term S of the source text is (or is not) translated as a term T in the target text.
Different types of specifications for terms are provided, like exact matches, relaxed matches, prefix matches and regular expression matches. Searches for terms have 'operators', {\m and, andnot,} to be able to access sentence pairs where terms S and T occur, or sentence pairs where S appears and not T.

Internet Courses

What is the Internet ?
How to get connected ?
How to look for and save useful information ?
What commercial software is available ?
How to use e-mail ?
What resources are available on the net ?
How to create personal, advertising or commercial pages ?


WCalendar (V1.0) is a simple but usefull tool to consult and add user events to a calendar using a Web browser.

Unix Backup Tool (V1.3) The Unix Backup Tool (UBT) is a tool written in Tcl/Tk and using Expect to control the interactive commands like dump. It allows to define automatically from templates descriptions the files system list to backup, and with a nice interface to change the options and follow the backup progress.

