ISSCO/TIM/ETI, Université de Genève

 

 

 

 

 

 

Resources, Tools and Projects for Multimodal Dialogue Understanding and Management: a Web-based Review

 

 

 

 

 

 

Andrei Popescu-Belis

 

 

 

Report IM2.MDM-05    June 2003

 

Plan of the report

  1. Introduction
  2. Conferences, tutorials and courses
  3. Annotation guidelines and formats
  4. Software tools, mainly for transcription and annotation
  5. Corpora
  6. Project descriptions

 

*          *          *          *          *

 

1.    Introduction

 

This report gathers resources related to multi-modal dialogue management. Resources are: conference material, courses, tutorials, project descriptions, annotation guidelines, data, and software tools. Each resource is described here giving the following fields: title, URL, type (organization, tool, corpus, etc.), dates (start, end, last release, etc.), abstract, and sometimes references. A complement to this document is report IM2-MDM-05 containing bibliographic references often related to the present entries.

The resources are grouped by organisational categories rather than by theme: the main theme is described in the report's title (multimodal dialogue understanding and management). Quite often the items in our review cover several subtopics.

 

*          *          *          *          *

 

2. Conferences, tutorials and courses

 

Title

ESSLLI 2001 Course on Dialog Annotation

URL

http://purl.org/net/gor/esslli01/

Type

Course material

Dates

Summer 2001

Abstract

Languages for the Annotation and Specification of Dialogues. Documents and resources of the course, by Gregor Erbach.

 

 

Title

ELSNET Summer School 2001 – Courses on Text and Speech Corpora

URL

http://ufal.ms.mff.cuni.cz/~ess2001/down.html

Type

Course material

Dates

Summer 2001

Abstract

Steven Bird, Annotation graphs in theory and practice

Lou Burnard, Encoding of language corpora: principles and practice

Henk van den Heuvel and Eric Sanders, Validation of Speech Databases (including transcription)

Amy Isard, Ole Bernsen and Laila Dybkjaer, Dialogue corpora (MATE)

Uli Tuerk, Multimodal Corpora: Design, Technical Aspects and Handling

Chalapathy Neti, Gerasimos Potamianos and Giri Iyengar, Joint audio-visual speech processing and resources

Geoffrey Sampson, Annotation at the grammatical level

Esther Grabe, Prosodic annotation

Jan Hajic, Linguistic Annotation of a Large Corpus: From Morphology to Syntax

 

 

Title

CMU Graduate Seminar on Spoken Dialog Processing

URL

http://fife.speech.cs.cmu.edu/Courses/11716/
http://www-2.cs.cmu.edu/~aliceo/dialogpapers.html
http://www-2.cs.cmu.edu/~aliceo/dialoglinks.html

Type

Course material, papers and hyperlinks

Dates

2000

Abstract

Carnegie-Mellon University. A seminar by Alex Rudnicky, valuable for its list of readings that includes online versions of most of the articles. See also another site with papers for this seminar, as well as a comprehensive list of websites (with papers), both by Alice Oh.

 

 

Title

SIGdial – ACL Special Interest Group on Dialogue

URL

http://www.sigdial.org

Type

Organization

Dates

Permanent

Abstract

SIGdial, among other activities, coordinates a library of discourse processing development tools, software, data, bibliographic references, publications, and links to related web pages. See especially their list of resources.

 

 

Title

SIGmedia – ACL Special Interest Group on Multimedia Language Processing

URL

http://www.sigmedia.org

Type

Organization

Dates

Permanent (since 1992)

Abstract

SIGMEDIA promotes research which focuses on the use of natural language within the context of a multimedia environment. This includes: (1) the interpretation of multimodal input, (2) the generation of multimodal output, (3) multimedia information retrieval and (4) anthropomorphic conversational agents.

 

 

Title

SIGdial Workshops on Discourse and Dialogue

URL

http://www.sigdial.org/events/events.php?page=4

Type

Workshops, resources

Dates

2000-2003, ongoing

Abstract

2nd: Aalborg, Denmark, September 1-2, 2001 (just before Eurospeech 2001):
http://www.sigdial.org/sigdialworkshop01/

3rd: Philadelphia, PA, USA, July 11-12, 2002
http://www.mlab.uiah.fi/~kjokinen/sigdial/

4th workshop Sapporo, Japan, 5-6 July 2003, in conjunction with ACL2003:
http://www.speech.cs.cmu.edu/sigdial2003/

 

 

Title

Series of Workshops on the Semantics and Pragmatics of Dialogue

URL

http://www.cis.uni-muenchen.de/sil/workshop/dialogwsh.html (MunDial'97)
http://parlevink.cs.utwente.nl/Conferences/twlt13.html (Twendial'98)
http://earth.let.uva.nl/~amstelog/ (Amstelogue'99)
http://www.ling.gu.se/gotalog (Gotalog'00)
http://www.uni-bielefeld.de/BIDIALOG (Bidialog'01)
http://www.ltg.ed.ac.uk/edilog/ (EDILOG 02)

Type

Workshops, resources

Dates

1997–2002

Abstract

EDILOG 2002, 6th Workshop on the Semantics and Pragmatics of Dialogue, University of Edinburgh, Scotland, September 4-6, 2002.

Previous workshops in this series include: MunDial'97 (Munich), Twendial'98 (Twente), Amstelogue'99 (Amsterdam), Gotalog'00 (Gothenburg), Bidialog'01 (Bielefeld).

 

 

Title

International CLASS Workshop on Natural, Intelligent and Effective Interaction in Multimodal Dialogue Systems

URL

http://www.class-tech.org/events/NMI_workshop2.html
http://www.class-tech.org/events/NMI_workshop2/papers/

Type

Workshop, resources

Dates

June 28-29, 2002

Abstract

Copenhagen, Denmark

 

 

Title

Dagstuhl Seminar 2001: Coordination and Fusion in Multimodal Interaction

URL

http://www.dfki.de/~wahlster/Dagstuhl_Multi_Modality/

Type

Research seminar, documentation (slides)

Dates

2001

Abstract

Slides and conclusions of an important four day research seminar. Slides of Working Groups available: (1) Research Roadmap of Multimodality, (2) Data Collection and Multimodal Annotation Tools, (3) Software Architectures for Multimodal Systems, (4) Multimodal Meaning Representation

 

 

Title

NIST Automatic Meeting Transcription Data Collection and Annotation Workshop

URL

http://www.nist.gov/speech/test_beds/mr_proj/

Type

Presentations/slides

Dates

November 2, 2001

Abstract

Initial informal workshop held at NIST to explore a collaboration among sites collecting meeting room corpora. The workshop addressed issues in data collection and annotation approaches, data sharing, common annotation standards and tools, and distribution of corpora. Great enthusiasm was expressed by all of the sites to create a collaborative project where data could be shared via a set of common standards. The following presentation slides in PDF are available: "NIST Automatic Meeting Transcription Project", "Meeting Data Collection at CMU/ISL", "The Meeting Recorder Project at ICSI", "MITRE's Work Relevant to Meeting Room Data Collection/Annotation", "LDC Meeting Transcription, Parameters and Progress", "Univ. of Washington's Meeting Data Collection Effort".

 

*          *          *          *          *

 

3. Annotation guidelines and formats

 

Title

MATE Supported Coding Schemes (deliverable D1.1)

URL

http://mate.nis.sdu.dk/about/D1.1/
http://www.dfki.de/mate/d11/annex.html

Type

Project Report

Dates

July 1998

Abstract

See especially the ANNEX, which provides a list of all the schemes, by domain. The most relevant to IM2.MDM is the Dialog Act coding and the Coreference coding.

1/ Communication Problems: Bethan L. Davies' Coding Scheme, Chat, Odense University Scheme, SWBD-DAMSL.

2/ Coreference Schemes: Bruneseaux and Romary, DRAMA, MUC-7 Coreference Task, Poesio and Vieira (1,2), UCREL Anaphoric Annotation, *-U Line.

3/ Dialogue Act Schemes: Alparon, Chiba Coding Scheme, Chat, Coconut, Condon and Cech's Coding Scheme, C-Star, DAMSL, Janus, Flammia's Coding Scheme, LinLin, Maptask, Nakatani et al.'s Coding Scheme, SLSA, SWBD-DAMSL, Traum's Coding Scheme, Verbmobil.

4/ Morpho-Syntactical Schemes.          

5/ Prosody Schemes: PROSPA, IPA, TEI, ToBI, SAMPA, SAMPROSA, INTSINT, SAMSINT, IPO, TSM, TILT Model, Verbmobil, KIM, PROZODIAG, Göteborg.

6/ Cross-level Schemes: BNC, Bonn Focus Research, CHILDES, DAMSL, Kiel Corpus Format, Partitut Format at BAS, SABLE, SAM Standards, TEI, TRAINS Dialogue Corpus, Verbmobil II Conventions for Spontaneous Speech.

 

 

Title

MATE Dialogue Annotation Guidelines (deliverable D2.1)

URL

http://www.ims.uni-stuttgart.de/projekte/mate/mdag/
http://www.andreasmengel.de/pubs/mdag.pdf

Type

Annotation scheme

Dates

January 2000

Abstract

The annotation scheme proposed in the MATE project deals with the following phenomena or "levels": prosody, morpho-syntax, dialogue acts, coreference, communication problems. A cross-level analysis is also provided. The MATE workbench uses this annotation scheme.

REF

A. Mengel, L. Dybkjaer, J.M. Garrido, U. Heid, M. Klein, V. Pirrelli, M. Poesio, S. Quazza, A. Schiffrin & C. Soria 2002, MATE Dialogue Annotation Guidelines, Deliverable 2.1 of the MATE Project LE4-8370, 8 Jan 2000.

 

 

Title

MATE Workbench

URL

http://www.cogsci.ed.ac.uk/~dmck/MateCode/

Type

Annotation software

Dates

Version 0.18, 26 June 2001, demo

Abstract

"Early demo version" of the MATE annotation tool for spoken dialogue transcription. See separate entries for the MATE project and its guidelines.

 

 

Title

AGTK: Annotation Graphs Toolkit

URL

http://agtk.sourceforge.net/

Type

Software

Dates

2000

Abstract

Annotation Graphs are a formal framework for representing linguistic annotations of time series data. Annotation Graphs abstract away from file formats, coding schemes and user interfaces, providing a logical layer for annotation systems.

The package is provided with several annotation tools, such as TableTrans and MultiTrans.

REF

Bird Steven and Mark Liberman 2001, "A formal framework for linguistic annotation", Speech Communication, 33, 1-2, p. 23-60.
http://arxiv.org/abs/cs/0010033

 

 

Title

ATLAS: Architecture and Tools for Linguistic Analysis Systems

URL

http://www.nist.gov/speech/atlas/

Type

Annotation Framework: Guidelines, Software

Dates

Started 2000, updated 2003

Abstract

ATLAS (an initiative involving NIST, LDC and MITRE) is aimed at corpus construction, evaluation infrastructure, and multi-modal visualization, with a focus on the development of linguistic applications. The main goal is to provide an abstraction over the diversity of linguistic annotations (rooted in the `Annotation Graphs').

ATLAS is made of four main components: an annotation ontology, an Application Programming Interface, an interchange format for linguistic data and a type definition infrastructure.

NIST has created a Java instantiation of the data model and provides an Application Programming Interface (jATLAS). Linguistic data expressed using ATLAS can be serialized to XML using the ATLAS Interchange Format (AIF).

REF

Bird Steven, David Day, John Garofolo, John Henderson, Christophe Laprun and Mark Liberman 2000, "ATLAS: A flexible and extensible architecture for linguistic annotation", Proceedings LREC 2000 (Second International Conference on Language Resources and Evaluation), Athens, Greece, volume III/III, p. 1699-1706.
http://arxiv.org/abs/cs/0007022

 

 

Title

Multiparty Discourse Group / DAMSL

URL

http://www.cs.rochester.edu/research/cisd/projects/

Type

Organization

Dates

Up to 2001 (?)

Abstract

The Multiparty Discourse Group, a multi-site group of which the University of Rochester is a member, is part of the Discourse Resource Initiative (DRI). The goal of the Multiparty Discourse Group is to devise a common high-level framework of dialog acts. The group also provides the DAT annotation tool.

 

 

Title

Meeting Recorder Annotation Guidelines for Dialog (ICSI, Berkeley)

URL

http://www.icsi.berkeley.edu/Speech/mr/mtgrcdrtrans.html (prosody, etc.)
http://www.icsi.berkeley.edu/Speech/mr/docs/Draft5.pdf (dialogue acts)

Type

Annotation Guidelines

Dates

August 2002

Abstract

Based on DAMSL for the dialogue acts annotation, these guidelines were adapted to multiparty dialogues in meetings

 

 

Title

Verbmobil Dialogue Act Annotation

URL

http://coral.lili.uni-bielefeld.de/~vmobil/vm-anno/vm-annotations.html
http://www.dfki.de/cgi-bin/verbmobil/htbin/decode.cgi/share/VM-depot/FTP-SERVER/vm-reports/report-226-98.ps.gz
http://www.dfki.de/~kipp/

Type

Annotation guidelines

Dates

2000

Abstract

The first URL describes “Annotation Schemata in Verbmobil Phase II”.

The second URL is a report on “Dialogue Acts in VERBMOBIL-2”, which defines the second edition of dialogue acts, not only for appointment scheduling dialogues, but also for travel planning in general.

The third URL is the web page of one of the main contributors to the dialog act annotation, Michael Kipp (see his publications).

See also the entry for Verbmobil under 'Projects'.

 

 

Title

EXMARaLDA

URL

http://www.rrz.uni-hamburg.de/exmaralda/index.html

Type

Annotation Guidelines

Dates

Version 1.2.4, May 2003

Abstract

XML-based system for transcribing and annotating spoken discourse on the computer. The acronym means "EXtensible MARkup Language for Discourse Annotation". The long term goal of the project is the implementation of a multilingual database of spoken discourse. The DTDs and some Java tools are available for download.

REF

Thomas Schmidt 2001, "The transcription system EXMARaLDA: An application of the annotation graph formalism as the Basis of a Database of Multilingual Spoken Discourse", Proceedings of IRCS Workshop on Linguistic Databases, Philadelphia, PA, USA, pp. 219-227.

 

 

*          *          *          *          *

 

4. Software tools, mainly for transcription and annotation

 

Title

LDC – Linguistic Annotation

URL

http://www.ldc.upenn.edu/annotation/

Type

Repertoire of tools

Dates

Last updated: December 2001

Abstract

This web page describes tools and formats for creating and managing linguistic annotations. The focus is on tools which have been widely used for constructing annotated linguistic databases, and on the formats commonly adopted by such tools and databases.

The basic data may be in the form of time functions (audio, video and/or physiological recordings) or it may be textual. The added notations may include transcriptions of all sorts (from phonetic features to discourse structures), part-of-speech and sense tagging, syntactic analysis, `named entity' identification, co-reference annotation, and so on.

 

 

Title

Transcriber, by ETCA/DGA (France) and LDC (USA)

URL

http://www.etca.fr/CTA/gip/Projets/Transcriber/

Type

Software, transcription/annotation tool

Dates

Latest version: 1.4.5, July 2002.

Abstract

Transcriber assists the manual annotation of speech signals. It provides a user-friendly graphical interface for segmenting long duration speech recordings, transcribing them, and labelling speech turns, topic changes and acoustic conditions. It works on various Unix systems (Linux, Sun Solaris, Silicon Graphics) and Windows NT, and is freely distributed under the GNU GPL. The data is stored in a transparent XML format.

 

 

Title

SoundScriber

URL

http://www.lsa.umich.edu/eli/micase/soundscriber.html

Type

Software, transcription tool

Dates

1998

Abstract

SoundScriber is a very simple freeware program for Windows 95. Besides normal playback features, it offers keystrokes to control the program while working in another window, variable speed playback, and "walking" (plays a small stretch of the file several times, then advances to a new piece, overlapping slightly with the previous one, such that it is possible to transcribe continuously without having to manually pause or rewind).

 

 

Title

VoiceWalker / SoundWriter

URL

http://www.linguistics.ucsb.edu/resources/computing/download/download.htm

Type

Software (playback for transcription)

Dates

Version 2.0, April 1999

Abstract

VoiceWalker 1.1 is basically a glorified digital tape deck. It can step (or "walk") through a recording, repeating short overlapping segments for a specified number of repetitions, then moving on to the next segment. Works with WAV files, Windows AVI files and Quicktime MOV files. Windows 95/98. SoundWriter incorporates the ability to align transcripts with sound files.

 

 

Title

Anvil: A Tool for Annotation of Video and Spoken Language

URL

http://www.dfki.de/~kipp/anvil/

Type

Software

Dates

Version 4.0, April 2003

Abstract

Anvil is a generic video annotation tool. Originally developed for Gesture Research, it has also proved suitable for research in the fields of Linguistics, Ethology, Anthropology, Psychotherapy, Human-Computer Interaction, Embodied Agents, Human-Computer Interaction (HCI) or Computer Animation.

It offers frame-accurate, hierarchical multi-layered annotation with objects that contain attribute-value pairs. Layers and attributes are all user-defined, so that Anvil can accommodate arbitrary annotation schemes. Available from the author by email.

 

 

Title

TASX-annotator – Time Aligned Signal data eXchange

URL

http://tasxforce.lili.uni-bielefeld.de/

Type

Software

Dates

Latest version: June 2002

Abstract

The TASX-annotator is a central component of the TASX-environment, which allows the annotation and transcription of video (multi-channel) and audio data.

Needs Java 2, Java Media Framework 2.1.1, saxon, xerces, PerlTools, jexmaralda (see above).

 

 

Title

MMAX: A Tool for Multi-Modal Annotation in XML

URL

http://www.eml.villa-bosch.de/english/Research/NLP/Downloads/

Type

Software

Dates

Version 0.94, April 2003

Abstract

This tool allows the annotation of all kinds of linguistic data which consist of markables, attributes assigned to these markables and relations between them. The tool was used to annotate bridging relations (`associative anaphora') in a corpus, to develop and evaluate bridging resolution systems, etc.

 

 

Title

DAT annotation tool

URL

http://www.cs.rochester.edu/research/cisd/resources/damsl/

Type

Software

Dates

Version 1.10, 1998

Abstract

The goal of the Multiparty Discourse Group of the DRI (see these entries) is to devise a common high-level framework of dialog acts. The group, based at the University of Rochester, provides the DAT (dialog acts tagger) annotation tool.

 

 

Title

GATE: General Architecture for Text Engineering

URL

http://gate.ac.uk/

Type

Software

Dates

Version 2.1, February 2003

Abstract

GATE provides a TIPSTER-compliant framework to combine text processing modules in order to perform various tasks. GATE can be freely downloaded for research purposes and is supplied with an Information Extraction system.

 

 

Title

TRINDIKIT

URL

http://www.ling.gu.se/projekt/trindi/

Type

Software

Dates

2000

Abstract

The TRINDI project (see separate entry) provides this toolkit for building and experimenting with dialogue move engines and information states. The toolkit specifies formats for defining information states, update rules, dialogue moves, as well as associated algorithms.

 

*          *          *          *          *

 

5. Corpora

 

Title

SIGdial – ACL Special Interest Group on Dialogue

URL

http://www.iet.com/Projects/sigdial/info.html
http://www.iet.com/Projects/sigdial/resources.html

Type

Organization

Dates

Permanent

Abstract

SIGdial, among other activities, coordinates a library of discourse processing development tools, software, data, bibliographic references, publications, and links to related web pages. See especially their list of resources.

 

 

Title

Switchboard Corpus

URL

http://www.isip.msstate.edu/projects/switchboard/
http://www.colorado.edu/ling/faculty/jurafsky/ws97.tar.gz

Type

Corpus

Dates

Released August 1997; updated 2002, 2003 (minor fixes).

Abstract

Corpus of two-party telephone conversations, transcribed. Part of the corpus is available for download from the Discourse Language Modeling Project at the second URL.

 

 

Title

HCRC Map Task Corpus and XML Annotations

URL

http://www.hcrc.ed.ac.uk/dialogue/maptask.html (presentation)
http://www.hcrc.ed.ac.uk/maptask/ (data and tools)

Type

Corpus, tools

Dates

1992-2001

Abstract

The Map Task is a cooperative task involving two participants sitting opposite one another, each one having a map which the other cannot see. One speaker instructs the other to reproduce a route that is drawn on the instructor's map (the maps are not identical... but the participants are aware of that).

The HCRC corpus (8 CD-ROM, raw audio files) consists of 128 digitally recorded unscripted dialogues (plus readings of landmark names). All dialogues are transcribed verbatim in standard orthography, including filled pauses, false starts, hesitations, repetitions and interruptions. The structure of the dialogue is annotated in XML. The files are available for download at the above URL.

 

*          *          *          *          *

 

6. Project descriptions

 

Title

Meeting Recorder Project at ICSI,Berkeley

URL

http://www.icsi.berkeley.edu/Speech/mr/index.html

Type

Project

Dates

ongoing

Abstract

The project aims at investigating speech recognition for meetings. ICSI collects data using a meeting room equipped with a multichannel, studio-quality recording system.  s of February 2001, ICSI had already 40 hours of 16 channel pilot data. Ten hours were hand-transcribed using conventions designed by Jane Edwards at ICSI. As for the goals and applications, while the basic idea is to develop recognition systems that are able to transcribe conventional meetings, more useful applications include search for particular information or production of automatic summaries. ICSI is also associated to the IM2 project.

REF

Morgan, N., Baron, D., Bhagat, S., Carvey, H., Dhillon, R., Edwards, J. A., Gelbart, D., Janin, A., Krupski, A., Peskin, B., Pfau, T., Shriberg, E., Stolcke, A., and Wooters, C. "Meetings about meetings: research at ICSI on speech in multiparty conversations." ICASSP 2003 (International Conference on Acoustics, Speech, and Signal Processing), Hong Kong, China.

 

 

Title

ISL's Meeting Browser

URL

http://www.is.cs.cmu.edu/meeting_room/

Type

Project

Dates

publications: 1999-2002

Abstract

This project at the Interactive Systems Laboratories (Carnegie Mellon University) develops a system for browsing a meeting.  By automatically transcribing a meeting, the ISL's Meeting Browser can answer many of the questions an absent individual may ask: search the transcription for keywords and topics; emotion recognition provides clues as to the meeting's mood; summarization offers a quick overview; highlighted action items provides assignments; and video playback. The project has collected about 100 hours of meeting recordings.

REF

Waibel, A., Bett, M., Metze, F., Ries, K., Schaaf, T., Schultz, T., Soltau, H., Yu, H., and Zechner, K. "Advances in Automatic Meeting Record Creation and Access." ICASSP 2001 (International Conference on Acoustics, Speech and Signal Processing), Salt Lake City, UT, USA.

 

 

Title

Meeting Tracking and Recognition (Genoa)

URL

http://www.ri.cmu.edu/projects/project_286.html

Type

Project

Dates

ended 2002

Abstract

At Carnegie Mellon University, Robotics Institute. Seems to have preceded the Meeting Browser Project. Although the web page signals that GENOA is terminated, it includes a list of publications, some of which are related to the ISL Meeting Browser (see above).

 

 

Title

Automatic Meeting Transcription Project at NIST

URL

http://www.nist.gov/speech/test_beds/mr_proj/
http://www.nist.gov/speech/test_beds/mr_proj/data_collection/scenarios.html

Type

Project

Dates

2001

Abstract

This project (volunteers' meeting in November 2001) provides a development and evaluation infrastructure for speech transcription in meetings. The infrastructure includes rich transcription and annotation conventions, a corpus of audio and video from meetings collected at NIST using a variety of microphones and video cameras, new evaluation protocols, metrics, and software, sponsoring workshops, etc. Recent developments in this project could not be found.

 

 

Title

SmartKom Project at DFKI

URL

http://smartkom.dfki.de/start_en.html

Type

Project Description

Dates

1999-present

Abstract

Dialog-based Human-Technology Interaction by Coordinated Analysis and Generation of Multiple Modalities.

 

 

Title

M4

URL

http://www.dcs.shef.ac.uk/spandh/projects/m4/

Type

Project

Dates

Started March 1st, 2002

Abstract

This European IST project aims at building a demonstration system to enable structuring, browsing and querying of an archive of automatically analysed meetings (recorded in a room equipped with multimodal sensors). The program of work is divided in: Smart Meeting Room, Data Collection and Annotation (WP1); Multimodal Recognition (WP2); Multimodal Integration (WP3); Demonstration and Evaluation (WP4); Dissemination, Exploitation and Evaluation (WP5).

 

 

Title

TRIPS

URL

http://www.cs.rochester.edu/research/cisd/projects/

Type

Project, software

Dates

last modified 2000

Abstract

Follows the TRAINS Project, University of Rochester.

TRIPS, The Rochester Interactive Planning System, is the latest in a series of prototype collaborative planning assistants. The goal is an intelligent planning assistant that interacts with its human manager using a combination of natural language and graphical displays. The system understands the interaction as a dialogue between it and the human, thus providing a context for interpreting human utterances and actions, and a structure for deciding what to do in response.

 

 

Title

Verbmobil

URL

http://verbmobil.dfki.de/
http://www.dfki.de/cgi-bin/verbmobil/htbin/doc-access.cgi (document server)

Type

Project

Dates

1996-2000

Abstract

Verbmobil was a German long-term project aiming at the development of a mobile translation system for the translation of spontaneous speech in face-to-face situations.

 

 

Title

Smart Space Laboratory at NIST

URL

http://www.nist.gov/smartspace/

Type

Lab description

Dates

2001

Abstract

This laboratory provides support to the IT industry in technology research, standards and measurements. Our Modular Test Bed enables the research community to bring next generation technologies together in a vendor-neutral environment. It consists of a defined middleware API for real-time data transport, a connection broker server for sensor data sources, and processing data sinks.

Smart Spaces are work environments with embedded computers, information appliances, and multi-modal sensors which offer people unprecedented levels of access to information and assistance from computers. The NIST mission is to address the measurement, standards and interoperability challenges that must be met.

 

 

Title

Discourse Resource Initiative (DRI)

URL

http://www.georgetown.edu/luperfoy/Discourse-Treebank/dri-home.html

Type

Organization

Dates

 

Abstract

See in particular the section on the Multiparty Discourse Group.

 

 

Title

Annotating Argumentation Acts in Spoken Dialog

URL

http://www.cs.rochester.edu/research/cisd/projects/

Type

Project

Dates

 

Abstract

A project at the University of Rochester.

 

 

Title

Discourse Language Modelling Project

URL

http://www.colorado.edu/ling/jurafsky/ws97/

Type

Documents and resources

Dates

1998

Abstract

About the Switchboard corpus annotated with the DAMSL scheme.

 

 

Title

MATE : Multilevel Annotation, Tools Engineering

URL

http://mate.nis.sdu.dk/
http://www.cogsci.ed.ac.uk/~dmck/MateCode/

Type

European Project (Telematics Project LE4-8370).

Dates

1998-2000

Abstract

See in particular Deliverables D1.1 (on pre-existent annotation schemes) and D2.2 on the MATE Dialogue Annotation Scheme (prosody, morpho-syntax, dialogue acts, coreference, communication problems) in the documents section above. The project designed also an annotation workbench, available at the second URL above.

 

 

Title

NITE : Natural Interactivity, Tools Engineering

URL

http://nite.nis.sdu.dk/

Type

Project

Dates

2001-2003

Abstract

European HLT Project.

 

 

Title

PARADISE: A Framework for Evaluating Spoken Dialogue Agents

URL

http://www.research.att.com/~walker/cv.html
http://www.research.att.com/~walker/nle-4.pdf (NLE 2000)
http://www.research.att.com/~walker/comp-sl11.ps (CSL 1998)
http://www.research.att.com/~walker/acl21.ps (ACL 1997)

http://acl.ldc.upenn.edu/P/P97/P97-1035.pdf (ACL 1997)

Type

Evaluation Scheme

Dates

ca. 1998

Abstract

The scheme is described in several papers by Marilyn A. Walker (Univ. of Edinburgh), cf. URLs above.

 

Title

TRINDI

URL

http://www.ling.gu.se/projekt/trindi/

Type

Project description

Dates

1998-2000

Abstract

TRINDI (Task Oriented Instructional Dialogues) is a European project (LE4-8314) focusing on dialogues between humans and machines, which enable humans to make choices in the performance of a certain task, such as route planning. Scenarios are analyzed using a grid of user interaction levels (from menu selection to spoken questions), and machine interaction levels (from pre-stored text to generated speech).

Participants: Göteborg University (P. Bohlin, R. Cooper, E. Engdahl, S. Larsson, D. Traum), Edinburgh University (E. Klein, C. Matheson, M. Moens, M. Poesio), SRI Cambridge (I. Lewin, D. Milward, S. Pulman), Saarbrücken University (J. Bos, M. Pinkal), Xerox XRCE (L. Karttunen, A. Zaenen).

 

 

Title

WITAS

URL

http://www-csli.stanford.edu/semlab/witas/

Type

Project

Dates

2000-2002 and on

Abstract

The WITAS Project at CSLI. Its main aim of the Conversational Interfaces project at CSLI is to build a general purpose dialogue system which supports multi-modal activity-oriented dialogues with devices. The goal is to pilot a helicopter (unmanned) using human-computer dialogue. The system uses a common software base consisting of the Open Agent Architecture, Nuance speech recogniser, Gemini (SRI's Natural Language parser and generator), and speech synthesis using Festival. The system handles "unscriptable" dialogues where there is no finite state transition network describing a conversation, and no clear end state for a conversation (different from the "form-filling" paradigm, such as many travel-planning systems). Research aims to address specific theoretical questions such as: what is the right level of abstraction at which to describe dialogue moves and context? What is an effective multi-modal communication act? How can they be generated? What notion of dialogue context or "information state" is appropriate in multi-modal contexts?

 

 

Title

DiaLeague

URL

http://dialeague.csl.sony.co.jp/ 

Type

Project description

Dates

ongoing (?)

Abstract

DiaLeague Project at Sony CSL. A web-based interface for people to talk with artificial dialogue systems. The dialogues are multimodal in that they involve spatial operations, though currently the utterances are textual. The web server is quite rudimentary.

 

 

Title

Persona

URL

http://research.microsoft.com/ui/persona/home.htm

Type

Project (Microsoft Research, USA)

Dates

1997 (?)

Abstract

The Persona project develops technologies to produce conversational assistants that interact with a user in a natural spoken dialogue. The work is built upon the Whisper speaker-independent continuous speech recognition system and a broad coverage English understanding system, both also developed at Microsoft Research. In an initial prototype, an expressive 3-dimensional parrot named Peedy  responds to user requests for music.