ISSCO/TIM/ETI,
Université de Genève
Resources,
Tools and Projects for Multimodal Dialogue Understanding and Management: a
Web-based Review
Andrei
Popescu-Belis
Report
IM2.MDM-05 – June 2003
* * * * *
This
report gathers resources related to multi-modal dialogue management. Resources
are: conference material, courses, tutorials, project descriptions, annotation
guidelines, data, and software tools. Each resource is described here giving
the following fields: title, URL, type (organization, tool, corpus, etc.),
dates (start, end, last release, etc.), abstract, and sometimes references. A
complement to this document is report IM2-MDM-05 containing bibliographic
references often related to the present entries.
The
resources are grouped by organisational categories rather than by theme: the
main theme is described in the report's title (multimodal dialogue
understanding and management). Quite often the items in our review cover
several subtopics.
* * * * *
|
Title |
ESSLLI 2001 Course on Dialog
Annotation |
|
URL |
|
|
Type |
Course
material |
|
Dates |
Summer
2001 |
|
Abstract |
Languages
for the Annotation and Specification of Dialogues. Documents and resources of
the course, by Gregor Erbach. |
|
Title |
ELSNET
Summer School 2001 – Courses on Text and Speech Corpora |
|
URL |
|
|
Type |
Course
material |
|
Dates |
Summer
2001 |
|
Abstract |
Steven Bird, Annotation graphs
in theory and practice Lou Burnard, Encoding
of language corpora: principles and practice Henk van den Heuvel
and Eric Sanders, Validation of Speech Databases (including transcription) Amy Isard, Ole Bernsen and Laila Dybkjaer, Dialogue corpora (MATE) Uli Tuerk, Multimodal
Corpora: Design, Technical Aspects and Handling Chalapathy Neti, Gerasimos Potamianos and Giri Iyengar, Joint audio-visual speech processing and
resources Geoffrey Sampson, Annotation at the
grammatical level Esther Grabe, Prosodic
annotation Jan Hajic, Linguistic
Annotation of a Large Corpus: From Morphology to Syntax |
|
Title |
CMU
Graduate Seminar on Spoken Dialog Processing |
|
URL |
http://fife.speech.cs.cmu.edu/Courses/11716/
|
|
Type |
Course
material, papers and hyperlinks |
|
Dates |
2000 |
|
Abstract |
|
|
Title |
SIGdial
– ACL Special Interest Group on Dialogue |
|
URL |
|
|
Type |
Organization |
|
Dates |
Permanent |
|
Abstract |
SIGdial,
among other activities, coordinates a library of
discourse processing development tools, software, data, bibliographic
references, publications, and links to related web pages. See especially
their list of resources. |
|
Title |
SIGmedia
– ACL Special Interest Group on Multimedia Language Processing |
|
URL |
|
|
Type |
Organization |
|
Dates |
Permanent
(since 1992) |
|
Abstract |
SIGMEDIA
promotes research which focuses on the use of natural language within the
context of a multimedia environment. This includes: (1) the interpretation of
multimodal input, (2) the generation of multimodal output, (3) multimedia
information retrieval and (4) anthropomorphic conversational agents. |
|
Title |
SIGdial
Workshops on Discourse and Dialogue |
|
URL |
|
|
Type |
Workshops,
resources |
|
Dates |
2000-2003,
ongoing |
|
Abstract |
2nd:
3rd:
4th
workshop |
|
Title |
Series
of Workshops on the Semantics and Pragmatics of Dialogue |
|
URL |
http://www.cis.uni-muenchen.de/sil/workshop/dialogwsh.html
(MunDial'97) |
|
Type |
Workshops,
resources |
|
Dates |
1997–2002 |
|
Abstract |
EDILOG
2002, 6th Workshop on the Semantics and Pragmatics of Dialogue, Previous
workshops in this series include: MunDial'97 ( |
|
Title |
International
CLASS Workshop on Natural, Intelligent and Effective Interaction in
Multimodal Dialogue Systems |
|
URL |
http://www.class-tech.org/events/NMI_workshop2.html
|
|
Type |
Workshop,
resources |
|
Dates |
|
|
Abstract |
|
|
Title |
Dagstuhl
Seminar 2001: Coordination and Fusion in Multimodal Interaction |
|
URL |
|
|
Type |
Research
seminar, documentation (slides) |
|
Dates |
2001 |
|
Abstract |
Slides and
conclusions of an important four day research seminar. Slides of Working
Groups available: (1) Research Roadmap of Multimodality, (2) Data Collection
and Multimodal Annotation Tools, (3) Software Architectures for Multimodal
Systems, (4) Multimodal Meaning Representation |
|
Title |
NIST
Automatic Meeting Transcription Data Collection and Annotation Workshop |
|
URL |
|
|
Type |
Presentations/slides |
|
Dates |
|
|
Abstract |
Initial informal workshop held at NIST to explore a
collaboration among sites collecting meeting room corpora. The
workshop addressed issues in data collection and annotation approaches, data
sharing, common annotation standards and tools, and distribution of corpora.
Great enthusiasm was expressed by all of the sites to create a collaborative
project where data could be shared via a set of common standards. The following
presentation slides in PDF are available: "NIST Automatic Meeting
Transcription Project", "Meeting Data Collection at CMU/ISL",
"The Meeting Recorder Project at ICSI", "MITRE's
Work Relevant to Meeting Room Data Collection/Annotation", "LDC
Meeting Transcription, Parameters and Progress", " |
* * * * *
|
Title |
MATE Supported
Coding Schemes (deliverable D1.1) |
|
URL |
http://mate.nis.sdu.dk/about/D1.1/
|
|
Type |
Project
Report |
|
Dates |
July
1998 |
|
Abstract |
See
especially the ANNEX, which provides a list of all the schemes, by domain.
The most relevant to IM2.MDM is the Dialog Act coding and the Coreference
coding. 1/
Communication Problems: Bethan L. Davies' Coding
Scheme, Chat, 2/
Coreference Schemes: Bruneseaux and Romary, DRAMA, MUC-7 Coreference Task, Poesio and Vieira (1,2), UCREL
Anaphoric Annotation, *-U Line. 3/
Dialogue Act Schemes: Alparon, Chiba Coding Scheme,
Chat, Coconut, Condon and Cech's Coding Scheme,
C-Star, DAMSL, Janus, Flammia's
Coding Scheme, LinLin, Maptask,
Nakatani et al.'s Coding Scheme, SLSA, SWBD-DAMSL, Traum's Coding Scheme, Verbmobil. 4/ Morpho-Syntactical
Schemes. 5/
Prosody Schemes: PROSPA, IPA, TEI, ToBI, SAMPA,
SAMPROSA, INTSINT, SAMSINT, IPO, TSM, TILT Model, Verbmobil,
KIM, PROZODIAG, Göteborg. 6/
Cross-level Schemes: BNC, Bonn Focus Research, CHILDES, DAMSL, Kiel Corpus
Format, Partitut Format at BAS, SABLE, SAM
Standards, TEI, TRAINS Dialogue Corpus, Verbmobil
II Conventions for Spontaneous Speech. |
|
Title |
MATE
Dialogue Annotation Guidelines (deliverable D2.1) |
|
URL |
http://www.ims.uni-stuttgart.de/projekte/mate/mdag/ |
|
Type |
Annotation
scheme |
|
Dates |
January
2000 |
|
Abstract |
The
annotation scheme proposed in the MATE project deals with the following phenomena
or "levels": prosody, morpho-syntax, dialogue
acts, coreference, communication problems. A
cross-level analysis is also provided. The MATE workbench uses this
annotation scheme. |
|
REF |
A. Mengel, L. Dybkjaer, J.M. Garrido, U. Heid, M. Klein, V. Pirrelli, M. Poesio, S. Quazza, A. Schiffrin & C. Soria 2002, MATE Dialogue Annotation Guidelines,
Deliverable 2.1 of the MATE Project LE4-8370, 8 Jan 2000. |
|
Title |
MATE
Workbench |
|
URL |
|
|
Type |
Annotation
software |
|
Dates |
Version
0.18, |
|
Abstract |
"Early
demo version" of the MATE annotation tool for spoken dialogue
transcription. See separate entries for the MATE project and its guidelines. |
|
Title |
AGTK: Annotation
Graphs Toolkit |
|
URL |
|
|
Type |
Software |
|
Dates |
2000 |
|
Abstract |
Annotation
Graphs are a formal framework for representing linguistic annotations of time
series data. Annotation Graphs abstract away from file formats, coding
schemes and user interfaces, providing a logical layer for annotation
systems. The
package is provided with several annotation tools, such as TableTrans and MultiTrans. |
|
REF |
Bird
Steven and Mark Liberman 2001, "A formal
framework for linguistic annotation", Speech Communication, 33,
1-2, p. 23-60. |
|
Title |
ATLAS:
Architecture and Tools for Linguistic Analysis Systems |
|
URL |
|
|
Type |
Annotation
Framework: Guidelines, Software |
|
Dates |
Started
2000, updated 2003 |
|
Abstract |
ATLAS
(an initiative involving NIST, LDC and MITRE) is aimed at corpus
construction, evaluation infrastructure, and multi-modal visualization, with
a focus on the development of linguistic applications. The main goal is to
provide an abstraction over the diversity of linguistic annotations (rooted
in the `Annotation Graphs'). ATLAS is
made of four main components: an annotation ontology,
an Application Programming Interface, an interchange format for linguistic
data and a type definition infrastructure. NIST has
created a Java instantiation of the data model and provides an Application Programming
Interface (jATLAS). Linguistic data expressed using
ATLAS can be serialized to XML using the ATLAS Interchange Format (AIF). |
|
REF |
Bird
Steven, David Day, John Garofolo, John Henderson, Christophe Laprun and Mark Liberman 2000, "ATLAS: A flexible and extensible
architecture for linguistic annotation", Proceedings LREC 2000
(Second International Conference on Language Resources and Evaluation), |
|
Title |
Multiparty
Discourse Group / DAMSL |
|
URL |
|
|
Type |
Organization |
|
Dates |
Up to
2001 (?) |
|
Abstract |
The
Multiparty Discourse Group, a multi-site group of which the |
|
Title |
Meeting
Recorder Annotation Guidelines for Dialog (ICSI, Berkeley) |
|
URL |
http://www.icsi.berkeley.edu/Speech/mr/mtgrcdrtrans.html
(prosody, etc.) |
|
Type |
Annotation
Guidelines |
|
Dates |
August 2002 |
|
Abstract |
Based on
DAMSL for the dialogue acts annotation, these guidelines were adapted to
multiparty dialogues in meetings |
|
Title |
Verbmobil
Dialogue Act Annotation |
|
URL |
http://coral.lili.uni-bielefeld.de/~vmobil/vm-anno/vm-annotations.html
|
|
Type |
Annotation
guidelines |
|
Dates |
2000 |
|
Abstract |
The
first URL describes “Annotation Schemata in Verbmobil
Phase II”. The
second URL is a report on “Dialogue Acts in VERBMOBIL-2”, which defines the
second edition of dialogue acts, not only for appointment scheduling
dialogues, but also for travel planning in general. The
third URL is the web page of one of the main contributors to the dialog act
annotation, Michael Kipp (see his publications). See also
the entry for Verbmobil under 'Projects'. |
|
Title |
EXMARaLDA |
|
URL |
|
|
Type |
Annotation
Guidelines |
|
Dates |
Version
1.2.4, May 2003 |
|
Abstract |
XML-based
system for transcribing and annotating spoken discourse on the computer. The
acronym means "EXtensible MARkup
Language for Discourse Annotation". The long term goal of the project is
the implementation of a multilingual database of spoken discourse. The DTDs and some Java tools are available for download. |
|
REF |
Thomas Schmidt 2001, "The transcription system EXMARaLDA: An application of the annotation
graph formalism as the Basis of a Database of Multilingual Spoken Discourse",
Proceedings of IRCS Workshop on Linguistic Databases, |
* * * * *
|
Title |
LDC – Linguistic
Annotation |
|
URL |
|
|
Type |
Repertoire
of tools |
|
Dates |
Last
updated: December 2001 |
|
Abstract |
This web
page describes tools and formats for creating and managing linguistic annotations.
The focus is on tools which have been widely used for constructing annotated
linguistic databases, and on the formats commonly adopted by such tools and
databases. The
basic data may be in the form of time functions (audio, video and/or
physiological recordings) or it may be textual. The added notations may
include transcriptions of all sorts (from phonetic features to discourse
structures), part-of-speech and sense tagging, syntactic analysis, `named
entity' identification, co-reference annotation, and so on. |
|
Title |
Transcriber, by ETCA/DGA ( |
|
URL |
|
|
Type |
Software,
transcription/annotation tool |
|
Dates |
Latest
version: 1.4.5, July 2002. |
|
Abstract |
Transcriber assists the manual annotation of speech signals. It provides a user-friendly
graphical interface for segmenting long duration speech recordings, transcribing
them, and labelling speech turns, topic changes and acoustic conditions. It
works on various Unix systems (Linux, Sun Solaris, Silicon
Graphics) and Windows NT, and is freely distributed under the GNU GPL. The
data is stored in a transparent XML format. |
|
Title |
SoundScriber |
|
URL |
|
|
Type |
Software,
transcription tool |
|
Dates |
1998 |
|
Abstract |
SoundScriber is a very simple freeware program for Windows 95. Besides normal
playback features, it offers keystrokes to control the program while working
in another window, variable speed playback, and "walking" (plays a
small stretch of the file several times, then advances to a new piece,
overlapping slightly with the previous one, such that it is possible to
transcribe continuously without having to manually pause or rewind). |
|
Title |
VoiceWalker / SoundWriter |
|
URL |
http://www.linguistics.ucsb.edu/resources/computing/download/download.htm
|
|
Type |
Software
(playback for transcription) |
|
Dates |
Version
2.0, April 1999 |
|
Abstract |
VoiceWalker 1.1 is basically a glorified digital tape deck. It can step (or
"walk") through a recording, repeating short overlapping segments
for a specified number of repetitions, then moving on to the next segment. Works
with WAV files, Windows AVI files and Quicktime MOV
files. Windows 95/98. SoundWriter incorporates the
ability to align transcripts with sound files. |
|
Title |
Anvil: A
Tool for Annotation of Video and Spoken Language |
|
URL |
|
|
Type |
Software |
|
Dates |
Version
4.0, April 2003 |
|
Abstract |
Anvil is
a generic video annotation tool. Originally developed for Gesture Research,
it has also proved suitable for research in the fields of Linguistics, Ethology, Anthropology, Psychotherapy, Human-Computer
Interaction, Embodied Agents, Human-Computer Interaction (HCI) or Computer
Animation. It
offers frame-accurate, hierarchical multi-layered annotation with objects
that contain attribute-value pairs. Layers and attributes are all user-defined,
so that Anvil can accommodate arbitrary annotation schemes. Available from
the author by email. |
|
Title |
TASX-annotator
– Time Aligned Signal data eXchange |
|
URL |
|
|
Type |
Software |
|
Dates |
Latest
version: June 2002 |
|
Abstract |
The
TASX-annotator is a central component of the TASX-environment, which allows
the annotation and transcription of video (multi-channel) and audio data. Needs Java
2, Java Media Framework 2.1.1, saxon, xerces, PerlTools, jexmaralda (see above). |
|
Title |
MMAX: A
Tool for Multi-Modal Annotation in XML |
|
URL |
http://www.eml.villa-bosch.de/english/Research/NLP/Downloads/
|
|
Type |
Software |
|
Dates |
Version
0.94, April 2003 |
|
Abstract |
This
tool allows the annotation of all kinds of linguistic data which consist of markables, attributes assigned to these markables and relations between them. The tool was used to
annotate bridging relations (`associative anaphora') in a corpus, to develop
and evaluate bridging resolution systems, etc. |
|
Title |
DAT
annotation tool |
|
URL |
|
|
Type |
Software |
|
Dates |
Version
1.10, 1998 |
|
Abstract |
The goal
of the Multiparty Discourse Group of the DRI (see these entries) is to devise
a common high-level framework of dialog acts. The group, based at the |
|
Title |
GATE: General
Architecture for Text Engineering |
|
URL |
|
|
Type |
Software |
|
Dates |
Version
2.1, February 2003 |
|
Abstract |
GATE
provides a TIPSTER-compliant framework to combine text processing modules in
order to perform various tasks. GATE can be freely downloaded for research
purposes and is supplied with an Information Extraction system. |
|
Title |
TRINDIKIT |
|
URL |
|
|
Type |
Software |
|
Dates |
2000 |
|
Abstract |
The TRINDI
project (see separate entry) provides this toolkit for building and
experimenting with dialogue move engines and information states. The toolkit
specifies formats for defining information states, update rules, dialogue
moves, as well as associated algorithms. |
* * * * *
|
Title |
SIGdial
– ACL Special Interest Group on Dialogue |
|
URL |
http://www.iet.com/Projects/sigdial/info.html
|
|
Type |
Organization |
|
Dates |
Permanent |
|
Abstract |
SIGdial,
among other activities, coordinates a library of discourse processing
development tools, software, data, bibliographic references, publications,
and links to related web pages. See especially their list of resources. |
|
Title |
Switchboard
Corpus |
|
URL |
http://www.isip.msstate.edu/projects/switchboard/ |
|
Type |
Corpus |
|
Dates |
Released
August 1997; updated 2002, 2003 (minor fixes). |
|
Abstract |
Corpus
of two-party telephone conversations, transcribed. Part of the corpus is
available for download from the Discourse Language Modeling
Project at the second URL. |
|
Title |
HCRC Map
Task Corpus and XML Annotations |
|
URL |
http://www.hcrc.ed.ac.uk/dialogue/maptask.html
(presentation) |
|
Type |
Corpus,
tools |
|
Dates |
1992-2001 |
|
Abstract |
The Map
Task is a cooperative task involving two participants sitting opposite one
another, each one having a map which the other cannot see. One speaker
instructs the other to reproduce a route that is drawn on the instructor's
map (the maps are not identical... but the participants are aware of that). The HCRC
corpus (8 CD-ROM, raw audio files) consists of 128 digitally recorded
unscripted dialogues (plus readings of landmark names). All dialogues are
transcribed verbatim in standard orthography, including filled pauses, false
starts, hesitations, repetitions and interruptions. The structure of the
dialogue is annotated in XML. The files are available for download at the
above URL. |
* * * * *
|
Title |
Meeting
Recorder Project at ICSI,Berkeley |
|
URL |
|
|
Type |
Project |
|
Dates |
ongoing |
|
Abstract |
The
project aims at investigating speech recognition for meetings. ICSI collects
data using a meeting room equipped with a multichannel,
studio-quality recording system. s of February 2001, ICSI had already 40 hours of 16 channel
pilot data. Ten hours were hand-transcribed using conventions designed by
Jane Edwards at ICSI. As for the goals and applications, while the basic idea
is to develop recognition systems that are able to transcribe conventional
meetings, more useful applications include search for particular information
or production of automatic summaries. ICSI is also associated to the IM2
project. |
|
REF |
Morgan,
N., Baron, D., Bhagat, S., Carvey,
H., Dhillon, R., Edwards, J. A., Gelbart, D., Janin, A., Krupski, A., Peskin, B., Pfau, T., Shriberg, E., Stolcke, A., and Wooters, C.
"Meetings about meetings: research at ICSI on speech in multiparty
conversations." ICASSP 2003 (International Conference on Acoustics,
Speech, and Signal Processing), |
|
Title |
ISL's
Meeting Browser |
|
URL |
|
|
Type |
Project |
|
Dates |
publications:
1999-2002 |
|
Abstract |
This
project at the Interactive Systems Laboratories ( |
|
REF |
Waibel,
A., Bett, M., Metze, F., Ries, K., Schaaf, T., Schultz,
T., Soltau, H., Yu, H., and Zechner,
K. "Advances in Automatic Meeting Record Creation and Access." ICASSP
2001 (International Conference on Acoustics, Speech and Signal Processing),
|
|
Title |
Meeting
Tracking and Recognition ( |
|
URL |
|
|
Type |
Project |
|
Dates |
ended
2002 |
|
Abstract |
At
Carnegie Mellon University, Robotics Institute. Seems to have preceded the
Meeting Browser Project. Although the web page signals that |
|
Title |
Automatic
Meeting Transcription Project at NIST |
|
URL |
http://www.nist.gov/speech/test_beds/mr_proj/
|
|
Type |
Project |
|
Dates |
2001 |
|
Abstract |
This project
(volunteers' meeting in November 2001) provides a development and evaluation
infrastructure for speech transcription in meetings. The infrastructure
includes rich transcription and annotation conventions, a corpus of audio and
video from meetings collected at NIST using a variety of microphones and
video cameras, new evaluation protocols, metrics, and software, sponsoring
workshops, etc. Recent developments in this project could not be found. |
|
Title |
SmartKom
Project at DFKI |
|
URL |
|
|
Type |
Project
Description |
|
Dates |
1999-present |
|
Abstract |
Dialog-based
Human-Technology Interaction by Coordinated Analysis and Generation of
Multiple Modalities. |
|
Title |
M4 |
|
URL |
|
|
Type |
Project |
|
Dates |
Started |
|
Abstract |
This
European IST project aims at building a demonstration system to enable structuring,
browsing and querying of an archive of automatically analysed meetings
(recorded in a room equipped with multimodal sensors). The program of work is
divided in: Smart Meeting Room, Data Collection and Annotation (WP1); Multimodal
Recognition (WP2); Multimodal Integration (WP3); Demonstration and Evaluation
(WP4); Dissemination, Exploitation and Evaluation (WP5). |
|
Title |
TRIPS |
|
URL |
|
|
Type |
Project,
software |
|
Dates |
last
modified 2000 |
|
Abstract |
Follows
the TRAINS Project, TRIPS, The Rochester Interactive Planning System, is the latest in a series of
prototype collaborative planning assistants. The goal is an intelligent
planning assistant that interacts with its human manager using a combination
of natural language and graphical displays. The system understands the
interaction as a dialogue between it and the human, thus providing a context
for interpreting human utterances and actions, and a structure for deciding
what to do in response. |
|
Title |
Verbmobil |
|
URL |
http://verbmobil.dfki.de/
|
|
Type |
Project |
|
Dates |
1996-2000 |
|
Abstract |
Verbmobil
was a German long-term project aiming at the development of a mobile translation
system for the translation of spontaneous speech in face-to-face situations. |
|
Title |
Smart
Space Laboratory at NIST |
|
URL |
|
|
Type |
Lab
description |
|
Dates |
2001 |
|
Abstract |
This
laboratory provides support to the IT industry in technology research,
standards and measurements. Our Modular Test Bed enables the research community
to bring next generation technologies together in a vendor-neutral
environment. It consists of a defined middleware API for real-time data
transport, a connection broker server for sensor data sources, and processing
data sinks. Smart
Spaces are work environments with embedded computers, information appliances,
and multi-modal sensors which offer people unprecedented levels of access to
information and assistance from computers. The NIST mission is to address the
measurement, standards and interoperability challenges that must be met. |
|
Title |
Discourse
Resource Initiative (DRI) |
|
URL |
http://www.georgetown.edu/luperfoy/Discourse-Treebank/dri-home.html
|
|
Type |
Organization |
|
Dates |
|
|
Abstract |
See in
particular the section on the Multiparty Discourse Group. |
|
Title |
Annotating
Argumentation Acts in Spoken Dialog |
|
URL |
|
|
Type |
Project |
|
Dates |
|
|
Abstract |
A project
at the |
|
Title |
Discourse
Language Modelling Project |
|
URL |
|
|
Type |
Documents
and resources |
|
Dates |
1998 |
|
Abstract |
About the
Switchboard corpus annotated with the DAMSL scheme. |
|
Title |
MATE : Multilevel
Annotation, Tools Engineering |
|
URL |
http://mate.nis.sdu.dk/ |
|
Type |
European
Project (Telematics Project LE4-8370). |
|
Dates |
1998-2000 |
|
Abstract |
See in
particular Deliverables D1.1 (on pre-existent annotation schemes) and D2.2 on
the MATE Dialogue Annotation Scheme (prosody, morpho-syntax,
dialogue acts, coreference, communication problems)
in the documents section above. The project designed also an annotation
workbench, available at the second URL above. |
|
Title |
NITE :
Natural Interactivity, Tools Engineering |
|
URL |
|
|
Type |
Project |
|
Dates |
2001-2003 |
|
Abstract |
European
HLT Project. |
|
Title |
|
|
URL |
http://www.research.att.com/~walker/cv.html
http://acl.ldc.upenn.edu/P/P97/P97-1035.pdf
(ACL 1997) |
|
Type |
Evaluation
Scheme |
|
Dates |
ca. 1998 |
|
Abstract |
The
scheme is described in several papers by Marilyn A. Walker ( |
|
Title |
TRINDI |
|
URL |
|
|
Type |
Project
description |
|
Dates |
1998-2000 |
|
Abstract |
TRINDI
(Task Oriented Instructional Dialogues) is a European project (LE4-8314)
focusing on dialogues between humans and machines, which enable humans to
make choices in the performance of a certain task, such as route planning.
Scenarios are analyzed using a grid of user interaction levels (from menu
selection to spoken questions), and machine interaction levels (from
pre-stored text to generated speech). Participants:
|
|
Title |
WITAS |
|
URL |
|
|
Type |
Project |
|
Dates |
2000-2002
and on |
|
Abstract |
The WITAS
Project at CSLI. Its main aim of the Conversational Interfaces project at
CSLI is to build a general purpose dialogue system which supports multi-modal
activity-oriented dialogues with devices. The goal is to pilot a helicopter
(unmanned) using human-computer dialogue. The system uses a common software
base consisting of the Open Agent Architecture, Nuance speech recogniser,
Gemini (SRI's Natural Language parser and
generator), and speech synthesis using Festival. The system handles "unscriptable" dialogues where there is no finite
state transition network describing a conversation, and no clear end state
for a conversation (different from the "form-filling" paradigm, such
as many travel-planning systems). Research aims to address specific
theoretical questions such as: what is the right level of abstraction at
which to describe dialogue moves and context? What is an effective
multi-modal communication act? How can they be generated? What notion of dialogue
context or "information state" is appropriate in multi-modal
contexts? |
|
Title |
DiaLeague |
|
URL |
|
|
Type |
Project
description |
|
Dates |
ongoing
(?) |
|
Abstract |
DiaLeague
Project at Sony CSL. A web-based interface for people to talk with artificial
dialogue systems. The dialogues are multimodal in that they involve spatial
operations, though currently the utterances are textual. The web server is
quite rudimentary. |
|
Title |
Persona |
|
URL |
|
|
Type |
Project
(Microsoft |
|
Dates |
1997 (?) |
|
Abstract |
The
Persona project develops technologies to produce conversational assistants
that interact with a user in a natural spoken dialogue. The work is built
upon the Whisper speaker-independent continuous speech recognition system and
a broad coverage English understanding system, both also developed at
Microsoft Research. In an initial prototype, an expressive 3-dimensional
parrot named Peedy responds to user
requests for music. |