You must be either using a mobile browser or an ancient one!
If the latter, I sincerely suggest you come back in Chrome.

Use spacebar, arrow keys or page up / down to navigate

Geneva, October 2013

DEVELOPMENT & EVALUATION OF MULTILINGUAL MULTIMODAL DIALOGUE SYSTEMS ON MOBILE DEVICES
   Nikolaos Tsourakis
Thesis defense for the degree of Doctor of Philosophy

Overview

Address specific issues related to the deployment and evaluation of mobile applications that use speech in combination with different modalities

Why Speech?
Why Mobility?
Why Multimodality?

Why Speech?

Babel

Why Mobile?

ENIAC (1946)

Why Multimodal?

      Input/Output modalities

Research Questions

  • Design. What architectures for mobile platforms can be incorporated in order to offer efficient and robust systems? Will the design be based on open standards? Will this infrastructure be easily extensible and usable by others?
  • Interaction. What are the appropriate output modalities for each situation? What kinds of different user interactions should be supported by the system?
  • Evaluation. What makes a system more successful compared to another? What kind of evaluation should be performed, taking into account different metrics?

The Regulus Platform

Regulus

Open source platform for constructing rule-based medium-vocabulary spoken dialogue applications

Key Modules

Recognition Message in Different Notations

  1. action_for_session(0.9059477290138602,
  2. action_sequence(
  3. recognise_and_dialogue_process_from_wavfile(c:/foobar.wav)))
  1. { "*action_for_session":
  2. [ "0.9059477290138602", {
  3. "*action_sequence": [
  4. { "*recognise_and_dialogue_process_from_wavfile":
  5. [ "c:/foobar.wav" ]}]}]}
  1. <action_for_session>
  2. <atom>0.9059477290138602</atom>
  3. <action_sequence>
  4. <recognise_and_dialogue_process_from_wavfile>
  5. <atom>c:/foobar.wav</atom>
  6. </recognise_and_dialogue_process_from_wavfile>
  7. </action sequence>
  8. </action for session>

Calendar

Offers multi-modal access to a meeting database
Vocabulary of 211 words
"What meetings are there next week?"
"Where is the meeting?"
"What are my next three meetings?"
"Will Marianne attend?"
"Will anyone from IDIAP be at the meeting?"

MedSLT

Multilingual spoken language translation system designed for medical domains
Help in situations where no common language exists between the doctor and the patient
Headache/Chest pain/Abdominal pain
"Does bright light make the pain worse?"
English, French, Spanish, Catalan, Arabic, Japanese

CALL-SLT

Computer assisted second language learning system
Restaurant: "Ich möchte einen hamburger"
About me: "Mon frère s'appelle Stéphane"
Travel: "I need one ticket to London"
L1: English, French, Japanese, German, Arabic, Chinese
L2: English, French, Japanese, German, Greek, Swedish

Design

Overview

ASR Topologies

Network Speech Recognition
Distributed Speech Recognition
Embedded Speech Recognition

Selecting A Topology

High Level Architecture

Proposed Architectures

Performance

Error Rates & Response Latency

desktop:    mobile: 

Comparing The Two Architectures

My Contributions

Interaction

Input/Output Modalities

Overview

Why Exposing System's Understanding?

Rephrasing Mechanisms

Case Study

What I Measure?

Key Results

Multimedia Prompts

Using Image Prompts

French native speaker (L1) practicing English (L2) (restaurant domain)

 

Case Study

Key Results

Using Video Prompts

Lessons

Experiment

Improvement per User & Per Prompt

Comparisons based on a pre- and a post-test

rec:    no-rec: 

Why Gestures?

Core commands: move forward and backward in the dialogue flow, start and stop speaking, get help and abort an ongoing action

Methodology

Acceleration axes - Feature space

Gestures Classification

User Studies

Key Results

Evaluation

Evaluation

Medical Speech Translation Systems

The Pathway To Healthcare

Hôpital de Nyon - Suisse

The ISO/IEC Quality Model

My Tasks

Data Collection

choose a number in a scale of 1-9 favoring the feature you like most

Weighted Quality Model

Metrics

104 metrics for 24 end-note characteristics

My Contributions

Conclusions

Conclusions