Level 4: Basic Scripts

Up to Level 3, there are only very limited options for controlling the order in which prompts are presented. They can be collected into lessons, so that the user can select a set of similar prompts from a menu, or they can be assigned a number in the Group line, which means that the low-numbered prompts are presented before the high-numbered ones inside each lesson. If no numbers are assigned, the order of presentation is random.

This is enough for simple examples like the pronunciation course, but works less well when the course is supposed to teach the student how to carry out some kind of goal-directed conversation. For example, if the theme of the lesson is “restaurant language”, it is more natural to structure it as a conversation between the student and the waiter, where the student starts by ordering a drink and ends by asking for the bill. The course designer needs a way to break up the lesson into a series of exchanges which take place in a logical sequence that will not always be the same, and which to some extent is determined by what the student does.

Level 4 adds a mechanism for doing this; a lesson can include a “script”, which defines the lesson as a set of “steps”. As before, each step picks a Prompt from a specified group, presents it, and processes the student’s spoken response. What happens next now depends on whether the student manages to answer correctly or not, and possibly on other things (this is described in Level 6). For the moment, there are three choices. If the speech recognizer accepted the student’s response, the conversation proceeds to the next step, as defined by the script. If it rejected the student’s response, it repeats the step, possibly with a different multimedia file. If the student is rejected too many times, the system gives up and offers them something easier.

A lesson which uses a script defines it in the Lesson unit, e.g.

Lesson
Name           hotel
PrintName      Hotel booking
Description    Simple hotel booking dialogue
Client         dialogue_client
Script         hotel_booking_script
EndLesson

Scripts are written in XML, which is a little more complicated than the plain text files used to specify Prompts, but still not very complicated: XML is more or less like HTML, with which many nontechnical people now have some familiarity. A typical step, from a hotel booking lesson, looks like this:

<step>
   <id>ask_for_number_of_nights</id>
   <group>room_for_number_of_nights</group>
   <next_success>ask_how_to_pay</next_success>
   <next_limit>is_one_night_okay</next_limit>
</step>

The meanings of the individual lines are as follows:

  • The name of the step is ask_for_number_of_nights.
  • The prompt will be from a group called room_for_number_of_nights. These prompts will tell the student to say things like “I want a room for three nights”.
  • If the student’s response is accepted, the next step will be ask_how_to_pay. In this step, the student will be given a prompt to get a response like “Can I pay by Visa?”
  • If the student is rejected too many times, the system will move to the step is_one_night_okay. Here, the system will say something like “I don’t understand, but is one night okay?”, and the student will be told to say “Yes”.

The system’s side of the conversation is supplied by multimedia prompts. So here, for example, we might have the Prompt

Prompt
Lesson         hotel
Multimedia     ask_how_many_nights
Group          room_for_number_of_nights
Text/french    Dis : tu voudrais une chambre pour 3 nuits
Response       ( i would like | could i have ) a room for three nights
EndPrompt

The Multimedia line is slightly different from the ones shown earlier; it does not contain the name of an actual multimedia file, but rather a “multimedia unit” which contains several files. Since the step may be repeated more than once if the student is rejected, the different files in the multimedia unit are played in order. Here, the multimedia unit is

Multimedia
Id             ask_how_many_nights
File           ask_how_many_nights1.flv
File           ask_how_many_nights2.flv
EndMultimedia

ask_how_many_nights1.flv is a video file showing an animated desk clerk saying “How many nights do you wish to stay?” while ask_how_many_nights2.flv is a file with the same clerk saying “I’m sorry, how many nights was that”.

Creating video multimedia files

The creation of the video multimedia files is divided into three parts:

  • Recording the audio files
  • Creating the videos and integrating the audio recordings
  • (Change file format)

The following will explain the three steps in more detail:

  • So far we’ve had positive experience using www.fiverr.com to record the audio files by native speakers. One lesson can usually be recorded for $5. Alternatively the audio files can also be recorded by any other native speaker. However, it’s important that the recording quality is adequate.
  • The cartoon videos can be created on www.voki.com. After having created a user account, one can then create the desired characters, backgrounds, etc. for free. Once the cartoon characters have been created, the audio file can be integrated via the same interface.
  • As a third step, it may be necessary to change the video format to a CALL-SLT compatible format.

You will see at the site that for a fee further options are available. In particular, files can be produced without the voki logo that appears in the free version.

Hello and goodbye: a Lite course with a script

This example shows a minimal course that uses a script. In the first step, the system says “Hello”, and the student says “Hello” back; in the second step the system says “Goodbye” and the student says “Goodbye” back; we allow two ways to say “Hello” and two ways to say “Goodbye”. The system’s side of the dialogue is done using recorded multimedia.

The course director contains a total of six files: the course file hello.txt, the script file toy_hello.xml, and the four multimedia files hello1.wmv, hello2.wmv, goodbye1.wmv and goodbye2.wmv. The directory structure looks like this:

                mynamespace
                   |
          ------------------
          |
      hello_course
          |
   ---------------------------------
   |              |                |
grammars       scripts          multimedia

hello.txt      toy_hello.xml    hello1.wmv
                                hello2.wmv
                                goodbye1.wmv
                                goodbye2.wmv

Let’s look at these files in more detail.

The course file

The course file is not very different from the ones we’ve already seen. First, we have a Course unit:

# ---------------------------------------------------
# Course

# One course, 'hello_course'

Course
Name           hello_course
Client         dialogue_client
L2             english
Languages      french
EndCourse

The only thing to note here is the Client line: for a course with a script, we need to specify a dialogue_client. Like the multimedia_client in Level 2, this will display multimedia. It will also automatically advance the dialogue between turns, using the script.

Next, we have a Lesson unit:

# ---------------------------------------------------
# Lessons

# One lesson, 'hello_goodbye'

Lesson
Name           hello_goodbye
PrintName      Hello and goodbye
Description    Learn to say hello and goodbye
Script         toy_hello
EndLesson

Again, this is almost the same as Lesson units we have seen earlier, except that we have the line

Script         toy_hello

This says that the lesson uses the script toy_hello.xml, which can be found in the course’s scripts directory.

The next part of the file, which makes up the greater part of it, specifies the four Prompts:

# ---------------------------------------------------
# Prompts

# Say hello

Prompt
Lesson         hello_goodbye
Group          hello_group
Multimedia     hello_multimedia
Text/french    Bonjour
Response       hello
EndPrompt

Prompt
Lesson         hello_goodbye
Group          hello_group
Multimedia     hello_multimedia
Text/french    Salut
Response       hi
EndPrompt

# Say goodbye

Prompt
Lesson         hello_goodbye
Multimedia     goodbye_multimedia
Group          goodbye_group
Text/french    Adieu
Response       goodbye
EndPrompt

Prompt
Lesson         hello_goodbye
Multimedia     goodbye_multimedia
Group          goodbye_group
Text/french    A plus
Response       bye
EndPrompt

Again, these are nearly the same as the Prompts we have seen at Level 3. The only difference is in the Multimedia lines. Instead of referring directly to actual multimedia files, these give the names of Multimedia units which appear at the end of the file, and look like this:

# ---------------------------------------------------
# Multimedia declarations

Multimedia
Id hello_multimedia
File hello1.flv
File hello2.flv
EndMultimedia

Multimedia
Id goodbye_multimedia
File goodbye1.flv
File goodbye2.flv
EndMultimedia

Each Multimedia declaration lists two different multimedia files, which are played in order if the student doesn’t immediately get the step right. So for example hello1.flv says “Hello”, but hello2.flv says “Sorry? Hello?”

As already noted, the multimedia files are placed in the multimedia directory.

Complete course file

Here is the whole course file:

# ---------------------------------------------------
# Course

# One course, 'hello_course'

Course
Name           hello_course
Client         dialogue_client
L2             english
Languages      french
EndCourse

# ---------------------------------------------------
# Lessons

# One lesson, 'hello_goodbye'

Lesson
Name           hello_goodbye
PrintName      Hello and goodbye
Description    Learn to say hello and goodbye
Script         toy_hello
EndLesson

# ---------------------------------------------------
# Prompts

# Say hello

Prompt
Lesson         hello_goodbye
Group          hello_group
Multimedia     hello_multimedia
Text/french    Bonjour
Response       hello
EndPrompt

Prompt
Lesson         hello_goodbye
Group          hello_group
Multimedia     hello_multimedia
Text/french    Salut
Response       hi
EndPrompt

# Say goodbye

Prompt
Lesson         hello_goodbye
Multimedia     goodbye_multimedia
Group          goodbye_group
Text/french    Adieu
Response       goodbye
EndPrompt

Prompt
Lesson         hello_goodbye
Multimedia     goodbye_multimedia
Group          goodbye_group
Text/french    A plus
Response       bye
EndPrompt

# ---------------------------------------------------
# Multimedia declarations

Multimedia
Id hello_multimedia
File hello1.flv
File hello2.flv
EndMultimedia

Multimedia
Id goodbye_multimedia
File goodbye1.flv
File goodbye2.flv
EndMultimedia

The script file

The final piece of the course is the script file, which gives the structure of the dialogue. It looks like this:

<?xml version="1.0"?>
<script>
<!-- Strategy for toy hello/goodbye lesson -->

<!-- Hello -->
<step>
   <id>hello</id>
   <group>hello_group</group>
   <next_limit>goodbye</next_limit>
   <next_success>goodbye</next_success>
</step>

<!-- Goodbye -->
<step>
   <id>goodbye</id>
   <group>goodbye_group</group>
   <next_limit>exit</next_limit>
   <next_success>exit</next_success>
</step>

</script>

The script consists of two step units, each of which starts with the line <step> and ends with the line </step>. As you would expect, the first step covers the “hello” part of the dialogue, and the second step covers the “goodbye” part. The name of each step is marked using the `<id> tag, so the name of the first step is “hello” and the name of the second one is “goodbye”. The second line is marked <group> and marks which prompt-group to use for this step. In the “hello” step, for example, we fetch prompts from hello_group.

The rest of the material in each step tells the system what do after an exchange has finished. If we look at the “hello” step, we see that there are two lines respectively marked <next_limit>, and <next_success>.

The first of these says where to go if the student has used up all their chances on that step, i.e. been rejected twice: the line

<next_limit>goodbye</next_limit>

specifies that in this case the system should move to the next step, “goodbye”. The following line,

<next_success>goodbye</next_success>

says what to do if the student’s response is accepted. In this very simple example, the result is the same, and we again go to the “goodbye” step.*

The “goodbye” step is similar, except that, as it is the last step in the dialogue, there is nowhere to go. The lines say that the system exits the dialogue if the student either runs out of tries or succeeds.