This documentation is under active development. Criticism is very welcome.
LARA is a reading-listening tool that lets you mark up text to help students to improve their reading ability in a new language. Features include:
concordance pages for words showing where they have occurred before in the student’s own reading progress
these words are shown in contexts in which they have previously occurred.
colour-codes for how often a word has been seen
audio recordings of both segments and individual words
an easy tool for recording the audio
mouseovers for translations
links to grammar resources etc
images and embedded audio files
LARA pages are intended to be placed on the web and read through a web-browser. All information can be accessed by clicking or hovering with the mouse.
The easiest way understand what LARA does is to look at an example:
On the left, we have a page of text from Peter Rabbit. The reader has just clicked on the word “ran”, and on the right we get a list of all the places in Peter Rabbit where a form of “run” has turned up in the student’s reading history, which so far only consists of this one book. Note that the list contains examples with both “run” or “ran”. Clicking any word, on the left or on the right of the screen, produces a similar list. The reader can hover the mouse over any word to hear a spoken recording, and also get a translation. If they hover the mouse over a loudspeaker icon, they get a translation of the whole preceding sentence, and if they click it they hear an audio recording of the sentence. The colours show how often words have occurred to date. Red means once; green means two or three times; blue means four or five times; black means more than five times. When you start, everything is red. As you read, more and more of the words turn black.
This document explains
how to access content to read it: reader portal
how to create content using LARA tools: constructor portal
You can also download the underlying tools and run them on your laptop, though you’ll probably need some software skills to do that.
In the next section, we’ll start by showing you how to log in to the Portal and interact with some actual LARA content.
Who did what¶
The original LARA concept was suggested by Cathy Chua.
The first version of the LARA core engine was implemented by Manny Rayner in a mixture of SICStus Prolog and Python 3. The second version, which is described in this document, is being implemented by Manny Rayner and Matt Butterweck in pure Python 3.
The LARA portal is being implemented by Hanieh Habibi in PHP. A substantial part of the design is based on suggestions from Branislav Bédi.
The LARA GUI was implemented by Matt Butterweck in Python 3.
Turkish LARA SAAS servicing from the ITU Turkish NLP pipeline has been developed by Gülşen Eryiğit.
English LARA content has been developed by Cathy Chua and Manny Rayner.
Icelandic LARA content has been developed by Branislav Bédi.
Farsi LARA content has been developed by Elham Akhlaghi and Hanieh Habibi.
Japanese LARA content has been developed by Junta Ikeda.
German and Middle High German LARA content have been developed by Matt Butterweck.
Italian LARA content has been developed by Sabina Sestigiani.
Israeli Hebrew LARA content has been developed by Ghil’ad Zuckermann.
Barngarla LARA content has been developed by Ghil’ad Zuckermann.
French LARA content has been developed by Cathy Chua and Manny Rayner.
Swedish LARA content has been developed by Manny Rayner.
Turkish LARA content has been developed by Fatih Bektaş.
Irish LARA content has been developed by Harald Berthelsen and Neasa Ní Chiaráin
Dutch LARA content has been developed by Helmer Strik
This documentation was originally written by Manny Rayner, except for the sections “GUI Window” and “Creating a new content from a template”, which were written by Matt Butterweck. It has been edited and substantially rewritten by Cathy Chua.
Grateful thanks to Johanna Gerlach for help with LiteDevTools, Philippe Baudrion for organising the CALLector webspace, and Lionel Nicolas and Verena Lyding for flexibility in supporting contacts between the various people involved in developing LARA.
Table of contents¶
- The reader portal
- The constructor portal
- Advanced portal functionality
- Using the Python code: prerequisites
- Using the PHP code: prerequisites and installation
- Local content
- Directory structure
- Writing a local config file
- Config file parameters
- Format of tagged LARA text
- Adding HTML formatting to LARA text
- Including non-L2 text
- Adding <audio> tags
- First invocation of LARA compiler (“resources”)
- Recording LARA audio using LiteDevTools
- Filling in LARA translation spreadsheets
- Adding notes to words
- Second invocation of LARA compiler (“word_pages”)
- Editing a file from the content directory
- Opening the compiled HTML file in the browser
- Creating a new content from a template
- Making your LARA pages accessible
- Tagging and segmentation
- GUI Window
- Distributed content
- Internal documentation
- Invoking TreeTagger
- Performing multi word expression annotation
- Performing the “resources” step
- Performing the “word pages” step
- Processing LDT output
- Adding metadata for distributed LARA
- Merging two language resource directories
- Merging two translation spreadsheets
- Getting voices and L1s for a resource
- Getting voices and L1s for a resource file
- Getting the audio and translation files for a corpus resource
- Downloading a resource
- Exporting a corpus resource as a zipfile
- Importing a corpus resource from a zipfile
- Checking well-formedness of a config file
- Structured diff on tagged corpus
- Unzipping a file
- Converting a CSV file into a JSON file
- Converting a JSON file into a CSV file
- Compiling a reading history
- Incremental compilation of a reading history
- Getting a list of pages for a resource
- Cleaning the reading history cache