| Project ref. no. | MLIS-115 DicoPro |
| Project title | DicoPro - On-line dictionary consultation for language professionals on Intranet |
| Deliverable status | Public |
| Date of delivery | 29 / 10 : 1999 |
| Deliverable number | D10.2 |
| Deliverable title | Validation |
| Status & version |
Final version 29/10/1999 |
| Number of pages | 73 pages |
| WP / Task responsible | WP-10 Validation / Hachette Livre |
| Author(s) | Marie-Françoise Poullet, Hachette Livre |
| EC Project Officer | Poul Andersen |
| Keywords | Validation cycles, Client software, Server, Dictionary data, Interface, Tests, Validation agreement |
| Abstract (for dissemination)/B> | This report relates the periodical validation cycles and stresses upon the points that are still to be developped before implementing the product in a professional context. |
Validation
Workpackage WP10, Deliverable D10.2
Release 0
Marie-Françoise Poullet (Hachette Livre)
Executive Summary
The aim of validation workpackage was to test each element of DicoPro tool not only in its strictly technical aspect (good implementation of functionalities) but also in its practical aspect, i.e. its compliance with the needs of the final user (user friendliness, relevance, speed of the tool). This document presents the different cycles of tests that were carried out; it shows the difficulties that were met during testing; it reports the remaining bugs and proposes some improvements for the tool. As a conclusion, the point of view of the potential user reveals the eventual interest of DicoPro tool.
1.1 Goals of Validation Workpackage
1.3 DicoPro specificity and WP10 report structure
2.1 Drawing up of the document
2.2 Starting tests without validation agreement
2.3 Signing of the final validation agreement
3.1 Partners involved in testing and testing environment
3.2 Release of testing versions and rounds of tests
3.2.1 18/05/99 : DicoPro_01 Client available
3.2.2 29/06/99 : DicoPro_02 Client available
3.2.3 06/08/99 : DicoPro_03 Client available
3.2.4 23/09/99 : DicoPro_04 Client available
3.3 Conclusion on the final version
5. Dictionaries, data and layout
5.2 Data display and structure
6.2.1 As far as hardware, OS and intranet are concerned
6.2.2 As far as data are concerned
6.2.3 As far as users are concerned
6.3 Perspective for DicoPro tool
Validation Workpackage, named WP 10, is aimed at checking that:
DicoPro tool which is made up of two software elements (DicoPro Client and DicoPro Server) can be loaded, installed and used on various platforms
the tool meets user expectations in terms of functionality, performance and usability, i.e. queries can be made in the available dictionaries giving results which satisfy the users expectations.
An analysis of DicoPro performance while used by professional translators should enable to evaluate the relevance of the tool and make its set up in an industrial context easier.
WP10 essentially depends on the progress of previous or ongoing steps, i.e. Tool Adaptation (WP6), Dictionary Information Service (WP7), Installation & Training (WP8) and Tuning (WP9) Workpakages. It implies the active participation of the five WP10 partners, enabling ISSCO to work on and carry out improvements to the software and lexical data layout.
Harper-Collins and Hachette Livre, who have supplied dictionaries, shall give their support to ISSCO as monolingual and bilingual dictionaries publishers. L & H Mendez, MTL and Xynos shall help with the understanding of users needs and the adaptation of DicoPro tool to such needs.
The checking/validation step was to start at the beginning of April and finish at the end of August.
A meeting was held in Paris on April 16 to define the validation context, the bug reporting method and the projected schedule.
The versions of DicoPro Client, DicoPro Server and the dictionary data were loaded through the ISSCO ftp site. Test results (and especially bugs) were reported to a dedicaced e-mail distribution list including all WP10 partners; this list and the bug reports are available at the following address : http://www.issco.unige.ch/projects/dicopro/mail/dicopro-test.
This document offers a synthesis of WP10 activity. For more details, the above URL may be consulted.
DicoPro provides an open system to language professionals who are connected to an Intranet. It enables them to consult dictionaries from multiple sources with an uniforme interface.
WP10 partners have had to test three elements of DicoPro:
a Client software which communicates via intranet with
a server, on which
the dictionaries are stored.
The data are confidential. Consequently, a confidentiality agreement (thereafter called validation agreement) was to be drawn up and signed by WP10 partners before all the data could be supplied for consultation.
This document will focus on DicoPro specificity. Following this introduction:
Part two will present the validation agreement
Part three will sum up DicoPro Client tests
Part four will sum up DicoPro Server tests
Part five will analyse dictionary data
As a conclusion, part six will present the difficulties that emerged during the validation step, will show the limitations of this workpackage and synthesize the points of view of the target testers, who are also the target users.
In annex, the Validation agreement and the final testing report forms are included.
The aim of the validation workpackage was not only to perform tests to check the technical functions of the software, but also to validate the tool in a professional context. Therefore, it was necessary to make available to WP10 partners the whole corpora of all the dictionaries.
A confidentiality agreement had to be drawn up and signed by publishers and the other WP10 partners, using as a model the agreement that was previously signed between ISSCO and the publishers in November 1998.
Hachette Livre Legal department drew up the validation agreement and released a first draft on February 26 1999 so to give the partners enough time to improve it and sign it before the beginning of the validation period. At the meeting that was held in Paris on April 15 and 16 1999, the document was not finalized yet. Still, a discussion between the partners and the Hachette Livre lawyer enabled to clearly define in common the terms and content of the agreement.
2.2. Starting tests without validation agreement (top)
DicoPro Client version_01 was released on May 18 1999. Since a few WP10 partners had not accepted nor given feedback on the new draft of the validation agreement, it was decided to start the tests on letter A only with the available dictionaries, i.e. Hachette-Oxford French-English, English-French and Harper-Collins English-Italian.
Version 4 of the validation agreement was drawn up at the end of June, but feedback from some partners was still needed before it could be sent for signing.
2.3. Signing of the final validation agreement (top)
Eight copies of version 4, which had beeb agreed as the final one, were sent on August 10 1999 to the first of the eight signatories (MTL). After a very long and time consuming travel between United Kingdom, Belgium, Greece, Switzerland, Netherlands and France, the document has now be signed by all the WP10 participants. See the validation agreement in annex A (p. 14).
3. DicoPro Client (top)
DicoPro interface is independant from a specific platform and can run regardless of the type of Operating System. The widest the range of client material, the better it was for a complete validation of DicoPro tool. Partners hardware and environment were indeed diverse.
DicoPro Client was tested at L & H Mendez by two persons on Toshiba Satellite Pro 460 CDT, Window 95, 64 MB RAM and PC Pentium Desktop, Windows 95, 32 MB RAM.
Testers qualifications: translator/terminologist and director R&D.
DicoPro Client was tested at MTL by four persons on networked Solutions PC (233 MHz), Windows NT 4.0.
Testers qualifications: Language engineering consultant, plus three translators.
DicoPro client was tested at Hachette Livre by two persons on two Macintosh G3 (300 MHz and 350 MHz) Mac OS 8.1. and 8.6. The Client software was installed on PC Windows 98 but eventually this computer could not be used for testing.
Testers qualifications: lexicographer and multimedia projects manager.
DicoPro Client should have been tested at Harper-Collins by one person on Sun Ultra-5, Solaris 2.5.
Tester qualifications: analyst-programmer, typesetter.
DicoPro Client should have been tested at Xynos by on Pentium 200 MMX 96 MB RAM, Windows 98.
Tester qualifications: technical manager and translators.
ISSCO team was working on a Client computer, on Unix (Sun/Solaris 2.6).
3.2. Release of testing versions and rounds of tests (top)
Three rounds of tests were necessary to get to version 4, considered as the final version of the project and tested in a fourth round.
3.2.1. 18/05/99 : DicoPro_01 Client available (top)On May 14 1999, ISSCO asked WP10 partners to load Java environment in a version compatible with the software (Java 1.2). A problem appeared with Macintosh and OS/2, because this version of Java was not available for such environment: Hachette Livre and LIM could not participate to the first round of testing.
L & H Mendez and MTL could load and install DicoPro_01 Client easily, but the connection to ISSCO server turned out to be more complex. L & H Mendez could connect to ISSCO on May 31, MTL on June 17. Both L & H Mendez and MTL experienced problems with the display of the dictionaries, which were later resolved.
Harper-Collins and Xynos did not participate to the first round of tests.
Bug reports put emphasis on cosmetic problems in the interface, typographical errors in the layout of the dictionaries entries (fonts especially IPA fonts, size of characters, too many spaces between words).
A design problem was raised during this round of tests: some arrows ("right" and "left") enable the display by alphabetical order of the concerned dictionary of the entries in the right part of the screen, whereas the list of headwords displayed in the left part of the screen can be the result of a search without relation to the entry displayed. This point that seemed immediately illogical to the testers have not been corrected yet. Will the professional user be bothered by this?
3.2.2. 29/06/99 : DicoPro_02 Client available (top)DicoPro_02 Client showed very few changes compared to the previous version, exept for its compatibility with Java 1.1. Some of the interface problems were fixed. In the new version of dictionaries, the bad display of accented letters on Macintosh was solved. Nevertheless, until today, IPA fonts display is still a problem for the Macintosh. It seems that only a planned new version of Java for Macintosh could solve the problem, but this version is still not available.
DicoPro_02 was installed without problems at L & H Mendez, MTL and Hachette Livre. The server software remained at ISSCO.
Harper-Collins and Xynos did not participate to the second round of testing.
Bug reports put emphasis on queries: some queries on words included in the dictionaries got no answer. Moreover, the starting of DicoPro was considered to be quite time consuming.
In version 02, translators at MTL appreciated the possibility of automatic display of the first word in a list of results. In fact, nine times out of ten these translators felt they would make queries on exact words, instead of pattern or suffix.
Problems in the typographical layout of the entries were focussed on, but at that time ISSCOs urgency was the development of the server with a new version of the client software. Working on the dictionary data was postponed later. But the layout as it stood then was quite worrying since the logical structure of entries was not respected.
3.2.3. 06/08/99 : DicoPro_03 Client available (top)
DicoPro_03 Client was no longer compatible with the server version installed at ISSCO. It had to be tested in an intranet environment since DicoPro_03 server was made available on July 26 1999.
Because WP10 partners had some difficulties to install their own server, ISSCO allowed testers to connect to the University server during the full testing period.
The new round of tests was performed by MTL and Hachette Livre. Xynos did not participate and both Haper-Collins and L&H Mendez tried but did not succeed in getting a working installation.
Two essential functions of the tool were still not available in this version: advanced and multiple searches. Typographical layout of the entries was not changed. Changes in the program were above all designed to make the tool more performant.
Testers spent a huge amount of time installing the Server and consequently had much less time to test the new version of the Client. Nevertheless, remarks were made about multiple windows, slowness of the tool and many other points (cf. http://www.issco.unige.ch/projects/dicopro/mail/dicopro-test).
The main problems raised in DicoPro_03 Client were:
the layout of the entries did not comply with their lexical structure
bad results of queries on words with accented characters (with suffix and pattern queries especially).
3.2.4. 23/09/99 : DicoPro_04 Client available (top)DicoPro_04 was tested in a systematic way by filling out an exhaustive grid for all the projected functionalities. The exhaustive test carried out by L & H Mendez, MTL, Hachette Livre and Xynos can be consulted in annex B (p. 19). The included remarks shall be taken into account so to implement a commercial version of the product.
MTL, Hachette Livre and Xynos were able to test DicoPro in an Intranet environment, L & H Mendez in Internet through ISSCOs server. Harper-Collins tested in both Intranet and through ISSCOs server.
3.3 Conclusion on the final version (top)
Nearly all functions are correctly implemented. But there are still some problems:
- searches on words with accented letters are impossible on Macintosh
- index that appear at the top of some entries do not work
- lack of stability of the open dictionaries after multiple or advanced searches
- problems with personal bookmarks
- necessary fit of the functions "save" and "cancel" in the preferences
- difficulties or ambiguity when using various frames in different dictionaries. This function should be considered again and certainly simplified.
The speed of the tool is correct in an Intranet context, however it seems to be still an issue at Harper-Collins where the Client and the Server are running on the same machine. It will be necessary to confirm that searches are always time efficient (above all when patterns or suffix searches are made), when all the corpora of dictionaries are on line.
The interface of "dictionaries consultation" is simple and effective. The icons could be more homogeneous on a design point of view and some of them could be more representative of their functions. The contextual inactivity of some icons should be clearer.
The user documentation is adapted to the version and was very clear. Yet, it is still not always available on line and, above all, it is not contextual.
The documentation for the installation of the tool (with its Java environment and fonts) should be checked before any industrial distribution.
4. DicoPro Server (top)
The first tests on DicoPro Client were carried out using a connection to the ISSCO server. For security, each computer assigned to validation at the partners firms was registered with its IP number at ISSCO. The difficulties that were encounterd with connecting to ISSCO server delayed the starting of the tests. They did not seem that serious because they were due to a connection problem to internet and because the goal of the validation step was to prove the efficiency of the program, i.e. the running of the Client and Server software on Intranet.
Expected installation platforms for servers:
- L & H Mendez. Hardware: PC Pentium II (128 MB RAM), Windows NT 4. Qualifications of the person who have tried to install the server: System engineer.
- MTL. Hardware: networked Solutions PC (233 Mhz), Windows NT 4. Qualifications of the person who have installed the server: Language engineering consultant and systems administrator.
- Hachette Livre. Hardware: Sun Sun SPARCStation 5, Solaris 2.7. Qualifications of the person who have installed the server: manager of Hachette intranet sites, Net engineer.
- Harper-Collins. Hardware: Sun Ultra-5, Solaris 2.5. Qualifications of the person who have installed the server: analyst-programmer, typesetter.
- Xynos. Hardware: Intel Pentium II, 350 MHz, 128MB of RAM, Windows NT 4.0, Service Pack 5. Qualifications of the person who have installed the server: technical manager.
On July 26 1999, the first version of DicoPro sever was available. It was called DicoPro_03 Server in order to bring it into line with the client software DicoPro_03 Client.
A second server DicoPro_04 was available on October 4 1999. It is compatible with DicoPro_04 Client.
4.2. Installation problems (top)
WP10 partners, except Hachette Livre, met serious difficulties when installing and defining parameters for their server. These problems have been solved at the end of the project at MTL, Harper-Collins and Xynos, but not at L & H Mendez.
When Hachette Livre loaded and installed the server, the connection was successful but access to the dictionaries was denied because of a licence problem. The bug, soon resolved, was due to time difference between the partners countries!
Bug reports show the great efforts made by MTL, Xynos and L & H Mendez to identify and explain their problems and those of ISSCO to answer all questions as soon as possible. Problems were diverse: installation of the different modules, definition of parameters, connection, etc. MTL succeeded their first connection client-server on intranet on September 24 1999, Harper-Collins on October 6 1999, Xynos on October 7 1999.
A major difficulty in the validation step was the connection to the Client and Server software via Intranet: it seems essential to verify that WP10 testing cycles have really shown all the problems related to the implementation of DicoPro in a firm and have enabled to solve them.
5. Dictionaries, data and layout (top)
For the main part of the validation cycle, tests have been performed on the data of letter A in three dictionaries only:
- Harper-Collins English-Italian
- Oxford-Hachette English-French
- Hachette-Oxford French-English
Other dictionaries have been available at the end of the validation cycle.
Bug reports have put emphasis on the typographical layout that did not comply with the data structure. From August 16 until September 23 1999, ISSCO worked on a new processing of the dictionary data and submitted a few samples to Hachette Livre. The problem turned out to be more complex than thirst thought: a post-processing would have had to be followed by a detailed checking and the "cleaning up" of lexical entries. There was no more time to define such a new processing step that, besides, was to be adapted according to each dictionary.
On September 24 1999, a new version of letter A for English-Italian, French-English and English-French dictionaries was available on ISSCO server.
Also on September 24 1999, letter A of Italian-English and monolingual French dictionaries was available on ISSCO server .
On September 26 1999, letter A of Spanish-English and English-Spanish dictionaries was available on ISSCO server.
These dictionaries were also available for partners to download to install on their own servers..
5.2. Data display and structure (top)
In this project, priority was not given to correcting the typographical layout of data. Obviously, this point would have deserved a huge amount of time, so that publishers and users could think about the desirable layout of an on-line dictionary.
A question emerged during the validation step: shall DicoPro make the translator work easier by a uniform presentation of all the dictionaries or shall each dictionary retain its own identity and that of its publisher?
Data as they were supplied to ISSCO presented structure and SGML tagging discrepancies from one publisher to another. Conversion of dictionaries from SGML to HTML was eventually more complex than it was thought, because SGML format did not comply systematically with the intimate and detailed structure of the entries, but sometimes only took into account the typographical layout wished by the lexicographer. Post-processing designed to disambiguate some lexical entities could not solve all the problems.
Please report to the Dictionary testing form filled out by L & H Mendez, MTL, Hachette Livre and Xynos in Annex D (p. 65) for precise remarks about the presentation of search results and the layout of entries.
Suggestions for a better readibility of the list of entries after a request and the layout of a dictionary entry:
- in a list of multi-word compounds, only the first word appears, so it is necessary to display the whole entry to see if the relevant compound word is really defined
- for homonyms, it would be helpfull if more information was given in the headwords list
- the number of words matching a search should appear at the bottom of the window for instance.
- the information could be more colour coded to separate the sense tags from the lexical items themselves, to emphasize compound words
- a use of full potentiality of the hyperlinks and cross-references
- avoid the italic font
- avoid the vast amount of blank space that makes the use of the scrolling bar necessary
- more performant links between elements in the left and right windos should be studied.
As the tool stands, the layout of entries remains too similar with the paper dictionaries layout. All the potentiality of the tool has not been used.
5.3. Relevance of queries (top)
It is quite difficult to assess the relevance of the searches in a professional context since the tests were not performed in a real scale.
It seems that the tool is somewhat too powerful compared to the everyday needs of the translators and that searches on patterns and suffix are little used. It seems that the boolean logic of the advanced search is not particularly intuitive for the end user
A first difficulty in DicoPro validation planning came from the tool itself: a software and its functions, a user interface and ergonomics, then the layout of data that was to conform with a logical structure, were to be checked at the same time. If the first two points could be tested by everyone following a defined method, the third point came within the competence of publishers who had supplied the structured data. However, such a work required much more time and staff than publishers had.
The second stumbling block was the number of partners involved in WP10 (a natural tendency is to rely on the work of other partners), from the distance between them (it is quite difficult to follow a discussion via e-mail) and apparently from the different competences of the partners (diversity of qualifications is indeed precious but sometimes makes things more complex).
The third difficulty was obviously linked to the holidays period: holidays for partners involved in testing but also for IT personnel, whose assistance to the partners turned out to be crucial during the validation step.
Finally, changes of WP10 actors in ISSCO and Hachette Livre made much more complex the necessary continuity of a project like DicoPro. In addition, ISSCO had to move at the end of September when partners were behind with the schedule.
6.2. Validation limitations (top)
We cannot claim that exhaustive tests have been carried out as far as hardware and OS are concerned, but WP10 partners could demonstrate the principle of independance of DicoPro software from the platform since the Client could run in NT 4.0, Windows 95, Windows 98, Unix and MacOS environment and the Server could be installed on NT 4.0 and Unix platform.
The difficulties with having Client and Server running together on intranet (or Internet) show that tests should be carried on from a Net implementation perspective if this product were to be distributed in an industrial context. A systematic analysis of the different types of nets should enable a complete set of documentation to be drawn up that could be used in any technical situation and provide the opportunity to correct bugs if needed.
6.2.2. As far as data are concerned (top)As said before, only letter A of three dictionaries were available during the testing period. The delay in signing the validation agreement meant that a full-scale testing of DicoPro, as it was wished by all the partners, could not be performed. This agreement is now signed and the real validation of lexical data could start in better conditions. It would be the next step of the project.
6.2.3. As far as users are concerned (top)Problems linked to different installations (Client, Server, connection to ISSCO server, connection to intranet servers, fonts setting, etc.) required a lot of time to be solved and had a real impact on the users tests.
The fact that all the dictionaries were not available and that all the partners but one were not able to install DicoPro in their intranet system prevented WP10 partners from validating the tool in a professional context.
The final user should have validated the product in his/her standard work conditions. But due to technical difficulties and delay, L & H Mendez, MTL and Xynos staff could not work with DicoPro tool as it was expected.
6.3. Perspective for DicoPro tool (top)
As a conclusion, DicoPro tool still needs some corrections and improvements before beeing sold. Its use by language professionals must now be tested in a real professional context. The professional opinion will be crucial for the finalization of the product. Publishers will have to take into account the professional assessment to organize their dictionary data in an efficient layout.
Consequently, it is important to conclude this report with both MTLs assessment (as a professional translator) and Hachette Livre proposition (as a dictionary publisher), which give a positive perspective for DicoPro. These conclusions are taken from the Dictionary testing forms (see annexe D p. 65)
MTL :
Our testers felt that the tool had the potential to be faster than using paper dictionaries (and more convenient in terms of desk space!) and faster than CDROMs particularly those CDROMs which are not uploadable on to a hard drive (so that you have to keep changing the disc in your CDROM drive). However, from the current validation exercise it is not possible to say that the tool is definitely faster for several reasons:
- Only the letter A was available so that the testers could not use the tool for real translation work.
- Not enough dictionaries were available so that the testers could not use the tool in an optimal real life situation (i.e where all the dictionaries a translator would normally need are available through the interface). If the translator still has to use CD-ROMs, thesauri and other lexical resources as well as DicoPro then little time is saved by using the tool.
- There have been several bugs, connection problems, server drops etc during the validation cycle which have either slowed the testers down, caused them to stop using the tool altogether, or resulted in them spending time filling in bug reports. This is not a criticism of the tool - this is exactly what should happen when testing a prototype. But having spent a morning compiling bug reports, taking screen dumps etc, it is not really reasonable to expect that tester to say that the tool is faster than using a paper dictionary! Usability testing is really the next stage, once the bugs and connection problems have been ironed out.
Our testers felt that the tool had the potential to be a very useful piece of software, as long as it had the backing of the major dictionary publishers, and preferably terminological resources should be available through the tool as well. One translator remarked that it would be useful to be able to connect to a remote server via the internet, so that one could have the software installed on a laptop PC and could access the data from anywhere where there was a phone line (eg a hotel room). This would harness one of the most useful aspects of the tool - the fact that you don't need to carry around several CDROMs or paper dictionaries.
Hachette Livre :
Translators should indeed save time using an on-line platform offering many dictionaries. The presentation of data should be harmonized for all the dictionaries since it would make the search much easier and faster: the end user would get accustomed himself/herself to a particular display of data and would find his/her bearings in the various dictionaries.
A solution for the future would be to propose new data providers a tagging model elaborated in association with ISSCO specialists and in accordance with the terminologists and lexicographers requirements. The adding value of new dictionaries for DicoPro platform are the lexical data themselves that must be more specifically oriented to technical and science fields or purely literary vocabularies, because here are the needs of translators.
DicoPro
ON-LINE DICTIONARY CONSULTATION FOR
LANGUAGE PROFESSIONALS ON INTRANET
http://www.issco.unige.ch/projects/dicopro_public/
Site map
Email:
dicopro@issco.unige.ch
The DicoPro project was made possible by
European Community and Swiss government
funding.
Contents copyright ©1998-1999 DicoPro
Consortium