In what follows and throughout this report, tools is taken in its widest sense to mean any tool which may help the work of SdT as a whole. Thus planning and management tools are included as well as tools more directly related to the production of translations.
One item, which is not strictly speaking a computerized tool nonetheless plays an important role in facilitating computerization of text handling: this is the EUROLOOK standard for document preparation, which is intended to ensure a uniform appearance of texts and whose use also facilitates convertibility between the different text-processing systems in use. (Note that some of the requesting services use Word as their text-processing system, rather than Word Perfect, which is standard throughout the SdT).
At the end of 1994, the following tools were in use.
There is also a project of the same name under way, managed by D.G. XIII in cooperation with the SdT, which aims at the intelligent integration of existing resources and the development of additional tools, such as alignment tools, concordance tools and databases of linguistic resources, including translation memories.
Recent policy has been to concentrate on the development of SYSTRAN as a tool for translating the less disseminated languages into German, English and French. Along the same lines, there are Contracts of Association with some national administrations, e.g. the Greek government, for the development of certain language pairs.
It is perhaps worth noticing that of all the tools available, some are used much more heavily than the rest. EURODICAUTOM and CELEX in particular are frequently consulted and much appreciated.
In order to find a solution, much of the material has been standardized. First a standard level of information was decided upon, and then a nomenclature specifically created with translation in mind defined. The first nomenclature dealt with public works, and nomenclatures for supplies and services were subsequently added.
The system allows documents to be keyed in through codes, directly in the Publications Office. Text in each of the languages is generated directly from the codes. Consequently, a team of about 80 people, mainly involved in document preparation but also including a few translators, succeed in generating a million pages of multi-lingual text a year.
By the end of 1995 a full nomenclature will be available and the whole of the translation process will have been automated. (At the moment, a brief summary is still translated by conventional means).
All of the computing machinery and tools listed above of course require maintenance. Many of the tools are also affected by the change from UNIX and Q-One to PCs with windows and Word-Perfect; their adaptation is either completed or under way. Other tools evolve with time or the changing needs of the environment. Here, we simply pick out some of the important developments planned for the near future. This section will also begin to take us into some of the problem areas connected with present and future functioning of the SdT.
However, encouraging the exchange of documents by electronic means does raise some problems. First, it is important that requests going through POETRY are also recorded in SUIVI. This is linked with one of the very common work-flow management problems caused by working with electronic means: because there is no physical object in the form of a pile of paper, people forget to send requests or send them twice, and similarly forget sometimes that a document has been received. The solution here is a change in working habits, but experience with other electronic tools shows that the change can take time. Another problem is the validation of authorisation of a request for translation: signatures are not easily sent by electronic mail. On a more banal level, printing and photo-copying may lead to organisational problems. On the non-electronic system, requesters prepare as many copies of a document as the number of languages they request translations into, plus one extra. With an electronic system, this burden is passed to the translation services, who must print out and copy the documents. This problem is aggravated by the difficulty experienced in the SdT with keeping secretarial staff. The problem would, of course, disappear if all translations were done completely electronically. But several of the translators interviewed pointed out that even if they work on the screen, they like to have a paper copy of the translation at their side to refer to, partly because scrolling backwards and forwards on the screen is tiresome and inefficient, partly in order to have an overview of the text easily available, partly because looking at a screen all the time is very tiring.
A very frequently mentioned problem with the current functioning of the SdT should be mentioned in connection with the document server. A document receives an identity when it arrives in the Translation Services and is given a translation number. In some cases, this number disappears when it leaves the services and is replaced by a COM, SEC or C number. Moreover, a single SdT document does not always correspond to a single COM or SEC document: a COM document may very well consist of several SdT documents put together. The contrary may also sometimes be the case. Furthermore, the text and its translations may subsequently be modified, by the requester, by the Legal Services or by the Council. Sometimes, but by no means always, the modifications may come back to the Translation Services. If they do not come back, the translations stored in the SdT archives do not correspond to the definitive version which is eventually published in the Official Journal. The translation may in any case be difficult to trace because of the change in numbering. The COM, SEC, C tool mentioned earlier is a partial response to this problem, but the only way to be absolutely sure that the definitive version is available for future reference would be to have access to the Official Journal versions, and perhaps to be able to search them by key-word rather than by document numbers which are susceptible to change. Full-text search would also be useful. Unfortunately, making the text of the Official Journal available in electronic form for archiving and retrieval involves a certain number of legal problems, since the Publications Office has separate contracts with a number of different publishers for publication of the Official Journal. CELEX makes the official versions of some documents available, but the problem was nonetheless mentioned often enough to suggest that this is still felt to be a pressing problem.
It should be noted too that the practice of modifying translations after they have left the Translation Services can be a source of considerable frustration for the translators, who sometimes see their work ``corrected'' in such a way as even to introduce grammatical errors. And since the published version is the official version, this can also lead to having to quote the offending passage!
The document consists of some 1,700 pages, and has up until now been prepared literally by using cut and paste on paper to introduce the modifications which take place each year. To increase the burden, even in the literal sense, the operation was carried out on A3 paper. The whole text was then re-typed in the Publications Office.
A prototype of the system allows the whole to be stored in a central repository as a single document, in the form of ``editorial objects'', coded in SGML. Each object contains a segment of text. Other information such as what part of the whole this object is, what language it is in, what version it is, who is working on it and so on is associated with the object. A set of filters allows the various actors to use their own tools: WinWord, Word-Perfect, Excel, Interleaf. This avoids re-typing, and also allows translation of parts which are unlikely to change to proceed without waiting for the whole document to be finished.
The system was successfully put into operation for the first time in 1995. Reactions from the three main actors are positive.
D.G. XIX had to change their working habits from working on paper to working on an electronic document, and had to hire auxiliary typing help to get the data entry done. Nonetheless, the advantage of being able to dispose immediately of a clean electronic copy for internal distribution outweighed the minimal rise in cost.
For the SdT, the improvement was very noticeable. It proved possible to meet the deadlines easily, and since the document entered mainline production, it caused no major perturbation to the normal functioning of the service.
The Publications Office suffered most problems, primarily due to the printer not being able to adapt to processing SGML files rather than introducing all data from scratch. Consequently, the time savings which had been foreseen did not materialize, although the deadlines were kept. Experience should help to iron out most of the difficulty.
The system will now be extended to include the other two Institutions, Council and Parliament. It will also serve as a model for a new project, SEI-Leg (for legislation) to provide for a central repository of Commission documents being worked on.
All in all, the experience proved the usefulness of stocking documents in SGML, thereby rendering them immune to different word-processing systems used by the different actors and to changes over time coming from successive versions.
The procedure for selecting free-lance translators has recently changed. Previously, most translation units had a group of free-lances with whom they worked regularly. The free-lances were almost ``distant colleagues'', who became used to the texts dealt with by a particular service and to Commission procedures and in-house jargon as well as to the specific terminology of the work of the unit.
However, the volume of work passed to free-lances became so large that a new procedure, aiming at transparency in the selection procedure, had to be put in place. Jointly with the Parliament translation services, a call for expressions of interest was published. The three thousand or so individuals or agencies who responded were asked to translate a brief text for each of their languages, and those whose work was satisfactory are now recorded on a central register. This implies that any translation unit in the SdT can, subject to price constraints, call on the services of any free-lance, as can also the services of Parliament. We shall see later that the change in procedures has given rise to some disquiet amongst the Heads of Department and those responsible for central or group planning. Here we simply note that the change in procedure implies re-design of APEX.
First, all pages of translation are not equal. Some texts are a great deal easier to translate than others. For example, an experienced translator, dictating, and working with a type of text with which he is very familiar, has been known to produce up to fifty pages of translation a day. The same translator, working on a different kind of text may be lucky if he produces one page, especially if he runs into problems of terminology which require considerable research or if the author of the original does not write the most lucid prose.
Then there is the question of what exactly to count: many documents come in successive versions, and only the modifications have to be translated from one version to another. If a three hundred page document contains five minor modifications, it would be mis-leading, to say the least, to say that three hundred pages were translated. On the other hand, it might also be misleading to count only the volume of the modifications: translating one sentence may involve reading the whole section in which it is embedded or even more to get a sufficient context.
Once again, though, the particular work context of the SdT creates special needs. The products currently on the market are primarily oriented towards a single-user environment, where one translator gradually builds up his own archive of texts. Another factor in a large translation service is that staff sometimes suffer from frequent interruptions in their work. Not all tools tolerate interruptions well: sometimes a user has to finish a job completely or abandon it and re-start from scratch, instead of being able to leave it for a while in mid-task.
Within a very large translation service, creating a translation memory is far from being straightforward. It would be naive to think that any translation done could simply and automatically be used to feed the translation memory. It may be that the new translation is not consistent with previous translations of the same or very similar text elements, and that in fact the previous translations are preferable. It may be that two people in two different departments are simultaneously translating very similar texts; the probability that they will produce the same translation is very low, and adding both to the archive may unnecessarily increase the amount of redundant material retrieved when the memory is consulted. It may be that the new translation is subsequently to be revised, and that it would be inappropriate to archive it before revision. If the reviser is not working on an electronic copy, there may be difficulties in any case in capturing his revisions. Thus, there is a series of questions to do with validation of the translation which have consequences for how the system can be used.
One suggestion was that it would be helpful to make a strong distinction between an archiving system, a working system, and the individual's own previous copies.
There are also, of course, all the storage and retrieval problems associated with a translation service that produces more than a million pages of translation a year. It is worth mentioning that one of the people interviewed pointed out that there is already a memory problem. Documents cannot be kept in electronic archives allowing easy access indefinitely, because there is not enough space. It sometimes happens that a document which has been removed from the live archives to tape archives is needed. It is possible to get it with the help of the computer staff, but that can sometimes take more time than is available.
With terminology other additional factors affect the issue. Several of the people interviewed mentioned that translators were sometimes reluctant to share terminology. This is not only because of a proprietory attitude to the fruit of their own labours. A translator may know that he has been forced to produce a translation in a hurry, and be somewhat uncomfortable about some of the solutions he has adopted, at least feeling that he would have liked more time to mull the problems over or to do research. In these circumstances, he will obviously be unhappy with any system which automatically seizes his solutions to feed a common resource.
On the opposite side, because any translator is aware of having been sometimes forced by urgency to take short-cuts, he can lack confidence in other people's solutions: they too may, after all, not have had the time to search for the really good solution. He will want at the very least to know the source of the solution.
At the time of writing, a strategy to deal with the practical issues of combining local data with general availability is being discussed. The proposal is that local data should have a structure which is a subset of the EURODICAUTOM structure, so that immediate uploading capability is available, and that individual translators should be encouraged to upload their local files into a central area. Subsequently the local data will be taken from the local area and validated by the terminologists before being included into the generally available terminology. The open question is still, of course, how to encourage translators to upload their material. One way, it is felt, is ensure that updating of the generally available terminology is done speedily, so that the individual translator can see the results of his collaboration made concrete in terms of improved general resources within the space of a couple of months.
Two different but parallel structures are foreseen, a "Translation Workshop" ("Atelier de Traduction") in Brussels, and a "Modernisation Network" ("Réseau de Modernisation") in Luxembourg.
The first of these is recruitment and training, where the possibility of organising common entry examinations is being investigated, and where training courses are being jointly organized to ensure that each course can reach the critical mass required.
The second of these is concerned with complementarity in management, the idea being that the translation service of one Institution can help out that of another in time of need, or, for example, during the summer when staff are less available and when work loads are very variable across the Institutions.
The third concerns terminology, documentation and new computer aids. There is considerable activity in these areas.
Any translation service faces a problem of obtaining, validating and stocking terminology. A very large translation service dealing with documents in very many different areas and needing to ensure consistency both over large groups of people and over large numbers of documents faces the problem to an even greater degree. We shall return to some of these problems later.
Here, though, we should mention projects to alleviate some of the problems through collaboration with the other European Institutions. The Council of Ministers' terminology base, TIS, can already be consulted, and an update procedure is being implemented.
More ambitiously, a decision to create a single terminological data base for the European Union has been taken; implementing the decision will involve tackling and resolving a number of quite complex organisational issues.
The main issue in documentation is the creation of a single numbering scheme for document identification. (We have already mentioned the complications caused by COMM, SECC and C numbers and their failure to correspond). This too is quite a complex issue, and will have to be resolved at the level of the Secretary General.
There are also on-going inter-institutional discussions on document archiving and translator's workbenches, which are intended to lead to creating common technical specifications and to joint calls for tenders for translators' work bench tools.
Pilot tests of access to INTERNET have also been carried out and are continuing, although on a fairly limited scale.
A CD-Rom server will be put into prototype service in 1995.
Most tools are distributed freely to those who request them. However, some tools are perhaps best suited to be used by specialists, and there is some feeling that their distribution should be more limited. Examples include some machine-aided translation tools which are not yet totally satisfactory, being perhaps slow and working through an interface which is not very intuitive.
Some tools by their nature require varying levels of access for different people. SUIVI here is an obvious example. Another example is certain data management tools, such as terminology bases, translation archives and so on, where some users will have read-only access, some add but not modify access and so on.
It should also be noted that where a variety of tools for doing roughly the same job are commercially available, for reasons of support and maintenance, a user cannot necessarily have the particular tool he would like, but only the tool which has been centrally approved. As might be expected, people do not always agree with the central decision, and may sometimes buy and install the tool of their choice.