MLCC is a corpus acquisition project under the EAGLES program for Int'l Science and Technology Cooperation, funded by the EC Telematics program and the Swiss Federal Government.
The aim was to collect a set of texts representing a substantial improvement in range, quantity and quality of corpus material available. Two sub corpora have been defined to meet current needs for multilingual data consisting of a comparable set of texts in six languages and a parallel set of data in 9 languages.
The comparable text collection includes financial newspaper articles from the early '90s. The parallel data is taken from the Official Journal of the CEC, sub-series Written Questions to Parliament and the Proceedings of European Parliament.
The data has been converted to an SGML, TEI-conformant mark-up. Negotiations are underway for distribution of the data.Participants: LTG, Edinburgh and ISSCO with coordination by CNR,Pisa