Wolfgang Teubert, Coordinator of TELRI

Language engineering is the core of information technology, and information technology will be the key industry of the next decades. The information super highways conceived today will transport a variety of data, images, sounds, tables, figures, calculations, and process protocols. To make these data intelligible, they must be bound together by language. Without natural language processing information remains incomprehensible. More than any other continent, Europe is multilingual. This situation provides a challenge to European language technology. We all want information to cross borders freely. But countries can only uphold their cultural and linguistic identity if all the relevant information is accessible and available in the national language(s). This is an important principle of the European Union today, and it also holds for all European nations. For the emergent European information society we have to develop a language technology that takes advantage of the multilingual challenge. It will have to support the production, revision, conversion, presentation, publication, documentation and last, but not least, translation of texts in technical and everyday language, and it will have to grant language-independent information retrieval by sophisticated interaction modes based on natural language. Language engineering in Europe will then play a leading role on the world market. The quality of all language technology rests on the linguistic knowledge determining the algorithms of any natural language processing application. This linguistic knowledge is accessible in and by language resources. We find it in scientifically designed text corpora, in lexicons based on existing dictionaries and on corpus analysis, and we can extract it from textual and lexical resources and convert it into the form needed in application by powerful generic software, both language specific and language independent. Language resources are the raw material of all language technology. The better they are the more expensive is their creation. The langua- ge industry, small and medium-sized enterprises in particular, often cannot afford to build them up. On the other hand, in all European countries there are focal language centres with a long tradition in the creation and application of language resources. What we need then, is a common infrastructure of (public domain) research and (private) industry. We need a common platform where providers and users of language resources come together, share expertise, discuss their needs, develop options and what is most important, exchange re- sources. In some European countries such an infrastructure exists already, in others it is gradually being built up. But most of the work was (and still is) devoted to monolingual applications. There has not been much cross-border cooperation. This is why in Western Europe several efforts have been made to build up a common infrastructure that can serve the needs of multilingual language technology applications. In March 1995, the European Language Resources Association (ELRA) was set up with strong backing by the Commission of the European Community. But Europe is larger than the European Union. Linguistic expertise, language resources and computational linguistics are highly developed in most Central and Eastern European countries. As well, we can observe here the emergence of a powerful, if still small, language industry. If we want to make Europe a competitor on the world market of language technology, we must build up a common infrastructure for the whole of Europe. The Concerted Action TRANS-EUROPEAN LANGUAGE RESOURCES INFRASTRUCTURE (TELRI), a COPERNICUS project funded by the European Commission, brings together 22 institutions of 17 European countries, with strong links to relevant language centres all over Europe. These institutions pool their resources, build up multilingual expertise, develop generic tools for multilingual applications and create a strong permanent platform for successful cooperation between research and industry. TELRI was initiated in January 1995. Already, it has succeeded in setting up several multinational joint ventures with academic and industrial partners leading to concrete language technology products for today's and tomorrow's market. These projects will be presented in the next issues of this newsletter. I strongly hope that our TELRI newsletter will make companies, organizations and institutes in research and industry aware of our activities. We need many partners with diverse backgrounds to develop a strong European network. Tell us about your needs. I am sure we will find a way to cooperate. 

TELRI will set up a permanent network of leading national language and language technology centres in the whole of Europe. It will pool existing language resources, corpora, machine-readable dictionaries and lexicons, lexical databases, and generic software tools for the creation, re-use, maintenance, validation, and exploitation of linguistic data. It will complement these repositories with newly created multilingual resources, offering a wide range of language data to the NLP community. TELRI will establish a platform where research and industry meet, exchange resources and engage in product-oriented cooperation. TELRI has a duration of three years (1995-1997). There are 22 participating institutions in 17 European countries (Albania, Germany, Great Britain, Slovakia, Italy, Bulgaria, the Czech Republic, Sweden, Slovenia, Romania, Estonia, France, the Nehterlands, Latvia, Lithuania, Poland and Hungary). Links have been established with language centres elsewhere in Europe, with relevant European organisations and ventures, and with focal language institutions in other parts of the world. TELRI is engaged in the following activities: * Establishment of an Industrial User group representing software industry, publishers and translation services. TELRI partners and users carry out joint projects leading to marketable results. * Documentation of language resources, generic software, institutions, projects and activities, to be made available on Internet. * Validation and quality assessment of taggers, alignment software and homograph disambiguation software. * Software design for language independent and language specific validation of language resources. * Infrastructure awareness improvement by the TELRI newsletter. * Joint presentation of service facilities. * Organisation of a Seminar: Language Resources for Language Technology, as a dissemination and cooperation platform for research and industry. * Creation of a special electronic TELRI network for online accessibility of all language resources among TELRI partners. * Creation of a multilingual corpus and design of tools for the automatic detection of translation equivalents. TELRI activities are organised in Working Groups of five to seven members.

