Information about TELRI partners

INDEX

Wolfgang Teubert

John Sinclair

Eva Hajièová

Elena Paskaleva

Andrejs Spektors

Dan Tufis

Ruta Marcinkeviciene

Julia Pajzs

Alexandra Jarosova

Haldur Õim

Hamdam Arzikulov

J.G. Kruyt

Kemal Oflazer

Martin Gellerstam

Primoz Jakopin

Tamas Varadi

Tomaz Erjavec

Geoff Barnbrook

Frantisek Cermak

Laurent Romary

Vladimir Benko

Dan Cristea

Antonio Zampolli

Michal Jankowski

Alexandre Zubov

Anatole Shaikevich

Iordan Penchev

Bahri Beci 72

top   telri 


Wolfgang Teubert

1.1 Curriculum Vitae

1946 Born in Eschwege, Germany

1966 - 1971 Heidelberg University. Majors: German Studies/English Studies

1971 1. Staatsexamen (Final exam)

since 1971 Researcher at the Institut für deutsche Sprache, Mannheim

1972/1973 Lecturer at Sarajevo University (on leave from the IDS)

1975 - 1978 Acting Head of the Department of Research Facilities

1979 Ph.D. Heidelberg University

1978 - 1992 Head of the Department of Research Facilities

1993 - 1996 Head of the Department of Language Change

since 1997 Senior Research Fellow, Department of Lexicology

1.2 List of Recent Publications

1992: Sprachwandel als Konzeptgeschichte. Linguistische Anmerkungen zur Abwicklung der DDR. In: NÉMET FILOLÓGIAI TANULMÁNYOK XXI/Arbeiten zur deutschen Philologie XXI. Debrecen (Ungarn). S. 102-114.

1992: Die Deutschen und ihre Identität. In: Burkhardt, Armin/ Fritzsche, Peter K. (Hrsg.): Sprache im Umbruch. Politischer Sprachwandel im Zeichen von "Wende" und "Vereinigung". Berlin. S. 233-252.

1992: Zur Behandlung von Präpositionalattributen im Wörterbuch. In: Cahiers d' Études Germaniques Nr. 23. S. 119-135.

1993: Sprachwandel und das Ende der DDR. In: Reiher, Ruth/Läzer, Rüdiger (Hrsg.): Wer spricht das wahre Deutsch? Berlin. S. 28-52.

1993: Neue Argumente. In: Sprachreport 3. S. 12f.

1994: Das Erhabene als Konzept der Wahrnehmung. In: Busse, Dietrich/Hermanns, Fritz/ Teubert, Wolfgang (Hrsg.): Zeichengeschichte, Begriffsgeschichte, Diskursgeschichte. Opladen. S. 212-258.

1994: User Needs. In: Alvar Ezquerra, Manuel/ Lafon, Pierre/ Sinclair,John M./ van Sterkenburg, Piet/ Teubert, Wolfgang/ Zampolli, Antonio (Hrsg.): Network of European Textual Reference Corpora. Final Report and Proposals. Presented to the Commission of the European Communities. S. 7-39.

1994: (mit Busse, Dietrich) Ist Diskurs ein sprachwissenschaftliches Objekt? Überlegungen zu einer linguistischen Diskurssemantik. In Busse, Dietrich/ Hermanns, Fritz/Teubert, Wolfgang (Hrsg.): Zeichengeschichte, Begriffsgeschichte, Diskursgeschichte. Opladen. S. 10-28.

1995: Vorwort. In: Born, Joachim/Schütte, Wilfried: Eurotexte. Textarbeit in einer Institution der EG. (= Studien zur deutschen Sprache 1/95). S. 11-15.

1996: (mit Conway, Steve/ Steward, Fred/Townson, Mike) Environmental Innovation and Corporate Communication: a UK/German Comparative Study. Detailed Report. EC Environmental Research Programme: Research Area III Economic and Social Aspects of the Environment. Birmingham/Dublin/Mannheim.

1996: User Needs. In: Calzolari, Nicoletta/ Baker, Mona/ Kruyt, Johanna G. (Hrsg.): Towards a Network of European Reference Corpora. Pisa. S. 37-56. (Linguistica computazionale. Vol. XI.).

1996: Zum politisch-gesellschaftlichen Diskurs im Postsozialismus. In: Reiher, Ruth u. a. (Hrsg.): Von Buschzulage und Ossinachweis. S. 286-318.

1996: Das Ende des Sozialismus und die Sprache. In: Heinrich-Böll-Stiftung (Hrsg.): Die Sprache-Hort der Freiheit. Sprachwende und Sprachwandel nach 1989. S. 117-135.

1996: Language Resources for Language Technology. In: Tufis, Dan (Hrsg.): Limbaj si Tehnologie. Editura Academiei Române, Bukarest. S. 19-28.

1996: Comparable or Parallel Corpora? In: International Journal of Lexicography, Vol. 9/3, Oxford University Press. S. 38-64.

1996: Why Corpus Linguistics? Editorial. In: International Journal of Corpus Linguistics, vol. 1/1. S. iii-x.

1996: Language Resources: The Foundations of a Pan-European Information Society. In: Rettig, Heike u. a.: TELRI: Trans-European Language Resources Infrastructure. Proceedings of the First European Seminar. "Language Resources for Language Technology", Budapest. S. 105-128.

1996: Deutsch-französische Verständigung. Ein Übersetzungswerkzeug für das 21. Jahrhundert. In: Sprachreport, 2/96. S. 7-11.

1996: The Concept of Work in Europe. In: Musolff, Andreas/Schäffner, Christina/Townson, Michael: Conceiving of Europe: Diversity in Unity. Aldershot. S. 129-145.

1997: Korpus und Neologie. In: Teubert, Wolfgang (Hrsg.): Neologie und Korpus. (Schriften zur deutschen Sprache). (Erscheint).

1997: Translation and the Corpus. In: Marcinkeviciene, Ruta /Volz, Norbert: Language Applications for a Multilingual Europe. Proceedings of the Second European Seminar: Kaunas (Erscheint).

Editorship

since 1996: International Journal of Corpus Linguistics. Amsterdam (John Benjamins).

1997: Teubert, Wolfgang (Hrsg.) Neologie und Korpus. (Schriften zur deutschen Sprache). Tübingen (Narr). (Erscheint)

1.3 Brief Description of „Institut für deutsche Sprache“

The „Institut für deutsche Sprache“, Mannheim is a central research institute formed in 1964. It aims to undertake research into and to document the German language in recent and contemporary usage. It is financed by the State and currently has a staff of over 100. The Institut undertakes mainly long-term projects in research groups. The extensive library, archive, documentation and machine-readable corpora are also available for external researchers.

The Institute also hosts lectures, conferences and colloquia, thus acting as a central point of information and contact for German linguistics for both national and international visitors and indeed for all those interested in language.

The Institut publishes several series, including the following:

Studien zur deutschen Sprache, Narr

Deutsche Sprache: Zeitschrift für Theorie, Praxis, Dokumentation, Erich Schmidt

Sprachreport. Informationen und Meinungen zur deutschen Sprache, Eigenverlag

Phonai - Lautbibliothek der deutschen Sprache, Niemeyer

Deutsch im Kontrast, Groos

Studienbibliographien Sprachwissenschaft, Groos

Schriften des Instituts für deutsche Sprache, de Gruyter

Jahrbücher des Instituts für deutsche Sprache, de Gruyter


top   telrii 


John Sinclair

2.1 Curriculum Vitae

John Sinclair is President of The Tuscan Word Centre, and long-time Professor of Modern English Language at the University of Birmingham (now part-time). He is also Adjunct Professor in Jaio Tong University, Shanghai, China, and Honorary Professorial Research Fellow in the University of Glasgow, Scotland. He has researched and taught in most aspects of the English Language, in particular stylistics, grammar, lexis, discourse and corpus linguistics. He is Founding Editor-in-Chief of the Cobuild Project, and has had continuous involvement in research and development projects for many years, financed by the British Government, the European Community (NERC, PAROLE, EAGLES, ET10, MLAP, TELRI) and commercial sponsors. In recent years he has been concentrating on improving research facilities in corpus linguistics.

2.2

Professor Sinclair's recent publications in the year 1996-7 were:

"Introduction", in Corpus to Corpus, Special Volume of the International Journal of Lexicography, vol 9 no 3, OUP; pp 171- 178

"Multilingual Databases", in Corpus to Corpus, Special Volume of the International Journal of Lexicography, vol 9 no 3, OUP; pp 179 - 196

Lexis and Lexicography, papers edited by Joe Foley, Unipress, Singapore; 181pp

"Implementation Plan" (with A. Zampolli), in Towards a Network of European Reference Corpora, report of an EU project; Giardini, Pisa; pp 1- 36

"Text Representation: Spoken Language", in Towards a Network of European Reference Corpora, report of an EU project; Giardini, Pisa; pp 85 - 106

"Access and Management Software Tools" in Towards a Network of European Reference Corpora, report of an EU project; Giardini, Pisa; pp 116-126

"The Empty Lexicon" in The International Journal of Corpus Linguistics, vol 1 no 1; John Benjamin

"The Search for Units of Meaning" in TEXTUS IX, Journal of the AIA; Tilger, Genova; (2nd of two vols as Guest Editor, with L.Merlini)

"Fictional Worlds Revisited" in Le Trasformazioni del Narrare, Schena Editore, Brindisi

"Prospects for Automatic Lexicography"; The Otto Jespersen Memorial Lecture; in A.Zettersten and V. Pedersen (eds) Symposium on Lexicography VII; Proceedings of the Seventh Symposium on Lexicography, May 5-6, 1994 at the University of Copenhagen. Lexicographica Series Maior no 76; Tubingen, Max Niemeyer Verlag

Editing

Corpus to Corpus, Special Volume of the International Journal of Lexicography, vol 9 no 3, OUP (Guest Editor and Co-ordinator of the Council of Europe project reported in the book)

TEXTUS vol IX no1, Journal of the AIA; Tilger, Genova (2nd of two vols as Guest Editor, with L.Merlini)

2.3 THE TUSCAN WORD CENTRE

The Centre was established in 1996 as a non-profit company in Italy, with the objective of offering expertise in language technology, particularly Corpus Engineering, to scholars, researchers, and staff of small and medium enterprises in Europe. Multilingual corpora and language-independent software are the main research interests. Intensive courses of a week or more duration are among the first events to be offered, and seminar organisation, study visits and short projects will follow. Residential accommodation is available, and the Centre has already received a number of distinguished visitors.


top   telri 



Eva Hajièová

3.1 Curriculum Vitae

Born: 23.8.1935, Prague, Czechoslovakia.

Education:

Graduated at the Faculty of Philosophy, Charles University, Prague in 1958 (English and Czech languages and literatures), received her PhD in general linguistics at the same university in 1967. Defended her CSc thesis ("candidate of sciences") on negation and presupposition in the semantic structure of the sentence in 1976 and her DrSc thesis ("doctor of sciences") on the semantic structure of the sentence in 1987; both academic degrees were received in general linguistics at the Faculty of Philosophy, Charles University, Prague.

Positions:

1958-1962 grammar school teacher

1962-1972 assistant professor, Faculty of Philosophy,

Charles University, Prague

1972-1987 research worker at the Faculty of Mathematics and

Physics, Charles University, Prague

1987- professor at the Faculty of Mathematics and

Physics, Charles University, Prague

Awards and grants:

1970- research fellow at the Department of Applied

Linguistics, University of Edinburgh (12/1970-3/1971)

1990- research fellow at NIAS (Netherlands

Institute of Advanced Studies), Wassenaar, The

Netherlands (1/2-30/6)

1993- visiting professor, University of Duisburg, Germany

1995- awarded by the Alexandr-von-Humboldt Research Prize

Present position:

Head of the Institute of Formal and Applied Linguistics, Faculty of Mathematics and Physics, Charles University, Prague

Membership in Scientific Societies:

President of the Asociation for Computational Linguistics (1996-97)

Prague Linguistic Circle (member of the executive committee)

Linguistic Society of Czechoslovakia Czech Society for Cybernetics and Informatics (member of the executive committee)

Circle of Modern Philologists, Prague

AILA - International Association of Applied Linguistics (member of the executive committee)

ACL - Association for Computational Linguistics (1982-1987: chairperson of the European Chapter)

ICCL - International Committee for Computational Linguistics (vice-president)

SLE - European Linguistic Society

Membership of editorial boards:

The Prague Bulletin of Mathematical Linguistics - editor-in-chief Kybernetika; Computers and Artificial Intelligence; Journal of Pragmatics; Applied Artificial Intelligence; Linguistica Pragensia

Since 1982 regular member of programme committees of the bi-annual international conferences in computational linguistics (COLING), chairperson of this committee for COLING 1988; member of the programme committee of the ECAI conference in 1992 and area chair in 1994. Member of the Int. committee for the evaluation of Swedish linguistics (1990-91). Chairperson of the organizing committee of the COLING conference in 1982 in Prague and of the International Summer School in Computational Linguistics in 1991 in Prague.

Main scientific interests and research activities:

Research interests: the syntactic and semantic structure of the sentence, discourse patterns, linguistic aspects of artificial intelligence. Principal investigator of several large-acale projects in the past; currently the main investigator of a large project on semi-automatic machine translation from English to Czech within the Academic Initiative of IBM and of the project concerning automatic retrieval from full-text documents

Author or co-author of 6 books, four of which published in English by foreign publishing houses; author or co-author of several tens of research papers published mostly in English in well-known journals or Proceedings of conferences (see the enclosed list of publications).

Invited lectures abroad:

Invited paper at COLING 76 (Int. Computational Linguistics Conference) in Ottawa, Canada, and at the International Congress of Linguists 1992, Quebec, Canada. Invited paper at the Nobel Symposium on text analysis in Stockholm 1980. During the years 1971 - 1992 invited to lecture at the universities in Germany (Hamburg, Saarbruecken, Stuttgart, Bielefeld, Heidelberg, Muenchen, Duisburg, Bochum), France (Paris, Grenoble), The Netherlands (Amsterdam, Groningen, Utrecht, Nijmegen), Denmark (Copenhagen, Odense, Aarhus), Norway (Oslo), Sweden (Stockholm, Gothenburg, Lund, Uppsala), Poland (Warsaw, Krakow), Hungary (Budapest), Russia (Moscow, Novosibirsk), Bulgaria (Sofia), Italy (Pisa, Udine, Venice, Pavia), Switzerland (Geneva), Great Britain (London, Colchester, Cambridge, Lancaster, Edinburgh), Japan (Kyoto, Tokio), Canada (Montreal), USA (MIT Cambridge, Boston Univ., Univ. of Pennsylvania, Philadelphia, SUNY Buffalo, CUNY New York, Stanford Univ., Indiana Univ. Bloomington, Chicago University, Harvard University). In 1987 invited to lecture at the Summer School in Artificial Intelligence in Mexico City, in 1988 at the tutorial at COLING 88 in Budapest, in 1990 at the Int. Summer School in Artificial Intelligence in Varna, Bulgaria, in 1991 at the Int. Summer School in Computational Linguistics in Penang, Malajsia, and at the Int. Summer School in Computational Linguistics in Prague, Czechoslovakia, in 1992 at the Int. Summer School in Artificial Intelligence in Prague, in 1996 at the European Summer School in Language, Logic and Information, Prague.

3.2 List of recent publications

1992:

A challenge for universal grammar: Valency and "free" order in underlying structure. In: Plenary Session Texts. 15th International Congres of Linguists, Quebec, Universit‚ Laval, 1992, 5O-61. Also in: Actes du 15e CongrÎs international des linguistes. Les presses de l'Universit‚ Laval, Sainte-Foy 1993, 59-69.

Linguistic aspects of natural language processing. In: Advanced topics in Artificial Intelligence, ed. by V. Marik, O. Stepankova  and R. Trappl. Lecture Notes in Artificial Intelligence 617. Berlin:Springer, 477-484.

Focus on focus - Towards a dynamic account of discourse. In: Festschrift fuer V. J. Rozencvejg zum 8O. Geburtstag, ed. by T. Reuther. Wiener Slawistischer Almanach, Sonderband 33, Wien:Gesellschaft zur Fuerderung slawisticher Studien, 185-188.

Stock of shared knowledge - a tool for solving pronominal anaphora. (With V. Kubon and P. Kubon.) In: COLING 92, ed. by Ch. Boitet. Nantes, Vol. 1:127-133.

Linguistic meaning and semantic interpretation. (With P. Sgall.) In: Current advances in semantic theory, ed. by M. Stamenov. Amsterdam/Philadelphia:Benjamins, 299-310.

1993

Issues of sentence structure and discourse patterns. Prague:Charles University 1993, 194pp.

Identifying topic and focus by an automatic procedure. (With H. Skoumalova  and P. Sgall.) In: Proceedings of the Sixth Conference of the European Chapter of the Association for Computational Linguistics, Utrecht 1993, 178-182.

1994

Grammatical data in the lexicon. In: Computational Approaches to the Lexicon, ed. by B.T.S. Atkins and A. Zampolli. Oxford: Oxford University Press, 265-277.

Cognitive prerequisites of anaphoric relations and topic-focus articulation (TFA). In: Writing versus speaking, ed. by S. Cmejrkova , F. Danes and E. Havlova . Tuebingen:G. Narr, 205-211.

Topic, focus and negation. In: Focus and Natural Language Processing, Vol. 2, ed. by Peter Bosch and Rob van der Sandt. Working Papers of the Institute for Logic and Linguistics, IBM Deutschland Informationssystems GmbH, Scientific Centre, Heidelberg, 323-331.

Machine readable dictionary as a source of grammatical information (With A. Rosen.). In: Current Issues in Computational Linguistics: In Honour of Don Walker, ed. by A. Zampolli, N. Calzolari and M. Palmer. Linguistica computazionale Vol. 9-10. Pisa: Giardini editori e stampatori, 191-199.

Syntactic and semantic properties of English focalizers (rhematizers). In: 4th conference of English, American and Canadian studies, Brno, Sept. 6-8, 1994: 15-16.

1995

An automatic procedure for topic - focus identification. (With H. Skoumalova  and P. Sgall.) In: Computational Linguistics 21:81-94.

1996

Topic-Focus Articulation - A matter of langue or parole? The Case of Negation. In: Theoretical linguistics and grammatical description (ed. Robin Sackmann), Amsterdam: John Benjamins, 1996, 167-175

Focus and Prosody. In: Integration of language and speech, Proceedings (ed. by I.M. Boguslavskij, A.V. Lazourski), Moskva. 1996, 56-68

The information structure of the sentence and the coherence of discourse. In: Burning Issues in Discourse: An Interdisciplinary Account (eds. E. Hovy and D.R. Scott) Heidelberg: Springer Verlag, 1996, 111-126

3.3 Brief profile of The Institute of Formal and Applied Linguistics (UFAL)

The Institute of Formal and Applied Linguistics (UFAL), Faculty of Mathematics and Physics, headed by Prof. Eva Hajicova, is an institute of Charles University teaching and carrying out research in computational linguistics, having in mind both the theoretical and applicational aspects. The Institute closely collaborated or is collaborating on several national projects (the build-up and processing of computerized corpora, taggers and parsers for Czech, machine translation between English and Czech) as well as international ones (a Czech-US project on quantifiers and the semantics of word order, with B.H. Partee from Umass, Amherst as a co-investigator, an EC joint research project PECO on grammar checkers of free word order languages, Copernicus joint research projects MULTEXT-East and CEGLEX, and several EC funded concerted actions in the field of language resources, computational lexicology and speech-language integrated research. At the same time, theoretical research in general linguistics, syntax-semantics interface, discourse structure and Czech and English syntax is going on, with results respectively acknowledged in the linguistic and CL community, especially those concerning dependency syntax, topic-focus articulation and discourse structure. A post-graduate programme is supervised by the professors in the Institute, open for both Czech students and students from abroad. Two-week intensive courses in linguistics and semiotics called Vilem Mathesius Centre Courses (named after the founder of the Prague School of Linguistics, which started in the twenties and has its continuation until today) are organized twice each year since 1992, with ten to twelve leading specialists from abroad teaching in each cycle.

Address: Malostranske n. 25, 118 00 Praha 1, Czech Republic

E-mail: hajicova@ufal.mff.cuni.cz

Tel.: 420-2-2191 4252

Fax: 420-2-2191 4309


top   telri 



Elena Paskaleva

4.1 Curriculum Vitae

PERSONAL RECORD:

Full name: Elena Paskaleva Paskaleva

Date of birth: 21.12.1941

Nationality: Bulgarian

EDUCATION:

1960 - 1965 - M. Sc. in Slavonic Languages, University of Sofia, Faculty of Slavonic Languages

1973 - 1978 - Ph.D. in Linguistics (Machine Translation), Moscow State University, Department of

Structural and Applied Linguistics

PROFESSIONAL RECORD:

1965 - 1973 - Philologist at the Group of machine translation

1978 - 1987 - Research Associate III, II and I degree at the Institute of Mathematics

1987 - 1991 - Senior Researcher at the Linguistic Modeling Laboratory

1992 - 1994 - Vice-Director of the Linguistic Modeling Laboratory, Bulgarian Academy of Sciences

1991 - present: - Assoc. Prof. of the Linguistic Modeling Laboratory

1994 - present - Director of the Linguistic Modeling Laboratory

TEACHING:

1987 - 1990 - Lecturer. University of Sofia, Faculty of Classical and Western Languages, Courses taught:"Introduction to Computer Science for Philologists".

1984 - 1990 - Lecturer. Faculty of Slavonic Languages, Courses taught: "Computational Linguistics".

OTHER ACTIVITIES:

1991 - present - Bulgarian Translator's Union, member of the Managing Board, leader of the Terminology Commission; Member of the Executive Board of the Foundation "Translation"

1991 - 1995 - Sofia Municipality Council: councilor

LANGUAGES:

French

English

Russian

Czech

4.2 List of Recent Publications

1.PAVLOV, R., ANGELOVA, G. and PASKALEVA, E. On Experimental Linguistic Processors for Man-Computer Dialogue in Bulgarian. Proceedings of the International Conference on Artificial Intelligence: Methodology, Systems, Applications (AIMSA'84), Varna, Bulgaria, Horth-Holland 1985.

2.PASKALEVA, E. and B. RUDIGER. Wortformenanalyse des Bulgarischen und Deutchen ohne Verwendung eines Lexemworterbuchs. Miteilungen zur automatischen Sprachverarbeitung, Berlin, September, 1985, pp.1-19.

3.PASKALEVA, E. Computer and Translation. Sofia, "Science and Art", 1988, 190 pages (in Bulgarian).

4.AVGUSTINOVA, T., K. OLIVA and E. PASKALEVA. An HPSG- based Parser for Bulgarian. International Seminar on Machine Translation "Computer and Translation 89", Publ. House of All-Union Transl. Centre, Moscow, 1989, 10-22.

5.SIMOV, K., ANGELOVA, G. and PASKALEVA, E. MORPHO- ASSISTANT: The Proper Treatment of Morphological Knowledge. Proceedings of the 13th Int. Conference COLING'90, Volume 3, pp.455-457.

6.SIMOV, K., PASKALEVA, E., DAMOVA, M. and M. SLAVCHEVA. MORPHO-ASSISTANT - A Knowledge Based System for Bulgarian Morphology. Demo-description af the Third Conference on Applied Natural Language Processing, Trento, Italy, April 1992.

7.PASKALEVA, E. Extracting and Accumulating Linguistic Knowledge by SUPERLINGUA. - Project I-5 (NSRF). Technical report, Sofia 1993.

8.PASKALEVA, E, K.SIMOV, M.DAMOVA, M.SLAVCHEVA. The Long Journey from the Core to the Real Size of a large LDB. - In: "Acquisition of Lexical Knowledge from Text". Proc. of a Workshop sponsored by SIGL of ACL, Columbus, Ohio, 1993, 161-169.

9.PASKALEVA, E. Intelligent Acquisition and Processing of Language Knowledge from Real Large Text Corpora in Desk-Top Publishing Systems. - International Conference "Methodology of Mathematical Modeling", Varna, June, 1994.

10.JOHN NERBONNE, ELENA PASKALEVA, JAN PROVOOST, INNEKE SCHUURMANN. The Word in the Computational Linguistics, TEMPUS JEP 1941/92, 170 pp. (working report).

11. PASKALEVA, E., B.ZAHARIEVA. Tagging a Highly Inflected Language. In: Proceedings of the First European Seminar of TELRI " Language Resources for Language Technology", Tihany, Hungary, eds. Heike Rettig, Julia Pajzs and Gabor Kiss, 1995, pp. 179-191

12.E.PASKALEVA, European Language Resources and the Treasury of the Computerised Russian Language Fund. In: Proceedings of the First European Seminar of TELRI " Language Resources for Language Technology", Tihany, Hungary, eds. Heike Rettig, Julia Pajzs and Gabor Kiss, 1995, pp.

13.E.PASKALEVA, M. DOBREVA, New Tools for Old Language (Computer processing of Bulgarian texts). Proceedings of First International conference "Computer processing of Medieval Slavic Manuscripts", July 1995, Blagoevgrad

14.ANGELOVA, G., PASKALEVA, E. and DJERASSI, E. TermKIT: an Environment for Knowledge Based Research of Terminology. In Proc. of the 2nd Joint Conference of Knowledge Based Software Engineering (JCKBSE-96), Sozopol, Bulgaria, September 1996, pp. 154 - 162.

15.E.PASKALEVA , S.MICHOV., Second Language Acquisition from Aligned Corpora. In: Proceedings of the International Conference „Language Technology and Language Teaching“, Groningen, April 1997

16.E.PASKALEVA, Bulgarian Language Resources and Tools in Joint European Initiatives, Second European Seminar "Language Applications for a

Multilingual Europe", April, 1997, Kaunas, Lithuania

17.JOHN NERBONNE, LAURI KARTTUNEN, ELENA PASKALEVA. GABOR PROCZEKY, TIJT ROOSMAA, Reading more in Foreign Languages, Fifth Applied Natural Language Processing Conference, April 1997, Washington, ACL.

4.3 Linguistic Modelling Laboratory

The Laboratory was set up in July 1987 in the framework of the Central Laboratory for Parallel Information Processing Bulgarian Academy of Sciences, bringing together researchers from different areas: computer scientists, linguists, logicians. The main objective of the Central Laboratory is to develop: high performance computer systems and algorithms for parallel processing, distributed computer systems, computer networks, intelligent man-machine interfaces, etc.

Natural Language Processing is regarded as a vital part of this enterprise.

Main research topics of LML:

- theoretical: natural language modelling and processing, formalisms for knowledge representation;applied: creation and support of unique (by volume and exhaustiveness) computer products concerning Bulgarian and multilingual resources.

1. International projects:

2.1. COPERNICUS'94 1202 TELRI (Trans-European Language Ressources Investigation).

2.2. COPERNICUS'94 “ELSNET GOES EAST”

2.3. COPERNICUS'94 790 BILEDITA (Bilingual Electronic Dictionaries and Intelligent Text Allignment)

2.4. COPERNICUS'94 343 GLOSSER.

2.5. DocGen: Generation of Documentation using Object-Oriented Specifications (Bulgaria, Portugal)

2.6. DBR-MAT: an Intelligent MAT System for Structurally Different Languages (Bulgaria, Roumania, Germany), sponsored by Volkswagen foundation

Outstanding achievements: Bulgarian Grammatical Dictionnary, Bulgarian Large LDB, Parallel corpora - French-Bulgarian, English-Bulgarian, French-English, systems for automatical markup, alignment and extraction of translation equivalents from parallel corpora, platform independent high-speed analyzer (applied for Bulgarian and Russian)r semimanual tagging and dictionaries builiding, DB system for Corpora administration. computing and communication facilities - PC Pentium machines, Power Mac, Internet facilities operating systems used in the department - DOS, WINDOWS, NEXT and Macintosh. national and international professional associations - Bulgarian Translators' Union, ACL, FOLLI.


top   telri 



Andrejs Spektors

5.1 CURRICULUM VITAE

Date of birth: October 6, 1943

Address: Palsmane Institute of Mathematics and Computer Science, University of Latvia, 29, Raina Blvd., Riga LV 1459 , LATVIA

Education

University Degree Discipline Year received

University of Latvia M.Sc. Mathematics 1974

P.N.Lebedev Physics

Institute , Moscow Cand.Sc. Astrophysics 1981

University of Latvia Dr.phys. Physics 1992

M.Sc. Supervisor: Dr. B.Martuzans, University of Latvia, Institute of Mathematics and Computer Science, Thesis: Heat transfer problem in high temperature plasma.

Cand.Sc.(Dr.phys.) Supervisor: Prof.S.I.Syrovatskii, P.N.Lebedev Physics Institute, Department of Theoretical Physics Thesis: Numerical simulation of hydro-dynamical processes in Solar flare plasmas

REPORT ON THE RESEARCH ACTIVITIES

1968 - 1974 Radioastrophysical Observatory of the Latvian Academy of Sciences, radio telescope construction; cosmology and general relativity.

1974 - 1979 Computer modelling of Solar flares

1979 - 1983 Head of Department of Solar Physics of the Radioastrophysical Observatory.

1983 - 1985 Chief of Laboratory for Computer Network Host Systems in the Institute of Electronics and Computer Technique of the Academy of Sciences of Latvia.

1985 - 1987 Head of Department of Programming Environment for Communication Systems at the Research Institute of the Industrial Corporation VEF.

1987 - present: Head of the Artificial Intelligence Laboratory, Institute of Mathematics and Computer Science of the University of Latvia.

A. Spektors has authored 45 articles.

He is a member of the Association of Latvian Scientists.

5.2 Some publications of A. Spektors :

A. Spektors, The Latvian Language and Computers. In: " Vispasaules latviesu zinatnu kongress, Riga. 12. - 17. 7. 1991. " Riga, 1991 (in Latvian)

R. Chevere, I. Greitane, A. Spektors , The Problems of Word Classification in the Formation of Thematic Wordstock of Automatized Terminological Dictionaries. - In : "International Terminological Conference "Terminology Science and Terminology Planning" in commemoration of E.Drezden (1892-1992), TermNet - Vienna 1994 , pp. 195-200

A.Spektors, M.Baltina, Project "Analysis, Computer Aided Processing and Development of Data Base for Latvian Historical Texts" - In: "Language and Technology in Europe 2000. Seminar, 10 - 11 November, 1994, Riga".

Andrejs Spektors, The Latvian Language Corpus. In: " Baltistica VII International Congress of Baltistics ", Rga 1995, pp. 105-106.

Andrejs Spektors, Electronic Dictionaries. In: “Leksikografijas teorija un prakse”, Riga, Latviesu valodas instituts, 1977, pp. 36 - 40. (in Latvian)

Andrejs Spektors, The Latvian Language in the Internet and Resources of Computer Linguistics. In: “Latviesu valoda - esamîba, vide, konteksti”, Riga, PBLA, 1997, pp. 46 - 53. (in Latvian)

J. Borzovs, G. Fricnovics, A. Spektors, The Latvian Language and IT&T.Baltic IT Review, 1997, Nr. 5, pp. 24 - 28

5.3 Profile of the Artificial Intelligence Laboratory, Institute of Mathematics and Computer Science of the University of Latvia.

The Institute of Mathematics and Computer Science was established in 1959 as a Computing Centre of research character. About 130 people are employed at the Institute, including 7 Dr.'s habil., 43 Dr.'s and 25 M.'s S. There are 3 Members of the Academy of Sciences of Latvia. In 1996 the financial volume of the performed tasks exceeded 380 thousand lats (about 540 KECU). There are three departments at the Institute: the Department of Computer Science, the Department of Mathematics and the Department of Information Technologies. In Computer Science the main research directions are: inductive synthesis of programs and algorithmic learning theory; computational complexity of algorithms; automatic test generation and program verification and analysis; specification languages for telecommunication and information systems; teletraffic theory; simulation of the discrete event systems; computer linguistics; machine translation and computer technology in education. The work in this field is based upon the correlation of theoretical research with actual implementation of practical projects.

For years the University of Latvia has been one of the leading ones in the world in the field of inductive synthesis. Although these studies were carried out mainly in the context of theoretical computer science, many ideas can be used in the synthesis of natural language knowledge as well.

Since 1988 the Laboratory of Artificial Intelligence at the Institute of Mathematics and Computer Science of the University of Latvia has been concerned with natural language processing, starting with problems such as storing and statistical analysis of historical and contemporary Latvian texts. Since 1991 the Laboratory has been also dealing with generation of phrases used in offices. There were some experiments of office document translation carried out.

In AI laboratory studies of Latvian grammar were initiated. A program for case generation of Latvian nouns and adjectives was written. For this purpose Latvian dictionaries were studied to find anomalies. A program for case generation of verbs also was written. There are three main forms from Latvian verbs: infinitive form, present form and past form. For case generation Latvian grammar uses all three verb forms. There were attempts to develop a system of rules for deriving infinitive and present forms from a past form. It is possible to derive the infinitive form from past, but there are about 40 % anomalies in deriving the present form. Practical systems such as spelling checkers and programs for dividing words into main parts such as syllables, prefix, root etc. were also developed.

During 1993 the work on the project (supported by the Soros foundation - Latvia). "Development of Methods and Software Tools for Computer Aided Knowledge Acquisition for the Latvian Language Machine Foundation (Knowledge Base) Creation" was carried out, revealing drawbacks of traditional knowledge acquisition methods. In 1994-1995 the work on the project "Automatic Synthesis of the Latvian Language Linguistic Knowledge" was going on. In this project mainly the possibilities of synthesising knowledge about Latvian syntax from morphological knowledge were studied. Several methods, for example, usage of Latvian morphemic language model, were investigated.

In 1995 a joint project with University of Stockholm (Prof. B. Kangere) "An Automated Morphemic Analysis of Latvian Language Texts" was initiated with aim to develop rules for automated recognition of morphemes in Latvian word forms (without help from dictionaries). This project will exploit inherent regularities of Latvian word formation principles.

Since 1993 an experimental MT system model is under construction. This project is supported by a grant from Latvian Council of Science. It will include both English-Latvian and Latvian-English translation directions. The model is developed as an interlingua system and uses ideas of MT system SWETRA (Lund, Sweden).

In 1995 - 1997 a joint research project 058 "ONOMASTICA-COPERNICUS. Multi-Language Pronunciation Dictionary of Names in Central and Eastern European Countries" was carried out and a pronunciation lexicon for Latvian was created.

The Artificial Intelligence Laboratory participates in COPERNICUS Concerted Action 1202 "Trans-European Linguistic Resources Infrastructure".


   telrii 



DAN TUFIS

6.1 Curriculum Vitae

BORN: 5 II 1954, Bucharest, Romania, married, 1 son

EDUCATION:

1992 -Ph.D. in Computer Science, Polytechnic Institute of Bucharest, Romania

1991 -Post graduate studies in Computational Linguistics, Linguistic Institute, University of California at Santa Cruz, USA

1979 -M.S. in Computer Science (including undergraduate studies) from Department of Computer Science Polytechnic Institute of Bucharest, Romania

PRESENT POSITION

- Director of the "Center for Advanced Research on Machine Learning, Natural Language Processing and Conceptual Modeling" of the Romanian Academy and Senior Researcher (grade I), PhD Advisor at the Computer Science Department of the "Politechnica" University of Bucharest - Head of Laboratory for Computational Linguistics and Senior Researcher (grade I) at Research Institute for Informatics

RESEARCH INTERESTS

Corpus Linguistics, Intelligent Computer Aided Language Learning, Machine Language Learning, Natural Language Processing, INTERNET technology. He is the main author of the first NL systems for Romanian (SDLR, IURES, PC-IURES, PARADIGM) and also the main author of TC-LISP. He coordinated the Romanian part in several international projects (with NPO/TPS Russia, George Mason University USA, with ISSCO-Geneva Switzerland, with IDSIA-Lugano Switzerland.Currently he is involved into three European projects: ELSNET Goes EAST, TELRI and MULTEXT-EAST. He is the director of the INTERNET education program of the Romanian Academy (more than 400 students already graduated this course) program which is funded by the Andrew Mellon Foundation in New York and George Mason in Washington.

OTHER SCIENTIFIC INVOLVEMENTS

-Member of the Advisory Board of the European Chapter of the Association for Computational Linguistics -Member of the editorial board of Book Series "Text and Language Technology", Kluwer Academic Publishers, USA -Commisioning Editor of the International Journal of Corpus Linguistics, John Benjamin Publishers, UK -Member of the Editorial Board of the International Journal on Functional Electronics, Romanian Academy, Romania) -Member of the Editorial Board of the International Journal on Information and Control, ICI, Romania -UNESCO Expert on Artificial Intelligence, Functional Programming and Computational Linguistics -Organiser of the main conferences and summer schools on language technology in Romania (EUROLAN'93, PAIL'94, EUROLAN'95, Awareness Days in language Technology'96, EUROLAN'97)

SCIENTIFIC AWARDS

"Traian Vuia" prize, awarded by the Romanian Academy for technical

sciences, 1989 "Tudor Tanasescu" prize awarded by the Romanian Academy for information sciences, 1996

PUBLICATIONS

Dan Tufis edited three books, authored 3 other books and published more than 80 scientific papers in journals, proceedings and contribution volumes.

6.2 SELECTED PUBLISHED BOOKS, CONTRIBUTIONS AND ARTICLES

1. Mason, D. Tufis "Probabilistic Tagging in a Multilingual Environment: Making an English Tagger Understand Romanian", in John Sinclair (ed.) Proceedings of the Third International TELRI Seminar, Montecatini, October, 1997

2. D.Tufis "INTERNET-Lab: Exposing Romanian Acadmics from the Humanities to the Internet Technology", Proceedings of the ELSNET International Seminar on Internet Training, Rodes, September, 1997

3. Tufis, P. Andersen (eds.) Recent Advances in Romanian Language Technology, Editura Academiei, 1997, available also at URL : http://www.racai.ro

4. Tufis, S. Bruda "Structure Markup in CES and Preliminary Statistics on Romanian Translation of Plato's "Republica" in TELRI News, nr. 5, May, 1997.

5. T.Erjavec, N. Ide, D. Tufis. , "Encoding and Parallel Alignment of Linguistic Corpora in Six Central and Eastern European Languages" in Michael Levison (ed) Proceedings of the Joint ACH/ALL Conference, Queen's University, Kingston, Ontario, June 1997 (also available on web:

http://www.qucis.queensu.ca/achallc97)

6. Tufis "A Generalised Environment for Unification Based Natural Language Processing", in W. Teubert, R. Markincevicene (eds.) Proceedings of the Second TELRI European Seminar, Kaunas, April 1997

7. Tufis A.M. Barbu "A Reversible and Reusable Morpho-Lexical Description of Romanian" in D. Tufis, P. Andersen (eds.) Recent Advances in Romanian Language Technology, Editura Academiei, 1997.

8. Tufis, A.M. Barbu, V. Petrascu, G. Rotariu, C, Popescu "Corpora and Corpus-Based Morpho-Lexical Processing in D. Tufis, P. Andersen (eds) Recent Advances in Romanian Language Technology, Editura Academiei, 1997.

9. Gh. Tecuci, D. Tufis, DFt. Treusan-Matu, L. Negreanu, D. Trifenescu, S. Celinoiu, C. Niculescu, D. Marcu, M. Ciocoiu, S. Bruda, Introduction to INTERNERT, Editura Academiei Bucuresti, 1997 -in Romanian (new edition,

extended and updated)

10. D.Tufis (ed.) Limbaj si Tehnologie, Editura Academiei, Bucure[ti, 1996

11. D.Tufis. "CALL: The potential of Lingware and the Use of Empirical Linguistic Data" in B. Magaard (ed) Proceedings of COLING'96 (panel discussion) Copenhagen, 1996

12. D.Tufis, L. Diaconu, A.M. Barbu, C. Diaconu. "Morfologia limbii romne, o resurs lingvistic reversibil si reutilizabil`" in D. Tufis (ed.) Limbaj si Tehnologie, 1996

13. D.Tufis, L. Diaconu, C. Diaconu, A.M. Barbu.. "Dicsionar morfo-lexical destinat traducerii automate", in D.Tufis (ed.) Limbaj si Tehnologie, 1996

14. D.Tufis. "Lexical Gaps and Weighted Abduction", in C. Unger and A. Lesia (eds.), Intelligent Computer Communication, Casa Cersii de Dftiins, Cluj-Napoca, 1995

15. D.Tufis. "Overcoming Lexical Gaps", in G. Paun (ed.) Mathematical Linguistics and Related Topics, Romanian Academy Publishing House, Bucharest, 1995

16. D.Tufis, O.Popescu. "A Minimal Commitment Algorithm For Finding The Shortest Path in Multiple Inheritance Semantic Networks", in Fl.Gh.Filip (ed.) International Journal on Information and Control, vol.1, Bucharest, 1994

17. H.Hamburger, D.Tufis. "Situation Viewpoints For Generation", in D.Mc Donald (ed.) Proceedings of the 7th International Generation Workshop, Kennebunkport, Maine 1994

18. D.Tufis, H.Hamburger, R.Hashim, J.Pan. "Generation of Natural Language in an Immersive Language Learning System", in M.Browse (ed.) Basics of ManMachine Communication for the Design of Educational Systems, Springer Verlag, 1994

19. H.Hamburger, D.Tufis, R.Hashim. "Structuring Two Medium Dialog For Learning Language And Other Things", in Ed Hovy (ed.) Proceedings of the ACL Workshop on Intentionality and Structure in Discourse Relations, Columbus, Ohio, 1993

20. D.Tufis, O.Popescu. "A Fast Algorithm For The Shortest Path Problem in Multiple Inheritance Semantic Networks", in Florin Gh.Filip (ed.) International Journal on Information and Control, vol.2, Bucharest, 1992

6.3 Brief profile of RACAI

Center For Machine Learning, Natural Language Processing & Conceptual Modeling, Romanian Academy Casa Stiintei, Room 240, Calea 13 Septembrie, no.13, sector 5, Bucharest, fax:+401 210 03913, telephone +401 631 1902

The Center has been established in 1994 to conduct basic research in artificial intelligence and to promote international scientific research. It has a core of permanent staff (12 persons, 10 of them being experienced researchers), affiliates (9, out of which 5 are reputed scientists from abroad) and a variable number of temporary (contact-based) collaborators, mainly MSc or PhD students.

The main research directions of the Center are: natural language processing, multistrategy learning and agent technologies, distributed inteligent processing and conceptual modelling. Some of the addressed research problems are:

Development of computational models of language, including formalizing linguistic and extra-linguistic knowledge involved in language comprehension and production; besides development of large linguistic resources (dictionaries, corpora, grammars) special emphasis is placed on communication modelling in man-machine and machine-machine dialogues. Development of robust natural language processing tools based on corpus evidence and probabilistic modelling.

Development of a general method for multistrategy task-adaptive learning based on plausible reasoning, integrating a wide range of basic learning strategies, depending on the features of the system's learning task; automated knowledge acquisition through multistrategy learning; development of a general framework for knowledge base refinement through multistrategy learning, active experimentation and guided knowledge elicitation; development of adaptive knowledge-based agents through apprenticeship and multistrategy learning.

New architectures for computer network services. The envisaged topics are:

- human-computer interface in the context of network services (formal models; some semiotic aspects);

- study of some general issues in open distributed systems:transparency requirements; objects as a modelling concept; problems of addressing, naming and routeing;

- a parallel study of the distributed processing models introduced by CORBA, DCE, RPC and java;

- evaluation of the new solutions offered by java (with a stress to java RMI) for implementing client-server applications (posibilities to simplify application-level protocols; security and access control aspects; study of RMI marshalling techniques in conjunction with other existing ones; persistence of java objects using RMI object serialization).

Development of general methods for intelligent tutoring integrating AI technology with pedagogical fundamental principles; modelling the subject matter, the student and tutorial strategy are key topics in intelligent tutoring research as they are supposed to allow for development of flexible and personalized tutorial systems, able to optimize the instructional processes.

Development of structural-phenomenological(SP) models, conceptual and symbolic. The program, based on previous works of prof. Draganescu, is aiming at obtaining conceptual SP-models for various forms of reality that cannot be completely explained by S(structural)-models. This may be the case for living organisms, mind, consciousness and existence in general. In these models the notion of information plays a fundamental role. A further step will be to pass from conceptual SP-models to symbolic SP-models.

Collaborative Research

Participation at the contract between GMD FOKUS - Research Institute for Open Communication Systems (Berlin, Germany) and Research Institute for Informatics (Bucharest): Accounting Component of an ATM Service Management System (in the framework of the MILAN (Management of Interconnected Local ATM etworks) project; MILAN joined different international activities in broadband network and service management, including some Eurescom and ACTS projects).

MAC-ELU: An unification-based framework for machine translation: this project was done in cooperation with ISSCO-Geneva Switzerland

MAC-PAILab: An integrated environment for teaching AI in Universities: this project was done in cooperation with IDSIA-Lugano Switzerland.

FLUENT2: A foreign language tutoring system: this project was done in cooperation with Intelligent Tutoring Systems Lab of the George Mason University, Virginia USA and Language Technology Lab of MIT, USA

ILPNET, European Concerted Action (PECO #0044)

http://www.ai.ijs.si/ilpnet.html

Ongoing international projects:

* MULTEXT-EAST, European Joint Research Project (COP #106)

http://www.lpl.univ-aix.fr/projects/multext-east/ (1995-1997)

* ELSNET Goes EAST, European Concerted Action (COP #200)

http://www.cogsci.ed.ac.uk/elsnet /home.html (1995-1997)

* TELRI, European Concerted Action (COP #1202)

http://www.ids-mannheim.de/telri/telri.html (1995-1997)

* A NATO research grant at LIMSI/CNRS Orsay France, on conceptual

modelling of natural language communication (1996-1998)

* A research grant was awarded by the U.S. National Research Council to our project called "Using Multistrategy Learning as a Framework for Building Knowledge-Based Systems" (Twinning Program with Bulgaria and Romania, 1995 - 1997), for which we have developed an intelligent learning system based on the MTL-JT methodology.

* CONCEDE, European Joint Research Project on encoding electronic dictionaties (1997-1999)

* INTERNET Education Program, joint program with George Mason University in Fairfax and sponsored by Andrew Mellon Foundation in New York (1996-1998)

Besides research programme, the Center aims at becoming an active know-how disseminer, organising national and international conferences, workshops, summer-school and seminars. Among the most significant dissemination actions may be mentioned:

* PAIL On-Line International Summer School on AI Education, Bucharest, 1-5 August, 1994

* EUROLAN'95 International Summer School on "Language and Perception: Representations and Processes", Iasi, 18-27 July 1995 (coorganiser with University Al. I.Cuza of Iasi)

* Awareness Days on Language & Technology, Bucharest, 30-31 January 1996 (European Commission funded seminar, within the "Awarenes Campaign" Programme of the European Union)

* EUROLAN'97 International Summer School on "Corpus Linguistics: Lexicons, Language Engineering, Discourse Processing", Tusnad, 18-27 July 1997 (co-organiser with University A. I.Cuza of Iasi, jointly funded by European Commission, TELRI European Project and Romanian Ministry of Research)

As one of the 8 national backbones our Center ensures Internet communication for all the institutes of Romanian Academy (23). Beginning with January 1996 our Center started an introductory course on using INTERNET-accessing programs. The course, although intended primarily for scientists from the academic community, is open (free of charge) to any interested person.


top   telrii 



Ruta Marcinkeviciene

7.1 Curriculum Vitae

1. Family name: Marcinkeviciene

2. First name: Ruta

3. Date of birth: April 15, 1958

4. Nationality: Lithuanian

5. Civil status: Married

6. Education:

Institution Vilnius University

Date 1976 - 1981

Degree University Degree in Germanic Philology

Institution Vilnius University

Date 1990

Degree Doctor's Degree in Baltic Linguistics

7. Language skills (mark 1 to 5 for competence)

Language Reading Speaking Writing

Lithuanian

(mother tongue) 5 5 5

English 5 5 5

Russian 5 5 4

Polish 4 3 2

German 4 3 2

Latvian 3 3 2

8. Membership of professional bodies:

member of editorial board of International Journal of Corpus Linguistics

9. Other skills:

computer literacy, driving licence

10. Current position:

Head of the Center of Computational Linguistics at Vytautas Magnus University, Faculty of Humanities

Docent at the Department of Lithuanian Linguistics, Vytautas Magnus

University

7.2 List of most recent publications:

Seeming Tautologies in Lithuanian / E-Prime Anthology III // Ed. by D. David Bourland, Jr., Paul Dennithorne Johnson, California, Concord, 1997.

Phonology, Morphology, Syntax, and [[opthyphen]] Squirrely Semantic. A review of the book: D. David Bourland, Jr., and Elizabeth J. Bourland. A Course in Advanced Squirrelly Semantics / Et cetera, 1997, Number 2, P.247 - 250.

Klausimas dël klausimo arba kà gali kompiuterinis tekstynas. Darbai ir dienos, Nr. 5, 1997, Nr. , p. 19-37.

Tekstynø lingvistika ir lietuviø kalbos tekstynas. Lituanistika, 1997, Nr. 1, p. 58-78.

Tarp Scilës ir Charibdës. Darbai ir dienos, 1996, Nr. 2, p. 67-79.

Kalba informacijos ir integracijos am[[thorn]]iuje. Lietuvos mokslas, 1996, t. IV, kn. 8, p. 57-60.

Kolokacija: tyrimo kryptys, metodai, mokyklos. Lituanistika, 1995, Nr. 2, p. 40-54.

Karo metafora. Darbai ir dienos, 1995, Nr. 1/10, p. 121-124.

Address, phone, fax, e-mail:

Ms. Ruta Marcinkeviciene

Stulginskio 15, Kaunas

Lithuania

Private phone +370 7 268395

Office phone +370 7 224515

Office fax +370 7203858

E-mail ruta.marcinkeviciene@vdu.lt

7.3

The origins of Vytautas Magnus University go back to January, 1920 when the School of Higher Studies was established in Kaunas. In 1922 it was reorganized into the University of Lithuania and 1930 the university was re-named Vytautas Magnus University. The University was closed during the Nazi occupation and again in 1950 the Soviets denied its existence. During Soviet times studies were re-organized to form Polytechnical and Medical Institutes and Humanities and Social Studies were transferred to the University of Vilnius. With the rebirth of Lithuanian independence in 1989, through the efforts of Lithuanian scholars at home and abroad, Vytautas Magnus University was re-established. Vytautas Magnus University has been organized according to the principles of autonomy and academic freedom similar to the structure of North American and Western European universities. The University accords special attention to the Hu-manities and to Social Sciences.

In the seven Faculties there are 2,000 undergraduate students, 530 students in the Master's programs and 150 in Doctoral programs. Among the 270 teaching staff there are 69 professors, 108 docents (associate professors). Each year there are about 30 visiting professors and lecturers from the United States and other Western European universities. Vytautas Magnus University has close ties and reciprocal exchanges with institutions of higher education in Austria, Canada, Great Britain, Italy, United States, Poland, Norway, Finland, Sweden and Germany. These relationships develop through the above mentioned countries universities visiting professors working at VMU, and the Lithuanian students participating in study programs abroad, as well as joint research projects. The foreign universities support VMU financially with equipment and educational materials. There are also good relationships with universities in the United States such as: the University of Illinois, Loyola, Rutgers, Fordham, Crayton, New England College and Northridge California State University.

Vytautas Magnus University is located in Lithuania's second largest city, Kaunas. For many years it had been the temporary capital. Kaunas is set on the confluence of the rivers Neris and Nemunas. Its central location is at the very crossroads of trading routes and Kaunas is rapidly developing into a business center. The University occupies four buildings which include classrooms, laboratories and auditoriums for performances and larger assemblies, and housing facilities for staff and students. The Central Administration Building of Vytautas Magnus is at S.Daukanto 28, near Vienybes aikste (Unity Square). There stand the symbols of Lithuanian statehood, the graceful Liberty monument and the Tomb of the Unknown Soldier with the eternal flame. The Faculties of Fine Arts, Social Studies and Humanities are located at Donelaicio 52. Environmental Studies and Informatics Faculties are at Vileikos 8. The Faculty of Business and Administration is in the Central Building - at S.Daukanto 28. The Faculty of Theology is at Laisves aleja 53. Residences for professors and students from abroad are at Muitines 7 and student dormitories are located at Taikos pr. 119. The facilities and buildings are shown in the following illustration. The University offers undergraduate and graduate studies. During the first two years of undergraduate studies students take general humanistic courses as well as basic and introductory courses in the field of chosen specialization. Presently the study of the English language and Informatics (computers) is especially emphasized. In the third and fourth years two thirds of the programs time is given to courses in the field of specialization. In the graduate studies VMU attempts to focus on fields of study which were non-existent or very neglected during the Soviet times. These areas are: Ethnological Studies, Integrated Arts, Theology, Psychology, Sociology, Social Work, Business Administration, Municipal Management, Training in Banking and Environmental Studies. There are two levels of graduate studies. A two year Master's program and then a three to four year period of study to achieve a Doctoral Degree. Many doctoral programs are organized in cooperation with other research institutes of Lithuania with good facilities and libraries. For effective organization of studies 6 personal computer classes, special laboratories, other facilities and reading rooms are available.


top   telrii 



Julia Pajzs

8.1 Curriculum Vitae

Date of birth: 9 January 1955. Budapest

Secondary School: Leovey Klara, Budapest, (1969-1973).

University: Janus Pannnonius University, Pecs (1973-

1977) Hungarian linguistics and literature,

history, BA (1977)

Eotvos Lorand University, Budapest (1977-

1983) mathematics MA (1983)

Doctoralship: Computational lexicography. Eotvos Lorand

University, Budapest 1989.

PhD: Computational linguistics, Janus Pannonius

University, Pecs, Hungary (1997)

Affiliations: Computer center for State Administration

(1978-1985)

Research Institute for Linguistics,

Hungarian Academy of Sciences, Department of

Lexicography and Lexicology (1985-).

Research Projects: Programming and system plans for the

Frequency Dictionary of Hungarian (1980-

1981)

Programs for the morphological synthesis and

analysis of Hungarian. (1981, 1987, 1990)

Computer expert of the project for

Historical Dictionary of Hungarian (1985-)

Project for creating a new dictionary of

French/Hungarian, Hungarian-French (1991-)

Coordination and research in the GRAMLEX,

MULTEXT-EAST, TELRI Copernicus projects

(1995-)

Computerization of the Hungarian Consize

Dictionary (1995-)

Project for setting up the National Corpus

of Hungarian and making it available on

internet for research. (1996)

Others: Organization of the COMPLEX conferences on

Computational Lexicography and Text Research

(1990, 1992, 1994, 1996)

Organization of the first TELRI seminar on

"Language resources and Language Technology"

(1995)

8.2 Publications

Pajzs J.: Szamitogep es lexikografia (Doctoral thesis) (in Hungarian) 'Computational lexicography' Research Institute for Linguistics of the HAS, Budapest 1990. p. 83.

Pajzs J.: Felmillio szo szamitogepen. (in Hungarian) 'Half million word on-line' Computerworld - Szamitastechnika vol. III. no. 5. 9. March 1988. p. 24-25.

Pajzs J.: Dictionary digitalisiert. Oxford Englisch per Knopfdruck. Computerwelt Ostereich Nr. 9. 13. 5. 1988. p.9.

Pajzs J.: Francia szamitogepes nagyszotar. (in Hungarian) 'The Tresor de la langue francaise' Computerworld - Szamitastechnika vol. IV. no. 27. 1 July 1989. p. 26.

Pajzs J.: Magyar szamitogepes nagyszotar. (in Hungarian) 'The Historical Dictionary of Hungarian' Computerworld - Szamitastechnika vol. IV. no. 28. 8. July 1989. p. 14.

Kiss L. - Pajzs J.: A magyar irodalmi is koznyelv nagyszotara (1533-1990) (in Hungarian) 'The Historical Dictionary of Hungarian' Magyar Nyelv vol. 1989. no. 2. p. 129-136.

Pajzs J.: Szamitogepes szotarak (in Hungarian) 'Computerized Dictionaries' Nyelvtudomanyi Kozleminyek vol. 89. no 1-2. szam Budapest. 1987. p. 67-97.

Pajzs J.: Creating a Historical Dictionary of Hungarian with the Aid of Computer T. Magay - J. Zigany: BUDALEX '88 Proceedings Akademia Kiado Budapest 1990. p. 559-563.

Pajzs J.: Realisation assistee par ordinateur de grands dictionnaires francais et hongrois Cahiers d'etudes hongroises 3/91 Centre Interuniversitaire d'tudes Hongroises Universite Paris III. Institut Hongrois de Paris p. 47-54.

Pajzs, J.: The Use of a Lemmatized Corpus for Compiling the Dictionary of Hungarian In: Using Corpora Proceedings of the 7th Annual Conference of the OUP & Centre for the New OED and Text Research. University of Waterloo Centre for the New OED, 1991. pp. 129-136.

Pajzs J. A Debreceni Tezaurusz egy felhaszn l si lehetosegerol: a magyar nyelv szamitogepes alapszokincstara. (in Hungarian) 'A possible use of the "Tesaurus of Debrecen", the core vocabulary of Hungarian' Klaudi K.(ed): Kunyv Papp Ferencnek 1992.

Pajzs J: Magyar nyelvu szovegek vizsgalata szamitogeppel (in Hungarian) 'Computational methods for examination of Hungarian texts' Szekely G. (ed): Elso magyar alkalmazott nyelveszeti konferencia Nyiregyh za II. 1992. pp. 717-722.

Pajzs, J: le role de l'ordinateur dans la redaction de nouveau dictionnaire hongroise-francaise/francaise-hongroise Cahiers d'Etudes hongroises 4/92 Centre Interuniversitaire d'Etudes Hongroises Universit‚ Paris III. Institut Hongrois de Paris pp. 119-125.

Pajzs, J. - Tihanyi, L. - Villo, I.: Writing Dictionaries with Grammar Defined Databases. In: (KIEFER F. - KISS G. - PAJZS J:, 1992) pp. 259-274.

Kiefer F. - Kiss G. - Pajzs J: (eds) Papers on Computational Lexicography and Text Research Proceedings of COMPLEX '92. Budapest: Research Institute for Linguistics of the HAS, 1992. p. 357.

Pajzs J: Project Report on the Historical Dictionary of Hungarian. in: (KIEFER F. - KISS G. - PAJZS J:, 1994) pp. 205-214.

Kiefer F. - Kiss G. - Pajzs J: (eds) Papers on Computational Lexicography and Text Research Proceedings of COMPLEX '94. Budapest: Research Institute for Linguistics of the HAS 1994. pp. 260.

Pajzs J.: A szamitogepes nagyszotari korpusz felhasznalasanak lehetosegei (in Hungarian) 'On the possible use of the corpus of Hungarian' Magyar Nyelv 1994. 3. pp. 287-302.

Pajzs J.: Szamitogepes szotarak mint adatbazisok. (in Hungarian) 'Computerized dictionaries as databases' NyK vol 93. no 1-2. Budapest, 1992-1993. pp. 161-177.

Kiefer F. - Kiss G. - Pajzs J: (eds) Papers on Computational Lexicography and Text Research Proceedings of COMPLEX '96. Budapest: Research Institute for Linguistics of the HAS 1996.

8.3 Brief profile of organization

RESEARCH INSTITUTE FOR LINGUISTICS

HUNGARIAN ACADEMY OF SCIENCES

Budapest, I., Szinhaz utca 5Ä9.; Hungary Budapest, P.O.Box 19. ÄÄ 1 2 5 0 (36-1) 175-8011/276 ext. (36-1) 175-8285 Fax: (36-1) 212-2050

E-mail: kiefer@nytud.hu banreti@nytud.hu

Founded in 1949, RIL is the only non-university-affiliated research center in the field of linguistics in Hungary. Ist activities focus on the synchronic and diachronic description of Hungarian as well as on research in both theoretical and applied linguistics.

Director: Ferenc Kiefer (morphology, semantics, pragmatics)

Vice-director: Zoltan Banrsti (syntax, patholinguistics)

Senior research fellows (full time):

va B.Lurinczy (dialectology), Marianne Bakrcentso-Nagy (Finno-Ugric linguistics), Mihaly Brody (syntax), Sandor Csecs (Finno-Ugric linguistics), Ildiko Ecsedy (languages in China), Katalin Kiss (syntax), Judit Fehr (Oriental studies), Karoly Gerstner (historical linguistics, lexicography), Maria T.Gosy (phonetics, child language), Erzsbet H.Nagy (lingustic norm), Lea Haader (historical linguistics), Gyurgy Hazai (Turkish), Laszlo Horvath (historical linguistics), Alexandr Jarovinszkij (child language), L szlcents K lm n (semantics, computational linguistics), Ilona Kassai (spoken language, child language), Gabor Kemny (stylistics), Andras Komlasy (syntax, morphology), Miklos Kontra (spoken language, sociolinguistics), Gabor Olaszy (phonetics), Ferenc Papp (Russian linguistics, computational linguistics), Ildikcents Posgay (dialectology), Zita Riger (child language, Romani), Robert Simon (oriental studies), Tamas Szende (phonology), Laszlo Szits (linguistic norm), Ferenc Tukei (Chinese), Tamas Varadi (Computational linguistics), Balazs Wacha (historical linguistics, Hungarian descriptive syntax).

Research fellows (full time): between 40 and 50. Part-time researchers: between 40 and 50.

Departments:

1. Oriental studies. Head: Ferenc Tukei. 2. Phonetics. Head: Maria T. Gosy. 3. Applied linguistics. Head: Zita Riger. 4. Theoretical linguistics. Head: Andras Komlosy. 5. Finno-Ugric languages. Head: Marianne Bakro-Nagy. 6. Dialectology. Head: va B.Lurinczy. 7. Spoken language. Head: Miklos Kontra. 8. Historical linguistics. Head: Lea Haader. 9. Linguistic norm. Head: Laszlo Szits. 10. Lexicography and Lexicology. Head: Karoly Gerstner. 11. Corpus Linguistics. Head: Tamas Varadi.

Major projects:

(a) Historical Grammar of Hungarian (Lea Haader)

The historical period described in the first stage of the project was Old Hungarian (up to the thirties of the 16th century). This period was split up into Early Old Hungarian to which Proto-Hungarian was appended (up to the middle of the 14th century) and Late Old Hungarian. The outcome was a three-volume grammar (morphology and syntax). The second stage comprises the development of Hungarian from the early 16th century until the end of the 18th century. This part of the project started in 1995.

(b) The Comprehensive Dictionary of the Hungarian Language (Karoly Gerstner)

The dictionary (containing 200.000-250.000 entries) will be based on a corpus consisting of approximately 40 million running words taken from various periods. 2/3 of the texts are selected from 20th century writings (journals, novels, scientific and religious literature, etc.). The rest comes mainly from 19th century works and to some extent also from 18th century writings. A multi-functional lexical data base is under construction. The text file is tagged by a morphological analyzer, designed for this particular task. This analyzer can also be used as a spellchecker. For retrieval the OPEN TEXT (PAT) software is used.

(c) Survey of spoken Hungarian (Miklos Kontra) Tape-recorded interviews have been conducted with 200 informants who constitute a random stratified sample of the population of Budapest. Transcription and computerization of the data is in progress. A pen-and-paper survey has also been conducted with a nationally representative sample of the adult Hungarian population. The data of the national survey are in computer-readable form. Several papers have been published on some grammatical (phonological, morphological as well as syntactic) peculiarities of spoken language.

(d) Speech synthesis and speech perception (Gabor Olaszy)

A multilingual text-to-speech real-time speech synthesizing system has been developed for several languages (Hungarian, German, Dutch, Spanish, Portuguese, Esperanto). The refinment of the system is under preparation. (e) The structural grammar of Hungarian (Ferenc Kiefer)

The project aims at a theoretical description of Hungarian. The first volume was dedicated to syntax (published in 1992), the theoretical framework used was generative syntax and lexical functional grammar. The second volume described the phonology of Hungarian (published in 1994) using mainly post-structural methodology. The third volume, which will provide a detailed description of Hungarian morphology, is in preparation (the manuscript will be completed by the end of 1996). There will also be a fourth volume devoted to the lexicon, i.e. to lexical representations.

(f) Child language (Zita Riger)

The construction of a Hungarian data base for child language according to CHILDES (Child Language Data Exchange System). The data base is used for an inquiry into the effects of adult-child interaction on the development of grammar, the lexicon and the communicative competence (socialization) in the child. Further research topics include (i) the linguistic socialization of Gipsy children in traditional communities, and (ii) child bilingualism.

(g) Aphasia and patholinguistics (Zoltan Banrsti)

Investigation of patholinguistic phenomena by means of various tests in order to be able to answer some fundemantal questions about the organization of grammar (e.g. the modular or non-modular structure of grammar, interrelationships between syntax, morphology and the lexicon) as well as about the relationship between grammar and the human language processor.

(h) The Dictionary of Hungarian Dialect Vocabulary (va B.Lurinczy)

The project started in the early 50s and is expected to be finished by the end of 1997. The dictionary contains attested dialect words from all regions including Hungarian speaking regions outside of Hungary. Thus far three volumes have been published (I.(A-D), 1979, 1053 pages; II.(E-J), 1988, 1175 pages, III.(K-M), 1992, 1341 pages. Volume IV. is in preparation.

(i) Uralic Etymological Archives (UEA) (Maria Sipos)

Based on the material of the Uralisches Etymologisches Wörterbuch UEA is a computer-readable data base which contains more than 3000 etymologies of the Uralic language family. The construction of a multi-functional retrieval system is in progress.

(j) Speech perception and comprehension (Gabor Olaszy)

Various models for the lexical access in perception taking into consideration the 'word' as a phonetic unit are being examined. Also research is being carried out to clarify serial processes in speech perception.

(k) Sociolinguistic study of the Hungarian language in neighboring countries (Miklos Kontra)

A sociolinguistic study is carried out among the Hungarian minorities in Slovakia, Ukraine, Roumania, Yugoslavia (Serbia), Slovenia and Austria. Special attention is being paid to bilingualism.

(l) Theoretical linguistics (individual projects)

The Hungarian language is used in order to clarify various theoretical issues in phonology, morphology, syntax and semantics. The theoretical frameworks used include autosegmental phonology, government and charm phonology, natural morphology, generative morphology, government and binding theory, lexical-functional grammar, discourse representation theory.

Teaching

In 1990 RIL launched a joint program with Eotvos Lorund University, Budapest, (ELTE) to teach theoretical linguistics. The program comprises a 4-year undergraduate and (since 1994) a 3-year graduate education.


top   telrii 



No.9

ALEXANDRA JAROSOVA

9.1 Curriculum Vitae

Country: Slovakia

Date of birth: 22 February 1952, Prague.

Curriculum vitae : Education: 1970-1975 Faculty

of Philogy, University of Petersbourgh; 1975-1984 - Russian teacher, interpreter, research student L. Stur Linguistics Institute;

1985 - PhD thesis. Present position: senior research fellow, head of the Section of Linguistic Data Processing, member of the Slovak Corpus team at the L. Stur Linguistics Institute. Research subject: lexicology, monolingual and bilingual lexicography, corpora. Relevant visits abroad: Charles University, Prague - International Summer School in Computational Linguistics (July 1991); Westfaelische Universitaet, Muenster - Symposium on Corpora of European Languages (May 1992), EURALEX'94 (Amsterdam, August 1994), EURALEX'96 (Gothenburg, August 1996), TELRI (COPERNICUS) Seminars and Workshops (Tihany 1995, Ljubljana 1997, Kaunas 1997, Birmingham 1977)

9.2 List of recent publications (1992 - 1997)

1. Velky slovensko-rusky slovnik IV-VI (Great Slovak-Russian Dictionary, Vol. IV - VI), Editor-in-chief: Sekaninova, E., Veda, vydavatelstvo SAV, Bratislava, 1990-1995 (co-author)

2. Spajatelnost slov a jej odraz v slovniku ( Collocability of words and its presentation in the dictionary). Jazykovedny casopis, 43, 1992, pp. 116-126

3. Korpus textov slovenskeho jazyka ( The Slovak Text Corpus). Slovenska rec, 58, 1993, pp. 89-95

4. Dolnik, J. - Benkovicova, J. - Jarosova, A. : Porovnavaci opis lexikalnej zasoby (Contrastive Description of Vocabulary). Bratislava, Veda 1993.

5. Frazeologiceskaja periferija i jejo predstavlenije v tolkovom slovare (Phraseological "Periphery" and its Presentation in the Explanatory Dictionary).In: Kroslakova, E. and Durco, P. (Eds.): Proceedings, Second International Conference on Phraseology, Phraseology in Education, Science

and Culture. Vysoka skola pedagogicka v Nitre, Fakulta humanitnych vied, Katedra slovenskeho jazyka 1993, pp. 187-194

6. Vychodiska dvojjazycnej lexikografie: Konfrontacna semaziologia a onomaziologia. (Theoretical Basis of the Bilingual Lexicography: Contrastive Semasiology and Onomasiology). Jazykovedny casopis, 45, 1994, pp. 19-30

7. Monokolokabilne slova v slovencine (Nonce-words in Slovak). Jazykovedny casopis, 46, 1995, pp. 83-99

8. Tvorba dvojjazycnych terminologickych slovnikovs pocitacovou podporou (Computer-aided Compilation of Bilingual Terminological Dictionaries). Kultura slova, 31, 1997, 201-210

9. Comparative Analysis of English and Slovak Translation of Plato's Republic. In: TELRI Newsletter, April 1997, pp. 15-19

10. Lexicography and Computers: The Slovak Variant (in press). Proceedings of the International Conference, Slovak Language at the End of the XX century.

11. Parallel Corpora and Equivalents not Found in Bilingual Dictionaries: An Attempt at Their Generalization (in press). In N. Volz (ed.) Language Applications for a Multilingual Europe. Proceedings of the Second European TELRI Seminar.

9.3 Brief Profile of Proposers's organisation

The scolarly activity of Ludovit Stur Linguistics Institute is concentrated on the basic research of the Slovak language. Following projects are being carried out at present:

1. Slovak in Contacts (eventually in Conflicts) with other Languages: the interference of Slovak speakers in the contact regions, their verbal behavior, national selfidentification, auto- and heterotypization.

2. Language/Speech Culture in the Theory and Practice: the functioning of the literary language in various forms of public communication, the relation between the literary language and other forms of national language, general and particular problems of terminology.

3. The System of Contemporary Slovak: functional stratification of Slovak language, the analyse of grammar/lexicon relations.

4. Slovak Corpus and Lexical Database.

5. Dictionary of the Contemporary Slovak Language.

6. Historical Dictionary of the Slovak Language.

7. Dictionary of the Slovak Dialects.

8. Atlas Elaboration of the Slavonic Languages in an International Cooperation.


top   telrii 



Haldur [[Otilde]]im

10.1 Curriculum Vitae

Born 22.01.1942 in Estonia

1965 graduated from the University of Tartu

1970 the degree of Candidate of Sciences in linguistics

1983 the degree of Doctor of Sciences in linguistics

1994 member of the Estonian Academy of Sciences

1969 - 1977 senior researcher at the University of Tartu, Language Processing Research Unit

1977 - 1983 senior lectures at the University of Tartu, Department of Estonian

1983 - 1985 Associate Professor at the Universityof Tartu, Department of Estonian

1985 - 1991 Professor of Estonian at the University of Tartu, Department of Estonian

1991 - 1992 Guest Professor at the University of Koblenz-Landau, Institute of Computational Linguistics

1992 - present Professor of General Linguistics at the University of Tartu, Department of General Linguistics

Main areas of research:

1) Theoretical problems of Linguistics

2) Computational Linguistics and automated language processing

About 80 scientific publications, including two books.

10.2 Selected list of recent publications of Haldur Oim

1. Language understanding and problem solving: on the relation between computational linguistics and artificial intelligence. - Computational Linguistics. An International Handbook. Ed. by I.Batory, W.Lenders and W.Putschke. Walter de Gruyter, Berlin - New-York 1989, 277-283.

2. An approach to the modelling of natural reasoning. - AIMSA-90. Varna, Bulgaria. Proceedings. Ed. by P.Jorrand and S.Sgurev 1990, 182-189.

3. A Formal Model of Communicative Strategy. - Proceedings of the Scandinavian Conference on Artificial Intelligence -'93. Stockholm 1993, 226-231.

4.The Need for a Theory of Folk Theories in Cognitive Semantics. - Papers in Theoretical and Computational Linguistics. Ed. by H. Oim. Tartu 1996, 193-210.

5.The University of Tartu Corpus of Written Estonian: A Survey of the Structure and Principles of Selection pf Texts.- Papers in Theoretical and Computational Linguistics. Ed. by H. Oim. Tartu 1996, 7-32.

6. Linguistic Semantics and Naive Theories: A View of Language Competence.- Proceedings of the 16th International Congress of Linguists. Paris, July 1997.

10.3 University of Tartu

At the University of Tartu, the research unit which will carry out the Estonian part of the Concreted Action is research Group of Computational Linguistics at the Department of General Linguistics.

RGCL is the central research unit in Estonia dealing systematically with natural language processing in theory and in practice. Since 1980-s when it was founded, the main results of its activities are the following.

1) It has provided a morphological analyzer of Estonian, ESTMORF.

2) On its basis, such basic processing tools as a speller, a lemmatizer and a hyphenizer of Estonian have been developed (these are included also in the Microsoft Office 97 packet sold in Estonia).

3) RGCL has provided a computerized corpus of Estonian in accord with classical principles, tagged according to TEI requirements, and is currently collecting and tagging texts in electronic format (ca 3 million words per year).

4) RGCL has participated in the following COPERNICUS projects:

GLOSSER, coordinated by University of Groningen;

MULTEXT-EAST, coordinated by Institute of german Language, Mannheim; CONCEDE, coordinated by University of Brighton.

The first three projects deal mainly with problems of creating and aligning multi-lingual corpora, the central theme of CONCEDE is building lexicons on the basis of parallel corpora. In December 1997 should start also EuroWordNet-2, coordinated by the University of Amsterdam, also EuroWordNet-2.

5) Since 1996 the central task of RGCL has been building lexicons of various kind on the basis of computer corpora. In 1997 this work has been enlarged considerably in the frames of Estonian National Program of Language Technology which started this year. Beyond research on lexicons it includes projects on morphological analysis and suynthesis, syntactic analysis, text retrieval tools.


top   telrii 



Hamdam ARZIKULOV

11. 1 CURRICULUM VITAE

NAME: Hamdam ARZIKULOV

DATE of birth: 1945, October 10

PLACE of birth: Samarkand, Uzbekistan

ADDRESS: 107, Amir Timur St., 705008 Samarkand, Uzbekistan

Phone: +3662.22.16.80

Fax: +3662.35.60.76

PLACE of work: Samarkand State Institute of Foreign Langagues,

Head of Language Engineering Centre

Phone: +3662.35.69.70

Fax: +3662.31.11.87

E-mail: hamdam sam.silk.org

GRADE: Professor, Doctor

RESEArCH FIELDS: Language Engineering; Turkic and Iranian

Languages; General Linguistics

11.2 LIST OF RECENT PUBLICATIONS

1. Arzikulov H. Meeting to Discuss Integration of Language Engi-

neering Research in Central Asia // Journal of Quantitative Linguis-

tics. -1996. -Vol. 3. -N 1. -Pp. 89-90.

2. Arzikulov H. Language Engineering in Central Asia. Speech sys-

tem and its computer modelling. St.-Petersburg University Ed., 1996

(in Russian).

3. Arzikulov H. NLP for Turkic Languages: TURKLINGTON. Samarkand

University Ed., 1997 (in Russian)

11.3 BRIEF PROFILE OF OUR ORGANISATION

Samarkand State Institute of Foreign Languages where I am working

now is one of the main Linguistic Universities of Uzbekistan. The Ins-

titute has a computer park and Language Engineering Centre. In this

Centre about fifteen reseachers are working out a polyfunctional Lin-

guistic Automaton for Turkic languages. Prof.Toshbolot SADYKOV from

Bishkek (Kyrgyzy), Dr. Makset AYIMBETOV from Nukus (Karakalpaky), Ol-

mos BELBOTAEV from Turkestan (Kazakhstan) and many other reseachers of

Central Asian Republics are collaborating with it.

Samarkand Language Engineering Centre organised two international

seminars (1992, 1996) on Language Engineering and Computer-Aided Lan-

guage Learning and published some papers.


top   telrii 



Dr. J.G. Kruyt

12.1 Curriculum Vita

1982- 1990: editor historical dictionary "Woordenboek der Nederlandsche Taal" (WNT)

From 1990: project manager of the electronic version of the WNT (EWNT)

From 1992: additionally project manager Language Database (the Language Database "Taalbank" provides retrieval on large Dutch, linguistically enriched text corpora).

From 1994: project manager Language Database; project manager Dutch spelling guide; head of EDP department (shared position).

From 1996: head of Language Database department, including EWNT.

12.2 List of recent publications

Kruyt, J.G. (1996), Language Databases for Dutch. In: The Low Countries 1996-97, Year Book of the Flemish-Netherlands Foundation 'Stichting Ons Erfdeel', 279-280.

Kruyt, J.G. & P.G.J. van Sterkenburg (1996), A new Dutch Spelling Guide. In: H. Rettig (ed.), Language Resources for Language technology, Proceedings of the first TELRI European Seminar in Tihany, 133-141.

Kruyt, J.G. & P.G.J. van Sterkenburg (1996), Corpus Design criteria. In: N. Calzolari, M. Baker & J.G. Kruyt (eds.), Towards a Network of European Reference Corpora: Report of the NERC Consortium Feasibility Study. Linguistica Computazionale vol. XI, Giardini, Pisa, 57-72.

Calzolari, N., M. Baker & J.G. Kruyt (eds.) (1996), Towards a Network of European Reference Corpora: Report of the NERC Consortium Feasibility Study. Linguistica Computazionale vol. XI, Giardini, Pisa.

Kruyt, J.G. & Dutilh, M.W.F. (1997), A 38 Million Words Dutch Text Corpus and its Users, in: Lexikos 7 (AFRILEX series 7:1997), pp. 1-16.

Kruyt, J.G. (1997), Electronische woordenboeken en tekstcorpora voor Europese taaltechnologie, in: Trefwoord, jaarboek 1997-1998, in

press, SDU, The Hague.

12.3 Brief profile of INL

Founded in 1967, and subsidized by the Dutch and Belgian governments, the Institute for Dutch Lexicology INL is an organization doing contract research and providing other forms of professional service (including education) within the fields of Dutch lexicology, lexicography and corpus linguistics. One of the tasks of INL is the completion of the historical Dictionary of the Dutch Language, the "Woordenboek der Nederlandsche Taal" (WNT), which is the largest dictionary in the world. A language database comprising large text corpora of present-day Dutch, enriched with linguistic information, is developed by the Language Database department.


top   telrii 



Kemal Oflazer

13.1 Curriculum Vitae

CURRENT POSITION

Associate Professor, Department of Computer Engineering and Information Science Bilkent University 06533 Bilkent Ankara, Turkey

Phone: +90-312-266 4000 ext. 1258

Fax: +90-312-266 4126

E-mail: ko@cs.bilkent.edu.tr

URL: http:/www.cs.bilkent.edu.tr/~ko/

EDUCATION

Ph.D. Computer Science, Carnegie Mellon University, Pittsburgh, PA, USA. B.Sc. and M.Sc., Middle East Technical University, Ankara Turkey.

RECENT PROFESSIONAL DUTIES

Member of the Editorial Board, Machine Translation Journal 1996 --

Program Committee member for ACL/EACL'97 Conference , Madrid, Spain, July 7 -- 11, 1997.

Program Commitee Member, ACL/SIGDAT Second Conference On Empirical Methods in Natural Language Processing (EMNLP-2) held at Brown University, Providence, RI, USA, Aug 1-2, 1997

Program Committee Co-chair and Organizing Committee Chair „International Conference on New Methods on Language Processing“, Bilkent University, Ankara, Turkey, September 11-13, 1996. (Organized jointly with Centre for Computational Linguistics, UMIST, Manchester, UK)

Program Committee Member for the Morphology Program Subcommittee for International Conference on Computational Linguistics, COLING'96 Copenhagen, Denmark, August, 1996.

RESEARCH EFFORTS

Kemal Oflazer is currently directing a NATO funded Science for Stability Project (USD 600,000 1994 - 1998), on developing natural language resources and applications for Turkish. More information about the project can be found on http://www.nlp.cs.bilkent.edu.tr.

Prof. Oflazer is personally currently interested in advanced finite state processing and morphological disambiguation mainly for Turkish but with potential applications in other highly inflected languages.

13.2 List of Recent Publications for Kemal Oflazer

1) Dilek Zeynep Hakkani, Gökhan Tür, Teruko Mitamura, Eric H. Nyberg 3rd, and Kemal Oflazer. Issues in Generating Turkish from Interlingua. Technical Report CMU-LTI-97-152. Carnegie Mellon University, Center for Machine Translation, Pittsburgh, PA, USA.

2) Dilek Zeynep Hakkani and Kemal Oflazer Tactical Generation in a Free Constituent Order Language to appear in Journal of Natural Language Engineering, Cambridge University Press, 1997.

3) Kemal Oflazer, Morphological Analysis , chapter in Syntactic Wordclass Tagging Hans van Halteren, Editor, Kluwer Academic Publishers, 1998.

4) Gökhan Tür, Kemal Oflazer and Nihat Özkan Tagging English by Path Voting Constraints Bilkent University, Computer Engineering and Information Science Technical Report BU-CEIS-9704, March 1996.

5) Kemal Oflazer and Gökhan Tür, Morphological Disambiguation by Voting Constraints in Proceedings of ACL'97/EACL'97, The 35th Annual Meeting of the Association for Computational Linguistics, July, 7-12, 1997, Madrid Spain.

6) Kemal Oflazer and Okan Yilmaz, A Constraint-based Case-frame Lexicon In the Proceedings of COLING'96 Copenhagen, Denmark, August 1996 (postscript copy )

7) Kemal Oflazer Error-tolerant Tree Matching, In the Proceedings of COLING'96 Copenhagen, Denmark, August 1996

8) Kemal Oflazer and Gökhan Tür, Combining Hand-crafted Rules and Unsupervised Learning in Constraint-based Morphological Disambiguation,in Proceedings of the ACL-SIGDAT Conference on Empirical Methods in Natural Language Processing, May 1996, Philadelphia, PA, USA.

9) Dilek Zeynep Hakkani, Kemal Oflazer, and Ilyas Cicekli Tactical Generation in a Free Constituent Order Language in Proceedings of 8th International Workshop on Natural Language Generation, Sussex, UK, June 1996.

10) Kemal Oflazer Error-tolerant Finite State Recognition with Applications to Morphological Analysis and Spelling Correction, In Computational Linguistics Vol.22 No:1, 1996.

11) Zelal Güngördü, and Kemal Oflazer, Parsing Turkish using the Lexical-Functional Grammar Formalism, Machine Translation Vol 10, Number 4, 1995.

12) Kemal Oflazer Error-tolerant Finite State Recognition, In the Proceedings of the 4th International Workshop on Parsing Technologies, IWPT'95, Prague, Czech Republic, September, 1995.

13) Kemal Oflazer and I. Kuruöz, Tagging and Morphological Disambiguation of Turkish Text in

Proceedings of the 4th ACL Conference on Applied Natural Language Processing, Stuttgart, Germany Oct. 1994.

14) Kemal Oflazer and C. Güzey, Spelling Correction in Agglutinative Languages,in Proceedings of the 4th ACL Conference on Applied Natural Language Processing, Stuttgart, Germany Oct. 1994.

15) Zelal Güngördü, and Kemal Oflazer, Parsing Turkish using the Lexical-Functional Grammar Formalism, In Proceedings of COLING'94, The 15th Conference on Computational Linguistics, Kyoto, Japan August, 1994.

16) Kemal Oflazer, Two-level Description of Turkish Morphology, Literary and Linguistic Computing, Vol. 9, No:2, 1994.

17) H.A. Güvenir and Kemal Oflazer, Using a Corpus to Teach Turkish

Morphology , in Proceedings of the Seventh Twente Workshop on Language Technology, Entschede, The Netherlands, June 1994

18) Aysin Solak, Kemal Oflazer, Design and Implementation of a Spelling Checker for Turkish, Literary and Linguistic Computing, Vol. 8, No:3, 1993, Oxford University Press.

13.3 Site info

Bilkent University is the premiere research university in Turkey with very qualified faculty and research teams in the Faculty of Engineering. The Department of Computer Engineering and Information Science has been very active in recent years in the context of natural language processing boosted by a 5 year $600,000 NATO Science for Stability Project grant to develop Turkish NLP resources and applications. The Department of Electrical Engineering is a partner in many EU/NATO funded projects, and is very active in speech processing. The Faculty of Engineering attracts students from the first couple hundred of the million or so high-school graduates who take university entrance exams every year.


top   telrii 



Martin Gellerstam

14.1 Curriculum vitae

Martin Gellerstam, born 1936, doctor of Scandinavian languages, Associate Professor of Scandinavian Languages, director of the Bank of Swedish, editor of the Swedish Academy Glossary, editor of Svenska ord (a dictionary for immigrants), Chairman of the Scandinavian Association of Lexicography.

14.2 Some recent publications:

Translations as a source for cross-linguistic studies. (In: Aijmer & Altenberg (eds.) Languages in Contrast (=Lund Studies in English 88). Lund University Press, 1996.

Anföringens estetik. Om dialogformer i tvärspråkligt perspektiv ("The aesthetics of direct speech. Dialogue formula in a cross-linguistic perpective") (In: Olle Josephson (ed.) Stilstudier). Uppsala 1996.

Brolexikon ("Bridge Dictionaries"). (In: Nordiske studier i leksikografi.3. Reykjavik 1995).

Översättaren och ordboken ("The Translator and the dictionary"). (In: Nordiske studier i leksikografi. 4. To be published.

14.3 Profile of the Department of Swedish

The scientific activities of the department can be divided into the research branches of Nordic Languages, Modern Swedish, Lexicology and Natural Language Processing. The Bank of Swedish is a research supporting national service institution with the task of building up a digital reference bank which includes all kinds of linguistic information. The department is engaged in international projects like Parole and Aventinus.


top   telrii 



Primoz Jakopin

15.1 Curriculum Vitae

Born June 30, 1949 in Ljubljana, Slovenia; 1972 a degree in mathematics at the University of Ljubljana, 1981 master's degree in information sciences at the University of Zagreb. Currently prepares the doctoral thesis: Upper Bound for the Entropy of Literary Texts in Slovenian.

1974-1979 assistant for computer science at the Medical faculty, 1979-1986 in charge of library information system at the University computing centre, 1986-1993 independent software developer, from 1993 assistant for text corpora analysis at the Faculty of Arts, from 1989 counselor at the Slovenian Academy for Sciences and Arts; all in Ljubljana.

Major pieces of software/lingware: STAT (Control Data Cyber, 1977), IBIS (Digital DEC 10, 1981), INES (Sinclair ZX Spectrum, 1985), STEVE (ATARI ST, 1987), EVA (DOS, 1992- and Windows 95, 1996-).

Recently an interactive tagger for Slovenian has been incorporated into EVA.

Married to Slava Ovcar, biologist, in 1971; daughters Ana, 1976 and Marija, 1979. Photographic exhibitions: Bled 1977, Ljubljana 1979, Laze 1994.

He speaks English, reads German, French, Italian, Russian and Czech.

Internet homepage: http://www.uni-lj.si/~ffjakopin

15.2 Recent publications in the field of text analysis

P. Jakopin, A. Bizjak: O oblikoslovnem oznacevanju slovenskega besedila [Part-of-speech Tagging of Slovenian Language]. Slavisticna revija, submitted for publication.

P. Jakopin, A. Musar: Kvantitativni prikaz prevoda zbirke zgodb [Quantitative Description of the Slovenian Translation of Lewis Carroll: Alice in Wonderland]. Slava, to be published.

P. Jakopin, A. Bizjak: Part-of-Speech Tagging in the Slovenian Translation of Plato's Republic. TELRI Newsletter 5 (April 1997).

P. Jakopin: Ali so rojstna imena kraja od drugih samostalnikov? [Are Personal Names Shorter Than Other Nouns]. Slavisticna revija. Vol. 44/2, 1996, p. 193-200.

M. Hajnsek-Holz, P. Jakopin: Odzadnji slovar slovenskega jezika [Reverse Dictionary of Slovenian Language]. ZRC-SAZU, Ljubljana 1996, 860 pp.

P. Jakopin: EVA - A Textual Data Processing Tool. Proceedings of the first TELRI seminar: Language Resources for Language Technology, Tihany 1995.

P. Jakopin: Nekaj stevilk iz Slovarja slovenskega knjiznega jezika [Some Figures from the Dictionary of the Slovenian Literary Language]. Slavisticna revija, Vol. 43/3, 1995, p. 341-375.

P. Jakopin: O n-terckih in o novem postopku za deljenje besed [On n-tuples and on a new method for word division]. TIP, Nr. 6, 1995, p. 38-40.

Slovar slovenskega knjiznega jezika [Dictionary of the Slovenian Literary Language]. DZS, Ljubljana 1994, 1714 pp. (software author)

P. Jakopin: STEVE: Text, Graphics, Data Base, DeskTop Publishing and Computer Aided Instruction on ATARI ST. Ljubljana, 1989, 605 pp.

15.3 Fran Ramovs Institute of the Slovenian Language, within the Centre of Scientific Research of the Slovenian Academy of Arts and Sciences

Novi trg 4, 1000 Ljubljana, Slovenia

tel. +386 61 1256 068, fax: +386 61 1255 253

Internet homepage: http://www.zrc-sazu.si/www/isjfr/isjfr-a.htm

The main activity of the Institute is research of the Slovenian

language and preparation of the basic works in the field of

Slovenian linguistics and quantitative linguistics:

- revised edition of the Dictionary of Standard Slovenian (in book and on CD),

- Orthographic Dictionary of Slovenian,

- compilation of a new dictionary of modern standard Slovenian (both in book and in electronic form, initial stage),

- implementation of the Lemmatization Dictionary of standard Slovenian,

- the continuation of the Etymological Dictionary of Slovenian,

- Dictionary of 16th Century Slovenian Protestant writers,

- Dictionary of the old standard language of the Prekmurje region,

- preparation of linguistic atlases (Slovenian - SLA, General Slavonic - OLA, European - ALE),

- monographs on dialects and dictionaries of dialects,

- terminological dictionaries.

The Institute is currently involved in several international projects: the OLA General Slavonic linguistic atlas, the ALE European linguistic atlas and in TELRI, Trans-European Language Resources Infrastructure, which is a concerted-action in the frame of Copernicus.

The responsible person and head of the Institute:

Prof. Dr. Varja Cvetko-Oresnik.

The Institute employs 35 researchers and 8 associates.


top   telrii 



Tamas Varadi

16.1 Curriculum Vitae

Dr. Tamás Váradi

Head of Department of Corpus Linguistics

Senior research fellow

Department of Sociolinguistics

Personal

Office address:

Linguistics Institute, Hungarian Academy of Sciences H-1250 Budapest, P.O.Box 19

Phone: 36 1 175 8011 X243

FAX: 36 1 212 2050

email: varadi@nytud.hu

Qualifications

Ph. D. (English Linguistics) 1996 Loránd Eötvös University, Budapest

M.A. (English Linguistics) 1981 Loránd Eötvös University, Budapest

B.A. (English, Spanish, Linguistics) 1976 Loránd Eötvös University, Budapest

Positions

1997 - Head of Corpus Linguistics Department Senior research fellow, Sociolinguistics Department (part time)

1995 - 1997 Research fellow at the Linguistics Institute of the Hungarian Academy of Sciences

1991 - 1995 Lecturer at the School of Slavonic and East European Studies, University of London

1994 - 1995 Research assistant in „Speech Rhythm in English“ project in UCREL, Department of Linguistics and Modern English Language, Lancaster University (part-time)

1992 - 1994 Research Assistant in MARSEC project at the Department of Linguistics and Modern English Language, Lancaster University (part time)

1990 - 1991

TEMPUS visiting fellow at the Department of Linguistics and Modern English Language, Lancaster University

1986 - 1990 Research fellow at the Linguistics Institute of the Hungarian Academy of Sciences Lecturer at the Department of English, Loránd Eötvös University (part time)

1983 - 1986 Research fellow of the Hungarian Academy of Sciences

1976 - 1983 Lecturer at the Department of English, College for Foreign Trade, Budapest

Relevant professional experience

Research background in NLP through participation in EU funded Copernicus joint research projects.

Several years' experience in corpus linguistic work including the speech corpus MARSEC, and the 20m word dictionary corpus at the Linguistics Institute

Extensive programming background in PERL, familiarity with a range of programming languages including C, C++, Pascal, Snobol, ICON, Prolog

Extensive experience in relational database design and implementation

Knowledge of HTML, SGML and TEI

Thorough knowledge and experience in CGI programming

Working knowledge of DOS/MsWindos/UNIX/Xwindow operating systems

Thorough familiarity with Unix text processing tools

Current Research:

Compiling the Hungarian National Corpus, targeted as a 100m word reference corpus of contemporary written Hungarian

Preparing a CD-ROM version of a digitized interview of the Budapest Sociolinguistic Interview database involving a multimedia user interface

Preparing parallel English-Hungarian corpus of Plato's Republic

Working on word level alignment using grammatical clues

Building a WWW user interface to the PAT textual database management system

Projects:

MULTEXT- EAST (Copernicus) Linguistics Institute, Hungarian Academy of Sciences, 1995

TELRI (Copernicus) Linguistics Institute, Hungarian Academy of Sciences, 1995

Learner's Dictionary of English SSEES, London University, 1991 - 1995

Speech Rhythm in English UCREL, Dept of Linguistics and Modern English Language, Lancaster University 1994 - 1995

MARSEC - Machine Readable Spoken English Corpus

UCREL, Dept of Linguistics and Modern English Language, Lancaster University 1992 - 1994

Lancaster Parsed Corpus of English UCREL, Dept of Linguistics and Modern English Language, Lancaster University 1990 - 1991

Budapest Sociolinguistic Survey Linguistics Institute, Hungarian Academy of Sciences 1986

Question-Answering in automated document Hungarian Academy of Sciences, 1977 - 1979

English-Hungarian Contrastive Linguistics Hungarian Academy of Sciences, 1973 - 1976

Courses taught:

English as a foreign language (with Business English as a specialisation

Dept. of English, College of Foreign Trade (1976 - 1983)

Contrastive linguistics and error analysis (MA)

Dept. of Linguistics and Modern English Language

Introduction to linguistics English morphology and syntax

Contrastive linguistics Second language acquisition studies

Department of English, Loránd Eötvös University, Budapest (1983-1990)

Contrastive Linguistics and Error Analysis Department of English, Klagenfurt University (1982,1983,1986)

Computer assisted language learning (MA) Introduction to linguistics

Department of Linguistics and Modern English Language Lancaster University(1991)

Other professional activities:

1987,1991 Course Tutor at British Council Course on „The Use of Computers in English Language Teaching and Research“; held at Lancaster University

1988 Resource person at the Dubrovnik Inter-University Course „Strategies in Second Language Communication“

1991 Collaborated on computational corpus linguistic work at the Unit for the Computer Research on the English Language Lancaster University

1981 - 1990 Guest lectures at the universities of Klagenfurt, Graz, Arhus, Lancaster, Linguistic Circle of Ljubljana, University of Southern California

16.2 Select publications

1980 Strategies of target language communication: Message adjustment. International Review of Applied Linguistics 1980/1:52-69

1980 A Guide to Hungarian-English Errors in Dezso, L. & Nemser, W.(eds.) Studies in English-Hungarian Contrastive Linguistics Akadémiai Kiadó pp. 513-589

1984 Reported statements in English in Stephanides, é. (ed.) Contrasting English with Hungarian Akadémiai Kiadó

1988 A beszédszünet szubjektiv es objektiv regisztrálásának összevetésérol. (On the comparison of the subjective and objective registration of pauses) in Kontra, M. Beszélt nyelvi tanulmányok Hungarian Academy of Sciences Linguistica Series pp. 44-59

1991 A számítógépes lexikográfia új eszközeirol. (On new tools for compuational lexicography) In: Hunyadi, L. et. al. (eds.) Könyv Papp Ferencnek. (Festschrift for Ferenc Papp) Debrecen: KLTE. pp. 367-374

1992 - Kontra Miklós (szerk.) Studies in Spoken Languages: English, German, Finno-Ugric. MTA Nyelvtudományi Intézet, Budapest

1992 Review article of E. Bialystok: Communication Strategies, Blackwell, 1990 and N. Poulisse: The Use of Compensatory Strategies by Dutch Learners of English, Foris Publications, 1990 Applied Linguistics 13:4 pp. 434-440

1993 An Integrated Analysis of the Spontaneous Speech of Second Language Learners. Annales Universitatis Scientiarum Budapestinensis de Rolando Eötvös Nominatae. Sectio Linguistica. Tomus XX. Redigit I. Szathmári. Budapest, 1993, 201-265

1993 Disfluency Phenomena in L2 Speech in Z. Kövecses (ed.) Voices of Friendship. Linguistic Essays in Honour of László T. András, Budapest, Department of American Studies, Loránd Eötvös University, Budapest, pp. 117-128

1993 - Roach, P. Knowles, G, Arnfield, S: MARSEC: A Machine-Readable Spoken English Corpus. Journal of the International Phonetic Association 23:2. pp. 47-53

1994 Hesitations between inessive and illative forms in Hungarian in Studies in Applied Linguistics Debrecen, KLTE, 1:1 pp. 3-20

1995 -- Kontra, Miklós: Degrees of Stigmatization: t-final Verbs in Hungarian In: Wolfgang Viereck (Hrsg.): ZDL-Beiheft 77: Verhandlungen des Internationalen Dialektologenkongresses Bamberg 1990. Franz Steiner Verlag, Stuttgart, 1995, Band 4., 132-142

1996 Stylistic variation and the (bVn) variable in the Budapest Sociolinguistic Interview in Acta Linguistica Hungarica 43:3-4 pp 295 - 311

1997 Adatok a nagyszótári korpusz összetételérol (On the structure of the Academy Dictionary Corpus)in Kiss G. (szerk.) Szavak - nevek - helyek Kiss Lajos 75. születésnapjára (Words - names - places a Festschrift to celebrate Lajos Kiss’ 75th birthday), MTA Linguistics Institute, pp. 437 - 446

Csaba Oravecz

Curriculum Vitae

Education

1991 MSc in electrical engineering, Technical University, Budapest

1997 MA in English language and literature, Eotvos Lorand University,

Budapest

1997 MA in theoretical linguistics, Eotvos Lorand University, Budapest

1997- PhD student in theoretical linguistics, Eotvos Lorand University,

Budapest

Work experience

1994-96 Teaching assistant at the Department of English Linguistics,

Eotvos Lorand University, Budapest

1995- Research assistant at the Research Institute for Linguistics,

Hungarian Academy of Sciences

Participation in international projects

1995-96 GRAMLEX COPERNICUS project

1995- MULTETX-EAST COPERNICUS project

Recent publications

Oravecz Csaba (1996) " Application to Hungarian" in Sample Corpus Collection and Preparation. Multext-East Deliverable D2.1. Intermediate Report. Aix en Provence: Laboratoire Parole et Langage

16.3

RESEARCH INSTITUTE FOR LINGUISTICS

HUNGARIAN ACADEMY OF SCIENCES

Budapest, I., Szinhaz utca 5Ä9.; Hungary Budapest, P.O.Box 19. ÄÄ 1 2 5 0 (36-1) 175-8011/276 ext. (36-1) 175-8285 Fax: (36-1) 212-2050

E-mail: kiefer@nytud.hu banreti@nytud.hu

Founded in 1949, RIL is the only non-university-affiliated research center in the field of linguistics in Hungary. Ist activities focus on the synchronic and diachronic description of Hungarian as well as on research in both theoretical and applied linguistics.

Director: Ferenc Kiefer (morphology, semantics, pragmatics)

Vice-director: Zoltan Banrsti (syntax, patholinguistics)

Senior research fellows (full time):

va B.Lurinczy (dialectology), Marianne Bakro-Nagy (Finno-Ugric linguistics), Mihaly Brody (syntax), Sandor Csecs (Finno-Ugric linguistics), Ildiko Ecsedy (languages in China), Katalin Kiss (syntax), Judit Fehr (Oriental studies), Karoly Gerstner (historical linguistics, lexicography), Maria T.Gosy (phonetics, child language), Erzsbet H.Nagy (lingustic norm), Lea Haader (historical linguistics), Gyurgy Hazai (Turkish), Laszlo Horvath (historical linguistics), Alexandr Jarovinszkij (child language), Laszlo Kalman (semantics, computational linguistics), Ilona Kassai (spoken language, child language), Gabor Kemeny (stylistics), Andras Komlosy (syntax, morphology), Miklos Kontra (spoken language, sociolinguistics), Gabor Olaszy (phonetics), Ferenc Papp (Russian linguistics, computational linguistics), Ildiko Posgay (dialectology), Zita Riger (child language, Romani), Robert Simon (oriental studies), Tamas Szende (phonology), Laszlo Szits (linguistic norm), Ferenc Tukei (Chinese), Tamas Varadi (Computational linguistics), Balazs Wacha (historical linguistics, Hungarian descriptive syntax).

Research fellows (full time): between 40 and 50.

Part-time researchers: between 40 and 50.

Departments:

1. Oriental studies. Head: Ferenc Tukei. 2. Phonetics. Head: Maria T. Gosy. 3. Applied linguistics. Head: Zita Riger. 4. Theoretical linguistics. Head: Andras Komlosy. 5. Finno-Ugric languages. Head: Marianne Bakro-Nagy. 6. Dialectology. Head: va B.Lurinczy. 7. Spoken language. Head: Miklos Kontra. 8. Historical linguistics. Head: Lea Haader. 9. Linguistic norm. Head: Laszlo Szits. 10. Lexicography and Lexicology. Head: Karoly Gerstner. 11. Corpus Linguistics. Head: Tamas Varadi.

Major projects:

(a) Historical Grammar of Hungarian (Lea Haader)

The historical period described in the first stage of the project was Old Hungarian (up to the thirties of the 16th century). This period was split up into Early Old Hungarian to which Proto-Hungarian was appended (up to the middle of the 14th century) and Late Old Hungarian. The outcome was a three-volume grammar (morphology and syntax). The second stage comprises the development of Hungarian from the early 16th century until the end of the 18th century. This part of the project started in 1995.

(b) The Comprehensive Dictionary of the Hungarian Language (Karoly Gerstner)

The dictionary (containing 200.000-250.000 entries) will be based on a corpus consisting of approximately 40 million running words taken from various periods. 2/3 of the texts are selected from 20th century writings (journals, novels, scientific and religious literature, etc.). The rest comes mainly from 19th century works and to some extent also from 18th century writings. A multi-functional lexical data base is under construction. The text file is tagged by a morphological analyzer, designed for this particular task. This analyzer can also be used as a spellchecker. For retrieval the OPEN TEXT (PAT) software is used.

(c) Survey of spoken Hungarian (Miklos Kontra) Tape-recorded interviews have been conducted with 200 informants who constitute a random stratified sample of the population of Budapest. Transcription and computerization of the data is in progress. A pen-and-paper survey has also been conducted with a nationally representative sample of the adult Hungarian population. The data of the national survey are in computer-readable form. Several papers have been published on some grammatical (phonological, morphological as well as syntactic) peculiarities of spoken language.

(d) Speech synthesis and speech perception (Gabor Olaszy)

A multilingual text-to-speech real-time speech synthesizing system has been developed for several languages (Hungarian, German, Dutch, Spanish, Portuguese, Esperanto). The refinment of the system is under preparation. (e) The structural grammar of Hungarian (Ferenc Kiefer)

The project aims at a theoretical description of Hungarian. The first volume was dedicated to syntax (published in 1992), the theoretical framework used was generative syntax and lexical functional grammar. The second volume described the phonology of Hungarian (published in 1994) using mainly post-structural methodology. The third volume, which will provide a detailed description of Hungarian morphology, is in preparation (the manuscript will be completed by the end of 1996). There will also be a fourth volume devoted to the lexicon, i.e. to lexical representations.

(f) Child language (Zita Riger)

The construction of a Hungarian data base for child language according to CHILDES (Child Language Data Exchange System). The data base is used for an inquiry into the effects of adult-child interaction on the development of grammar, the lexicon and the communicative competence (socialization) in the child. Further research topics include (i) the linguistic socialization of Gipsy children in traditional communities, and (ii) child bilingualism.

(g) Aphasia and patholinguistics (Zoltan Banrsti)

Investigation of patholinguistic phenomena by means of various tests in order to be able to answer some fundemantal questions about the organization of grammar (e.g. the modular or non-modular structure of grammar, interrelationships between syntax, morphology and the lexicon) as well as about the relationship between grammar and the human language processor.

(h) The Dictionary of Hungarian Dialect Vocabulary (va B.Lurinczy)

The project started in the early 50s and is expected to be finished by the end of 1997. The dictionary contains attested dialect words from all regions including Hungarian speaking regions outside of Hungary. Thus far three volumes have been published (I.(A-D), 1979, 1053 pages; II.(E-J), 1988, 1175 pages, III.(K-M), 1992, 1341 pages. Volume IV. is in preparation.

(i) Uralic Etymological Archives (UEA) (Maria Sipos)

Based on the material of the Uralisches Etymologisches Wörterbuch UEA is a computer-readable data base which contains more than 3000 etymologies of the Uralic language family. The construction of a multi-functional retrieval system is in progress.

(j) Speech perception and comprehension (Gabor Olaszy)

Various models for the lexical access in perception taking into consideration the 'word' as a phonetic unit are being examined. Also research is being carried out to clarify serial processes in speech perception.

(k) Sociolinguistic study of the Hungarian language in neighboring countries (Miklos Kontra)

A sociolinguistic study is carried out among the Hungarian minorities in Slovakia, Ukraine, Roumania, Yugoslavia (Serbia), Slovenia and Austria. Special attention is being paid to bilingualism.

(l) Theoretical linguistics (individual projects)

The Hungarian language is used in order to clarify various theoretical issues in phonology, morphology, syntax and semantics. The theoretical frameworks used include autosegmental phonology, government and charm phonology, natural morphology, generative morphology, government and binding theory, lexical-functional grammar, discourse representation theory.

Teaching

In 1990 RIL launched a joint program with Eutvus Lorand University, Budapest, (ELTE) to teach theoretical linguistics. The program comprises a 4-year undergraduate and (since 1994) a 3-year graduate education.


top   telrii 



Tomaz Erjavec

17.1 Curriculum Vitae

Name: Tomaz Erjavec

Tel: +386 61 1773507

Fax: +386 61 1251038

Email: Tomaz.Erjavec@ijs.si

Department for Intelligent Systems

Jozef Stefan Institute

Jamova 39

SI-1000 Ljubljana

Slovenia

Dr. Erjavec is a research fellow at the Jozef Stefan Institute, working on the development of Slovene textual corpora, language technologies for the Slovene language, computational morphology and phonology and typed feature-structure formalisms and implementations.

He has been recently working on the Copernicus Joint Project MULTEXT-EAST, where he was the leader of Work Package 2: Tool application to sample corpus, and on the TELRI Copernicus Concerted Action, where he is the coordinator of Working Group on Tool availability. While at the Centre for Cognitive Science of the Edinburgh University, he has also been involved in the U.K. SALT-IED funded project „Integrated Language Database“. He is the co-editor of the International Journal of Corpus Linguistics (IJCL), and has been the reviewer for a number of international journals and conferences.

17.2 RECENT PUBLICATIONS

- Tomaz Erjavec, Nancy Ide, Dan Tufis:

Encoding and Parallel Alignment of Linguistic Corpora in Six Central

and Eastern European Languages.

In (Greg Lessard, Michael Levison, eds.):

Proceedings of the Joint International Conference of the ACH-ALLC '97

Queen's University, Kingston, Ontario, Canada, June 1997.

- Saso Dzeroski, Tomaz Erjavec:

Induction of Slovene Nominal Paradigms.

In N.Lavrac, S. Dzeroski, eds.:

Inductive Logic Programming;

7th International Workshop ILP-97, Proceedings,

Lecture Notes in Artificial Intelligence 1297, Springer,

pp. 141-148, 1997.

- Saso Dzeroski, Tomaz Erjavec:

Learning Slovene Declensions with FOIDL.

In W. Daelemans, A. van den Bosch, T. Weijters, eds.:

Empirical Learning of Natural Language Processing Tasks,

ECML-97 MLnet Workshop Notes, pp.49-60,

Prague, 1997.

- Tomaz Erjavec:

Racunalni"ske zbirke besedil (Computerized Text Collections).

Jezik in Slovstvo, pp.81--96, 42/2--3, 1996/7.

- Tomaz Erjavec, Claude de Loupy:

The MULTEXT-East project: practical experience in multilingual

corpus coding and processing.

CEN/TC304 Workshop: Providing Multilingual Support for Middleware:

Implementing the Universal Character Set ISO 10646 in the European

Information Society, Bled, November 1996, p.12, 1996.

- Tomaz Erjavec, Nancy Ide, Vladimir Petkevic, Jean Veronis:

Multext-East: Multilingual Text Tools and Corpora for

Central and Eastern European Languages.

Proceedings of the First European TELRI Seminar: Language Resources

for Language Technology, pp.87--98, 1996.

- Tomaz Erjavec:

Public Domain Generic Tools: an Overview.

Proceedings of the First European TELRI Seminar: Language Resources

for Language Technology, pp.37--48, 1996.

- Carl Vogel, Tomaz Erjavec:

'Restricted Discontinuous Phrase Structure Grammar and its Ramifications'

Current Issues in Mathematical Linguistics, Carlos Martin-Vide (ed.)

Elsevier/North Holland, pp.131--140, 1994.

- Tomaz Erjavec:

'Formalizing Realizational Morphology in Typed Feature Structures'

Fourth Computational Linguistics in the Netherlands Meeting - CLIN IV,

Groningen, Netherlands, pp.47--58, 1994.

17.3 ORGANISATIONAL STRUCTURE: JSI

Jozef Stefan Institute

The J. Stefan Institute (JSI, founded 1949) is a research organisation for pure and applied research in the natural sciences and technology. Both are closely interconnected in research departments composed of different task teams. Emphasis in basic research is given to the growth and education of young scientists, while applied research and development serve for the transfer of advanced knowledge. At present the Institute, totaling about 750, has a research staff of nearly 450: about 200 of them are post-graduates temporarily employed while obtaining their degrees, almost 200 have doctorates, and 100 have permanent professorships or temporary teaching assignments at the Universities (Ljubljana and Maribor). In view of its activities and status, the J. Stefan Institute plays the role of a kind of national institute, complementing the role of the universities and bridging the gap between science and applications.

The Department for Intelligent Systems of JSI is the major Slovenian AI research group with a 20 year tradition in R&D in artificial intelligence, intelligent systems, information systems, medical informatics, and natural language processing. The work in the area of NLP centers on the development of computational models and resources for the Slovene language: corpora, speech synthesis and models of inductive learning of language structure.


top   telrii 



Geoff Barnbrook

18.1 Curriculum Vitae

1968 - 1979 Various posts (from Articled Clerk to Assistant Manager) with Peat Marwick Mitchell and Co., a professional firm of Chartered Accountants, in their Birmingham and Rome offices. Associate of the Institute of Chartered Accountants 1973.

1979 - 1986 Tutor and Senior Tutor with The Financial Training Company Limited, a training company catering mainly for Chartered and Certified Accountancy students.

1984 -1988 B.A. course in Single Honours English, University of Birmingham (including a part-time first year taken over two years). Language options taken included Computational Linguistics and Middle English. First Class Honours awarded 1988.

1988 - 1989 Research at the University of Birmingham into the analysis of spelling variations in Chaucer's Canterbury Tales for the degree of M.Phil (awarded Dec. 1990)

1989 - 1991 Research Associate in Computing and English Studies, School of English, University of Birmingham, responsible for maintaining the School's computers, including IBM compatible PCs, Apple Macintosh machines, and unix-based workstations, and providing staff and student training and computer research and administration support.

1990 - 1995 Research at the University of Birmingham, under the supervision of Professor John Sinclair, into the automatic analysis of Cobuild dictionary definitions, for the degree of Ph.D. (awarded December 1995).

1991 - Lecturer in English Language, School of English, University of Birmingham, specialising in the History of the English Language, Computational Linguistics, Corpus-based Research Methods and Medieval English language and literature.

1996- Co-ordinator of the Corpus Research group within the School of English, involved in international research projects (such as the EC's PAROLE project and TELRI).

Main research project participation

1989 - 1991 Chamberlain Project (IBM, University of Birmingham) exploring the automatic analysis of Cobuild dictionary definitions.

1992 - 1994 ET/10 - 51 project (Birmingham, Bochum and Pisa) developing an automatic parser for Cobuild definitions and using it to extract lexico-syntactic information for a test vocabulary.

1996 - LE PAROLE project (organisations within most EU countries) developing resources for language research, including corpora and lexica.

1996 - TELRI project (Tran-European Language Resources Infrastructure - 22 institutions within 17 European countries) creating a language research infrastructure in collaboration with Eastern European institutions.

18.2 Publications relevant to current proposal

1992 'Computer analysis of spelling variants in Chaucer's Canterbury Tales' in Leitner (ed.), New Directions in English Language Corpora

'The Nature of the Metalanguage', (with J.M.Sinclair) in EC research contract ET-10/51, progress report 1

1993 'The semantics of Definitions', (with J.M.Sinclair) in EC research contract ET-10/51, progress report 2

'The Automatic Analysis of Dictionaries - Parsing Cobuild Explanations', in Baker, Francis & Tognini-Bonelli (eds.), Text and Technology: in honour of John Sinclair, Amsterdam: John Benjamins

1995 'Parsing Cobuild Entries', with J.M.Sinclair, in Sinclair, Hoelter & Peters (eds.), The Languages of Definition: the Formalization of Dictionary Definitions for Natural Language Processing: Luxembourg: Office for Official Publications of the European Communities

1996 Language and Computers: A Practical Introduction to the Computer Analysis of Language: Edinburgh: Edinburgh University Press

18.3 Site profile

The University of Birmingham, founded in the late nineteenth century, has about 17,000 students, mainly on one campus in Edgbaston. The Department of English has about 300 undergraduate students on its single honours programme, and a similar number doing joint honours. It also has a large number of postgraduates working in both literary and linguistic studies. The Corpus Research unit within the Department was started by Professor John Sinclair. It has strong links with Cobuild, dictionary publishers and proprietors of the 'Bank of English' corpus, which began as a joint venture between the University and the publishers Collins. Its current full-time staff are Dr. Geoff Barnbrook and Oliver Mason, the unit's Computer Officer.


top   telrii 



Frantisek Cermak

19.1 Curriculum Vitae

Institute of the Czech National Corpus, Faculty of Philosophy Charles University, director (Ústav ceského národního korpusu FFUK) nám J. Palacha 2, Praha 1, 110 00, fax: +42 02 2481 2166, email: Frantisek.Cermak@ff.cuni.cz

Born in 1940, I have studied Czech, English and Dutch languages (1957-62) at the Faculty of Philosophy, Charles University Prague. After having graduated a member of the Institute of the Czech Studies, Faulty of Philosophy, Charles University, oriented towards the Czech language as second language. Between 1991-1993 I was head of the Lexicography dept. of the Institute of the Czech Language, Czech Academy of Sciences. Since 1994 I am head of the interdisciplinary Institute of the Czech National Corpus, Faculty of Philosophy, Charles University, oriented towards build-up and development of the Czech National Corpus the largest universal databank of the Czech language. I have taken part in a multinational TELRI project of the European Union within the Copernicus Initiative which was oriented toward corpora. I have been a guest lecturer at a number of universities (Poland: Lublin, Gdansk, Krakow; Denmark: Copenhagen, Aarhus; Norway: Oslo; Italy: Udine; Sweden: Göteborg, Abockholm; Macedonia: Skopje; Egypt: Cairo; Hungary: Budapest; Netherlands: Amsterdam; Germany: Berlin; Spain: Granada).

I took my PhD degree from philosophy, linguistics and phonetics in 1976 (thesis: A Dutch Grammar 1976), Candidatus Scientiae degree (Csc) from the Czech language (based on a collection of works: Phraseology, Idiomatics of Czech and Phraseology. 1990), Doctor of Sciences degree (DrSc) from the Czech language (based on a collection of works: that of 1990 enlarged by several papers in 1991), degree of Dozent from general linguistics 1991 (with a collectioon of works: Aspects of the Language System and its Periphery: Czech and Other Languages in 1991) and Professor's degree from the Czech language (based on a collection of works Studies on the Czech Language 1994).

My special orientation goes to the following languages: Czech, Germanic languages, esp. Dutch, English and Scandinavian, Finnish, Slavonic languages, etc. My special fields include: lexicology and lexicography, phraseology and idiomatics, semantics, word formation, morphology, typology, theory of language, linguistic methodology, corpus linguistics, etc.

I am a member of Societas Linguistica Europea, EURALEX (board member), Jazykovedne sdruzení, Prazský lingvistický krouzek (Prague Linguistic Circle).

19.2 Selected Recent Publications

1995, Prague School of Linguistics Today. Linguistica Pragensia 1. 1-15.

1995, Systém, funkce, forma a sémantika ceských predlozek. Slovo a Slovesnost (in print) (Sytsem, Function, Form and Meaning of the Czech Prepositions).

1995, Functional System and Evaluation. In Travaux du cercle linguistique de Prague n.s. Prague linguistics circle papers, Vol. 1. eds. E. Hajicová, M. Cervenka, O.Leska, P. Sgall. Benjamins, Amsterdam/Philadelphia.73-84.

1996, Ferdinand de Saussure and the Prague School of Linguistics. In Travaux du cercle linguistique de Prague n.s. Prague linguistics circle papers, Vol. 2. eds. E. Hajicová, M. Cervenka, O.Leska, P. Sgall. Benjamins, Amsterdam/Philadelphia.59-72.

1997, Nizozemsko-ceský slovník.Nederlands-Tsjechisch Woordenboek. Leda Praha, 2. enlarged ed. (Dutch-Czech Dictionary together with z.Hrncírová) 1039 s.

1997, Synchrony and Diachrony Revisited: Was R. Jakobson and the Prague Circle Right in Their Criticism of de Saussure? Folia Linguistica Historica XVII/1-2, 19-40.

19.3 The Insitute of the Czech National Corpus

(Ústav ceského nárdního korpusu)

Faculty of Arts, Charles University

nám. J. Palacha 2, Praha 1, 110 00, Czech Republic

Head: Prof PhDr. Frantisek Cermak

The recently founded institute (in 1994), made up, by mutual agreement, in part of members of several institutes and schools, aims at building-up a very large, multifunctional academic computer corpus of the Czech language (The Czech National Corpus), both contemporary and historical (in text form) as well as spoken. It is intended to serve as basis for A) research, B) lexicographical and and other applicationa and in the future C) teaching (corpus linguistics and its applications); in a broader sense The Czech National Corpus will become the greatest and most universal source of data on the Czech language accessible with computer tools. Basic research fields represented are: corpus linguistics (with special reference paid to the Czech language as well as contrastive studies, up to an extent), thoery of corpora, applications and verification of linguistic theories, lexicographical and other applications of the corpus (linguistic and other).

Current research project: The Czech National Corpus Grant accorded by the State Grant Ganecy: The Corpus of Czech Written Texts, Programme Tools for the Computational Elaboration of Czech Texts, Czech Phraseology, its research and Lexicographical Elaboration, Corpus of Spoken Czech in Computer Elaboration, etc.


top   telrii 



Laurent Romary

20.1 Curriculum Vitae

Born: April 4th, 1964

Graduated from Ecole Supérieur d'Electricité in 1986

PhD in Computer Science in 1989

Current position: CNRS Senior Researcher (Chargé de Recherche) au Centre de Recherche en Informatique de Nancy (CRIN-CNRS), head of the research team " Language and Dialogue " (ca. 20 members), joint research unit with the INRIA.

Coordinates: CRIN-CNRS & INRIA Lorraine

Bâtiment Loria, B.P. 239

F-54506 Vandoeuvre Lés Nancy

Tel : (33) 03 83 59 20 37 (sec. 2026)

Fax : (33) 03 83 41 30 79

e-mail: Laurent.Romary@loria.fr (sec. Isabelle.Blanchard@loria.fr)

http://www.loria.fr/~romary

Main areas of research:

- man-machine dialogue system design

- pragmatics of dialogue

- corpus design and management

- text encoding standards and tools

20.2 Main recent publications

Bellalem, N. and L. Romary (1996). Gestural Prosody in Man-Machine Task-Oriented Dialogues. Workshop on the Integration of Gesture in Language and Speech, Newark, Delaware.

Bellalem, N. and L. Romary (1996). Structural Analysis of Co-verbal Deictic Gesture in Multimodal Dialogue System. Gesture Workshop'96, York, UK.

Bonhomme, P., F. Bruneseaux, et al. (1996). Codage, documentation et diffusion de ressources textuelles. Cahiers de GUTenberg(24): 177-180.

Bonhomme, P., S. Cruz-Lara, et al. (1997). SILFIDE : Serveur Interactif pour la Langue Française, son Identité, sa Diffusion et son Etude. TALN'97.

Bruneseaux, F. and L. Romary (1997). Codage des références et coréférences dans les dialogues homme-machine. Joint International Conference of the Association for Computers and the Humanities and the Association for Literary and Linguistic Computing.

Gaiffe, B. and L. Romary (1996). Do we co-refer to the same object? ECAI Workshop on Intelligent Multimedia Presentation Systems (W32), Budapest.

Romary, L. and J.-M. Pierrel (1996). Le projet Silfide : vers un accès ouvert aux ressources linguistiques francophones. Revue Française de Linguistique Appliquée.

20.3

Organization profile

" LANGUE ET DIALOGUE " TEAM

Combined project of the CRIN-CNRS (Centre de Recherche en Informatique de Nancy/Centre National de la Recherche Scientifique) and the INRIA (Institut National de Recherche en Informatique et Automatique)

Team leader:

Laurent Romary

The introduction of natural language and, in particular, of speech into a man-machine interface requires systems which are both robust from the point of view of recognition and understanding of language and well adapted to the underlying task. The objective of the DIALOGUE project is to define and put into operation robust and reliable communication systems with a man-machine language component. To this end we also develop an important activity in the underlying linguistic engineering which is required both for fundamental and applicative studies.

MAIN AREAS OF RESEARCH

… The use of knowledge for the recognition and comprehension of speech and natural, language, from the treatment of speech signals to pragmatics, recognition of words, understanding of sentences, use of prosody, use of structural information; syntax and semantics and their interactions in the text, modeling and interpretation of dialogues.

… The integration of natural language in applications; the supply of data to multi-modal "natural languages": oral dialogue and interactive applications (PARTNER and DIAPASON systems integrating natural language, speech, graphics, and direct designation (mouse or glove designation)), systems of help dialogue (application to telephone information).

… The definition of methods and tools for the management and access to linguistic ressources. In particular, we coordinating the Silfide project aiming at defing a framework for the distribution and use of such ressources accross the web for the French speaking academic community.

SCIENTIFIC AND INDUSTRIAL LINKS

We have and have had several research and development contracts with industriel partners such as ALCATEL, THOMSON-SINTRA-DASM, THOMSON-CSF-SDC, DRET, SOLLAC.

We have been participating in the following European projects :

… MULTIWORKS Esprit project with BULL, OLIVETTI, PHILIPS;

… ROARS Esprit project with THOMSON, ENA, TELECOM;

… Aquarelle Esprit project (Information Engineering) with Bull, Finsiel, Grif, Euroclid;

… Master Eureka project with Alcatel;

… Lingua project on multilingual parallel concordancing.

We have numerous scientific relations, in particular with the following european partners: Universities of Namur, Valencia, Gand, Sarrebrücken (DFKI), Mannheim (IDS), Birmingham, Sheffield, Edinburgh (HCRC) etc.


top   telrii 



Vladimir Benko

21.1 Curriculum Vitae

Academic title: (Dipl.) Ing.

Date of birth: 28 April 1954, Liptovsky Mikulas, Slovakia

Nationality: Slovak

Sex: male

Condition: married

Biographical notes:

1972-1978 -- student at the Faculty of Electrical Engineering, Slovak Technical University, Bratislava

1978-1981 -- Institute for Information and Managenment - research assistant (operating systems, software engineering)

1981-1985 -- Slovak Academy of Sciences, Computing Centre - junior research worker (database systems, text-processing)

1986-1992 -- SAS, Information Centre - senior research worker (textual databases, machine-readable dictionaries, computerized typesetting)

1990 -- Joint working group for computational linguistics (with the L. Stur Linguistics Institute) - computational lexicography

1992 -- Computational Linguistics Laboratory, Faculty of Education, Comenius University, Bratislava

1993 -- L. Stur Linguistics Institute, Slovak Academy of Sciences, computational lexicography (head of research project -- part time)

1995 -- invited by Microsoft Slovakia to act as language consultant in software localization projects

1997 -- appointed head of the Computational Linguistics Laboratory, Faculty of Education, Comenius University, Bratislava

Research subject:

Language corpora, lexical databases, use of language

resources in teaching foreign languages

Present position:

Senior research worker & lecturer

Mailing address:

Vladimir Benko

Comenius University, Faculty of Education

Computational Linguistics Laboratory

Moskovska 3, SK-81334 Bratislava

Slovakia

21.2 RECENT PUBLICATIONS

Benko, V.: Slovak Language Lexical Database (SLLD), In: Complex '90 International Conference, Balatonfured 1990

Benko, V.: Konverzia Kratkeho slovnika slovenskeho jazyka na lexikalnu databazu (Conversion of the "Concise Dictionary of the Slovak Language" into a Lexical Database), In: Algoritmy '91 International Conference, April 1991

Benko, V.: (Late) Computational Support for a Dictionary Project: The Dictionary of Slovak Dialects, COMPLEX'92 International Conference, Budapest, October 1992

Benko, V.: Pocitacove korpusy a analyza textu (Computerized Corpora and Text Analysis), In: Text a kontext, Pedagogicka fakulta v Presove UPJS v Kosiciach, Presov 1993

Benko, V. -- Kostolansky, E: Corpus Work at the Faculty of Education, Comenius University. In: TALC '94, 1st International Conference on Teaching and Language Corpora, Lancaster University, April 1994.

Pisarcikova, M. -- Benko, V.: Slovak Synonym Dictionary. In: EURALEX '96 Proceedings, Part II, Gothenberg 1996

21.3 Site profile

Comenius University (Univerzita Komenskeho) in Bratislava is the oldest university in the Slovak Republic. It was founded in 1919 and follows the university tradition of the Academia Istropolitana which was established in Bratislava by Matthias Corvinus, the Hungarian King, in 1467.

The Faculty of Education (Pedagogicka fakulta) prepares teachers for Basic Schools (Primary: 1st--4th and Middle: 5th--8th grades), and in some subjects also for Secondary General Education Schools (9th--12th grades). The subjects taught cover Elementary Education, Pedagogy, Psychology, Slovak, English, German, French, Spanish, History, Civics, Mathematics, Natural Sciences, Art, Music, and Physical Education. The Faculty, as the sole University institution in Slovakia, is responsible for the preparation of teachers and educators for special education schools and institutions (visually and hearing impaired, physically handicapped, speech disorders, mentally retarded, and emotionally and socially disturbed). There is a special branch of study for education therapists and for specialists in the field of social work. The Faculty has 21 departments and 2 research

units.

Computational Linguistics Laboratory (Laboratorium pocitacovej lingvistiky), a small research unit founded in 1992, is aimed at carrying out NLP research and developing LI applications. The main questions to be addressed are as follows:

1. General issues in computational processing of Slovak language, which include creation of formal representations of language, computational interpretation of text, evolution of speech in various stages of child development, computer-aided language (and humanities in general) learning, adaptation of new information technologies in CL research.

2. Creation of universal tools and re-usable resources to support application development, such as integrated development and testing environment for lexical research, tools for transformation of existing MRD's and other lexical resources, developing of own lexicons and language corpora.

3. Application of research results in processing large text resources (quantitative text analysis, summarization), computer-aided language learning, creation of specialized lexicons, machine translation.

4. Developing methodologies of using computerized language resources in teaching languages (both first and foreign).

At present the are two projects being carried out in the framework of the Slovak Grant Agency for Science funding programme:

1. Formal Models for Computerized Processing of Slovak Language, part Morphology (coordinator Eduard Kostolansky)

2. Slovak Corpus and Lexical Database (coordinator Vladimir Benko).

The Laboratory also participates at several dictionary projects with the Slovak Pedagogical Publishers and Forma software house.

The Laboratory is equipped by PC-compatible computers connected via an Ethernet LAN. The software environments include DOS, MS-Windows and Unix; C, Lex/Yacc and dBase and a number of in-house developed tools.


top   telrii 



Dan Cristea

22.1 Curriculum Vitae

Education

1994 University POLITEHNICA Bucharest - Ph.D. in Computer Science

1981 University "A.I.Cuza" of Iasi - Master in Mathematics

1975 University POLITEHNICA Bucharest - Master in Computer Science

B. Academic Honors

September - November 1996. Research fellowship, Universiti Sains Malaysia (Penang, Malaysia), Department of Computer Science. Research on computational discourse theory.

July 1996. Invited lecturer, ACM Summer School in Intelligent Natural Language interfaces (Belis-Fintinele, Romania).

December 1995 - March 1996. CNR Rome Research fellowship, Istituto per la Ricerca Scientifica e Technologica (Trento, Italy). Research on computational discourse theory.

December 1994 - June 1995. Research fellowship, University of Venice, Department of Computational Linguistics. Research on speech processing, prosody, and computer aided language learning

September 1994. Invited lecturer, Summer School on ``Contemporary Topics in Computational Linguistics'' (Tuzlata, Bulgaria)

December 1993 - February 1994. Research fellowship, University of Paris-Sud, Orsay. Research on multilingual morphology generation and computer-aided language learning.

June 1993. Research fellowship, University of Nijmegen. Research on Semantic Syntax.

May 1993. TEMPUS fellowship, University of Edinburgh, Department of Artificial Intelligence. Research on morphology generation.

C. Foreign Languages

English - Proficiency in reading, writing and conversation

French - Proficiency in reading, writing and conversation

Italian - Proficiency in reading and conversation

Romanian - Native proficiency

D. Employment and Experience

March 1995-at pres. -- Associate Professor

University "A.I.Cuza" of Iasi, Faculty of Computer Science, Chair of Applied Computer Science. Artificial intelligence, natural language processing, expert systems

1988-1995 -- Lecturer

University "A.I.Cuza" of Iasi, Faculty of Computer Science, Chair of Applied Computer Science. Artificial intelligence, natural language processing, expert systems

1985-1988 -- Assistant Professor

University "A.I.Cuza" of Iasi, Faculty of Computer Sciencce. Artificial intelligence, LISP, PROLOG

1984-1988 -- Project manager

Computer Center of the University of Iasi natural language processing, natural language interfaces to databases, knowledge representation

1983 -- Researcher

Research Institute in Computer Science, Bucharest. Question-answering systems, representation of knowledge, inference in semantic nets

1981-1983 -- Researcher

Computer Center of the University of Iasi morphology, parsing

1976-1981 -- System engineer

Computer Center of the University of Iasi. Computer systems maintenance and development, design and realization of electronic equipment for speech processing

22.1 LIST OF PAPERS

Tufis,D.; Cristea,D (1985): IURES:A Human Engineering Approach to Natural Language Question-Answering Systems. In: Artificial Intelligence. Methodology, Systems, Applications. (Eds: Bibel,W;Petkoff,B) North-Holland, Amsterdam, pag.177-184.

Cristea,D; Tufis,D; Mihaescu,T (1985):IURES: A Computer Natural Language Question-Answering System with Possible Medical Applications. Rev.Med.Chir.Soc.Med.Nat.Iasi LXXXIX(3), pag.511-516.

Cristea,D.; Mihaescu,T. (1986): Sistem de codificare automata a limbajului medical, in Al IX-lea Simpozion de Informatica Medicala - MEDINF'86. Rapoarte si lucrari, Iasi, pag.136-142.

Cristea,D. (1987): Sistemul QUERNAL, in Giumale,Cr.; Preotescu,D.; Serbanati,L.; Tufis,D.; Tecuci,Gh.; Cristea,D.: LISP, Editura Tehnica, Bucuresti, vol.2, pag.215-229.

Cristea,D.; Mihaescu,T. (1988): Combining Menues with Natural Language Processing in Recording Medical Data. Jou. of Clin. Comp., New York, XVI(5-6), pag.156-166.

Cristea,D. (1990): Procesarea limbajului natural. Bull.Ass.Rom.Sci. 1(1-3), pag.73-80.

Cristea,D.; Teodorescu,H.N. (1991): A Fuzzy Approach to a Mixed Syntactic-Semantic Parser. In: Fuzzy Systems and Artificial Intelligence. An Advanced Textbook. (Eds: Teodorescu,H.N.; Yamakawa,T.; Rascanu,A.) Iasi University Publishing House, Iasi, pag.67-78.

Cristea,D.; Teodorescu,H.N. (1992): Case Studies of Uncertainty in Natural Language Queries. Second International Conference of the Balkanic Union for Fuzzy Systems and Artificial Intelligence, (31 August - 5 September 1992, Karadeniz Technical University, Trabzon, Turkey)

Cristea,D. (1993): A Computational Insight into Semantic Syntax. The Nijmegen Report. Research report. Nijmegen University.

Cristea,D. (1993): The generation of Romanian Morphology. Research Report. University of Edinburgh.

Cristea,D.; Galescu,L. (1993): L-exp - A Language for Programming Natural Language Applications. The 9-th Romanian Symposium on Computer Science, ROSYCS-93. University "Al.I.Cuza" Iasi.

Cristea,D. (1994): The Classification Language MICH, Research Report, LIMSI-CNRS, Universite Paris-Sud, Orsay.

Cristea,D.; Galescu,L.; Bacalu C. (1994): L-exp - A Language for Building Natural Language Applications, in Proceedings of The 14th International Conference Avignon'94: Artificial Intelligence, KBS, Expert Systems, Natural Language, Paris.

Cristea,D. (1994): Probleme de analiza limbajului natural: Aplicatii in accesul la baze de date. Teza de doctorat, Universitatea POLITEHNICA Bucuresti, iulie 1994.

Cristea,D.; Delmonte,R.; Petrea,M.M. (1995): PROSODICS - a way to enhance your English. Proceedings of the Fifth Symposium on Automatic Control and Computer Science. Iasi, Romania, 26-27 October 1995.

D.Cristea,D.; Delmonte,R.; Petrea,M.M. (1996): PROSODICS - sau de ce copi'i e altceva decat co'pii. In D.Tufis (Ed.): Limbaj si Tehnologie, Editura Academiei, Bucharest.

D.Cristea: Representing and Understanding Discourse (1996). Technical Report #9609-03, IRST-Trento, Italy, Sept. 1996.

D.Cristea; B.Webber (1997): Expectations in Incremental Discourse Processing, Technical Report, Univ. of Iasi and Pannsylvania, June 1997.

D.Cristea; B.Webber (1997): Expectations in Incremental Discourse Processing, in Proceedings of the 35th Annual Meeting of the Association for Computational Linguistics, Madrid, July 1997.

M.M.Petrea, D.Cristea (1997): Dealing with prosody. A computer assisted language learning approach, in Recent Advances on Language Techonoly, Ed.

Academiei, Bucharest.

22.3

History.

The Faculty of Computer Science was settled in 1991. It continues the Computer Science section that functioned between 1966-1991 as part of the Faculty of Mathematics of the University "A.I. Cuza" Iasi.

In 1997 it is still the only faculty having as profile Computer Science among the non-technical universities in Romania.

Formation Tracks

* The Computer Science section (4+1 years)

Prepares specialists in the following domains:

o Software Engineering

o High schools teaching in Computer Science careers

o Research in Computer Science

* The College of Information Technology (3 years)

Prepares programmers and analysts for skillful usage of existent

software

* The College of Admistrative Staff (3 years)

Prepares specialists in secretariat work and administration

Address

Mail: Faculty of Computer Science, University "A. I. Cuza" Iasi

16, Gen. Berthelot St., 6600 - Iasi, Romania

Phone: +40.32.216 560 (Secretariat, The Dean)

Fax: +40.32.213 330

URL:http://www.infoiasi.roNo 23

Antonio Zampolli

23.1 Curriculum Vitae

Full Professor of Computational Linguistics, University of Pisa

Director of the Istituto di Linguistica Computazionale, (ILC-CNR) Pisa

Academic Experience

Researcher at CAAL (Centro per l'Automazione dell'Analisi Linguistica e Letteraria), 1960-1965.

Researcher at CNUCE (Centro Nazionale Universitario di Calcolo Elettronico), 1965-1970

Senior System Engineer at the IBM Scientific Center, Pisa (1967-1973)

Associate Professor of Computational Linguistics at Pisa University (1970-1974)

Full Professor of General and Applied Linguistics at the University of Genova (1974-1977)

Full Professor of Computational Linguistics at the University of Pisa, from 1977.

Director of the International Summer School "Computational and Mathematical Linguistics", from 1970.

Director of the CNUCE Linguistics Division (1968-1978)

Director of the Institute of Computational Linguistics of CNR (since 1978)

Coordinator of the Italian National Strategic Project ``National Language Processing'' (1985-86)

Coordinator of the Italian National Strategic Project ``Methods and Tools for Language Industries in the International Cooperation'' (1987-89)

Professional Activities

· Italian Representative and Member of the Executive Committee of ALLC (Association of Literary and Linguistic Computing) since 1973

· Past vicepresident of ACH (Association for Computers in the Humanities)

· President of ALLC (Association of Literary and Linguistic Computing)

· Vicepresident of EURALEX 1985-1988

· President of EURALEX 1988-1990

· Member of ICCL (COLING) since 1973

· Vicepresident of ICCL (1978-1984)

· Vicepresident of AILA (Association International de Linguistique Appliquee) from 1975 to 1984

· President of GILA (Italian Group of Applied Linguistics) since 1973

· Representative of ALLC in the Executive Committee of ACL (Association for Computational Linguistics) since 1986

· Member of the Standing Committee for Humanities of the ESF (European Science Foundation) 1985 - 1990: subject representative for computational linguistics;

· Co-chairman of the group ``Computers and the Humanities'' of the ESF 1989-1992

· Vicepresident and Member of CETIL 1979-1988

· Member and vicepresident of the CIDST ad hoc group for Linguistic Problems, 1979-84

· Member of the Group of Experts of the Council Europe of for Language Industries, and Member of the Bureau (1986 - 1990)

· Member of the group for linguistic technology in documentation of the FID

· Member of the Steering Committee of the TEI (Initiative for Text Encoding Guidelines and a Common Interchange Format for Literary and Linguistics Data), NO promoted by ALLC, ACH, ACL and supported by EEC, NEH, Mellon Foundation, and coordinator of the European participation

· Coordinator of the LRE supported project for American-European cooperation

· Coordinator of the NSF-ESPRIT joint project for lexical resources

· Member of the EAGLES management board

· Founder and member of the ELSNET management board

· Coordinator of the ELSNET Task Group for the reusability of resources

· Coordinator of the EAGLES corpus working group

· Member of the USA Linguistic Data Consortium Advisory Board

· Member of the CEC coordination group for lexical resources

· Invited speaker in more than 100 International Conferences

· Member of the EUROTRA ACM and CGM

· President of the Technical Task Force of the EUROTRA ACM and CGM

· Participant in the European Projects MULTILEX, ONOMASTICA, DELIS, ACQUILEX-II, MULTEXT

· Participant in the EUREKA Projects EUROLANG and GENELEX

· Member of the ELTA and DANZIN EEC strategic panel 1991 - 1992

· Chairman of the Scientific Program Committee of several Conferences (COLING-78, COLING-92, ALLC-78, etc.)

· Member of the Editorial Board of several Scientific Journals (Computers and the Humanities, ALLC Journal, Histoire et Mesure, TA, Journal of Applied Linguistics, Industries de la langue, etc.)

· Director of the Journal ``Linguistica Computazionale''

· Coordinator of various projects (MLAP NERC, Esprit BRA ACQUILEX-1, LRE EAGLES, LRE RELATOR, LE PAROLE 2, International co-operation of EAGLES)

· Organizer of several specialized conferences (COLING 1973, COLING 1978, ALLC 1983, ESF 1981 Workshop on the Possibilities and Limits of the Computer in Producing and Publishing Dictionaries, EEC International Workshop ``On Automating the Lexicon, 1986; International Workshops the Multifunctional Lexicon, 1986-1987-1988, International Workshop on corpora, 1992, Council of Europe Workshops 1987-88, etc.).

Research Interests

Main Fields: Computational Linguistics; Natural Language Processing; Language Industries.

In particular: Computational lexicology and lexicography; reusability of lexical resources; text processing; analysis of multilingual corpora; lexicographical and linguistic workstation; formal grammars and parsers; machine translation; quantitative linguistics; office automation; computer-assisted language teaching, document production and analysis, multimedia literary research, etc.)

23.2 Selected Publications

The Parole Project in the Context of the European Actions for Language Resources, in print, (1997).

Survey of the State of the Art in Human Language Technology, Jointly Sponsored by the European Commission and the National Science Foundation of the United States of America, Pisa, Giardini Editori, 1997, (with Varile, G.B., (Managing eds., 1997), Cole, R., Mariani, J., Uszkoreit, H., Zaenen, A., Zue, V. (Editorial Board, 1997)).

„Introduction“, in Varile, G.B., (Managing eds., 1997), Cole, R., Mariani, J., Uszkoreit, H., Zaenen, A., Zue, V. (Editoral Board, 1997).

„Introduzione“, in Ridolfi, P., Piraino, (eds., 1997): Trattamento automatico delle lingue nella società dell'informazione, Atti del Convegno, Roma, 13-14 gennaio, 1997, in stampa come numero speciale di „La Comunicazione“, Pubblicazione Trimestrale dell'Istituto Superiore PT. (eds., 1997).

„Introduction“, in ELRA Newsletter, Vol. 1, N.3 (October, 1996).

„The International Cooperation in the Field of Language Resources, COLING-96, Copenhagen (August, 1996).

„The Expert Advisory Group on Language Engineering Standards“, in ERCIM News, 26 (July, 1996), pp. 16, (with Calzolari, N., and McNaught, J.).

„Introduction“, in ERCIM News, 26 (July, 1996), pp. 8.

(Guest Editor), „Natural Language Processing“, in ECRIM News, 26 (July, 1996).

„Introduction“, in ELRA Newsletter, Vol. 1, N.2 (July, 1996).

„Multilingual Access to Information“, La situazione Italian nel settore del Language Engineering, Main-96, Volterra (July, 1996).

„Introduction“, in ELRA Newsletter, Vol. 1, N.1 (March, 1996).

„Il Progetto PAROLE: Programma di lavor e prospettive di disseminazione dei risultati“, Istituto di Linguistica Computationale del CNR, Pisa (1996).

„Il Progetto Naxionale per il TAL“, Ministero PTT, Roma (1996).

„A Scientific Strategy for Computational Lingusitics in Europe“, TELRI, Birmingham (March, 1996).

„Scientific Problems of Standarization“, in LE EAGLES Corpus Workshop, Madrid, (January, 1996).

„Introduction“, in „Towards a Network of European Reference Corpora: Report of the NERC Consortium Feasibility Study“, in Linguistica Computazionale XI - XII, Giardini, Pisa, (with N. Calzolari, M. Baker, and T. Kruyt (eds., 1995)).

Automating the Lexicon: Research and Practice in a Multilingual Environment, Proceedings of a Workshop held in Grosseto, Oxford University Press, Oxford, (with Calzolari, N., Walker, D. (eds., 1995)).

(The total number of publications is more than 100)

23.3 Istituto di Linguistica Computazionale del CNR (ILC)

The Pisa Group, based on the co-operation between the Istituto di Linguistica Computazionale of the CNR (National Research Council), the Dipartimento di Linguistica dell'Università and the Consorzio Pisa Ricerche, has been active in the field of Computational Linguistics since 1967, when a Division of Computational Linguistics was formed at CNUCE (Centro Nazionale Universitario di Calcolo Elettronico). In 1970 the first chair of Computational Linguistics (which is still the only one in Italy) was created. In 1978 an independent Institute of the CNR for research and development in the field of Computational Linguistics was founded. In addition, the PhD programme in the Dipartimento di Linguistica dell'Università envisages a curriculum in Computational Linguistics.

The Pisa Group is now involved in a large number of national and international projects, with 30 permanent staff members, and an almost equal number of research personnel temporarily involved in these projects. These projects range from Text Processing (concordances, indices, lemmatisation, statistical analyses, etc.), to the establishment of a Reference Corpus of the Italian Language in co-ordination with parallel initiatives on other languages, the development of large Textual Databases, the use and analysis of Machine-Readable Dictionaries, the development of large Lexical Databases (monolingual and bilingual), the study of parallel/contrastive multilingual corpora, the study and implementation of morphology for several languages, the design of formal and computational grammars and the development of parsers (in different frameworks), the study and implementation of Knowledge Representation languages and systems, the study of dialogue and of natural language interfaces, the development of large Lexical Knowledge Bases, Machine Translation, digital image processing, etc. Pisa was one of the two Eurotra sites in Italy, dealing with lexical and strucural transfer, bilingual dictionaries as well as the implkementation of linguistic phenomena in generation. Moreover, the ET Pisa group has performed research work of lexical semantics, sentential complementation, tense and aspects, noun arguments and tranlation theory.

The Pisa group strongly promoted in the last years the concept of reusability of linguistic resourcesat the international level, and co-ordinates the European participation to the Text Enciding Initiative for the standaredisation of text and dictionary encoding and the European EAGLES initiative.

The Pisa group is involved in extensive national and international co-operation with Universities, Research Institutes(private and public, national and international), Industries, etc. It is/was involved in national and international research projects (co-ordinating some of them), such as the Italian National Strategic Project on Language Industries, EUROTRA, ESPRIT Projects (Basic Research Action ACQUILEX-I and II and IDEAL, MULTILEX), ET-7, the Text Encoding Initiative, ET-10/51/63/75, the Data Collection Initiative, Survey of Linguistic Resources for NLP, NERC, ELSNET, Eureka GENELEX and EUROLANG, ESPRIT DARPA/NSFco-operation,LRE DELIS, ONOMASTICA, MULTEXT, RELATOR, CRISTAL, RENOS, EAGLES, COLSIT, LS-GRAM, MLAP PAROLE and MEMORIA, LE PAROLE, SPARKLE, EuroWordNet, TAMIC. It has contracts of research with industries and it is active in the organisation of International Conferences, Workshops, Summer schools, etc.


top   telrii 



Michal Jankowski

24.1 Curriculum Vitae

School of English

Adam Mickiewicz University e-mail: mjank@ifa.amu.edu.pl

Aleja Niepodleglosci 4 mjank@hum.amu.edu.pl

61 687 Poznan, Poland

phone:(+49 61) 8528820

fax: (+49 61) 8523103

Principal research fields

- lexicography

- computational linguistics

Dictionary Projects

- English-Polish Computer Science Dictionary (1985 - 1990)

- English Semibilingual Dictionary for Speakers of Polish (1988 - 1990)

- A Dictionary of English Idioms (1986 - 1993)

- An English-Polish Picture Dictionary (1990 - 1991)

- Collins English-Polish and Polish-English Dictionary (1993 - )

- English-Polish Computer Science Dictionary 2nd ed. (1994 - )

- English Semibilingual Dictionary for Speakers of Polish 2nd ed. (1994)

- Collins English-Polish and Polish-English GEM Dictionary (1995 - 96)

- Collins English-Polish and Polish-English Dictionary 2nd

ed. (1996 - 97)

24.2 Recent publications:

Dictionaries:

- (with J. Fisiak (ed.) et al) Collins English-Polish, Polish-English Dictionary. Two vols. Warszawa: BGW. 1996. Pp. 512+520.

- (with J.Fisiak (ed.) et al) Learner's English-Polish Dictionary. Warszawa: PWN. 1996. Pp. 920.

- (with J. Fisiak (ed.) et al) Collins Gem English-Polish, Polish-English Dictionary. Warszawa: BGW. 1996. Pp. 966.

Articles:

- "Automatising lexicography projects". In Abramowicz W. and Z. Vetulani (eds). Language and Technology. Warszawa: PLJ. 1996. 126-131. (in Polish)

- "English-Polish Dictionary of Idioms: the computing background". In: Hickey, R and S. Puppel (eds.). Language history and linguistic modelling: A festschrift for Jacek Fisiak on his 60th birthday". Berlin - Ney York. Mouton. 1997. 1743-1750

24.3 Profile of the organisation

The Poznan School of English is currently the largest English department in Poland. Its teaching staff numbers 124 members, with 22 professors, 13 adjunct professors, and 18 foreign lecturers. The number of students currently participating in the five year programme exceeds 900.

Among the courses offered are Linguistics, History of English, English and American Literature, Sociolinguistics, Contrastive and Applied Linguistics, Translation, Lexicography, Practical English.

The department has a large library, audio-visual centre, a computer network, and language labs.

Several publishing and lexicography projects have been carried out in the department on a continuing basis in co-operation with major publishers. The department is involved in the TEMPUS and TELRI programmes.


top   telrii 



Alexandre Zubov

25.1 Curriculum Vitae

In 1964 I graduated from Moscow State University as a mathematican. During 1965-1968 I was taken a post-graduate course in Minsk State Pedagogical Institute of Foreign Languages (MSPIFL) on the speciality “Structural, Applied and Mathematical Linguistics”.

Since that time I filled the positions of senior teacher, assistant professor, professor, Head of the Computer Science and Applied Linguistics Department in Minsk State Linguistic University (the fomer MSPIFL).

In 1970 I received the degree of Candidate of Science in philology on the topic “Natural text processing in the “man-computer system”). I defended my dissertation at the University of Leningrad.

In 1985 at Moscow Military Institute I defended my Doctoral theses “The probabalistic-statistical model of text generation (semantic and syntactical aspects”).

Since 1997 I have been an Academician of International Academy of Informatization. I have published about 170 scientific works (there are 7 books among them) on the following problems: formalization of text structure and creation of programs for text generation; computer understanding and creation of programs for information condensing of texts; probabalistic-statistical analysis of texts; machine translation; computer-assisted learning theory and creation multimedia programs. The list of my publications relevant to the problems of text generation is enclosed. I was the managing editor of eight books of scientific works on the above-mentioned problems. Ten impotant works being made to order were caried out under my supervision on these problems. I was the organizer of four International scientific conferences. Eight followers of mine took the degree of Candidate of Science in Computer Linguistics.

My scientific works were published in Russia, Germany (Berlin), Italy (Bologna), Luxemburg, Poland (Posnan, Warszava), Estonia (Tallinn, Tartu). I took part in scientific workshops and conferences in Russia, England, Poland, Austria, Italy, Lithuania, Estonia.

Now I work as a vice-rector of Minsk State Linguistic University. At the same time I fill the position of Head of Computer Science and Applied Linguistics Department.

25.2 List of recent publications of Alexandre Zoubov

Books

Zubov A.V., Zubova I.I., Gabis A.A. The Linguistic Computer Science. Textbook. - Minsk, 1995. - 201 p. (in Russian).

Zubova A.V., Zubova I.I. Principles of Linguistic Computer Science. Textbook. Part 3. Artificial Intelligence. Textbook. - Minsk, 1993. - 202 p. (in Russian).

Articles and Theses.

Zoubov Alexandre. Quantative Methods of Estimation for Foreign Language Textbooks. - In: Journal of Quantative Linguistics, 1997, Vol 4, Sassenheim, Swets & Zeitlinger. - 13 pages, (in print, in English).

Zubov A.V. Knowledge Base of System of poetic Text Generation. - In: Cognitive Linguistic of the XX centure turn. Materials of International Conference. Part 1. - Minsk, 1997, p. 27-35. (in Russian).

Zubow Aleksander. Jezyk i technologia na Bialorusi. - Jezyk i Technologia. Warsawa: Akademicka Ofcyna Widawnica PRJ, 1996, p. 56-58 (in Polish).

Zubov A.V., Zubova I.I. Modern MULTIMEDIA Technology of Foreign Languages Training and its weak points. - In: Language Nomination. Theses of International Conference. - Minsk, 1996, p. 202-204 (in Russian).

Zubov Alexander. Department of Information Science and Applied Linguistics of Minsk Linguistic University. - In: Survey of Language Engineering Organization in Central and Eastern Europe. ELSNET. University of Edinburg, October, 1994, p. 76-77 (in English).

Zubov Alexander. Language and Technology in Belarus. - In: Language and Technology. Reports on the State of Affaire in Belarus, Bulgaria, ... Luxemburg, 1994, p. 13-14 (in English).

25.3 Brief profile of Minsk State Linguistic University

The Minsk State Linguistic University is one of the most important scientific and teaching centres in the Republic of Belarus. The university was set up in 1948 and at present includes 9 faculties and 40 departments. Its student body is over 3660 and it has a staff of 550 lecturers, 260 candidates of science and 42 professors among them. More than 550 young teachers graduate from the university each year. The university is know far beyond the borders of Belarus as a major research centre. About 70 promising young specialists enter the post-graduate course annually. Scores of theses are presented to the two academic councils - doctoral dissertations in German and Ramance languages or general linguistics and cadidate's theses in foreign language teaching methodology. Since 1990 there have been 48 scholarly titles conferred .

New teaching technologies are being introduced. There are classes for computer- assisted learning, space television equipment, all kinds of video facilities, etc. Record and video libraries are constantly restocked with hundreds of new tapes, feature and educational films, videotape recordings and computer programs. Up-to-date xerox machines are available. A publishing house attached to the university regularly brings out textbooks and manuals in English, French, German and Spanish. Likewise, a great number of books and articles containing teaching materials are printed by local and CIS publishers. Many of the foreign language teachers are authors and translators of books and pamphlets cocerned with Belarus' history, literature and culture.

The university has been working over the problem of NL and speech since 1966. The first research was connected with computer-aided compilation of word and word combination frequency list from English and German texts.

The major research of present days: implementation of computer programs of training natural languages, text generation (poems, tales, advertisements, proverbs), implementation of systems of natural language understandings, machine translation, creation of programs for information pressure of texts.

Theoretical achievements:

Designing of the special semantic-syntactical intermedialy language (SEMSYNT);

Designing of algorithm of probalility-algorithmic text generation; Designing of methods of engineering of training programs on foreign languages; Designing of methods of esimate of lexical filling of foreign language textbooks.


top   telrii 



Anatole Shaikevich

26.2 Publications

Shaikevich,A. Distributional statistical analysis in semantics, in: Principy i metody semanticheskih issledovaniy, Moscow, Nauka, 1976

Shaikevich, A. Hypotheses of natural classes and possibilities of numerical taxonomy in linguistics, in: Gipoteza v sovremennoy lingvistike, Moscow.

26.3 The Computer Fund of Russian Language

The Computer Fund of Russian Language (CFRL) is a research and development department within the V.V.Vinogradov Institute for Russian Language of the Russian Academy of Sciences. It was started in 1985 with a double objective of

1.serving as a nucleus for ensuring step by step compurerization of all

departments of the Institute, and

2.serving as a computer center for the studies of Russian language in academic community of the USSR at large. The progressive implementation of the first

objective let CFRL concentrate its efforts in the second direction.

The concept of CFRL was crystallized in the course of discussions of mid-1980s, attended by many dozens of academics. The CFRL as an ideal construction [Andrjuscenko, 1989] was conceived as a system of several subfunds such as:

General Russian Word List,

several dictionary data bases,

collection of texts,

a terminological data base,

Information system for Russian grammar and

subfunds for dialectology, languge history, phonetics.

Each of these subfunds is rather a bunch of activities in the particular field than a mere collection of data.

At the first stage of the development of the CFRL special tools (named Linguistic Programmed Source Packages - LPSP) were worked out for each subfund, e.g. a program system for the Russian Dialectological Atlas, an Automatical Concordance of Colloquial Russian, of the Folklore and of political texts, of Russian texts of

XI-XVII centuries.The complexity of the tasks at that moment can be gauged by a mere fact of Old Russian script differing radically from the present-day writing system (perhaps, equal to the difference between Latin and Greek scripts).

Later a general purpose package for various kinds of sources was constructed : for texts - as an automatical concordance, for dictionaries - as a dictionary data base control system, for dialectology and psycholinguistics - as a maintenance system for processing questionnaires and maps, etc. The main features of those LPSP-programs are their ability to function in any computer network, to communicate with similar LPSPs and to be tools for development of new computerized lingustic sources. They can be combined with many other programs such as automatical dictionaries, concordance making programs, language processors, type setting programs and other tools for language processing.

At the same time there were gathered and created machine readable lingustic sources, the archives of CFRL at present contain texts numbering more than 10 million word occurences (although of very uneven linguistic value). There are various dictionaries such as Zaliznyak's Russian grammatical dictionary, Russian orthographical dictionary, a medium-size most popular Ozhegov dictionary and others of this kind. A special feature is a formalized software system for designing new dictionaries. A description of a dictionary in this language is simultaneously a data base scheme for this dictionary. Dictionary data bases are connected whith automatic concordances and other automatic dictionaries, and there are tools to transform one dictionary into another.

The creation of corpora of CFRL in the 1980s was carried out in a rather haphazard way due to its reliance on contributions from outside. Some of those contributions could have never been worked out by CFRL itself, e.g. three corpora of Russian colloquial speech (some 500 thousand words) created at Lumumba University (Moscow), Saratov University and Institute for Russian language ; collections of folklore texts (some 50 thousand words). Among sources of CFRL there are even contributions from abroad. L.Loenngren kindly permitted us to use his one million words corpus of modern Russian [Loenngren, 1993]. Our text of Lermontov's "Hero of our time" , some texts of Chehov came from A.Barentsen of Amsterdam University.

Some collections are due to the initiative of CFRL itself. A 1 million words corpus of "Kommunist" journal is now outdated, the political Russian has changed enormously since then. On the other hand a more sizable collection of Russian poetic texts (XVIII-XX cc.) cannot become obsolete. At the time all the texts were keyboarded manually which was a formidable task. Next in line was a project of a statistically representative corpus of Russian. The corpus was to be composed of 4000 text fragments 250 words each. Thus the planned collection was to be much more diversified than Loenngren's corpus (4000 fragments against 600). Chronologically the texts should be limited to one year (1990), eventual expansion of the corpus was also considered.

The collapse of communist regime (with its state support of research) put CFRL into a situation of extreme complexity. The lack of funding obliged us to freeze some projects half done, such as work on the above-mentioned corpus of Russian of 1990 or a project of a great terminological word list (a kind of information system for terminological work).

Fortunately, CFRL lived through the low tide of 1993-94 and is trying to adapt to the present slow recovery of research support. The diversification of money sources (Soros Foundation, Central European University, Russian Fundamental Research Foundation and Russian Foundation for Research in Humanities) to a great extent has superseded the former bureaucratic, impersonal and rather conservative system of research funding.

However, the list of priorities of those funding organizations has changed radically. Nobody would support long-term projects, the general trend now is for two very different directions 1) computerization of research in Russia and its integration in World and European networks and 2) short-term projects impressive from the point of view of an academic expert. This explains the present strategy of research of CFRL.

On the one hand, there are plans (and some money for their implementation) for adaptation of CFRL software and data to international systems. CFRL has now a page in Internet (WWW: http://cfrl0.cfrl.synapse.ru and FTP:

cfrl2.cfrl.synapse.ru) and a subsidiary file server in Mannheim (at the Institute for German language). We at CFRL are strong adherents of the idea of free circulation and exchange of electronic texts, we do not expect any money from this activity, but would welcome cooperation and electronic texts (not only Russian texts) from abroad.

On the other hand, we have definite plans of extending our corpora. The advent of scanners has greatly facilitated in-house input of full texts, this and the changed climate of grant-giving brought about certain changes of priorities. Our main concern now is the extension of the corpus of Russian literary texts of the XIX century.

In the framework of Yu.N.Karaulov's project "Dostoevski's Dictionary" all literary texts of the writer have been input into computer. A by-product of this project is "Dostoevski's frequency dictionary" (1.8 million words). Another important project is "Distributional statistical atlas of the classic Russian novel", which includes 30 novels covering the period from 1855 to 1880. Those texts are already in the computer (some 3 million words).

Of course, such corpora cannot serve as a textual foundation of sophisticated descriptions of the present-day language, adaptable to the demands of teaching Russian as a second language. Good samples of the Russian of to-day would do the job much better.

However, our present reliance on literary full-texts corpora has a justification of its own. Without trying to make virtue of necessity, some positive aspects of this approach should be stressed. The results of linguistic processing of such corpora give light both on the general system of the language and on a particular system exemplified by a corpus under study. The analysis of the corpus of novels, for example, would lead to discovery of functional reasons for using this or that word in the peculiar context of novel as a literary genre. Multiple reclassification of subtexts makes possible analysis of some stylistic features which are hard to catch from a corpus made of samples. E.g. regrouping of texts of our corpus into microgenres (such as dialogues, monologues, tales by the characters, their letters, remarks, author's text, etc.) would enrich the lexicographic information accompanying word entries of the planned atlas. Another reclassification of the corpus into texts by particular literary characters would further enhance the discriminating power of the analysis, adding a social dimension to the resulting picture.

This second direction of our research implies elaboration of a complex software system for a new kind of text processing aimed at discovery of hidden structures lying behind the text. This methodological approach (called 'Distributional statistical analysis of texts) was worked out back in the 1970s [Shaikevich,1976,1979], when it was tested manually on some paper concordances (especially on Spevack's computer-made concordance to Shakespeare) [Spevack, 1967].

The method is based on systematic comparison of real frequencies of text events with their mathematical expectation calculated on the basis of some a priori probability. The text events are occurrences and co-occurrences of words (or any other symbol strings). In its extreme variant this language-independent method is quite formal, i.e. it does not imply prior knowledge of the language or text contents. The only thing needed is a large text corpus with discreet symbol strings. The results of the analysis are: morphological description of the language in terms of paradigms and word (or morpheme) classes, semantic links between words, semantic fields, various semantic and stylistic classes of words, classes of texts and text fragments.

An interesting feature of the analysis is the fact that it retains some peculiarities of the source text thus stopping halway between the level of generality of a linguist and that of a philologist, historian, political scientist or information scientist. Now when we have large corpora of texts at our disposal, we feel it urgent to turn the method into a working computer system so that a new tool for humanities and information science could be created.


top   telrii 



Iordan Penchev

27.1 Curriculum Vitae

- 12.02.1931 born in Sofia

- 1955 MA in Bulgarian philogogy, St Ohridski, University of Sofia

- 1979 DrLitt, Bulgarian Academy of Sciences

- 1984 Senior research Associate I rank BASc

- 1878-1987 Head of Department of applied linguistics, Institute for Bulgarian Language

- 1987-1993 Head of Language Models Sector at the Linguistic Modelling Laboratory, CICT, BASc

- 1994 - present Head of Research Group for Formal Linguistics, Institute of Bulgarian Language, BASc

- Fields of research: syntax, semantics, grammatical categories

- Membership in professional bodies: Union of Scientists in Bulgaria

- Foreign languages: German, English, Russian

27.2 Recent Publications

In co-authorship: Iordan Penchev, Aneta Dineva, Maria Stambolieva:

- Savremenni gramaticni modeli (Modern Grammatical Models) Plovdiv University Publishing House

- kompjutar i estestven ezik / Computer and Natural Language / Publication of CICT, Bulgarian Academy of Sciences, Sofia 1987.

Iordan Penchev:

- "Kam vaprosa za vremenata v balgarskiya ezik" / Towards the Definition of the Bulgarian Tenses/. Balgarski ezik N2, 1967.

- "Refleksivnite, medialnite i pasvnite izrecenija v balgarskija ezik" / Reflexive, medial and passive sentences in Bulgarian/. Izvestija na instituta za balgarski ezik 21, 1972.

- "The Conjunctions DA and ZA DA ("in order to") in Standard Bulgarian/. International Journal of Slavic Linguistics and Poetics 25-26, 1982.

- Stroezh na balgarsko izrecenie / The Structure of the Bulgarian Sentence/ Sofia: Nauka i izkustvo. 1984

27.3 Site Profile

The Bulgarian Academy of Sciences is a research institute within the framework of the Bulgarian Academy of Sciences comprising four basic laboratories - one of which is the Language Modelling Lab (LML).

The main institutional goals and mandates are to develop research in high performance computer systems, computer networks, signal and data processing, language technologies and information retrieval; to promote international cooperation in the above areas of research.


top   telrii