TELRI
Trans-European Language Resources Infrastructure - II

Current Events | Write to us | TELRI Main Page | TELRI Seminar

The Structure of Bilingual Lexicon Entry Viewed by DICTATOR (A Software Tool for Dictionary Annotation, Compilation and Upgrade)

Elena Paskaleva
Linguistic Modeling Department
Bulgarian Academy of Sciences
Sofia, Bulgaria
e-mail: hellen@lml.bas.bg

The tool DICTATOR created in Linguistic Modeling Department was designed to compile a bilingual (English-Bulgarian) dictionary in order to strenghten and refine the multilingual processing of various parallel text collections (most often based on data-driven approach). In the process of the creation of the tool different problems of general lexicographical interest and problems of the standardization of dictionary presentation came into sight and put many questions of ontological character - types of lexical entries, types of dictionaries, applicability of TEI standards and many other problems of linguistic knowledge presentation. All they give the well-known "added value" of computer modeling of linguistic phenomena - in our case the compilation and the organization of bilingual dictionaries.

The DB-TEI complex approach - sources and uses.

Three kinds of data variety have to be handled and intersected

  • the information in dictionary entry - bilingual, explanatory and intermediary (semi-bilingual);
  • the requirements of a DB organization of a system applied on these data;
  • the TEI guidelines - in some ‘core’ (minimally obligatory) subset.

The sources of this data can be:

  • annotated dictionnaries (the best and the rarest case)
  • manual entering of the entry (maximally facilitated by the system interface)
  • a collection of translational equivalents with contextual information obtained by a system of parallel corpora processing (in the DICTATOR's case - the results of searching procedure of MARK ALISTER - an aligning tool).

Applications

Besides the general purposes of the creation of dictionary-like base of translational equivalents the tool can be used for tasks of higher level of generalization - a comparison and evaluation of different dictionaries and an assessment of existing annotation standards in respect of their coverage of existing models and data.


See previous, next abstract.

Back to Newsletter no. 9.

© TELRI, 19.11.1999