TU Darmstadt / ULB / TUbiblio

Language specific applications : 5.2. Application to German

Steiner, Petra ; Lemnitzer, Lothar (1995)
Language specific applications : 5.2. Application to German.
Report, Bibliographie

Kurzbeschreibung (Abstract)

The work carried out in this task aims at formulating harmonized specifications and at proposing a notation for the lexica and the tagsets, to be contributed by each language group involved in the MULTEXT Project.

MULTEXT's general aim is to develop tools for corpus annotation which contribute to the standardization of this kind of work in an academic and an industrial environment. These tools will be provided with resources from six different languages to ensure their validity. Resources used to feed the tools are, among others, lexical lists for the six languages, containing the necessary information to run the tools. Tools that will use lexica are mainly those which perfom morphological analysis and generation, and lexical lookup tools. MULTEXT proposes to deliver a morphological tool together with basic morphological rules and a number of base form entries, duly coded with respect to the rules. The morphological tool is intended to expand these base forms into word-form lists, with corresponding morphosyntactic information. These word-forms will, in turn, be used for the tagger, providing that a correspondence between the morphosyntactic information and the tags to be used by the tagger is defined. The morphological tool must guarantee extensibility of the MULTEXT tools, as it is thought to be used by end-users to enlarge lexical material treated by the tools. It is also expected that a morphological analysis will be able to perform a ``guess" on at least the category of unknown words and, where possible, on morphosyntactic features. Within MULTEXT, therefore, ``lexical list" refers to a list of forms with related information: both to base-form lexica, coded in such a way as to feed the morphological tool, and to the word-form lexica, containing relevant information for corpus annotation purposes.

Typ des Eintrags: Report
Erschienen: 1995
Autor(en): Steiner, Petra ; Lemnitzer, Lothar
Art des Eintrags: Bibliographie
Titel: Language specific applications : 5.2. Application to German
Sprache: Deutsch
Publikationsjahr: März 1995
Reihe: MULTEXT Project
Zugehörige Links:
Kurzbeschreibung (Abstract):

The work carried out in this task aims at formulating harmonized specifications and at proposing a notation for the lexica and the tagsets, to be contributed by each language group involved in the MULTEXT Project.

MULTEXT's general aim is to develop tools for corpus annotation which contribute to the standardization of this kind of work in an academic and an industrial environment. These tools will be provided with resources from six different languages to ensure their validity. Resources used to feed the tools are, among others, lexical lists for the six languages, containing the necessary information to run the tools. Tools that will use lexica are mainly those which perfom morphological analysis and generation, and lexical lookup tools. MULTEXT proposes to deliver a morphological tool together with basic morphological rules and a number of base form entries, duly coded with respect to the rules. The morphological tool is intended to expand these base forms into word-form lists, with corresponding morphosyntactic information. These word-forms will, in turn, be used for the tagger, providing that a correspondence between the morphosyntactic information and the tags to be used by the tagger is defined. The morphological tool must guarantee extensibility of the MULTEXT tools, as it is thought to be used by end-users to enlarge lexical material treated by the tools. It is also expected that a morphological analysis will be able to perform a ``guess" on at least the category of unknown words and, where possible, on morphosyntactic features. Within MULTEXT, therefore, ``lexical list" refers to a list of forms with related information: both to base-form lexica, coded in such a way as to feed the morphological tool, and to the word-form lexica, containing relevant information for corpus annotation purposes.

Zusätzliche Informationen:

Common Specifications and Notation for Lexicon Encoding and Preliminary Proposal for the Tagsets

Fachbereich(e)/-gebiet(e): Zentrale Einrichtungen
Zentrale Einrichtungen > Universitäts- und Landesbibliothek (ULB)
Hinterlegungsdatum: 22 Jun 2023 12:42
Letzte Änderung: 22 Jun 2023 12:42
PPN:
Zugehörige Links:
Export:
Suche nach Titel in: TUfind oder in Google
Frage zum Eintrag Frage zum Eintrag

Optionen (nur für Redakteure)
Redaktionelle Details anzeigen Redaktionelle Details anzeigen