Steiner, Petra ; Lemnitzer, Lothar (1995)
Language specific applications : 5.2. Application to German.
Report, Bibliographie
Kurzbeschreibung (Abstract)
The work carried out in this task aims at formulating harmonized specifications and at proposing a notation for the lexica and the tagsets, to be contributed by each language group involved in the MULTEXT Project.
MULTEXT's general aim is to develop tools for corpus annotation which contribute to the standardization of this kind of work in an academic and an industrial environment. These tools will be provided with resources from six different languages to ensure their validity. Resources used to feed the tools are, among others, lexical lists for the six languages, containing the necessary information to run the tools. Tools that will use lexica are mainly those which perfom morphological analysis and generation, and lexical lookup tools. MULTEXT proposes to deliver a morphological tool together with basic morphological rules and a number of base form entries, duly coded with respect to the rules. The morphological tool is intended to expand these base forms into word-form lists, with corresponding morphosyntactic information. These word-forms will, in turn, be used for the tagger, providing that a correspondence between the morphosyntactic information and the tags to be used by the tagger is defined. The morphological tool must guarantee extensibility of the MULTEXT tools, as it is thought to be used by end-users to enlarge lexical material treated by the tools. It is also expected that a morphological analysis will be able to perform a ``guess" on at least the category of unknown words and, where possible, on morphosyntactic features. Within MULTEXT, therefore, ``lexical list" refers to a list of forms with related information: both to base-form lexica, coded in such a way as to feed the morphological tool, and to the word-form lexica, containing relevant information for corpus annotation purposes.
Typ des Eintrags: | Report |
---|---|
Erschienen: | 1995 |
Autor(en): | Steiner, Petra ; Lemnitzer, Lothar |
Art des Eintrags: | Bibliographie |
Titel: | Language specific applications : 5.2. Application to German |
Sprache: | Deutsch |
Publikationsjahr: | März 1995 |
Reihe: | MULTEXT Project |
Zugehörige Links: | |
Kurzbeschreibung (Abstract): | The work carried out in this task aims at formulating harmonized specifications and at proposing a notation for the lexica and the tagsets, to be contributed by each language group involved in the MULTEXT Project. MULTEXT's general aim is to develop tools for corpus annotation which contribute to the standardization of this kind of work in an academic and an industrial environment. These tools will be provided with resources from six different languages to ensure their validity. Resources used to feed the tools are, among others, lexical lists for the six languages, containing the necessary information to run the tools. Tools that will use lexica are mainly those which perfom morphological analysis and generation, and lexical lookup tools. MULTEXT proposes to deliver a morphological tool together with basic morphological rules and a number of base form entries, duly coded with respect to the rules. The morphological tool is intended to expand these base forms into word-form lists, with corresponding morphosyntactic information. These word-forms will, in turn, be used for the tagger, providing that a correspondence between the morphosyntactic information and the tags to be used by the tagger is defined. The morphological tool must guarantee extensibility of the MULTEXT tools, as it is thought to be used by end-users to enlarge lexical material treated by the tools. It is also expected that a morphological analysis will be able to perform a ``guess" on at least the category of unknown words and, where possible, on morphosyntactic features. Within MULTEXT, therefore, ``lexical list" refers to a list of forms with related information: both to base-form lexica, coded in such a way as to feed the morphological tool, and to the word-form lexica, containing relevant information for corpus annotation purposes. |
Zusätzliche Informationen: | Common Specifications and Notation for Lexicon Encoding and Preliminary Proposal for the Tagsets |
Fachbereich(e)/-gebiet(e): | Zentrale Einrichtungen Zentrale Einrichtungen > Universitäts- und Landesbibliothek (ULB) |
Hinterlegungsdatum: | 22 Jun 2023 12:42 |
Letzte Änderung: | 22 Jun 2023 12:42 |
PPN: | |
Export: | |
Suche nach Titel in: | TUfind oder in Google |
Frage zum Eintrag |
Optionen (nur für Redakteure)
Redaktionelle Details anzeigen |