TU Darmstadt / ULB / TUbiblio

Worth its Weight in Gold or Yet Another Resource – A Comparative Study of Wiktionary, OpenThesaurus and GermaNet

Meyer, Christian M. ; Gurevych, Iryna
Hrsg.: Gelbukh, Alexander (2010)
Worth its Weight in Gold or Yet Another Resource – A Comparative Study of Wiktionary, OpenThesaurus and GermaNet.
In: Computational Linguistics and Intelligent Text Processing: Proceedings of the 11th International Conference
doi: 10.1007/978-3-642-12116-6_4
Buchkapitel, Bibliographie

Kurzbeschreibung (Abstract)

In this paper, we analyze the topology and the content of a range of lexical semantic resources for the German language constructed either in a controlled (GermaNet), semi-controlled (OpenThesaurus), or collaborative, i.e. community-based, manner (Wiktionary). For the first time, the comparison of the corresponding resources is performed at the word sense level. For this purpose, the word senses of terms are automatically disambiguated in Wiktionary and the content of all resources is converted to a uniform representation. We show that the resources' topology is well comparable as they share the small world property and contain a comparable number of entries, although differences in their connectivity exist. Our study of content related properties reveals that the German Wiktionary has a different distribution of word senses and contains more polysemous entries than both other resources. We identify that each resource contains the highest number of a particular type of semantic relation. We finally increase the number of relations in Wiktionary by considering symmetric and inverse relations that have been found to be usually absent in this resource.

Typ des Eintrags: Buchkapitel
Erschienen: 2010
Herausgeber: Gelbukh, Alexander
Autor(en): Meyer, Christian M. ; Gurevych, Iryna
Art des Eintrags: Bibliographie
Titel: Worth its Weight in Gold or Yet Another Resource – A Comparative Study of Wiktionary, OpenThesaurus and GermaNet
Sprache: Englisch
Publikationsjahr: März 2010
Ort: Berlin/Heidelberg
Verlag: Springer
Buchtitel: Computational Linguistics and Intelligent Text Processing: Proceedings of the 11th International Conference
Reihe: Lecture Notes in Computer Science
Band einer Reihe: 6008
Veranstaltungsort: Iaşi, Romania
DOI: 10.1007/978-3-642-12116-6_4
URL / URN: https://link.springer.com/chapter/10.1007%2F978-3-642-12116-...
Zugehörige Links:
Kurzbeschreibung (Abstract):

In this paper, we analyze the topology and the content of a range of lexical semantic resources for the German language constructed either in a controlled (GermaNet), semi-controlled (OpenThesaurus), or collaborative, i.e. community-based, manner (Wiktionary). For the first time, the comparison of the corresponding resources is performed at the word sense level. For this purpose, the word senses of terms are automatically disambiguated in Wiktionary and the content of all resources is converted to a uniform representation. We show that the resources' topology is well comparable as they share the small world property and contain a comparable number of entries, although differences in their connectivity exist. Our study of content related properties reveals that the German Wiktionary has a different distribution of word senses and contains more polysemous entries than both other resources. We identify that each resource contains the highest number of a particular type of semantic relation. We finally increase the number of relations in Wiktionary by considering symmetric and inverse relations that have been found to be usually absent in this resource.

Freie Schlagworte: UKP_a_NLP4Wikis;UKP_p_EduWeb;reviewed
ID-Nummer: TUD-CS-2010-0012
Fachbereich(e)/-gebiet(e): 20 Fachbereich Informatik
20 Fachbereich Informatik > Ubiquitäre Wissensverarbeitung
Hinterlegungsdatum: 31 Dez 2016 14:29
Letzte Änderung: 24 Jan 2020 12:03
PPN:
Zugehörige Links:
Export:
Suche nach Titel in: TUfind oder in Google
Frage zum Eintrag Frage zum Eintrag

Optionen (nur für Redakteure)
Redaktionelle Details anzeigen Redaktionelle Details anzeigen