TU Darmstadt / ULB / TUbiblio

High Performance Word Sense Alignment by Joint Modeling of Sense Distance and Gloss Similarity

Matuschek, Michael and Gurevych, Iryna
Tsujii, Junichi and Hajic, Jan (eds.) (2014):
High Performance Word Sense Alignment by Joint Modeling of Sense Distance and Gloss Similarity.
In: Proceedings of the the 25th International Conference on Computational Linguistics (COLING 2014), Dublin City University and Association for Computational Linguistics, Dublin, Ireland, [Online-Edition: http://www.aclweb.org/anthology/C14-1025],
[Conference or Workshop Item]

Abstract

In this paper, we present a machine learning approach for word sense alignment (WSA) which combines distances between senses in the graph representations of lexical-semantic resources with gloss similarities. In this way, we significantly outperform the state of the art on each of the four datasets we consider. Moreover, we present two novel datasets for WSA between Wiktionary and Wikipedia in English and German. The latter dataset in not only of unprecedented size, but also created by the large community of Wiktionary editors instead of expert annotators, making it an interesting subject of study in its own right as the first crowdsourced WSA dataset. We will make both datasets freely available along with our computed alignments.

Item Type: Conference or Workshop Item
Erschienen: 2014
Editors: Tsujii, Junichi and Hajic, Jan
Creators: Matuschek, Michael and Gurevych, Iryna
Title: High Performance Word Sense Alignment by Joint Modeling of Sense Distance and Gloss Similarity
Language: English
Abstract:

In this paper, we present a machine learning approach for word sense alignment (WSA) which combines distances between senses in the graph representations of lexical-semantic resources with gloss similarities. In this way, we significantly outperform the state of the art on each of the four datasets we consider. Moreover, we present two novel datasets for WSA between Wiktionary and Wikipedia in English and German. The latter dataset in not only of unprecedented size, but also created by the large community of Wiktionary editors instead of expert annotators, making it an interesting subject of study in its own right as the first crowdsourced WSA dataset. We will make both datasets freely available along with our computed alignments.

Title of Book: Proceedings of the the 25th International Conference on Computational Linguistics (COLING 2014)
Publisher: Dublin City University and Association for Computational Linguistics
Uncontrolled Keywords: UKP_reviewed;UKP_p_UBY;UKP_a_LangTech4eHum
Divisions: 20 Department of Computer Science
20 Department of Computer Science > Ubiquitous Knowledge Processing
Event Location: Dublin, Ireland
Date Deposited: 31 Dec 2016 14:29
Official URL: http://www.aclweb.org/anthology/C14-1025
Identification Number: TUD-CS-2014-0113
Export:
Suche nach Titel in: TUfind oder in Google

Optionen (nur für Redakteure)

View Item View Item