Matuschek, Michael ; Gurevych, Iryna
Hrsg.: Tsujii, Junichi ; Hajic, Jan (2014)
High Performance Word Sense Alignment by Joint Modeling of Sense Distance and Gloss Similarity.
Dublin, Ireland
Konferenzveröffentlichung, Bibliographie
Kurzbeschreibung (Abstract)
In this paper, we present a machine learning approach for word sense alignment (WSA) which combines distances between senses in the graph representations of lexical-semantic resources with gloss similarities. In this way, we significantly outperform the state of the art on each of the four datasets we consider. Moreover, we present two novel datasets for WSA between Wiktionary and Wikipedia in English and German. The latter dataset in not only of unprecedented size, but also created by the large community of Wiktionary editors instead of expert annotators, making it an interesting subject of study in its own right as the first crowdsourced WSA dataset. We will make both datasets freely available along with our computed alignments.
Typ des Eintrags: | Konferenzveröffentlichung |
---|---|
Erschienen: | 2014 |
Herausgeber: | Tsujii, Junichi ; Hajic, Jan |
Autor(en): | Matuschek, Michael ; Gurevych, Iryna |
Art des Eintrags: | Bibliographie |
Titel: | High Performance Word Sense Alignment by Joint Modeling of Sense Distance and Gloss Similarity |
Sprache: | Englisch |
Publikationsjahr: | August 2014 |
Verlag: | Dublin City University and Association for Computational Linguistics |
Buchtitel: | Proceedings of the the 25th International Conference on Computational Linguistics (COLING 2014) |
Veranstaltungsort: | Dublin, Ireland |
URL / URN: | http://www.aclweb.org/anthology/C14-1025 |
Kurzbeschreibung (Abstract): | In this paper, we present a machine learning approach for word sense alignment (WSA) which combines distances between senses in the graph representations of lexical-semantic resources with gloss similarities. In this way, we significantly outperform the state of the art on each of the four datasets we consider. Moreover, we present two novel datasets for WSA between Wiktionary and Wikipedia in English and German. The latter dataset in not only of unprecedented size, but also created by the large community of Wiktionary editors instead of expert annotators, making it an interesting subject of study in its own right as the first crowdsourced WSA dataset. We will make both datasets freely available along with our computed alignments. |
Freie Schlagworte: | UKP_reviewed;UKP_p_UBY;UKP_a_LangTech4eHum |
ID-Nummer: | TUD-CS-2014-0113 |
Fachbereich(e)/-gebiet(e): | 20 Fachbereich Informatik 20 Fachbereich Informatik > Ubiquitäre Wissensverarbeitung |
Hinterlegungsdatum: | 31 Dez 2016 14:29 |
Letzte Änderung: | 24 Jan 2020 12:03 |
PPN: | |
Export: | |
Suche nach Titel in: | TUfind oder in Google |
Frage zum Eintrag |
Optionen (nur für Redakteure)
Redaktionelle Details anzeigen |