TU Darmstadt / ULB / TUbiblio

MoverScore: Text Generation Evaluating with Contextualized Embeddings and Earth Mover Distance

Zhao, Wei ; Peyrard, Maxime ; Liu, Fei ; Gao, Yang ; Meyer, Christian M. ; Eger, Steffen (2019)
MoverScore: Text Generation Evaluating with Contextualized Embeddings and Earth Mover Distance.
The 2019 Conference on Empirical Methods in Natural Language Processing. Hong Kong, China (03.11.2019-07.11.2019)
doi: 10.18653/v1/D19-1053
Konferenzveröffentlichung, Bibliographie

Dies ist die neueste Version dieses Eintrags.

Kurzbeschreibung (Abstract)

A robust evaluation metric has a profound impact on the development of text generation systems. A desirable metric compares system output against references based on their semantics rather than surface forms. In this paper we investigate strategies to encode system and reference texts to devise a metric that shows a high correlation with human judgment of text quality. We validate our new metric, namely MoverScore, on a number of text generation tasks including summarization, machine translation, image captioning, and data-to-text generation, where the outputs are produced by a variety of neural and non-neural systems. Our findings suggest that metrics combining contextualized representations with a distance measure perform the best. Such metrics also demonstrate strong generalization capability across tasks. For ease-of-use we make our metrics available as web service.

Typ des Eintrags: Konferenzveröffentlichung
Erschienen: 2019
Autor(en): Zhao, Wei ; Peyrard, Maxime ; Liu, Fei ; Gao, Yang ; Meyer, Christian M. ; Eger, Steffen
Art des Eintrags: Bibliographie
Titel: MoverScore: Text Generation Evaluating with Contextualized Embeddings and Earth Mover Distance
Sprache: Englisch
Publikationsjahr: 14 August 2019
Ort: Hong Kong, China
Verlag: Association for Computational Linguistics
Buchtitel: Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP)
Veranstaltungstitel: The 2019 Conference on Empirical Methods in Natural Language Processing
Veranstaltungsort: Hong Kong, China
Veranstaltungsdatum: 03.11.2019-07.11.2019
DOI: 10.18653/v1/D19-1053
Zugehörige Links:
Kurzbeschreibung (Abstract):

A robust evaluation metric has a profound impact on the development of text generation systems. A desirable metric compares system output against references based on their semantics rather than surface forms. In this paper we investigate strategies to encode system and reference texts to devise a metric that shows a high correlation with human judgment of text quality. We validate our new metric, namely MoverScore, on a number of text generation tasks including summarization, machine translation, image captioning, and data-to-text generation, where the outputs are produced by a variety of neural and non-neural systems. Our findings suggest that metrics combining contextualized representations with a distance measure perform the best. Such metrics also demonstrate strong generalization capability across tasks. For ease-of-use we make our metrics available as web service.

Fachbereich(e)/-gebiet(e): 20 Fachbereich Informatik
20 Fachbereich Informatik > Ubiquitäre Wissensverarbeitung
DFG-Graduiertenkollegs
DFG-Graduiertenkollegs > Graduiertenkolleg 1994 Adaptive Informationsaufbereitung aus heterogenen Quellen
Hinterlegungsdatum: 11 Sep 2019 12:13
Letzte Änderung: 29 Mai 2024 07:52
PPN:
Export:
Suche nach Titel in: TUfind oder in Google

Verfügbare Versionen dieses Eintrags

Frage zum Eintrag Frage zum Eintrag

Optionen (nur für Redakteure)
Redaktionelle Details anzeigen Redaktionelle Details anzeigen