Rücklé, Andreas ; Eger, Steffen ; Peyrard, Maxime ; Gurevych, Iryna (2018)
Concatenated Power Mean Word Embeddings as Universal Cross-Lingual Sentence Representations.
In: arXiv:1803.01400
Artikel, Bibliographie
Kurzbeschreibung (Abstract)
Average word embeddings are a common baseline for more sophisticated sentence embedding techniques. However, they typically fall short of the performances of more complex models such as InferSent. Here, we generalize the concept of average word embeddings to power mean word embeddings. We show that the concatenation of different types of power mean word embeddings considerably closes the gap to state-of-the-art methods monolingually and substantially outperforms these more complex techniques cross-lingually. In addition, our proposed method outperforms different recently proposed baselines such as SIF and Sent2Vec by a solid margin, thus constituting a much harder-to-beat monolingual baseline.
Typ des Eintrags: | Artikel |
---|---|
Erschienen: | 2018 |
Autor(en): | Rücklé, Andreas ; Eger, Steffen ; Peyrard, Maxime ; Gurevych, Iryna |
Art des Eintrags: | Bibliographie |
Titel: | Concatenated Power Mean Word Embeddings as Universal Cross-Lingual Sentence Representations |
Sprache: | Englisch |
Publikationsjahr: | März 2018 |
Titel der Zeitschrift, Zeitung oder Schriftenreihe: | arXiv:1803.01400 |
URL / URN: | https://arxiv.org/abs/1803.01400 |
Zugehörige Links: | |
Kurzbeschreibung (Abstract): | Average word embeddings are a common baseline for more sophisticated sentence embedding techniques. However, they typically fall short of the performances of more complex models such as InferSent. Here, we generalize the concept of average word embeddings to power mean word embeddings. We show that the concatenation of different types of power mean word embeddings considerably closes the gap to state-of-the-art methods monolingually and substantially outperforms these more complex techniques cross-lingually. In addition, our proposed method outperforms different recently proposed baselines such as SIF and Sent2Vec by a solid margin, thus constituting a much harder-to-beat monolingual baseline. |
Freie Schlagworte: | UKP_p_QAEduInf;AIPHES_area_b2 |
ID-Nummer: | TUD-CS-2018-0050 |
Fachbereich(e)/-gebiet(e): | 20 Fachbereich Informatik 20 Fachbereich Informatik > Ubiquitäre Wissensverarbeitung DFG-Graduiertenkollegs DFG-Graduiertenkollegs > Graduiertenkolleg 1994 Adaptive Informationsaufbereitung aus heterogenen Quellen |
Hinterlegungsdatum: | 06 Mär 2018 08:34 |
Letzte Änderung: | 24 Jan 2020 12:03 |
PPN: | |
Export: | |
Suche nach Titel in: | TUfind oder in Google |
Frage zum Eintrag |
Optionen (nur für Redakteure)
Redaktionelle Details anzeigen |