TU Darmstadt / ULB / TUbiblio

A Reflective View on Text Similarity

Bär, Daniel ; Zesch, Torsten ; Gurevych, Iryna (2011)
A Reflective View on Text Similarity.
Hissar, Bulgaria
Konferenzveröffentlichung, Bibliographie

Kurzbeschreibung (Abstract)

While the concept of similarity is well grounded in psychology, text similarity is less well-defined. Thus, we analyze text similarity with respect to its definition and the datasets used for evaluation. We formalize text similarity based on the geometric model of conceptual spaces along three dimensions inherent to texts: structure, style, and content. We empirically ground these dimensions in a set of annotation studies, and categorize applications according to these dimensions. Furthermore, we analyze the characteristics of the existing evaluation datasets, and use those datasets to assess the performance of common text similarity measures.

Typ des Eintrags: Konferenzveröffentlichung
Erschienen: 2011
Autor(en): Bär, Daniel ; Zesch, Torsten ; Gurevych, Iryna
Art des Eintrags: Bibliographie
Titel: A Reflective View on Text Similarity
Sprache: Englisch
Publikationsjahr: September 2011
Buchtitel: Proceedings of the International Conference on Recent Advances in Natural Language Processing
Veranstaltungsort: Hissar, Bulgaria
URL / URN: http://www.aclweb.org/anthology/R11-1071
Kurzbeschreibung (Abstract):

While the concept of similarity is well grounded in psychology, text similarity is less well-defined. Thus, we analyze text similarity with respect to its definition and the datasets used for evaluation. We formalize text similarity based on the geometric model of conceptual spaces along three dimensions inherent to texts: structure, style, and content. We empirically ground these dimensions in a set of annotation studies, and categorize applications according to these dimensions. Furthermore, we analyze the characteristics of the existing evaluation datasets, and use those datasets to assess the performance of common text similarity measures.

Freie Schlagworte: UKP_a_NLP4Wikis;UKP_p_WIKULU;UKP_s_DKPro_Similarity
ID-Nummer: TUD-CS-2011-0189
Fachbereich(e)/-gebiet(e): 20 Fachbereich Informatik
20 Fachbereich Informatik > Ubiquitäre Wissensverarbeitung
Hinterlegungsdatum: 31 Dez 2016 14:29
Letzte Änderung: 24 Jan 2020 12:03
PPN:
Export:
Suche nach Titel in: TUfind oder in Google
Frage zum Eintrag Frage zum Eintrag

Optionen (nur für Redakteure)
Redaktionelle Details anzeigen Redaktionelle Details anzeigen