Bär, Daniel ; Zesch, Torsten ; Gurevych, Iryna (2011)
A Reflective View on Text Similarity.
Hissar, Bulgaria
Konferenzveröffentlichung, Bibliographie
Kurzbeschreibung (Abstract)
While the concept of similarity is well grounded in psychology, text similarity is less well-defined. Thus, we analyze text similarity with respect to its definition and the datasets used for evaluation. We formalize text similarity based on the geometric model of conceptual spaces along three dimensions inherent to texts: structure, style, and content. We empirically ground these dimensions in a set of annotation studies, and categorize applications according to these dimensions. Furthermore, we analyze the characteristics of the existing evaluation datasets, and use those datasets to assess the performance of common text similarity measures.
Typ des Eintrags: | Konferenzveröffentlichung |
---|---|
Erschienen: | 2011 |
Autor(en): | Bär, Daniel ; Zesch, Torsten ; Gurevych, Iryna |
Art des Eintrags: | Bibliographie |
Titel: | A Reflective View on Text Similarity |
Sprache: | Englisch |
Publikationsjahr: | September 2011 |
Buchtitel: | Proceedings of the International Conference on Recent Advances in Natural Language Processing |
Veranstaltungsort: | Hissar, Bulgaria |
URL / URN: | http://www.aclweb.org/anthology/R11-1071 |
Kurzbeschreibung (Abstract): | While the concept of similarity is well grounded in psychology, text similarity is less well-defined. Thus, we analyze text similarity with respect to its definition and the datasets used for evaluation. We formalize text similarity based on the geometric model of conceptual spaces along three dimensions inherent to texts: structure, style, and content. We empirically ground these dimensions in a set of annotation studies, and categorize applications according to these dimensions. Furthermore, we analyze the characteristics of the existing evaluation datasets, and use those datasets to assess the performance of common text similarity measures. |
Freie Schlagworte: | UKP_a_NLP4Wikis;UKP_p_WIKULU;UKP_s_DKPro_Similarity |
ID-Nummer: | TUD-CS-2011-0189 |
Fachbereich(e)/-gebiet(e): | 20 Fachbereich Informatik 20 Fachbereich Informatik > Ubiquitäre Wissensverarbeitung |
Hinterlegungsdatum: | 31 Dez 2016 14:29 |
Letzte Änderung: | 24 Jan 2020 12:03 |
PPN: | |
Export: | |
Suche nach Titel in: | TUfind oder in Google |
Frage zum Eintrag |
Optionen (nur für Redakteure)
Redaktionelle Details anzeigen |