TU Darmstadt / ULB / TUbiblio

A Reflective View on Text Similarity

Bär, Daniel and Zesch, Torsten and Gurevych, Iryna (2011):
A Reflective View on Text Similarity.
In: Proceedings of the International Conference on Recent Advances in Natural Language Processing, Hissar, Bulgaria, pp. 515-520, [Online-Edition: http://www.aclweb.org/anthology/R11-1071],
[Conference or Workshop Item]

Abstract

While the concept of similarity is well grounded in psychology, text similarity is less well-defined. Thus, we analyze text similarity with respect to its definition and the datasets used for evaluation. We formalize text similarity based on the geometric model of conceptual spaces along three dimensions inherent to texts: structure, style, and content. We empirically ground these dimensions in a set of annotation studies, and categorize applications according to these dimensions. Furthermore, we analyze the characteristics of the existing evaluation datasets, and use those datasets to assess the performance of common text similarity measures.

Item Type: Conference or Workshop Item
Erschienen: 2011
Creators: Bär, Daniel and Zesch, Torsten and Gurevych, Iryna
Title: A Reflective View on Text Similarity
Language: English
Abstract:

While the concept of similarity is well grounded in psychology, text similarity is less well-defined. Thus, we analyze text similarity with respect to its definition and the datasets used for evaluation. We formalize text similarity based on the geometric model of conceptual spaces along three dimensions inherent to texts: structure, style, and content. We empirically ground these dimensions in a set of annotation studies, and categorize applications according to these dimensions. Furthermore, we analyze the characteristics of the existing evaluation datasets, and use those datasets to assess the performance of common text similarity measures.

Title of Book: Proceedings of the International Conference on Recent Advances in Natural Language Processing
Uncontrolled Keywords: UKP_a_NLP4Wikis;UKP_p_WIKULU;UKP_s_DKPro_Similarity
Divisions: 20 Department of Computer Science
20 Department of Computer Science > Ubiquitous Knowledge Processing
Event Location: Hissar, Bulgaria
Date Deposited: 31 Dec 2016 14:29
Official URL: http://www.aclweb.org/anthology/R11-1071
Identification Number: TUD-CS-2011-0189
Export:
Suche nach Titel in: TUfind oder in Google
Send an inquiry Send an inquiry

Options (only for editors)

View Item View Item