TU Darmstadt / ULB / TUbiblio

Bridging the gap between extractive and abstractive summaries: Creation and evaluation of coherent extracts from heterogeneous sources

Benikova, Darina and Mieskes, Margot and Meyer, Christian M. and Gurevych, Iryna (2016):
Bridging the gap between extractive and abstractive summaries: Creation and evaluation of coherent extracts from heterogeneous sources.
In: Proceedings of the 26th International Conference on Computational Linguistics (COLING), Osaka, Japan, ISBN 978-4-87974-702-0,
[Online-Edition: http://aclweb.org/anthology/C16-1099],
[Conference or Workshop Item]

Abstract

In our work, we present a corpus of heterogeneous documents for summarization to address the issue that information seekers usually face a range of different types of information sources. A second issue we address, is the summary type, as most manual summaries are abstractive, whereas automatic methods mainly create extractive summaries, which are hard to compare to each other using standard evaluation methods. Therefore, we suggest a multi-step process for creating \emph{coherent extracts}, which are based on information taken directly from the source documents, but minimally redacted and meaningfully ordered to form a coherent text. Our qualitative and quantitative evaluation show that quantitative results are not sufficient to judge the quality of a summary and that other quality criteria, such as coherence, should also be taken into account. We find that our corpus is of high quality and that it has the potential to bridge the gap between reference corpora of abstracts and automatic methods producing extracts. Our corpus is available to the research community for further development.

Item Type: Conference or Workshop Item
Erschienen: 2016
Creators: Benikova, Darina and Mieskes, Margot and Meyer, Christian M. and Gurevych, Iryna
Title: Bridging the gap between extractive and abstractive summaries: Creation and evaluation of coherent extracts from heterogeneous sources
Language: English
Abstract:

In our work, we present a corpus of heterogeneous documents for summarization to address the issue that information seekers usually face a range of different types of information sources. A second issue we address, is the summary type, as most manual summaries are abstractive, whereas automatic methods mainly create extractive summaries, which are hard to compare to each other using standard evaluation methods. Therefore, we suggest a multi-step process for creating \emph{coherent extracts}, which are based on information taken directly from the source documents, but minimally redacted and meaningfully ordered to form a coherent text. Our qualitative and quantitative evaluation show that quantitative results are not sufficient to judge the quality of a summary and that other quality criteria, such as coherence, should also be taken into account. We find that our corpus is of high quality and that it has the potential to bridge the gap between reference corpora of abstracts and automatic methods producing extracts. Our corpus is available to the research community for further development.

Title of Book: Proceedings of the 26th International Conference on Computational Linguistics (COLING)
ISBN: 978-4-87974-702-0
Uncontrolled Keywords: UKP_reviewed;AIPHES_corpus
Divisions: 20 Department of Computer Science
20 Department of Computer Science > Ubiquitous Knowledge Processing
DFG-Graduiertenkollegs
DFG-Graduiertenkollegs > Research Training Group 1994 Adaptive Preparation of Information from Heterogeneous Sources
Event Location: Osaka, Japan
Date Deposited: 31 Dec 2016 14:29
Official URL: http://aclweb.org/anthology/C16-1099
Identification Number: TUD-CS-2016-1445
Related URLs:
Export:

Optionen (nur für Redakteure)

View Item View Item