Benikova, Darina and Mieskes, Margot and Meyer, Christian M. and Gurevych, Iryna (2016):
Bridging the gap between extractive and abstractive summaries: Creation and evaluation of coherent extracts from heterogeneous sources.
In: Proceedings of the 26th International Conference on Computational Linguistics (COLING), pp. 1039-1050,
Osaka, Japan, ISBN 978-4-87974-702-0,
[Conference or Workshop Item]
Abstract
In our work, we present a corpus of heterogeneous documents for summarization to address the issue that information seekers usually face a range of different types of information sources. A second issue we address, is the summary type, as most manual summaries are abstractive, whereas automatic methods mainly create extractive summaries, which are hard to compare to each other using standard evaluation methods. Therefore, we suggest a multi-step process for creating \emph{coherent extracts}, which are based on information taken directly from the source documents, but minimally redacted and meaningfully ordered to form a coherent text. Our qualitative and quantitative evaluation show that quantitative results are not sufficient to judge the quality of a summary and that other quality criteria, such as coherence, should also be taken into account. We find that our corpus is of high quality and that it has the potential to bridge the gap between reference corpora of abstracts and automatic methods producing extracts. Our corpus is available to the research community for further development.
Item Type: | Conference or Workshop Item |
---|---|
Erschienen: | 2016 |
Creators: | Benikova, Darina and Mieskes, Margot and Meyer, Christian M. and Gurevych, Iryna |
Title: | Bridging the gap between extractive and abstractive summaries: Creation and evaluation of coherent extracts from heterogeneous sources |
Language: | English |
Abstract: | In our work, we present a corpus of heterogeneous documents for summarization to address the issue that information seekers usually face a range of different types of information sources. A second issue we address, is the summary type, as most manual summaries are abstractive, whereas automatic methods mainly create extractive summaries, which are hard to compare to each other using standard evaluation methods. Therefore, we suggest a multi-step process for creating \emph{coherent extracts}, which are based on information taken directly from the source documents, but minimally redacted and meaningfully ordered to form a coherent text. Our qualitative and quantitative evaluation show that quantitative results are not sufficient to judge the quality of a summary and that other quality criteria, such as coherence, should also be taken into account. We find that our corpus is of high quality and that it has the potential to bridge the gap between reference corpora of abstracts and automatic methods producing extracts. Our corpus is available to the research community for further development. |
Title of Book: | Proceedings of the 26th International Conference on Computational Linguistics (COLING) |
ISBN: | 978-4-87974-702-0 |
Uncontrolled Keywords: | UKP_reviewed;AIPHES_corpus |
Divisions: | 20 Department of Computer Science 20 Department of Computer Science > Ubiquitous Knowledge Processing DFG-Graduiertenkollegs DFG-Graduiertenkollegs > Research Training Group 1994 Adaptive Preparation of Information from Heterogeneous Sources |
Event Location: | Osaka, Japan |
Date Deposited: | 31 Dec 2016 14:29 |
Official URL: | http://aclweb.org/anthology/C16-1099 |
Identification Number: | TUD-CS-2016-1445 |
Corresponding Links: | |
Export: | |
Suche nach Titel in: | TUfind oder in Google |
![]() |
Send an inquiry |
Options (only for editors)
![]() |
Show editorial Details |