TU Darmstadt / ULB / TUbiblio

Beyond Generic Summarization: A Multi-faceted Hierarchical Summarization Corpus of Large Heterogeneous Data

Tauchmann, Christopher ; Arnold, Thomas ; Hanselowski, Andreas ; Meyer, Christian M. ; Mieskes, Margot (2018)
Beyond Generic Summarization: A Multi-faceted Hierarchical Summarization Corpus of Large Heterogeneous Data.
Miyazaki, Japan
Conference or Workshop Item, Bibliographie

Abstract

Automatic summarization has so far focused on datasets of ten to twenty rather short documents of mostly news articles. But automatic systems could in theory analyze hundreds of documents from a range of sources and provide an overview to the interested reader. Such a summary would ideally present the most general issues in a specific topic and allow for more in-depth information on specific aspects within said topic. In this paper, we present a new approach for creating hierarchical summarization corpora by first, extracting relevant content from large, heterogeneous document collections using crowdsourcing and second, ordering the relevant information hierarchically by trained annotators. Our resulting corpus can be used to develop and evaluate hierarchical summarization systems.

Item Type: Conference or Workshop Item
Erschienen: 2018
Creators: Tauchmann, Christopher ; Arnold, Thomas ; Hanselowski, Andreas ; Meyer, Christian M. ; Mieskes, Margot
Type of entry: Bibliographie
Title: Beyond Generic Summarization: A Multi-faceted Hierarchical Summarization Corpus of Large Heterogeneous Data
Language: English
Date: May 2018
Publisher: European Language Resources Association
Book Title: Proceedings of the 11th International Conference on Language Resources and Evaluation (LREC)
Event Location: Miyazaki, Japan
URL / URN: http://www.lrec-conf.org/proceedings/lrec2018/summaries/252....
Corresponding Links:
Abstract:

Automatic summarization has so far focused on datasets of ten to twenty rather short documents of mostly news articles. But automatic systems could in theory analyze hundreds of documents from a range of sources and provide an overview to the interested reader. Such a summary would ideally present the most general issues in a specific topic and allow for more in-depth information on specific aspects within said topic. In this paper, we present a new approach for creating hierarchical summarization corpora by first, extracting relevant content from large, heterogeneous document collections using crowdsourcing and second, ordering the relevant information hierarchically by trained annotators. Our resulting corpus can be used to develop and evaluate hierarchical summarization systems.

Uncontrolled Keywords: reviewed;AIPHES_corpus;AIPHES_area_c1
Identification Number: TUD-CS-2018-0007
Divisions: 20 Department of Computer Science
20 Department of Computer Science > Ubiquitous Knowledge Processing
DFG-Graduiertenkollegs
DFG-Graduiertenkollegs > Research Training Group 1994 Adaptive Preparation of Information from Heterogeneous Sources
Date Deposited: 14 Dec 2017 14:24
Last Modified: 15 Oct 2018 09:10
PPN:
Export:
Suche nach Titel in: TUfind oder in Google
Send an inquiry Send an inquiry

Options (only for editors)
Show editorial Details Show editorial Details