TU Darmstadt / ULB / TUbiblio

Interactive Summarization of Large Document Collections

Hättasch, Benjamin ; Meyer, Christian M. ; Binnig, Carsten (2019)
Interactive Summarization of Large Document Collections.
Workshop on Human-In-the-Loop Data Analytics. Amsterdam (05.07.2019-05.07.2019)
doi: 10.1145/3328519.3329129
Konferenzveröffentlichung, Bibliographie

Kurzbeschreibung (Abstract)

We present a new system for custom summarizations of large text corpora at interactive speed. The task of producing textual summaries is an important step to understand large collections of topicrelated documents and has many real-world applications in journalism, medicine, and many more. Key to our system is that the summarization model is refined by user feedback and called multiple times to improve the quality of the summarization iteratively. To that end, the human is brought into the loop to gather feedback in every iteration about which aspects of the intermediate summaries satisfy their individual information needs. Our system consists of a sampling component and a learned model to produce a textual summary. As we show in our evaluation, our system can provide a similar quality level as existing summarization models that are working on the full corpus and hence cannot provide interactive speeds.

Typ des Eintrags: Konferenzveröffentlichung
Erschienen: 2019
Autor(en): Hättasch, Benjamin ; Meyer, Christian M. ; Binnig, Carsten
Art des Eintrags: Bibliographie
Titel: Interactive Summarization of Large Document Collections
Sprache: Englisch
Publikationsjahr: Juli 2019
Ort: Amsterdam, Niederlande
Buchtitel: HILDA'19: Proceedings of the ...
Veranstaltungstitel: Workshop on Human-In-the-Loop Data Analytics
Veranstaltungsort: Amsterdam
Veranstaltungsdatum: 05.07.2019-05.07.2019
DOI: 10.1145/3328519.3329129
URL / URN: https://hilda.io/2019/proceedings/HILDA2019_paper_4.pdf
Kurzbeschreibung (Abstract):

We present a new system for custom summarizations of large text corpora at interactive speed. The task of producing textual summaries is an important step to understand large collections of topicrelated documents and has many real-world applications in journalism, medicine, and many more. Key to our system is that the summarization model is refined by user feedback and called multiple times to improve the quality of the summarization iteratively. To that end, the human is brought into the loop to gather feedback in every iteration about which aspects of the intermediate summaries satisfy their individual information needs. Our system consists of a sampling component and a learned model to produce a textual summary. As we show in our evaluation, our system can provide a similar quality level as existing summarization models that are working on the full corpus and hence cannot provide interactive speeds.

Freie Schlagworte: Text Summarization, Machine Learning, Approximate Computing, AIPHES_area_d2, dm, dm_vi_ml, dm_sherlock
Zusätzliche Informationen:

Article No 9

Fachbereich(e)/-gebiet(e): 20 Fachbereich Informatik
20 Fachbereich Informatik > Data Management (2022 umbenannt in Data and AI Systems)
DFG-Graduiertenkollegs
DFG-Graduiertenkollegs > Graduiertenkolleg 1994 Adaptive Informationsaufbereitung aus heterogenen Quellen
Hinterlegungsdatum: 26 Apr 2019 13:27
Letzte Änderung: 22 Apr 2020 07:41
PPN:
Export:
Suche nach Titel in: TUfind oder in Google
Frage zum Eintrag Frage zum Eintrag

Optionen (nur für Redakteure)
Redaktionelle Details anzeigen Redaktionelle Details anzeigen