TU Darmstadt / ULB / TUbiblio

Real-Time Summarization of Big Data Streams

Rücklé, Andreas (2015):
Real-Time Summarization of Big Data Streams.
Technische Universität Darmstadt, [Online-Edition: https://download.hrz.tu-darmstadt.de/media/FB20/Dekanat/Publ...],
[Master Thesis]

Abstract

Events like natural disasters, riots or protests trigger an increased information need for many people, because of regional closeness, social relations or general interest. Due to a high amount of news-articles that are created by different publishers during such events, it is nearly impossible for individual persons to process all information with the goal of staying up-to-date. Real-time summarization systems can help in such cases by providing persons with updates on the event while the situation still is developing, without requiring the individual person to manually analyze a large amount of news-articles. In this master thesis, a framework for real-time summarization is presented and multiple summarization systems based on this framework are introduced. Besides achieving a good summarization quality, another focus of this work was to retain real-time properties in terms of summarization and in terms of computational performance. Based on a simple approach defined as Baseline, different improvements were made with the goal to create an advanced system which achieves a performance similar to other state-of-the-art temporal summarization systems. The best resulting system of this work is an adaptive approach which is able to change configurations and algorithms during run-time to automatically select the best method to summarize each target-event. The adaptive selection is performed by detecting the importance of an event, based on its news-coverage. The system also makes use of an approach that requires all information to be reported by multiple sources before it can be included in an update. The adaptive summarization system showed superior results in terms of summarization quality compared to the Baseline system. Furthermore, a comparison to a state-of-the-art temporal summarization system also showed better results of the adaptive approach. At the same time, all real-time goals were achieved.

Item Type: Master Thesis
Erschienen: 2015
Creators: Rücklé, Andreas
Title: Real-Time Summarization of Big Data Streams
Language: English
Abstract:

Events like natural disasters, riots or protests trigger an increased information need for many people, because of regional closeness, social relations or general interest. Due to a high amount of news-articles that are created by different publishers during such events, it is nearly impossible for individual persons to process all information with the goal of staying up-to-date. Real-time summarization systems can help in such cases by providing persons with updates on the event while the situation still is developing, without requiring the individual person to manually analyze a large amount of news-articles. In this master thesis, a framework for real-time summarization is presented and multiple summarization systems based on this framework are introduced. Besides achieving a good summarization quality, another focus of this work was to retain real-time properties in terms of summarization and in terms of computational performance. Based on a simple approach defined as Baseline, different improvements were made with the goal to create an advanced system which achieves a performance similar to other state-of-the-art temporal summarization systems. The best resulting system of this work is an adaptive approach which is able to change configurations and algorithms during run-time to automatically select the best method to summarize each target-event. The adaptive selection is performed by detecting the importance of an event, based on its news-coverage. The system also makes use of an approach that requires all information to be reported by multiple sources before it can be included in an update. The adaptive summarization system showed superior results in terms of summarization quality compared to the Baseline system. Furthermore, a comparison to a state-of-the-art temporal summarization system also showed better results of the adaptive approach. At the same time, all real-time goals were achieved.

Divisions: 20 Department of Computer Science
20 Department of Computer Science > Ubiquitous Knowledge Processing
Date Deposited: 31 Dec 2016 14:29
Official URL: https://download.hrz.tu-darmstadt.de/media/FB20/Dekanat/Publ...
Identification Number: TUD-CS-2015-1312
Referees: Eugster, Patrick and Gurevych, Iryna
Export:
Suche nach Titel in: TUfind oder in Google

Optionen (nur für Redakteure)

View Item View Item