TU Darmstadt / ULB / TUbiblio

Combining Topic Models for Corpus Exploration: Applying LDA for Complex Corpus Research Tasks in a Digital Humanities Project

Schnober, Carsten and Gurevych, Iryna (2015):
Combining Topic Models for Corpus Exploration: Applying LDA for Complex Corpus Research Tasks in a Digital Humanities Project.
In: Proceedings of the 2015 Workshop on Topic Models: Post-Processing and Applications, Sheridan Communications, New York, NY, USA, In: TM'15, ISBN 978-1-4503-3784-7,
DOI: 10.1145/2809936.2809939,
[Online-Edition: http://doi.acm.org/10.1145/2809936.2809939],
[Conference or Workshop Item]

Abstract

We investigate new ways of applying LDA topic models: rather than optimizing a single model for a specific use case, we train multiple models based on different parameters and vocabularies which are combined on-the-fly to comply with varying information retrieval tasks. We also show a semi-automatic method which helps users to identify relevant topics across multiple models. Our methods are demonstrated and evaluated on a real-world use case: a large-scale corpus-based digital humanities project called Welt der Kinder (“Children and their World”). We illustrate our approach in that context and show that it can be generalized to other scenarios. We evaluate this work using empirical methods from information retrieval, but also show visualizations and use cases as actually applied in the project.

Item Type: Conference or Workshop Item
Erschienen: 2015
Creators: Schnober, Carsten and Gurevych, Iryna
Title: Combining Topic Models for Corpus Exploration: Applying LDA for Complex Corpus Research Tasks in a Digital Humanities Project
Language: English
Abstract:

We investigate new ways of applying LDA topic models: rather than optimizing a single model for a specific use case, we train multiple models based on different parameters and vocabularies which are combined on-the-fly to comply with varying information retrieval tasks. We also show a semi-automatic method which helps users to identify relevant topics across multiple models. Our methods are demonstrated and evaluated on a real-world use case: a large-scale corpus-based digital humanities project called Welt der Kinder (“Children and their World”). We illustrate our approach in that context and show that it can be generalized to other scenarios. We evaluate this work using empirical methods from information retrieval, but also show visualizations and use cases as actually applied in the project.

Title of Book: Proceedings of the 2015 Workshop on Topic Models: Post-Processing and Applications
Series Name: TM'15
Publisher: Sheridan Communications
ISBN: 978-1-4503-3784-7
Uncontrolled Keywords: UKP_p_WeltDerKinder;UKP_reviewed;Semantic Information Management;Digital Humanities, Topic Models, Information Retrieval
Divisions: 20 Department of Computer Science
20 Department of Computer Science > Ubiquitous Knowledge Processing
Event Location: New York, NY, USA
Date Deposited: 31 Dec 2016 14:29
DOI: 10.1145/2809936.2809939
Official URL: http://doi.acm.org/10.1145/2809936.2809939
Identification Number: TUD-CS-2015-1197
Related URLs:
Export:
Suche nach Titel in: TUfind oder in Google

Optionen (nur für Redakteure)

View Item View Item