TU Darmstadt / ULB / TUbiblio

Combining Topic Models for Corpus Exploration: Applying LDA for Complex Corpus Research Tasks in a Digital Humanities Project

Schnober, Carsten ; Gurevych, Iryna (2015):
Combining Topic Models for Corpus Exploration: Applying LDA for Complex Corpus Research Tasks in a Digital Humanities Project.
In: TM'15, In: Proceedings of the 2015 Workshop on Topic Models: Post-Processing and Applications, pp. 11-20,
Sheridan Communications, New York, NY, USA, ISBN 978-1-4503-3784-7,
DOI: 10.1145/2809936.2809939,
[Conference or Workshop Item]

Abstract

We investigate new ways of applying LDA topic models: rather than optimizing a single model for a specific use case, we train multiple models based on different parameters and vocabularies which are combined on-the-fly to comply with varying information retrieval tasks. We also show a semi-automatic method which helps users to identify relevant topics across multiple models. Our methods are demonstrated and evaluated on a real-world use case: a large-scale corpus-based digital humanities project called Welt der Kinder (“Children and their World”). We illustrate our approach in that context and show that it can be generalized to other scenarios. We evaluate this work using empirical methods from information retrieval, but also show visualizations and use cases as actually applied in the project.

Item Type: Conference or Workshop Item
Erschienen: 2015
Creators: Schnober, Carsten ; Gurevych, Iryna
Title: Combining Topic Models for Corpus Exploration: Applying LDA for Complex Corpus Research Tasks in a Digital Humanities Project
Language: English
Abstract:

We investigate new ways of applying LDA topic models: rather than optimizing a single model for a specific use case, we train multiple models based on different parameters and vocabularies which are combined on-the-fly to comply with varying information retrieval tasks. We also show a semi-automatic method which helps users to identify relevant topics across multiple models. Our methods are demonstrated and evaluated on a real-world use case: a large-scale corpus-based digital humanities project called Welt der Kinder (“Children and their World”). We illustrate our approach in that context and show that it can be generalized to other scenarios. We evaluate this work using empirical methods from information retrieval, but also show visualizations and use cases as actually applied in the project.

Book Title: Proceedings of the 2015 Workshop on Topic Models: Post-Processing and Applications
Series: TM'15
Publisher: Sheridan Communications
ISBN: 978-1-4503-3784-7
Uncontrolled Keywords: UKP_p_WeltDerKinder;UKP_reviewed;Semantic Information Management;Digital Humanities, Topic Models, Information Retrieval
Divisions: 20 Department of Computer Science
20 Department of Computer Science > Ubiquitous Knowledge Processing
Event Location: New York, NY, USA
Date Deposited: 31 Dec 2016 14:29
DOI: 10.1145/2809936.2809939
URL / URN: http://doi.acm.org/10.1145/2809936.2809939
Identification Number: TUD-CS-2015-1197
PPN:
Corresponding Links:
Export:
Suche nach Titel in: TUfind oder in Google
Send an inquiry Send an inquiry

Options (only for editors)
Show editorial Details Show editorial Details