Schnober, Carsten ; Gurevych, Iryna (2015)
Combining Topic Models for Corpus Exploration: Applying LDA for Complex Corpus Research Tasks in a Digital Humanities Project.
New York, NY, USA
doi: 10.1145/2809936.2809939
Konferenzveröffentlichung, Bibliographie
Kurzbeschreibung (Abstract)
We investigate new ways of applying LDA topic models: rather than optimizing a single model for a specific use case, we train multiple models based on different parameters and vocabularies which are combined on-the-fly to comply with varying information retrieval tasks. We also show a semi-automatic method which helps users to identify relevant topics across multiple models. Our methods are demonstrated and evaluated on a real-world use case: a large-scale corpus-based digital humanities project called Welt der Kinder (“Children and their World”). We illustrate our approach in that context and show that it can be generalized to other scenarios. We evaluate this work using empirical methods from information retrieval, but also show visualizations and use cases as actually applied in the project.
Typ des Eintrags: | Konferenzveröffentlichung |
---|---|
Erschienen: | 2015 |
Autor(en): | Schnober, Carsten ; Gurevych, Iryna |
Art des Eintrags: | Bibliographie |
Titel: | Combining Topic Models for Corpus Exploration: Applying LDA for Complex Corpus Research Tasks in a Digital Humanities Project |
Sprache: | Englisch |
Publikationsjahr: | Oktober 2015 |
Verlag: | Sheridan Communications |
Buchtitel: | Proceedings of the 2015 Workshop on Topic Models: Post-Processing and Applications |
Reihe: | TM'15 |
Veranstaltungsort: | New York, NY, USA |
DOI: | 10.1145/2809936.2809939 |
URL / URN: | http://doi.acm.org/10.1145/2809936.2809939 |
Zugehörige Links: | |
Kurzbeschreibung (Abstract): | We investigate new ways of applying LDA topic models: rather than optimizing a single model for a specific use case, we train multiple models based on different parameters and vocabularies which are combined on-the-fly to comply with varying information retrieval tasks. We also show a semi-automatic method which helps users to identify relevant topics across multiple models. Our methods are demonstrated and evaluated on a real-world use case: a large-scale corpus-based digital humanities project called Welt der Kinder (“Children and their World”). We illustrate our approach in that context and show that it can be generalized to other scenarios. We evaluate this work using empirical methods from information retrieval, but also show visualizations and use cases as actually applied in the project. |
Freie Schlagworte: | UKP_p_WeltDerKinder;UKP_reviewed;Semantic Information Management;Digital Humanities, Topic Models, Information Retrieval |
ID-Nummer: | TUD-CS-2015-1197 |
Fachbereich(e)/-gebiet(e): | 20 Fachbereich Informatik 20 Fachbereich Informatik > Ubiquitäre Wissensverarbeitung |
Hinterlegungsdatum: | 31 Dez 2016 14:29 |
Letzte Änderung: | 24 Jan 2020 12:03 |
PPN: | |
Export: | |
Suche nach Titel in: | TUfind oder in Google |
Frage zum Eintrag |
Optionen (nur für Redakteure)
Redaktionelle Details anzeigen |