TU Darmstadt / ULB / TUbiblio

Topic Modeling for Search and Exploration in Multivariate Research Data Repositories

Scherer, Maximilian and Landesberger, Tatiana von and Schreck, Tobias (2013):
Topic Modeling for Search and Exploration in Multivariate Research Data Repositories.
Springer, Berlin, Heidelberg, New York, In: Research and Advanced Technology for Digital Libraries, In: Lecture Notes in Computer Science (LNCS); 8092, DOI: 10.1007/978-3-642-40501-3₃₉,
[Conference or Workshop Item]

Abstract

Huge amounts of multivariate research data are produced and made publicly available in digital libraries. Little research focused on similarity functions that take multivariate data documents as a whole into account. Such similarity functions are highly beneficial for users, by enabling them to browse and query large collections of multivariate data using nearest-neighbor indexing. In this paper we tackle this challenge and propose a novel similarity function for multivariate data documents based on topic-modeling. Based on a previously developed bag-of-words approach for multivariate data, we can then learn a topic model for a collection of multivariate data documents and represent each document as a mixture of topics. This representation is very suitable for efficient nearest-neighbor indexing and clustering according to the topic distribution of a document. We present a use-case where we apply this approach to retrieval of multivariate data in the field of climate research.

Item Type: Conference or Workshop Item
Erschienen: 2013
Creators: Scherer, Maximilian and Landesberger, Tatiana von and Schreck, Tobias
Title: Topic Modeling for Search and Exploration in Multivariate Research Data Repositories
Language: English
Abstract:

Huge amounts of multivariate research data are produced and made publicly available in digital libraries. Little research focused on similarity functions that take multivariate data documents as a whole into account. Such similarity functions are highly beneficial for users, by enabling them to browse and query large collections of multivariate data using nearest-neighbor indexing. In this paper we tackle this challenge and propose a novel similarity function for multivariate data documents based on topic-modeling. Based on a previously developed bag-of-words approach for multivariate data, we can then learn a topic model for a collection of multivariate data documents and represent each document as a mixture of topics. This representation is very suitable for efficient nearest-neighbor indexing and clustering according to the topic distribution of a document. We present a use-case where we apply this approach to retrieval of multivariate data in the field of climate research.

Series Name: Lecture Notes in Computer Science (LNCS); 8092
Publisher: Springer, Berlin, Heidelberg, New York
Uncontrolled Keywords: Forschungsgruppe Visual Search and Analysis (VISA), Multivariate data, Content based retrieval, Bag-of-words
Divisions: 20 Department of Computer Science
20 Department of Computer Science > Interactive Graphics Systems
Event Title: Research and Advanced Technology for Digital Libraries
Date Deposited: 12 Nov 2018 11:16
DOI: 10.1007/978-3-642-40501-3₃₉
Export:

Optionen (nur für Redakteure)

View Item View Item