TU Darmstadt / ULB / TUbiblio

Unsupervised Latent Dirichlet Allocation for supervised question classification

Momtazi, Saeedeh and Gurevych, Iryna (2018):
Unsupervised Latent Dirichlet Allocation for supervised question classification.
In: Information Processing & Management, pp. 380-393, 54, (3), ISSN 0306-4573,
DOI: 10.1016/j.ipm.2018.11.007,
[Online-Edition: https://www.sciencedirect.com/science/article/pii/S030645731...],
[Article]

Abstract

Question answering systems assist users in satisfying their information needs more precisely by providing focused responses to their questions. Among the various systems developed for such a purpose, community-based question answering has recently received researchers’ attention due to the large amount of user-generated questions and answers in social question-and-answer platforms. Reusing such data sources requires an accurate information retrieval component enhanced by a question classifier. The question classification gives the system the possibility to have information about question categories to focus on questions and answers from relevant categories to the input question. In this paper, we propose a new method based on unsupervised Latent Dirichlet Allocation for classifying questions in community-based question answering. Our method first uses unsupervised topic modeling to extract topics from a large amount of unlabeled data. The learned topics are then used in the training phase to find their association with the available category labels in the training data. The category mixture of topics is finally used to predict the label of unseen data.

Item Type: Article
Erschienen: 2018
Creators: Momtazi, Saeedeh and Gurevych, Iryna
Title: Unsupervised Latent Dirichlet Allocation for supervised question classification
Language: English
Abstract:

Question answering systems assist users in satisfying their information needs more precisely by providing focused responses to their questions. Among the various systems developed for such a purpose, community-based question answering has recently received researchers’ attention due to the large amount of user-generated questions and answers in social question-and-answer platforms. Reusing such data sources requires an accurate information retrieval component enhanced by a question classifier. The question classification gives the system the possibility to have information about question categories to focus on questions and answers from relevant categories to the input question. In this paper, we propose a new method based on unsupervised Latent Dirichlet Allocation for classifying questions in community-based question answering. Our method first uses unsupervised topic modeling to extract topics from a large amount of unlabeled data. The learned topics are then used in the training phase to find their association with the available category labels in the training data. The category mixture of topics is finally used to predict the label of unseen data.

Journal or Publication Title: Information Processing & Management
Volume: 54
Number: 3
Uncontrolled Keywords: UKP_p_QAEduInf
Divisions: 20 Department of Computer Science
20 Department of Computer Science > Ubiquitous Knowledge Processing
Date Deposited: 18 Dec 2018 10:55
DOI: 10.1016/j.ipm.2018.11.007
Official URL: https://www.sciencedirect.com/science/article/pii/S030645731...
Export:
Suche nach Titel in: TUfind oder in Google

Optionen (nur für Redakteure)

View Item View Item