TU Darmstadt / ULB / TUbiblio

Exploiting Debate Portals for Semi-supervised Argumentation Mining in User-Generated Web Discourse

Habernal, Ivan and Gurevych, Iryna (2015):
Exploiting Debate Portals for Semi-supervised Argumentation Mining in User-Generated Web Discourse.
In: Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing (EMNLP), Association for Computational Linguistics, Lisbon, Portugal, [Online-Edition: http://www.aclweb.org/anthology/D15-1255],
[Conference or Workshop Item]

Abstract

Analyzing arguments in user-generated Web discourse has recently gained attention in argumentation mining, an evolving field of NLP. Current approaches, which employ fully-supervised machine learning, are usually domain dependent and suffer from the lack of large and diverse annotated corpora. However, annotating arguments in discourse is costly, error-prone, and highly context-dependent. We asked whether leveraging unlabeled data in a semi-supervised manner can boost the performance of argument component identification and to which extent is the approach independent of domain and register. We propose novel features that exploit clustering of unlabeled data from debate portals based on a word embeddings representation. Using these features, we significantly outperform several baselines in the cross-validation, cross-domain, and cross-register evaluation scenarios.

Item Type: Conference or Workshop Item
Erschienen: 2015
Creators: Habernal, Ivan and Gurevych, Iryna
Title: Exploiting Debate Portals for Semi-supervised Argumentation Mining in User-Generated Web Discourse
Language: English
Abstract:

Analyzing arguments in user-generated Web discourse has recently gained attention in argumentation mining, an evolving field of NLP. Current approaches, which employ fully-supervised machine learning, are usually domain dependent and suffer from the lack of large and diverse annotated corpora. However, annotating arguments in discourse is costly, error-prone, and highly context-dependent. We asked whether leveraging unlabeled data in a semi-supervised manner can boost the performance of argument component identification and to which extent is the approach independent of domain and register. We propose novel features that exploit clustering of unlabeled data from debate portals based on a word embeddings representation. Using these features, we significantly outperform several baselines in the cross-validation, cross-domain, and cross-register evaluation scenarios.

Title of Book: Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing (EMNLP)
Publisher: Association for Computational Linguistics
Uncontrolled Keywords: UKP_a_ArMin;UKP_reviewed;argumentation mining
Divisions: 20 Department of Computer Science
20 Department of Computer Science > Ubiquitous Knowledge Processing
DFG-Graduiertenkollegs
DFG-Graduiertenkollegs > Research Training Group 1994 Adaptive Preparation of Information from Heterogeneous Sources
Event Location: Lisbon, Portugal
Date Deposited: 31 Dec 2016 14:29
Official URL: http://www.aclweb.org/anthology/D15-1255
Identification Number: TUD-CS-2015-1178
Related URLs:
Export:
Suche nach Titel in: TUfind oder in Google

Optionen (nur für Redakteure)

View Item View Item